I've started Python some days ago and wrote a small Telnet implementation according to RFC 854 and RFC 5198. It is pretty low-level and sending messages directly isn't even supported. The highest abstractions are commands and user data where commands include negotiation (like "Do") and "general-purpose" commands (like "Go Ahead").
The best thing about it is that, using the __bytes__
method, these data can be converted to their binary representation and directly embedded into a data stream using TCP or just storing it temporarily, making it very flexible.
Code (libtelnet.py
):
# This module contains code and data to provide a slight abstraction from
# the socket module to provide Telnet client facilities.
import enum
import socket
import struct
# The Telnet protocol is based on communication between two NVTs (Network
# Virtual Terminals). They are an abstraction of terminals, so the protocol
# can operate independently of the actual terminal used.
# Standard Telnet option negotation syntax (partially quoted from
# RFC 854):
# - WILL XXX: indicate local party's desire (an offer) to begin performing
# option XXX
# - DO/DON'T XXX: positive/negative acknowledgements of WILL XXX
# - DO XXX: tell other party to begin performing option XXX
# - WILL/WON'T XXX: positive/negative acknowledgements of DO XXX
# An NVT consists of a printer and a keyboard; the former for outputting
# information, the ladder for inputting it. By default, the NVT is a half-duplex
# device in line-buffered mode. This can be altered by options, though.
# The Go Ahead (GA) command is to be issued when the Telnet client has sent all
# data to the printed and no input from the keyboard is waiting to support
# "lockable" keyboards at a half-duplex connection, for instance, the IBM 2741:
# it is locked when the other end is sending. The GA command unlocks it again.
# The GA command's purpose is not to signalize an EOL; usually, the machine
# has other mechanisms for handling and identifying such a state, so a GA
# command would be superflous.
# Other Telnet commands:
# - IP (Interrupt Process): interrupt process due to an infinite loop, for
# example
# - AO (Abort Output): abort output of text. Example: cat test.txt, where
# test.txt is so huge, it takes some seconds to print
# all the content. If an AO is issued in between, the
# rest of the output is discarded and the shell awaits
# the next command.
# - AYT (Are You There): issued to check responsiveness of the local system
# when, for example, a computation takes very long.
# - EC (Erase Character): deletes the most recent character in the data stream.
# Takes control characters into account.
# - EL (Erase Line): delete the current line of input.
# Valid control characters. All other control characters are ignored by the NVT.
# "\r\n" is a newline, "\r\x00" is a pure carriage return, as specified by
# RFC 854.
PORT = 23
# Old CtrlChars class.
#class CtrlChars(enum.Enum):
# NUL = "\x00"
# LF = "\n"
# CR = "\r"
# RFC 5198 ADVISES AGAINST using the lower control characters; they "SHOULD
# NOT be used unless required by exceptional circumstances."
# BEL = "\a"
# BS = "\b"
# HT = "\t"
# VT = "\v"
# FF = "\f"
# GOOD_CTRL_CHARS = (NUL, LF, CR)
# BAD_CTRL_CHARS = (BEL, BS, HT, VT, FF)
# TODO: RFC 5198 talks about "undefined control codes" - add support for
# these and find out what they are!
# "Strictly-conforming" control characters as specified by RFC 854 and 5198.
class CtrlChars(enum.Enum):
CRLF = "\r\n"
CRNUL = "\r\x00"
BEL = "\a"
BS = "\b"
HT = "\t"
VT = "\v"
FF = "\f"
# CONSTRAINTS IMPOSED BY RFC 5198:
# 1. The sequence CR NUL should not be used.
# 2. LF CR should not appear except for consecutive CR LF sequences
# (CR LF CR LF).
# 3. HT and a bare LF are problematic. The RFC doesn't discourage its usage
# explicitly but I dare propose avoiding them anyway
# TODO: RFC 5198 says:
# Appendix D. A Note about Related Future Work
#
# Consideration should be given to a Telnet (or SSH [RFC4251]) option
# to specify this type of stream and an FTP extension [RFC0959] to
# permit a new "Unicode text" data TYPE.
# FIND OUT IF THIS HAS BEEN DESIGNED SO FAR AND IMPLEMENT IT THEN!
# Didn't know where else to put this...
IAC = 255 # Interpret As Cmd
class Commands(enum.IntEnum):
SE = 240 # end of subnegotation parameters
NOP = 241 # no-op
DM = 242 # Data Mark portion of SYNCH
BRK = 243 # NVT character BRK (indicating Break or Attention key was hit)
IP = 244 # Interrupt Process
AO = 245 # Abort Output
AYT = 246 # Are You There
EC = 247 # Erase Character
EL = 248 # Erase Line
GA = 249 # Go Ahead
SB = 250 # subnegotation
# Negotation codes:
WILL = 251
WONT = 252
DO = 253
DONT = 254
NEG_CMD_RANGE = [ Commands.WILL, Commands.DO ]
# Telnet options as specified by the IANA (Internet Assigned Numbers
# Authority):
class Options(enum.IntEnum):
"""The Telnet options as specified the IANA (Internet Assigned Numbers
Authority). However, this class only contains the ones recognized by
this implementation. The implementation will refuse every option not
contained in here.
"""
TRANSMIT_BIN = 0
ECHO = 1
RECON = 2
SUPPRESS_GA = 3
APPROX_MSG_SIZE_NEG = 4
STATUS = 5
TIMING_MARK = 6
RCTE = 7
OUT_LINE_WIDTH = 8
OUT_PAGE_SIZE = 9
NAOCRD = 10
NAOHTS = 11
NAOHTD = 12
NAOFFD = 13
NAOVTS = 14
NAOVTD = 15
NAOLFD = 16
EXT_ASCII = 17
LOGOUT = 18
BM = 19
DET = 20
SUPDUP = 21
SUPDUP_OUT = 22
SEND_LOC = 23
TERMINAL_TYPE = 24
EOR = 25
TUID = 26
OUTMRK = 27
TTYLOC = 28
N3270_REGIME = 29 # A Python identifier must not start with a digit, so
# the N ("Negotiate") has been added
X3PAD = 30
NAWS = 31
TERMINAL_SPEED = 32
TOOGLE_FLOW_CTRL = 33
LINEMODE = 34
XDISPLOC = 35
ENVIRON = 36
AUTHN = 37
ENCRYPT = 38
NEW_ENVIRON = 39
TN3270E = 40
XAUTH = 41
CHARSET = 42
RSP = 43
COM_PORT = 44
SUPPRESS_LOCAL_ECHO = 45
START_TLS = 46
KERMIT = 47 # y'know Kermit the frog? ^^ Maybe derived from him...
SEND_URL = 48
FORWARD_X = 49
# Big gap with unassigned option codes...
TELOPT_PRAGMA_LOGON = 138
TELOPT_SSPI_LOGON = 139
TELOPT_PRAGMA_HEARTBEAT = 140
# Another big gap with unassigned option codes...
EXOPL = 255
class SuboptHandlers:
"""Contains handler functions, which are called when subnegotation
should take place. They return a bytes object containing an adequate
response to the subnegotation from the Telnet server.
"""
#def hLINEMODE():
# TODO: yet to be written
def _repr(self):
return "<" + (", ".join([str(key) + "=" + str(value) for key, value in
self.__dict__.items()])) + ">"
class Command:
def _calc_bytes(self):
data = [IAC, self.cmd.value]
if self.option != None:
data.append(self.option.value)
data.extend([i.value for i in self.option_args])
return bytes(data)
def __init__(self, cmd, option=None, option_args=[]):
# Used to protect against undefined commands. To add a new command,
# augment the Command enum.
if (not isinstance(cmd, Commands)):
raise TypeError("invalid Telnet command")
# Similar to the case above.
if (not isinstance(option, Options)):
raise TypeError("invalid Telnet option")
self.__dict__["cmd"] = cmd
self.__dict__["option"] = option
# Direct assignment to calculate byte representation.
self.option_args = option_args
def __bytes__(self):
return self._bytes
def __setattr__(self, name, value):
self.__dict__[name] = value
# Actually, it's superflous to recalculate the WHOLE bytes object but
# it's the easiest to implement.
# TODO: ELIDE THIS INEFFICIENCY!
self.__dict__["_bytes"] = self._calc_bytes()
def __repr__(self):
return _repr(self)
class UserData:
def __init__(self, data):
self._data_str = ""
# I think (it's untested) the lopp be replaced by
# "".join([if (isinstance(i, CtrlChars) i.value else i for i in data])
# but that would require (I think it would) two iterations over the
# list. Half as fast as the lower loop.
for i in data:
if isinstance(i, CtrlChars):
self._data_str += i.value
else:
# TODO: we don't check the normal strings for control
# characters here yet. ADD THIS **EFFICIENTLY**!
self._data_str += i
def __bytes__(self):
return bytes(self._data_str, "ASCII")
def __str__(self):
return self._data_str
def __repr__(self):
return _repr(self)
# TODO: add support for options and suboptions!
# options: http://www.iana.org/assignments/telnet-options/telnet-options.xhtml
The API isn't finished yet but I am just asking about the basic structure of it:
How is the inclusion convention in Python? In C++, you shouldn't do
using namespace xxx;
but apparently it's quite Pythonic to dofrom xxx import *
. If this is true, my API will be problematic because things likePORT
are not directly associated to Telnet and might collide with a TCP implementation usingPORT
as well.Do I use
enum
s well? Or do I overuse them? Do they make sense here at all?Is the code readable, comprehensible, and alright overall? My worst fear is that it is too similar to C++- and C-ish and I miss opportunities to write it the true Pythonic way.
I talked flexibility in the second paragraph. What do you think of that approach using
__bytes__
?
1 Answer 1
Q&A
1) This is wrong. Actually from xxx import *
is considered bad practice, for
the exact reason you point out. It is only acceptable for very specific modules
like pylab
, which have a specific intent doing so (setup a MATLAB-like
environment in this case).
2) Enums are very new in Python, so it is easy to argue that you don't need them at all. But I feel your use is consistent, even if it seems quite proeminent at the moment in front of the actual do-something part of your code.
String Manipulation, Generator Expression
I focus on the _repr
function. Avoid concatenating strings with the +
operator. This is because strings are immutable, so you actually create a copy
of concatenated strings "each time" (okay, sometimes Python is more intelligent
than that but this is off-topic for a start). Use format
and join
whenever you can. You will improve readability.
You know what list comprehensions do. Well done. But remember that they
build a list, eating up memory. For all functions that accept an iterable,
you can replace this by a generator expression (just remove the []
).
def _repr(self):
return "<{members}>".format(
members=", ".join(
"{key:!s}={value:!s}".format(key=key, value=value)
for key, value in self.__dict__.items()
)
)
Same remark for UserData.__init__
:
for + str + "+=" = FAIL
Place all string chunks in a list, join them at the end.
class UserData:
def __init__(self, data):
data_elements = []
for i in data:
if isinstance(i, CtrlChars):
data_elements.append(i.value)
else:
data_elements.append(i)
self._data_str = "".join(data_elements)
Your note on inefficiency
You gain a lot by waiting for the very last moment to process data. At least in performance: if you don't need, you don't do. You avoid unnecessary actions. But also in readability. If you do things when you need them, it's more clear why you do it, and also what you are doing.
Now is better than never.
Although never is often better than right now.
— The Zen of Python, Tim Peters.
One idea is to setup an invalidation mechanism. This delays the creation of the bytes to when it is really needed. Below is only an example.
class Command:
def __init__(...):
super().__setattr__("_bytes_valid", False)
super().__setattr__("_bytes", b'')
# ...
def __bytes__(self):
# Recalc on access if necessary (cache-like)
if not self._bytes_valid:
super().__setattr__("_bytes", self._calc_bytes())
super().__setattr__("_bytes_valid", True)
return self._bytes
def __setattr__(self, name, value):
super().__setattr__(name, value)
super().__setattr__("_bytes_valid", False)
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.
— Structured Programming with Goto Statements, Donald Knut.
The question being are we really in that critical 3%? Considering how often
the __bytes__
method is supposed to be called (probably once before sending
the command), and how many times an attribute is set (at least 3 times with
cmd
, option
and option_args
), IMHO the best solution in that case is
class Command:
def __bytes__(self):
return self._calc_bytes()
Tips & Tricks
class Command:
def __repr__(self):
return _repr(self)
can be replaced by
class Command:
__repr__ = _repr
This will save you a function call.
The last parameter of Command.__init__
is a good candidate for argument unpacking.
def __init__(self, cmd, option=None, *option_args):
Blocks cannot be empty. That's why you have to use the pass
statement.
class SuboptHandlers:
"""Contains handler functions, which are called when subnegotation
should take place. They return a bytes object containing an adequate
response to the subnegotation from the Telnet server.
"""
pass # There
Final thoughts on design
Of course, this is MY review, not THE review. My opinion can be discussed.
(Beware, big blahblah ahead).
It is bad practice to code without having layed out a solid design. Roughly speaking, you collect use cases and requirements, select an architecture and refine your design. Only then you write code.
Extensibility shall be chosen by design and not by failing a design. Because it comes at the price of performance, readability and simplicity, it has to be a required feature of your design. "if I want to add something" is not a good reason. "when I will add that feature" is.
That's why, considering the actual state of your code, I suggest you use functions rather than classes and standard containers rather than objects.
Simple is better than complex. [...]
Flat is better than nested. — The Zen of Python, Tim Peters.
def userdata_as_bytes(data):
data_elements = []
for i in data:
if isinstance(i, CtrlChars):
data_elements.append(i.value)
else:
data_elements.append(i)
return ("".join(data_elements)).encode("ascii")
def command_as_bytes(command):
# In this implementation example, command is just a tuple or a list
# containing (cmd, option, ...). This is not required, but shows you
# how standard containers can also be suited for encapsulation.
# You could instead declare
# command_as_bytes(cmd, option=None, *option_args)
cmd, option, *option_args = command
# Validation
if (not isinstance(cmd, Commands)):
raise TypeError("invalid Telnet command")
if (not isinstance(option, Options)):
raise TypeError("invalid Telnet option")
data = [IAC, cmd.value]
if option is not None:
data.append(option.value)
data += [optarg.value for optarg in option_args]
return bytes(data)
Be lazy. Delay actions till you are sure you (will) need them.
— A stupid quote, Me
-
1\$\begingroup\$ Why not keep
command_as_bytes(cmd, option=None, *option_args)
as a signature as proposed earlier? \$\endgroup\$301_Moved_Permanently– 301_Moved_Permanently2016年02月09日 07:55:29 +00:00Commented Feb 9, 2016 at 7:55 -
\$\begingroup\$ I used
self.__dict__
inself.__init__
to circumventself.__setattr__
, which will calculate the byte representation even though the needed fields haven't been set yet. Isn't that right? Also, why not classes? Is conversion via functions really the way to go despiteself.__bytes__
? Classes seem also more extensible if I want to add something. And regarding "[...] waiting for the very last moment to process data [...]": which part are you referring to in my/your code? \$\endgroup\$cadaniluk– cadaniluk2016年02月09日 11:34:06 +00:00Commented Feb 9, 2016 at 11:34 -
\$\begingroup\$ @MathiasEttinger: One use of a class is to encapsulate information as a single object you can manipulate easily. Here the point is to show that you can use a tuple instead of an object and gain the same functionality (a single entity that you build and forget). I shall probably explain this in the answer. \$\endgroup\$Cilyan– Cilyan2016年02月09日 13:45:58 +00:00Commented Feb 9, 2016 at 13:45
-
1\$\begingroup\$ Didn't you meant to ping @cad instead of me with your previous comment? \$\endgroup\$301_Moved_Permanently– 301_Moved_Permanently2016年02月09日 13:54:02 +00:00Commented Feb 9, 2016 at 13:54
-
\$\begingroup\$ @cad:
self.__dict__
: My, bad. I removed it. I extended sections Your note on inefficiency and Final thoughts on complexity to answer your other questions. @MathiasEttinger: No, I was answering your question. Maybe the ping practice is different here than on SO? I'll check. \$\endgroup\$Cilyan– Cilyan2016年02月09日 16:53:39 +00:00Commented Feb 9, 2016 at 16:53