Here is my solution to reddit.com/r/dailyprogrammer #93 (easy)., which is a Python script that translates between English and Morse code.
Is String_To_Translate
a good example of a class? Am I violating convention? How can I improve this code and make it more readable and pythonic?
import re
morse_code_dict = { 'a':'.-',
'b':'-...',
'c':'-.-.',
'd':'-...',
'e':'..-.',
'f':'..-.',
'g':'--.',
'h':'....',
'i':'..',
'j':'.---',
'k':'-.-',
'l':'.-..',
'm':'--',
'n':'-.',
'o':'---',
'p':'.--.',
'q':'--.-',
'r':'.-.',
's':'...',
't':'-',
'u':'..-',
'v':'...-',
'w':'.--',
'x':'-..-',
'y':'-.--',
'z':'--..',
'1':'.----',
'2':'..---',
'3':'...--',
'4':'....-',
'5':'.....',
'6':'-....',
'7':'--...',
'8':'---..',
'9':'----.',
'0':'-----',
',':'--..--',
'.':'.-.-.-',
'?':'..--..',
'/':'-..-.',
'-':'-....-',
'(':'-.--.',
')':'-.--.-',
' ':' ',
}
class String_To_Translate:
def __init__(self, string):
self.string = string
self.Translated = ''
self.lang = ''
def determine_lang(self):
exp = r'[a-zA-Z,円\?\/\(\)]'#regexp for non morse code char
if re.search(exp, self.string) == None:
self.lang = 'm'
else:
self.lang = 'eng'
def prep(self):
if self.lang == 'eng':
#get rid of whitespace & warn user
(self.string, numWhitespaceReplacements) = re.subn(r'\s', ' ', self.string)
if numWhitespaceReplacements > 0:
print 'Some whitespace characters have been replaced with spaces.'
#get rid of characters not in dict & warn user
(self.string, numReplacements) = re.subn(r'[^a-zA-Z,円\?\/\(\)\.\-\s]', ' ', self.string)
if numReplacements > 0:
print 'Some ({0}) characters were unrecognized and could not be translated, they have been replaced by spaces.'.format(numReplacements)
#convert to lowercase
self.string = self.string.lower()
def translate_from_morse(self):
eng_dict = dict([(v, k) for (k, v) in morse_code_dict.iteritems()])
def split_morse(string):
d = ' '
charList = [[char for char in word.split() + [' ']] for word in string.split(d) if word != '']
charList = [item for sublist in charList for item in sublist]
print charList
return charList
charList = split_morse(self.string)
for char in charList:
self.Translated += eng_dict[char]
def translate_from_eng(self):
charList=list(self.string)
for char in charList:
self.Translated += morse_code_dict[char] + ' '
def translate(self):
self.determine_lang()
if self.lang == 'm':
self.translate_from_morse()
elif self.lang == 'eng':
self.prep()
self.translate_from_eng()
else:
print 'shit'
if __name__=="__main__":
translateme = String_To_Translate(raw_input('enter a string: '))
translateme.translate()
print translateme.Translated
print 'translating back'
test = String_To_Translate(translateme.Translated)
test.translate()
print test.Translated
2 Answers 2
The first thing that stands out from your code is the big table giving the morse code: it's lengthy, hard to read, and hard to check. (And consequently, it contains two mistakes.) I would write something like this:
from itertools import izip def pairwise(s): """ Generate pairs from the sequence `s`. >>> list(pairwise('abcd')) [('a', 'b'), ('c', 'd')] """ it = iter(s) return izip(it,it) morse_encoding = dict(pairwise(''' a .- b -... c -.-. d -.. e . f ..-. g --. h .... i .. j .--- k -.- l .-.. m -- n -. o --- p .--. q --.- r .-. s ... t - u ..- v ...- w .-- x -..- y -.-- z --.. 1 .---- 2 ..--- 3 ...-- 4 ....- 5 ..... 6 -.... 7 --... 8 ---.. 9 ----. 0 ----- , --..-- . .-.-.- ? ..--.. / -..-. - -....- ( -.--. ) -.--.- ' .----. ! -.-.-- & .-... : ---... ; -.-.-. = -...- + .-.-. _ ..--.- " .-..-. $ ...-..- @ .--.-. '''.split())) morse_decoding = {v:k for k,v in morse_encoding.iteritems()}
You rebuild the decoding table every time you start a new translation. This seems wasteful: I suggest building it only once, as above.
Not everything needs to be a class. A class is a data abstraction whose instances represent some thing together with methods for operating on that thing. You don't actually have any persistent data here: all you are using the class for is to store the temporary state for a single translation). It makes more sense to structure this code as functions.
Your classes and methods have no docstrings and no doctests. Even a small doctest or two might have caught the mistakes in your Morse code table.
In your
determine_lang
method, you use a regular expression that matches strings that are not Morse code. It would be simpler to do the test the other way round: to determine that the string is Morse code (because Morse code contains only dots, dashes and spaces). Python's sets provide a clear and concise way to express this test:def is_morse_code(s): """ Return True if the string `s` resembles Morse code (contains only dots, dashes and spaces), False otherwise. >>> is_morse_code('... --- ...') True >>> is_morse_code('abc') False """ return set(s) <= set('.- ')
The original challenge may have referred to "English", but there's nothing actually English-specific about Morse code. It translates any language that can be written in the Latin alphabet. So I would prefer names that didn't presume that we were translating English.
When a character cannot be encoded or decoded, you print a message to inform the user. The usual approach in Python is to raise an exception.
Your translation code seems very complex for what it is doing. All you need to do is split the string into words, split each word into letters, translate each letter, join the translated letters into words, and then join the translated words into the result. This can be done neatly and concisely using
split
andjoin
and a couple of loops:def morse_encode(s): """ Encode the string `s` in Morse code and return the result. Raise KeyError if it cannot be encoded. >>> morse_encode('morse code') '-- --- .-. ... . -.-. --- -.. .' """ return ' '.join(' '.join(morse_encoding[l] for l in word) for word in s.lower().split(' ')) def morse_decode(s): """ Decode the string `s` from Morse code and return the result. Raise KeyError if it cannot be decoded. >>> morse_decode('-- --- .-. ... . -.-. --- -.. .') 'morse code' """ return ' '.join(''.join(morse_decoding[l] for l in word.split(' ')) for word in s.split(' '))
-
\$\begingroup\$ and for extra fun, return a function from determine_lang:
return lambda: morse_encode( str ) if re.search( exp, str ) else lambda: morse_decode( str )
\$\endgroup\$Glenn Rogers– Glenn Rogers2012年08月31日 11:33:14 +00:00Commented Aug 31, 2012 at 11:33 -
\$\begingroup\$ An excellent, thorough response. thank you Gareth. \$\endgroup\$dylan– dylan2012年09月04日 01:13:44 +00:00Commented Sep 4, 2012 at 1:13
Minor points:
self.Translated
should beself.translated
.- The use of
re
may be bit of overkill for what you want it for. - If you use a space in the text, you get told that it's being replaced by a space.
- Create constants instead of using 'm' and 'eng'
Points to consider (as in, what I would do!):
- Rather than having a single class, create separate classes for each direction, then select which in a function. The two directions share no code in common.
- Rather than storing the result, return it directly.
So I might end up with something like:
class TranslationToMorse:
def __init__( self, str ):
self.src_lang = ENGLISH
self.lang_dict = morse_code_dict
self.str = str
def prep( self, str ):
#etc
def translate( self ):
#etc
return trans
class TranslationToEnglish:
def __init__( self, str ):
self.src_lang = MORSE
self.lang_dict = dict([(v, k) for (k, v) in morse_code_dict.iteritems()])
self.str = str
def translate( self ):
#etc
return trans
def get_translator( str ):
exp = r'[a-zA-Z,円\?\/\(\)]'#regexp for non morse code char
if re.search( exp, str ) == None:
return TranslationToEnglish( str )
else:
return TranslationToMorse( str )
and:
print get_translator( msg ).translate()
-
1\$\begingroup\$ I'd avoid comparing
None
using the==
operator. Either useif re.search(exp, str):
orif re.search(exp, str) is None:
. \$\endgroup\$Bakuriu– Bakuriu2012年08月31日 09:54:02 +00:00Commented Aug 31, 2012 at 9:54