Python implementation of a Morse code translator

Question 1

Here is my solution to reddit.com/r/dailyprogrammer #93 (easy)., which is a Python script that translates between English and Morse code.

Is String_To_Translate a good example of a class? Am I violating convention? How can I improve this code and make it more readable and pythonic?

import re
morse_code_dict = { 'a':'.-',
 'b':'-...',
 'c':'-.-.',
 'd':'-...',
 'e':'..-.',
 'f':'..-.',
 'g':'--.',
 'h':'....',
 'i':'..',
 'j':'.---',
 'k':'-.-',
 'l':'.-..',
 'm':'--',
 'n':'-.',
 'o':'---',
 'p':'.--.',
 'q':'--.-',
 'r':'.-.',
 's':'...',
 't':'-',
 'u':'..-',
 'v':'...-',
 'w':'.--',
 'x':'-..-',
 'y':'-.--',
 'z':'--..',
 '1':'.----',
 '2':'..---',
 '3':'...--',
 '4':'....-',
 '5':'.....',
 '6':'-....',
 '7':'--...',
 '8':'---..',
 '9':'----.',
 '0':'-----',
 ',':'--..--',
 '.':'.-.-.-',
 '?':'..--..',
 '/':'-..-.',
 '-':'-....-',
 '(':'-.--.',
 ')':'-.--.-',
 ' ':' ',
 }
class String_To_Translate:
 def __init__(self, string):
 self.string = string
 self.Translated = ''
 self.lang = ''
 def determine_lang(self):
 exp = r'[a-zA-Z,円\?\/\(\)]'#regexp for non morse code char
 if re.search(exp, self.string) == None:
 self.lang = 'm'
 else:
 self.lang = 'eng'
 def prep(self):
 if self.lang == 'eng':
 #get rid of whitespace & warn user
 (self.string, numWhitespaceReplacements) = re.subn(r'\s', ' ', self.string)
 if numWhitespaceReplacements > 0:
 print 'Some whitespace characters have been replaced with spaces.'
 #get rid of characters not in dict & warn user
 (self.string, numReplacements) = re.subn(r'[^a-zA-Z,円\?\/\(\)\.\-\s]', ' ', self.string)
 if numReplacements > 0:
 print 'Some ({0}) characters were unrecognized and could not be translated, they have been replaced by spaces.'.format(numReplacements)
 #convert to lowercase
 self.string = self.string.lower()
 def translate_from_morse(self):
 eng_dict = dict([(v, k) for (k, v) in morse_code_dict.iteritems()])
 def split_morse(string):
 d = ' '
 charList = [[char for char in word.split() + [' ']] for word in string.split(d) if word != '']
 charList = [item for sublist in charList for item in sublist]
 print charList
 return charList
 charList = split_morse(self.string)
 for char in charList:
 self.Translated += eng_dict[char]
 def translate_from_eng(self):
 charList=list(self.string)
 for char in charList:
 self.Translated += morse_code_dict[char] + ' '
 def translate(self):
 self.determine_lang()
 if self.lang == 'm':
 self.translate_from_morse()
 elif self.lang == 'eng':
 self.prep()
 self.translate_from_eng()
 else:
 print 'shit'
if __name__=="__main__":
 translateme = String_To_Translate(raw_input('enter a string: '))
 translateme.translate()
 print translateme.Translated
 print 'translating back'
 test = String_To_Translate(translateme.Translated)
 test.translate()
 print test.Translated

Question 2

The first thing that stands out from your code is the big table giving the morse code: it's lengthy, hard to read, and hard to check. (And consequently, it contains two mistakes.) I would write something like this:

from itertools import izip
def pairwise(s):
 """
 Generate pairs from the sequence `s`.
 >>> list(pairwise('abcd'))
 [('a', 'b'), ('c', 'd')]
 """
 it = iter(s)
 return izip(it,it)
morse_encoding = dict(pairwise('''
 a .- b -... c -.-. d -.. e . f ..-. 
 g --. h .... i .. j .--- k -.- l .-..
 m -- n -. o --- p .--. q --.- r .-.
 s ... t - u ..- v ...- w .-- x -..-
 y -.-- z --..
 1 .---- 2 ..--- 3 ...-- 4 ....- 5 ..... 
 6 -.... 7 --... 8 ---.. 9 ----. 0 -----
 , --..-- . .-.-.- ? ..--.. / -..-. - -....-
 ( -.--. ) -.--.- ' .----. ! -.-.-- & .-...
 : ---... ; -.-.-. = -...- + .-.-. _ ..--.-
 " .-..-. $ ...-..- @ .--.-.
'''.split()))
morse_decoding = {v:k for k,v in morse_encoding.iteritems()}

You rebuild the decoding table every time you start a new translation. This seems wasteful: I suggest building it only once, as above.
Not everything needs to be a class. A class is a data abstraction whose instances represent some thing together with methods for operating on that thing. You don't actually have any persistent data here: all you are using the class for is to store the temporary state for a single translation). It makes more sense to structure this code as functions.
Your classes and methods have no docstrings and no doctests. Even a small doctest or two might have caught the mistakes in your Morse code table.
In your determine_lang method, you use a regular expression that matches strings that are not Morse code. It would be simpler to do the test the other way round: to determine that the string is Morse code (because Morse code contains only dots, dashes and spaces). Python's sets provide a clear and concise way to express this test:
```
def is_morse_code(s):
 """
 Return True if the string `s` resembles Morse code (contains only
 dots, dashes and spaces), False otherwise.
 >>> is_morse_code('... --- ...')
 True
 >>> is_morse_code('abc')
 False
 """
 return set(s) <= set('.- ')
```
The original challenge may have referred to "English", but there's nothing actually English-specific about Morse code. It translates any language that can be written in the Latin alphabet. So I would prefer names that didn't presume that we were translating English.
When a character cannot be encoded or decoded, you print a message to inform the user. The usual approach in Python is to raise an exception.

Your translation code seems very complex for what it is doing. All you need to do is split the string into words, split each word into letters, translate each letter, join the translated letters into words, and then join the translated words into the result. This can be done neatly and concisely using split and join and a couple of loops:

def morse_encode(s):
 """
 Encode the string `s` in Morse code and return the result. 
 Raise KeyError if it cannot be encoded.
 >>> morse_encode('morse code')
 '-- --- .-. ... . -.-. --- -.. .'
 """
 return ' '.join(' '.join(morse_encoding[l] for l in word)
 for word in s.lower().split(' '))
def morse_decode(s):
 """
 Decode the string `s` from Morse code and return the result.
 Raise KeyError if it cannot be decoded.
 >>> morse_decode('-- --- .-. ... . -.-. --- -.. .')
 'morse code'
 """
 return ' '.join(''.join(morse_decoding[l] for l in word.split(' '))
 for word in s.split(' '))

Question 3

and for extra fun, return a function from determine_lang: return lambda: morse_encode( str ) if re.search( exp, str ) else lambda: morse_decode( str )

Question 4

An excellent, thorough response. thank you Gareth.

Question 5

Minor points:

self.Translated should be self.translated.
The use of re may be bit of overkill for what you want it for.
If you use a space in the text, you get told that it's being replaced by a space.
Create constants instead of using 'm' and 'eng'

Points to consider (as in, what I would do!):

Rather than having a single class, create separate classes for each direction, then select which in a function. The two directions share no code in common.
Rather than storing the result, return it directly.

So I might end up with something like:

class TranslationToMorse:
 def __init__( self, str ):
 self.src_lang = ENGLISH
 self.lang_dict = morse_code_dict
 self.str = str
 def prep( self, str ):
 #etc
 def translate( self ):
 #etc
 return trans
class TranslationToEnglish:
 def __init__( self, str ):
 self.src_lang = MORSE
 self.lang_dict = dict([(v, k) for (k, v) in morse_code_dict.iteritems()])
 self.str = str
 def translate( self ):
 #etc
 return trans
def get_translator( str ): 
 exp = r'[a-zA-Z,円\?\/\(\)]'#regexp for non morse code char
 if re.search( exp, str ) == None:
 return TranslationToEnglish( str )
 else:
 return TranslationToMorse( str )

and:

print get_translator( msg ).translate()

Question 6

I'd avoid comparing None using the == operator. Either use if re.search(exp, str): or if re.search(exp, str) is None:.

Gareth Rees Gareth Rees 50.1k3 gold badges130 silver badges210 bronze badges · Accepted Answer · 2012-08-31 10:19:37Z

The first thing that stands out from your code is the big table giving the morse code: it's lengthy, hard to read, and hard to check. (And consequently, it contains two mistakes.) I would write something like this:

from itertools import izip
def pairwise(s):
 """
 Generate pairs from the sequence `s`.
 >>> list(pairwise('abcd'))
 [('a', 'b'), ('c', 'd')]
 """
 it = iter(s)
 return izip(it,it)
morse_encoding = dict(pairwise('''
 a .- b -... c -.-. d -.. e . f ..-. 
 g --. h .... i .. j .--- k -.- l .-..
 m -- n -. o --- p .--. q --.- r .-.
 s ... t - u ..- v ...- w .-- x -..-
 y -.-- z --..
 1 .---- 2 ..--- 3 ...-- 4 ....- 5 ..... 
 6 -.... 7 --... 8 ---.. 9 ----. 0 -----
 , --..-- . .-.-.- ? ..--.. / -..-. - -....-
 ( -.--. ) -.--.- ' .----. ! -.-.-- & .-...
 : ---... ; -.-.-. = -...- + .-.-. _ ..--.-
 " .-..-. $ ...-..- @ .--.-.
'''.split()))
morse_decoding = {v:k for k,v in morse_encoding.iteritems()}

You rebuild the decoding table every time you start a new translation. This seems wasteful: I suggest building it only once, as above.
Not everything needs to be a class. A class is a data abstraction whose instances represent some thing together with methods for operating on that thing. You don't actually have any persistent data here: all you are using the class for is to store the temporary state for a single translation). It makes more sense to structure this code as functions.
Your classes and methods have no docstrings and no doctests. Even a small doctest or two might have caught the mistakes in your Morse code table.
In your determine_lang method, you use a regular expression that matches strings that are not Morse code. It would be simpler to do the test the other way round: to determine that the string is Morse code (because Morse code contains only dots, dashes and spaces). Python's sets provide a clear and concise way to express this test:
```
def is_morse_code(s):
 """
 Return True if the string `s` resembles Morse code (contains only
 dots, dashes and spaces), False otherwise.
 >>> is_morse_code('... --- ...')
 True
 >>> is_morse_code('abc')
 False
 """
 return set(s) <= set('.- ')
```
The original challenge may have referred to "English", but there's nothing actually English-specific about Morse code. It translates any language that can be written in the Latin alphabet. So I would prefer names that didn't presume that we were translating English.
When a character cannot be encoded or decoded, you print a message to inform the user. The usual approach in Python is to raise an exception.

Your translation code seems very complex for what it is doing. All you need to do is split the string into words, split each word into letters, translate each letter, join the translated letters into words, and then join the translated words into the result. This can be done neatly and concisely using split and join and a couple of loops:

def morse_encode(s):
 """
 Encode the string `s` in Morse code and return the result. 
 Raise KeyError if it cannot be encoded.
 >>> morse_encode('morse code')
 '-- --- .-. ... . -.-. --- -.. .'
 """
 return ' '.join(' '.join(morse_encoding[l] for l in word)
 for word in s.lower().split(' '))
def morse_decode(s):
 """
 Decode the string `s` from Morse code and return the result.
 Raise KeyError if it cannot be decoded.
 >>> morse_decode('-- --- .-. ... . -.-. --- -.. .')
 'morse code'
 """
 return ' '.join(''.join(morse_decoding[l] for l in word.split(' '))
 for word in s.split(' '))

and for extra fun, return a function from determine_lang: return lambda: morse_encode( str ) if re.search( exp, str ) else lambda: morse_decode( str )

Stack Exchange Network

Python implementation of a Morse code translator

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Python implementation of a Morse code translator

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions