6
\$\begingroup\$

I decided to resume my (limited) Python skills after not coding for a long time.

Here's a very simple self-contained program to convert the characters of an email address into their HTML entities (the purpose is to make the email invisible to spamrobots).

  • Any comment for improvement? (e.g. args parsing, code style, etc.)

  • Apart from the -h option in argparse, what is the standard way to add some documentation for it, such as a manpage or some embedded help?

#!/usr/bin/python
#
# mung - Convert a string into its HTML entities
#
import argparse
parser = argparse.ArgumentParser(description = 'Convert a string into its HTML entities.')
parser.add_argument('string_to_mung', help = 'String to convert')
parser.add_argument('-l', '--link', action = 'store_true', help = 'Embed the munged string into a mailto: link')
args = parser.parse_args()
def mung(plain):
 munged = ''
 for c in plain:
 munged = munged + '&#' + str(ord(c)) + ';'
 return munged
string_munged = mung(args.string_to_mung)
if (args.link):
 print('<a href="&#109;&#97;&#105;&#108;&#116;&#111;&#58;')
 print(string_munged + '">')
 print(string_munged + '</a>')
else:
 print(string_munged)
200_success
146k22 gold badges190 silver badges479 bronze badges
asked Oct 7, 2018 at 13:11
\$\endgroup\$
2
  • 1
    \$\begingroup\$ Spam bots deal rather well with X(HT)ML escape sequences since they often use appropriate parser libraries. \$\endgroup\$ Commented Oct 8, 2018 at 8:53
  • \$\begingroup\$ Actually my experience suggests otherwise, as an obfuscated email address gets ten times less spam; see bit.ly/2UOuAOA \$\endgroup\$ Commented Dec 15, 2018 at 15:23

1 Answer 1

10
\$\begingroup\$

The code is straightforward and reads well, but:

  1. string concatenation is not very efficient, especially in a loop. You'd be better off using str.join on an iterable;
  2. encoding the mailto: part yourself impairs readability and maintenance, if only you had a function to do it for you. Oh wait...
  3. The comment at the beginning would be better as a module docstrings, you would then be able to use it as the argparse description using __doc__;
  4. You should avoid interleaving code and function definition, and protect top level code using an if __name__ == '__main__': guard.

Proposed improvements:

#!/usr/bin/python
"""Convert a string into its HTML entities"""
import argparse
def command_line_parser():
 parser = argparse.ArgumentParser(description=__doc__)
 parser.add_argument('string_to_mung', help='String to convert')
 parser.add_argument('-l', '--link', action='store_true',
 help='Embed the munged string into a mailto: link')
 return parser
def mung(plain):
 return ''.join('&#{};'.format(ord(c)) for c in plain)
if __name__ == '__main__':
 args = command_line_parser().parse_args()
 string_munged = mung(args.string_to_mung)
 if (args.link):
 string_munged = '<a href="{0}{1}">{1}</a>'.format(mung('mailto:'), string_munged)
 print(string_munged)
answered Oct 7, 2018 at 15:00
\$\endgroup\$
2
  • \$\begingroup\$ Python 3.6+: '&#{};'.format(ord(c))f'&#{ord(c)};' \$\endgroup\$ Commented Oct 7, 2018 at 22:55
  • 2
    \$\begingroup\$ @RomanOdaisky Given that there is no version tag, I prefer to keep format. \$\endgroup\$ Commented Oct 7, 2018 at 23:20

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.