4
\$\begingroup\$

As a beginner Python programmer, I wrote a simple program that counts how many times each letter appears in a text file. It works fine, but I'd like to know if it's possible to improve it.

def count_letters(filename, case_sensitive=False):
 alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
 with open(filename, 'r') as f:
 text = f.read()
 if not case_sensitive:
 alphabet = alphabet[:26]
 text = text.lower()
 letter_count = {ltr: 0 for ltr in alphabet}
 for char in text:
 if char in alphabet:
 letter_count[char] += 1
 for key in sorted(letter_count):
 print(key, letter_count[key])
 print("total:", sum(letter_count.values()))
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Jul 26, 2013 at 13:43
\$\endgroup\$
0

2 Answers 2

6
\$\begingroup\$

@flornquake points at the good direction (use string and collections.Counter) but I'd still modify some details:

  • alphabet = alphabet[:26] and text = text.lower(): My advice is not to override existing variables with new values, it makes code harder to understand. Use different names.

  • if char in alphabet: Make sure you perform inclusion predicates with hashes, sets or similar data structures, not lists/arrays. O(1) vs O(n).

  • Functions should return values (hopefully related with their name). Here it makes sense to return the counter.

I'd write:

import collections
import string
def count_letters(filename, case_sensitive=False):
 with open(filename, 'r') as f:
 original_text = f.read()
 if case_sensitive:
 alphabet = string.ascii_letters
 text = original_text
 else:
 alphabet = string.ascii_lowercase
 text = original_text.lower()
 alphabet_set = set(alphabet)
 counts = collections.Counter(c for c in text if c in alphabet_set)
 for letter in alphabet:
 print(letter, counts[letter])
 print("total:", sum(counts.values()))
 return counts
answered Jul 26, 2013 at 20:46
\$\endgroup\$
7
  • \$\begingroup\$ You didn't explicitly mention it, but one of the best improvements is the initialization of the Counter from a generator expression. Nice. \$\endgroup\$ Commented Jul 27, 2013 at 9:03
  • \$\begingroup\$ @codesparkle: I was to, but I realized that would be reviewing flownquake's answer, not OP :-) I'l add a comment to his question. \$\endgroup\$ Commented Jul 27, 2013 at 12:09
  • \$\begingroup\$ Wouldn't using original_text = original_text.lower() be better? Avoids unneeded variable. Also what about non-existent files? That would create an unrecognizable error for user as saying that file doesn't exist in error is quite different from the error that original_text undeclared. \$\endgroup\$ Commented Jul 29, 2013 at 16:33
  • \$\begingroup\$ @AseemBansal 1) reuse variables: Are you familiar with the functional paradigm? in FP you never ever use the same variable name (or symbol) for two different values (in the same scope, of course), because it's harder to reason about the algorithm. It's like in maths, you don't have x = 1, and just a step later x = 2, that makes no sense; the same when programming. More on this: en.wikipedia.org/wiki/Functional_programming 2) "what about non-existent files" The OP didn't stated what was to be done, so I did the same he did: nothing. Let the exception blow or be caught by the caller. \$\endgroup\$ Commented Jul 29, 2013 at 16:39
  • 1
    \$\begingroup\$ @AseemBansal: In C is difficult (and usually unrealistic) to apply FP principles, but in Python (like in Ruby, Perl, Lua and other imperative languages), you definitely can (of course an inpure approach to FP, with side-effects). If you give it a try, you'll see that code improves noticiably, it's more much more declarative (describing not how but what you are doing). Perhaps the starting point would be: docs.python.org/3/howto/functional.html. If you happen to know Ruby, I wrote this: code.google.com/p/tokland/wiki/RubyFunctionalProgramming \$\endgroup\$ Commented Jul 29, 2013 at 18:08
4
\$\begingroup\$

Your coding style is good, especially if you're only starting to program in Python. But to improve your code, you should learn your way around the standard library. In this case, the string and collections modules can make life easier for you.

import collections
import string
def count_letters(filename, case_sensitive=False):
 with open(filename, 'r') as f:
 text = f.read()
 if case_sensitive:
 alphabet = string.ascii_letters
 else:
 alphabet = string.ascii_lowercase
 text = text.lower()
 letter_count = collections.Counter()
 for char in text:
 if char in alphabet:
 letter_count[char] += 1
 for letter in alphabet:
 print(letter, letter_count[letter])
 print("total:", sum(letter_count.values()))

collections.Counter is, in essence, a dictionary with all values defaulting to 0.

answered Jul 26, 2013 at 16:01
\$\endgroup\$
2
  • 1
    \$\begingroup\$ Note that collection.Counter may take an iterable, so there is no need to add values with a side-effect (letter_count[char] += 1). \$\endgroup\$ Commented Jul 27, 2013 at 12:11
  • \$\begingroup\$ @tokland Good point. \$\endgroup\$ Commented Jul 27, 2013 at 18:27

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.