counting how often the same word appears in a txt file...But my code only prints the last line entry in the txt file

Steven D'Aprano steve+comp.lang.python at pearwood.info
Wed Dec 19 06:03:21 EST 2012


On 2012年12月19日 02:45:13 -0800, dgcosgrave wrote:
> Hi Iam just starting out with python...My code below changes the txt
> file into a list and add them to an empty dictionary and print how often
> the word occurs, but it only seems to recognise and print the last entry
> of the txt file. Any help would be great.
>> tm =open('ask.txt', 'r')
> dict = {}
> for line in tm:
> 	line = line.strip()
> 	line = line.translate(None, '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~')
> line = line.lower()
> 	list = line.split(' ')

Note: you should use descriptive names. Since this is a list of WORDS, a 
much better name would be "words" rather than list. Also, list is a built-
in function, and you may run into trouble when you accidentally re-use 
that as a name. Same with using "dict" as you do.
Apart from that, so far so good. For each line, you generate a list of 
words. But that's when it goes wrong, because you don't do anything with 
the list of words! The next block of code is *outside* the for-loop, so 
it only runs once the for-loop is done. So it only sees the last list of 
words.
> for word in list:

The problem here is that you lost the indentation. You need to indent the 
"for word in list" (better: "for word in words") so that it starts level 
with the line above it.
> 		if word in dict:
> 			count = dict[word]
> 			count += 1
> 			dict[word] = count

This bit is fine.
> else:
> 	dict[word] = 1

But this fails for the same reason! You have lost the indentation.
A little-known fact: Python for-loops take an "else" block too! It's a 
badly named statement, but sometimes useful. You can write:
for value in values:
 do_something_with(value)
 if condition:
 break # skip to the end of the for...else
else:
 print "We never reached the break statement"
So by pure accident, you lined up the "else" statement with the for loop, 
instead of what you needed:
for line in tm:
 ... blah blah blah
 for word in words:
 if word in word_counts: # better name than "dict"
 ... blah blah blah
 else:
 ...
> for word, count in dict.iteritems():
> 	print word + ":" + str(count)

And this bit is okay too.
Good luck!
-- 
Steven


More information about the Python-list mailing list

AltStyle によって変換されたページ (->オリジナル) /