Is this the most efficient solution or can the performance be improved? Just opens a file and prints random words (lines with newlines stripped out).
import random
filename = "20k.txt"
def main(filename):
words_file = open(filename, 'r')
words = words_file.read().split("\n")
temp_words = []
for word in words:
temp_words.append(word.strip("\r"))
words = temp_words
while True:
length = int(raw_input("Words:"))
for i in range(0,length):
print(words[random.randrange(0,len(words)+1)]),
print("\n")
if __name__ == "__main__":
main(filename)
2 Answers 2
First, you should always close
your files. Or, use the with
keyword, which takes care of this for you.
Then, the 'r'
mode is already the default mode for opening a file. Also, consider storing your words in a set
, instead of a list (gets rid of duplicate words, which might be wanted, or not if you want a higher probability for some words).
You should consider using random.choice
to get a random element from an iterable (actually an iterable with len
defined).
I would also separate the file reading into a separate method. and make it more explicit what the prompt actually means (the number of words to generate).
I would use str.join
and give it a generator expression rather than multiple calls to print
.
import random
def get_words(file_name):
with open(filename) as word_file:
return list(set(word.strip() for word in word_file))
def random_words(n, words)
return " ".join(random.choice(words) for _ in range(n))
def main(file_name):
words = get_words(file_name)
while True:
n = int(raw_input("Number of words:"))
print(random_words(n, words))
if __name__ == "__main__":
file_name = "20k.txt"
main(file_name)
-
\$\begingroup\$ Right, I forgot about the
close
. Using aset
is a better option, yes. Didn't know aboutrandom.choice
. Thanks a lot. \$\endgroup\$Samuel Shifterovich– Samuel Shifterovich2016年10月04日 16:19:09 +00:00Commented Oct 4, 2016 at 16:19 -
\$\begingroup\$ Your code ran:
TypeError: 'set' object does not support indexing
\$\endgroup\$Samuel Shifterovich– Samuel Shifterovich2016年10月04日 16:23:46 +00:00Commented Oct 4, 2016 at 16:23 -
\$\begingroup\$ What does
.strip()
do without any arguments? \$\endgroup\$Samuel Shifterovich– Samuel Shifterovich2016年10月04日 16:27:15 +00:00Commented Oct 4, 2016 at 16:27 -
\$\begingroup\$ @SamuelShifterovich It strips away whitespace from the edges, including all kinds of line-breaks. So
" asgsg \n\r\n".strip() == 'asgsg'
. \$\endgroup\$Graipher– Graipher2016年10月04日 16:28:33 +00:00Commented Oct 4, 2016 at 16:28 -
\$\begingroup\$ Your answer contains
words_file = open(filename)
andwith open(filename) as word_file:
. I guess the first line is a mistake? \$\endgroup\$Samuel Shifterovich– Samuel Shifterovich2016年10月04日 16:30:00 +00:00Commented Oct 4, 2016 at 16:30
You would greatly improve performance if you stop reading the whole 20k lines file into memory. The difference for 100 executions (3 words per execution) is 0.018 vs 0.793 seconds.
I suggest you to use word_file.seek method instead. It manipulates stream position to read just the word you need, not the whole file.
To get random position, use random.randint(0, EOF).
Details are here: https://github.com/artgromov/CorrectHorse.
I have made pull request to your repo.
-
\$\begingroup\$ I should've specified it's Python 2.7 \$\endgroup\$Samuel Shifterovich– Samuel Shifterovich2016年10月05日 13:20:13 +00:00Commented Oct 5, 2016 at 13:20
Explore related questions
See similar questions with these tags.