Return to Question

Tweeted twitter.com/StackCodeReview/status/650068482315415552

occurred Oct 2, 2015 at 22:02

added 238 characters in body; edited title

edited Oct 2, 2015 at 19:16

Aside: I've intentionally used regular expressions here instead of .strip() and .replace() in order to just get an idea for what regular expressions are and how they work (in addition to perhaps there being non-zero improvements because RE is implemented in C right?) The file parsing itself doesn't really effect performance I've noticed, but any input on what is standard python convention and when to use strip() and built-in string methods vs. python's re module is helpful there is very welcome.

deleted 76 characters in body; edited title

Source Link

edited Oct 2, 2015 at 19:16

Jamal

edited Oct 2, 2015 at 19:16

Jamal

35.2k
13
134
238

Project Euler # 22: Python, Name Scores, enumerate(), regex, and .index. Speed issueNames scores

General code criticism as well as specific syntax very welcome, thank you!

edit Added a link to names.txt in question description.

Project Euler # 22: Python, Name Scores, enumerate(), regex, and .index. Speed issue

General code criticism as well as specific syntax very welcome, thank you!

edit Added a link to names.txt in question description.

Project Euler # 22: Names scores

General code criticism as well as specific syntax very welcome.

Source Link

asked Oct 2, 2015 at 19:13

mburke05

asked Oct 2, 2015 at 19:13

mburke05

Project Euler # 22: Python, Name Scores, enumerate(), regex, and .index. Speed issue

Problem 22, here, asks the following:

Using names.txt (right click and 'Save Link/Target As...'), a 46K text file containing over five-thousand first names, begin by sorting it into alphabetical order. Then working out the alphabetical value for each name, multiply this value by its alphabetical position in the list to obtain a name score.

For example, when the list is sorted into alphabetical order, COLIN, which is worth 3 +たす 15 +たす 12 +たす 9 +たす 14 =わ 53, is the 938th name in the list. So, COLIN would obtain a score of 938 ×ばつかける 53 =わ 49714.

What is the total of all the name scores in the file?

My solution is as follows:

import re
import string
def create_names_list(file_name):
 #create a names list parsing 
 names = open(file_name).read()
 names = re.findall(r'"(.*?)"', names)
 names.sort()
 return names
def name_weights(file_name):
 #fetch name-list
 names = create_names_list(file_name)
 #create a letters dictionary e.g. {'A' : 1, 'B' : 2, ... 'Z' : 26}
 letters = string.ascii_uppercase
 letters_map_reversed = dict(enumerate(letters))
 letters_map = {value: key+1 for key, value in letters_map_reversed.iteritems()}
 # sum all letters using letter score * index
 return sum(letters_map[char]*(names.index(name)+1) for name in names for char in name)

%timeit name_weights('names.txt')

1 loops, best of 3: 1.18 s per loop

I noticed in the thread for the solutions that many python solutions appeared to be in the 8-12ms range, 10x faster than my solution. As a result I was concerned with the performance of my solution.

Two things I've done to attempt to tweak my solution are changing the letters_map[char] portion to just be ord(char)-64 which I noticed a few people were using. It seems like a clever, certainly shorter, way of getting the number value that I hadn't thought of. It didn't alter performance though, which makes me think my problem is with the final expression that uses a nested for loop to multiply all the characters in all the words by the given weight and letter_mapping number. It seems as though other solutions made similar use of nested for loops, but I noticed that nobody had used index() in order to get the numeric index of the name in the list. I wonder if that is what is causing performance, but I suspect it may also just be my misunderstanding of how to structure these nested for loops optimally (for instance I could have used enumerate() on names in addition to on letters in order to get the index in a dictionary which would forgo the need to use index().

General code criticism as well as specific syntax very welcome, thank you!

edit Added a link to names.txt in question description.

lang-py