Project Euler 22: Names scores alternative

Question 1

Can someone tell me how to improve my following python code.

Code:

a = open('/home/Username/1st few programms/names.txt')
for y in a:
 c = y.split(",")
 c.sort()
dict = {'"':0, "A": 1, "B": 2,"C": 3,"D": 4, "E": 5,"F": 6,"G": 7,"H": 8,"I": 9,"J": 10,"K": 11,"L": 12,"M": 13,"N": 14,"O": 15,"P": 16,"Q": 17,"R": 18, "S": 19, "T": 20, "U": 21, "V": 22,"W": 23,"X": 24, "Y": 25, "Z": 26}
sum = 0
for x in range(len(c)):
 localsum = 0
 for y in c[x]:
 localsum += dict[y]
 sum += localsum*(x+1)
print sum

Problem statement:

Using names.txt, a 46K text file containing over five-thousand first names, begin by sorting it into alphabetical order. Then working out the alphabetical value for each name, multiply this value by its alphabetical position in the list to obtain a name score.

For example, when the list is sorted into alphabetical order, COLIN, which is worth 3 +たす 15 +たす 12 +たす 9 +たす 14 =わ 53, is the 938th name in the list. So, COLIN would obtain a score of 938 ×ばつかける 53 =わ 49714.

What is the total of all the name scores in the file?

Question 2

Welcome to Code Review! As we all want to make our code more efficient or improve it in one way or another, try to write a title that summarizes what your code does, not what you want to get out of a review. For examples of good titles, check out Best of Code Review 2014 - Best Question Title Category You may also want to read How to get the best value out of Code Review - Asking Questions.

Question 3

Also what format is names.txt in?

Question 4

You should name your post according to the problem statement, which I regcognize to be "Project Euler 22: Names Score". This problem has been reviewed twice for Python, here and here. I advice you to read those as well, as answers provided here.

Question 5

a, dict, c and sum are all problematic names, for two different reasons. Single character variable names should be used sparingly, really only in brief contexts where the intent is quite clear like for i in range(10) where it's obvious that i is an integer and follows for loop conventions. for y in a is very unclear because you haven't made it obvious what y is or should be, and a would be much harder to follow if not for the fact that it's declared just above. Even so the choice of letters seems totally arbitrary. I'd replace a with f for file, since it's localised.

The other naming issue is that using the names dict and sum are actually overwriting builtin functions and should be avoided, especially as I'm going to suggest that you use sum, which you can't do if it's been overwritten.

Also instead of using a plain open, you should use with open(). That's Python's context manager which handles file closing for you. It ensures that a file is always closed correctly, even if an error arises or you don't call file.close(). Note, you didn't call a.close() so clearly this is useful:

with open('/home/Username/1st few programms/names.txt') as f:
 for line in f:
 names = line.split(",")
 names.sort()

Next, this code you're using makes no sense. You're iterating over each line in the file but just creating a new list of names each time. If all the names are stored on one line then this works but is redundant, and I'd instead suggest this:

with open('/home/Username/1st few programms/names.txt') as f:
 names = f.read().split(",")
names.sort()

read() will just read the full string of the file in so that then split can separate it based on commas. If you do have names on multiple lines, then you want the loop but you want to add more values to names each time:

names = []
with open('/home/Username/1st few programms/names.txt') as f:
 for line in f:
 names += line.split(",")
names.sort()

The algorithm that determines a person's score is quite unusual, so it would actually be best to create a function for it. One that takes a name and position in the list and returns their score. It can do largely what your inner loop does:

def name_score(index, name):
 localsum = 0
 for character in name:
 localsum += char_points[character]
 return localsum * (index + 1)

However, this is one of the places I'm going to suggest you use sum. You can pass a generator expression into sum to do this calculation efficiently in one line. A generator expression is a for loop like expression, here's how it'd look for this function:

def name_score(index, name):
 return sum(char_points[character] for character in name) * (index + 1)

You can see how there's the same for _ in syntax as before, just that now it's collapsed into one line. You no longer need to have a localsum and sum is faster than using a for loop.

Now with this function, the main loop is simplified to:

sum = 0
for i in range(len(names)):
 sum += name_score(i, names[i])
print sum

First though, let's get rid of names[i] as that's ugly. Use Python's enumerate function that allows you to both get the indices and values of a list together, like this:

for i, name in enumerate(names):
 sum += name_score(i, name)

This is much cleaner now, but we can also use sum in this case to remove your whole loop, in fact it can skip straight to printing the sum:

print sum(name_score(i, name) for i, name in enumerate(names))

Again, this is both more readable and much faster. This also means that you no longer have a sum value that needs to be renamed.

Question 6

Try these 3 statements instead of the first 3 lines in your code, it should work

a = open('names.txt','r')
k = a.read()
c = k.split(",")
c.sort()

And instead of actually typing out the whole dictionary, you can ask python to do it for you, like this

dict={}
mynum=1
allTheLetters = string.uppercase
for i in allTheLetters:
 dict[i]=mynum
 mynum+=1

imports :

import string

Remove the first for loop, it's useless. In your code you don't read the data inside the file at all, using the handler (a) use the read() function to read out data. Now the object k, has all the data, upon that you split the data considering comma(,) as the delimiter. object (c) receives it and you sort it out. And the rest is your problem's logic which you've got it right!

score 5 · Accepted Answer · 2015-12-02 12:09:04Z

a, dict, c and sum are all problematic names, for two different reasons. Single character variable names should be used sparingly, really only in brief contexts where the intent is quite clear like for i in range(10) where it's obvious that i is an integer and follows for loop conventions. for y in a is very unclear because you haven't made it obvious what y is or should be, and a would be much harder to follow if not for the fact that it's declared just above. Even so the choice of letters seems totally arbitrary. I'd replace a with f for file, since it's localised.

The other naming issue is that using the names dict and sum are actually overwriting builtin functions and should be avoided, especially as I'm going to suggest that you use sum, which you can't do if it's been overwritten.

Also instead of using a plain open, you should use with open(). That's Python's context manager which handles file closing for you. It ensures that a file is always closed correctly, even if an error arises or you don't call file.close(). Note, you didn't call a.close() so clearly this is useful:

with open('/home/Username/1st few programms/names.txt') as f:
 for line in f:
 names = line.split(",")
 names.sort()

Next, this code you're using makes no sense. You're iterating over each line in the file but just creating a new list of names each time. If all the names are stored on one line then this works but is redundant, and I'd instead suggest this:

with open('/home/Username/1st few programms/names.txt') as f:
 names = f.read().split(",")
names.sort()

read() will just read the full string of the file in so that then split can separate it based on commas. If you do have names on multiple lines, then you want the loop but you want to add more values to names each time:

names = []
with open('/home/Username/1st few programms/names.txt') as f:
 for line in f:
 names += line.split(",")
names.sort()

The algorithm that determines a person's score is quite unusual, so it would actually be best to create a function for it. One that takes a name and position in the list and returns their score. It can do largely what your inner loop does:

def name_score(index, name):
 localsum = 0
 for character in name:
 localsum += char_points[character]
 return localsum * (index + 1)

However, this is one of the places I'm going to suggest you use sum. You can pass a generator expression into sum to do this calculation efficiently in one line. A generator expression is a for loop like expression, here's how it'd look for this function:

def name_score(index, name):
 return sum(char_points[character] for character in name) * (index + 1)

You can see how there's the same for _ in syntax as before, just that now it's collapsed into one line. You no longer need to have a localsum and sum is faster than using a for loop.

Now with this function, the main loop is simplified to:

sum = 0
for i in range(len(names)):
 sum += name_score(i, names[i])
print sum

First though, let's get rid of names[i] as that's ugly. Use Python's enumerate function that allows you to both get the indices and values of a list together, like this:

for i, name in enumerate(names):
 sum += name_score(i, name)

This is much cleaner now, but we can also use sum in this case to remove your whole loop, in fact it can skip straight to printing the sum:

print sum(name_score(i, name) for i, name in enumerate(names))

Again, this is both more readable and much faster. This also means that you no longer have a sum value that needs to be renamed.

Stack Exchange Network

Project Euler 22: Names scores alternative

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Project Euler 22: Names scores alternative

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions