4
\$\begingroup\$

I am getting back into python after a year of not doing it at all. I just wrote a bit of code to create a file that looks like this:

1,2
1,1,1,1
-1,0,1
42,17

Then I decided to take the average of each line and put it into a list. For this I wrote the following:

def line_averages():
 out_file = open("data.csv", "w") #create data file
 out_file.write("1,2\n1,1,1,1\n-1,0,1\n42,17") #write numbers into data file
 out_file.close() 
 f = open("data.csv", "r") #read data file
 lines = f.readlines() 
 avLines= [] #create a list to put averages into 
 for line in lines:
 line2 = line.replace("\n", "") #remove /n from lines
 line2 = line2.split(",") #split the lines by commas
 Nlines = [] #create a list to put floats into
 for i in line2:
 i = float(i)
 Nlines.append(i) #put the floats into the list
 sums = sum(Nlines) #total up each line from the file
 avLines.append(sums/len(line2)) #put the averages of each line into a list
 return avLines

I'm pretty happy as this does exactly what I wanted but I can't help but wonder if it can't be shortened/simplified. It seems a bit inefficient to me to have lots of "placeholders" (like Nlines or line2) which are used to do an operation rather than being able to do the operation directly on them. What would you guys do (if anything) to make this more compact? I'm sure it could be done more succinctly!

200_success
145k22 gold badges190 silver badges478 bronze badges
asked Jun 12, 2016 at 14:41
\$\endgroup\$
1
  • \$\begingroup\$ Why do you write the data to a file then immediately read it back out again? \$\endgroup\$ Commented Jun 12, 2016 at 14:51

4 Answers 4

4
\$\begingroup\$

Firstly, note that you are not closing the file after opening it. A Better way to read from a file would be to use the with statement. Read What is the python "with" statement designed for? to know more about the with statement.

The code would be,

with open("data.csv") as filename:
 # Code Follows

Other things to note here are

  • The mode of the open statement is 'r' by default and hence, you don't need to explicitly mention it.
  • Use a meaningful variable name like filename. This is to help your future self from not asking What is the f?

Now coming to the core of the problem. You can use the fileobject directly to iterate over the lines, You do not need to read the entire file into a list. This is helpful for large files. The code will be

for line in filename:
 # Code follows

The line line2 = line.replace("\n", "") is quite inadequate as there might be other whitespace like a trailing space or a \r. To solve all these issues, you can use str.strip.

stripped_line = line.strip()

Again, Note the name of the variable. Coming to this code block

line2 = line2.split(",")
Nlines = [] 
for i in line2:
 i = float(i)
 Nlines.append(i) 
sums = sum(Nlines) 

You can solve this issue using split and a generator expression.

stripped_line = stripped_line .split(',')
line_sum = sum(float(i) for i in stripped_line)

Alternatively you can use map

line_sum = sum(map(float,stripped_line))

(You can use literal_eval to directly do sum(literal_eval(stripped_line)), But the string functions are the right (and preffered way))

Hence the complete code will be reduced to (excluding the data creation part)

def line_averages():
 with open("test.csv", "r") as filename: 
 avLines= [] 
 for line in filename:
 stripped_line = line.strip() 
 stripped_line = stripped_line.split(',')
 line_sum = sum(float(i) for i in stripped_line ) 
 avLines.append(line_sum/len(stripped_line )) 
 return avLines

Code Golfing the solution, We can use a List Comprehension. Read this to kow more about the list comprehensions.

def line_averages():
 with open("test.csv", "r") as filename: 
 lines = [[float(i) for i in line.strip().split(',')] for line in filename]
 return [sum(i)/len(i) for i in lines]
answered Jun 12, 2016 at 15:20
\$\endgroup\$
3
\$\begingroup\$

One crucial thing is to seperate each task in a seperate function. One function for reading, one for writing and one for calculating the average. This way you can also test/verify each function and detect errors more easily. It's also useful for code reuse (inside the same project or maybe for your next project). I don't quite understand why you have your values hardcoded in form of a string in the first place. In most cases they should be stored as values.

def save(filename, matrix):
 with open(filename, "w") as out_file:
 for l in matrix:
 out_file.write(",".join(map(str, l)) + "\n")
def load(filename):
 with open(filename) as in_file:
 matrix = []
 for line in in_file:
 matrix.append([float(s) for s in line.split(",")])
 return matrix
def averages(matrix):
 return [sum(row)/len(row) for row in matrix]
def main():
 matrix = [
 [1,2],
 [1,1,1,1],
 [-1,0,1],
 [42,17]]
 filename = "data.csv"
 save(filename, matrix)
 matrix_read = load(filename)
 print("averages: %s" % averages(matrix_read))
main()
answered Jun 12, 2016 at 15:34
\$\endgroup\$
2
\$\begingroup\$

Honestly, i think you are doing pretty well in terms of efficiency. There are a few spots though where you created variables unnecessarily. For example you created the sums variable when you would only be using it in the next line. It would be simpler to just say

 avLines.append(sum(Nlines)/len(line2))

That way, you don't have to create the sums variable at all. There were a few other spots where you could have easily avoided creating new variables (like line2 could have been avoided by just changing line without renaming it). Hope that answers your question!

answered Jun 12, 2016 at 15:15
\$\endgroup\$
-1
\$\begingroup\$

Just for the "average" part.

with open('data.csv', 'r') as fo:
 lines = fo.readlines()
 av_lines = []
 for line in lines:
 n_lines = map(lambda x: float(x), line.split(','))
 av_lines.append(sum(n_lines) / len(n_lines))
answered Jun 12, 2016 at 15:07
\$\endgroup\$
6
  • \$\begingroup\$ You have an error in your code, appen => append. \$\endgroup\$ Commented Jun 12, 2016 at 15:57
  • \$\begingroup\$ Fixed. @pacmaninbw . So vote me down for this? \$\endgroup\$ Commented Jun 12, 2016 at 16:03
  • \$\begingroup\$ Questions have to be working code, I would assume this means answers should be working code as well. Your answer wouldn't work. Keep in mind I lose a point when I downvote an answer. \$\endgroup\$ Commented Jun 12, 2016 at 16:06
  • 2
    \$\begingroup\$ You have provided an alternate way but didn't review the code. Please explain what was wrong and how your solution improves upon the original so that OP can learn from your thought process. \$\endgroup\$ Commented Jun 12, 2016 at 19:03
  • 3
    \$\begingroup\$ lambda x: float(x) is just a long-winded way to write float. \$\endgroup\$ Commented Jun 13, 2016 at 4:21

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.