I am using Python Version 3.2. Have the following code:
for row2 in reader2:
for row1 in reader1:
if row1['identification_column'] == row2['identification_column']:
row2['updated_col'] = row1['updated_col']
writer.writerow(row2)
reader1 is a csv.DictReader object that looks like the following:
identification_column,type
1, bike
2, guitar
3, drums
4, airplane
5, computer
reader2 is similar to reader1, except a much longer and more comprehensive file.
The problem is this:
I run through all of the inner loop, and, if the program doesn't find a match, it doesn't write the line, and then increment the outer loop like I thought it would. It just stops. Initially it threw me an error until I read a post here where someone suggested adding "extrasaction = 'ignore'" to the writer declaration statement. But that didn't solve my problem.
I would greatly appreciate any feedback for fixing this logic. In my mind the following was what would be happening:
A) In the case where the inner loop does not find the value in question from the outer loop, the program outputs the line in the outer loop with no changes
B) In the case where the inner loop has the exact value that the outer loop is iterating over, change the values in one of the columns in the row and then output that row
I can see that as is, the program just stops after the first iteration of the inner loop, but I don't understand why this is the case.
2 Answers 2
reader1 and reader2 are file objects (wrapped in a csv DictReader). Those are iterators that can only be read once (until you're at the end of the file), so there's nothing for the next for loop to do.
Solution:
Read the file into a list and use that to refresh the DictReader:
read_1 = myfile1.readlines()
for row2 in reader2:
reader1 = csv.DictReader(read_1)
for row1 in reader1:
# etc.
Even better, read that csv file into a list of dictionaries once - that should be faster:
reader1 = list(csv.DictReader(myfile))
for row2 in reader2:
for row1 in reader1:
# etc.
2 Comments
I'm not experienced with csv, but I'd guess that the reader is exhausted once you have iterated to the last line, and then you need to restart it to iterate again. So, maybe you should try to reassign reader1 before using it in the inner loop:
for row2 in reader2:
reader1 = csv.DictReader(open('my.csv'), ...)
for row1 in reader1:
1 Comment
with statement - as it stands, my.csv never gets closed.