I have a CSV file, call it csv_file. It has the following content:
Username, Password
name1, pass1
name2, pass2
...
I also have a dictionary, call it mydict
. It has the following content:
mydict = {
"name2" : "pass2",
"name3" : "pass3"
...
}
I want to update my CSV file to now include name3, pass3
, since those aren't in the CSV file but they are in the dictionary.
What's the most efficient, pythonic way of doing this?
Right now, here's what I have, but I don't think it's very efficient:
with open(csv_file, 'rb') as infile, open(new_csv_file, 'wb') as outfile:
r = csv.DictReader(infile)
w = csv.DictWriter(outfile, r.fieldnames)
w.writeheader()
temp_dict = {row['Username'] : row['Password'] for row in r}
for k in mydict:
if k.key not in temp_dict:
temp_dict[k] = mydict[k]
for value in temp_dict:
w.writerow({'Username' : value, 'Password' : temp_dict[value]})
I'm sure there's something I can do to make this better. Any suggestions?
1 Answer 1
There's no better way than creating a temporary dictionary to quickly update the contents of the entire file the way you want. However you can speed things by not using csv.DictReader
and csv.DictWriter
because they require building a separate temporary dictionary for each row processed.
Here's a more efficient version based on that supposition that also effectively updates the file "in-place". Note that the order of the rows in the file will be changed as a result of storing them temporarily in the dictionary. If that's important, use a collections.OrderedDict
instead.
Also noteworthy is that it would be even more efficient to use @user3757614's suggestion, and instead do a less complicated mydict.update(temp_dict)
(and then write mydict.items()
out as the updated version of the file). If you want to preserve mydict
, just make a copy of it first and then update that with temp_dict
's contents.
import csv
import os
mydict = {
"name2" : "pass2",
"name3" : "pass3"
# ...
}
csv_file = 'users.csv' # file to be updated
tempfilename = os.path.splitext(csv_file)[0] + '.bak'
try:
os.remove(tempfilename) # delete any existing temp file
except OSError:
pass
os.rename(csv_file, tempfilename)
# create a temporary dictionary from the input file
with open(tempfilename, mode='rb') as infile:
reader = csv.reader(infile, skipinitialspace=True)
header = next(reader) # skip and save header
temp_dict = {row[0]: row[1] for row in reader}
# only add items from my_dict that weren't already present
temp_dict.update({key: value for (key, value) in mydict.items()
if key not in temp_dict})
# create updated version of file
with open(csv_file, mode='wb') as outfile:
writer = csv.writer(outfile)
writer.writerow(header)
writer.writerows(temp_dict.items())
os.remove(tempfilename) # delete backed-up original
-
\$\begingroup\$ Very interesting! I like this a lot! However, could you explain the purpose of appending
'.bak'
totempfilename
Edit: Nevermind, I did some reading and understand the purpose. Thanks! \$\endgroup\$codycrossley– codycrossley2015年08月01日 01:10:10 +00:00Commented Aug 1, 2015 at 1:10 -
\$\begingroup\$ If you found my answer helpful, please consider up-voting and possibly accepting it. See What should I do when someone answers my question? \$\endgroup\$martineau– martineau2015年08月01日 12:05:03 +00:00Commented Aug 1, 2015 at 12:05
new_csv_file
or I can replace the old one. But I haven't decided what I'll do at that point yet. \$\endgroup\$mydict
but not update the password of any existing ones? \$\endgroup\$