This is the new input file format. I need to automate the process of replacing the content of one column in a .csv file with the use of python. I can also open the .csv file using Notepad and replace the content of the column but the file is very huge and it is taking a long time.
Name ID class Num
"kanika",""University ISD_po.log";" University /projects/asd/new/high/sde"","MBA","12"
"Ambika",""University ISD_po.log";" University /projects/asd/new/high/sde"","MS","13"
In the above, I need to replace the content of ID column. ID column is very inconsistent as it has big spaces and symbols like(; , /) in the content.The new content in the ID column should be "input".
This Id column is enclosed with 2 double quotes and has some extra spaces as well. Whereas other columns have only 1 double quote.
Is there any way to do it in python?
2 Answers 2
You could use the csv module in Python to achieve this.
csv.reader will return each row as a list of strings. You could then use csv.writer to stream each row and modify the ID column at this point, this will create a new file though.
So:
import csv
reader = csv.reader(open('file.csv', 'rb'))
writer = csv.writer(open('outfile.csv','wb'))
for row in reader:
writer.writerow([row[0], "input", row[2], row[3]])
4 Comments
Read the .csv line-by-line, split on ,, and replace the second column with "input".
Write it out (to a different file) as you go:
f = open('mycsv.csv','rb')
fo = open('out.csv','wb')
# go through each line of the file
for line in f:
bits = line.split(',')
# change second column
bits[1] = '"input"'
# join it back together and write it out
fo.write( ','.join(bits) )
f.close()
fo.close()
Then you can rename it to replace the original file if you like.
5 Comments
"" you should not split on themcsv answer, it just feels safer to use a custom-built csv-parser than homebrew code - especially when it's an inbuilt module that doesn't bloat your code at all!