let's say I have a string like this
ab, "cd
ef", gh, "ijk, lm"
and this
a,b,c
d,e,f
and I want to parse them with python csv module. how can i do it? Second one is assumed it's two lines but first one is not.
I thought they'll needed to be loaded into csv.reader() so first thought I'll need to divide them by a comma so used .split(',') but it would have a problem on the second string, as it'll ignore the newline and I also thought of .splitline() but in this case it'll mess up the first string..
been trying to solve this for a whole day and i'm out of ideas... any help?
1 Answer 1
The issue you are having is that you have a space after the , so your actual delimiter is ', ' in the first example.
Luckily, you are not the first with this issue. Use csv.skipinitialspace set to True to solve.
Given:
$ cat file1.csv
ab, "cd
ef", gh, "ijk, lm"
And:
$ cat file2.csv
a,b,c
d,e,f
You can do:
with open('file1.csv', 'r') as f:
for row in csv.reader(f, quotechar='"',skipinitialspace=True):
print(f"len: {len(row)}, row: {row}")
Prints:
len: 4, row: ['ab', 'cd\nef', 'gh', 'ijk, lm']
And the same dialect works for the second example that has a true , delimiter without the trailing space:
with open('file2.csv', 'r') as f:
for row in csv.reader(f, quotechar='"',skipinitialspace=True):
print(f"len: {len(row)}, row: {row}")
Prints:
len: 3, row: ['a', 'b', 'c']
len: 3, row: ['d', 'e', 'f']