My .csv data looks like this:
June 8, 2009 Monday
June 8, 2009 Monday
June 6, 2009 Saturday
June 6, 2009 Saturday Correction Appended
June 6, 2009 Saturday
June 6, 2009 Saturday
June 6, 2009 Saturday
etc...
The data spans 10 years. I need to separate the months and years (and don't care about the dates and days).
To single out months I have the next lines of code:
for row in reader:
date = row[1]
month = date.partition(' ')[0]
print month
However I can't figure out how to extract the numeric year from the string? Would I have to use regex for this?
kvorobiev
5,0804 gold badges32 silver badges36 bronze badges
1 Answer 1
Try:
for row in reader:
row_split = row[1].split()
month = row_split[0]
year = int(row_split[3])
Explaination
row[1] == "June 8, 2009 Monday"
Therefore:
row[1].split() == ["June", "8,", "2009", "Monday"]
So, your month and year are extracted as follows:
"June" == row[1].split()[0]2009 == int(row[1].split()[2])
answered Jun 9, 2015 at 10:45
thodic
2,2791 gold badge21 silver badges38 bronze badges
Sign up to request clarification or add additional context in comments.
6 Comments
bereal
Shouldn't both be
[0]?thodic
@bereal There is a space at the start of the second column. If OPs data is truly comma separated this adds a blank string at the start of the split array. See the edit explanation.
bereal
' a b c '.split() returns ['a', 'b', 'c'], unlike ' a b c '.split(' ')thodic
@bereal I originally had
split(' '), however, a suggested edit changed it to split(). Needless to say I've fixed the issue from the suggested edit in my answer. +1 for the tip.Zlo
Thanks for the explanation, but actually the data format I provided above is already extracted from
row[1] in the .csv. So row[1] contains the full string e.g., June 6, 2009 Saturday Correction Appended. |
lang-py
monthisJune 6, then you can get6bymonth.split(" ")[1]