I have file in the following format:
Berlin, Germany
New Delhi , India
New York , USA
Mumbai , India
Seattle, USA
I need to parse the file and print the output as
Germany : Berlin
India: New Delhi , Mumbai
USA: New York, Seattle
I wrote a code:
enter code here:
def check():
datafile=open('logfile.py','rU')
found=False
for line in datafile:
if 'India' in line:
lines=line.split()
print("India"+":"+lines[0])
if 'Germany' in line:
lines=line.split()
print("Germany"+":"+lines[0])
if 'USA' in line:
lines=line.split()
print("USA"+":"+lines[0])
datafile.close()
check()
This code is giving output as:
Germany:Berlin
India:NewDelhi
USA:NewYork
India:Mumbai
USA:Seattle
Please help.
PaulMcG
64.2k16 gold badges98 silver badges135 bronze badges
2 Answers 2
Another approach, is using defaultdict from collections to achieve this:
from collections import defaultdict
def check():
d = defaultdict(list)
with open('logfile.py', 'rU') as datafile:
for line in datafile:
data = line.split(',')
d[data[1].strip()].append(data[0].strip())
return d
res = check()
for k, v in res.items():
print("{} : {}".format(k, ', '.join(v)))
Output:
India : New Delhi, Mumbai
Germany : Berlin
USA : New York, Seattle
answered Apr 20, 2016 at 21:31
idjaw
26.8k10 gold badges68 silver badges84 bronze badges
Sign up to request clarification or add additional context in comments.
1 Comment
Jongware
Nice – now it makes me wonder how to get that irrational spacing in the original 'required' list.
Instead of directly printing everything, you could save it to a data structure like a dictionary or collections.defaultdict.
import collections.defaultdict as dd
result = dd(list)
with open('logfile.py', 'rU') as datafile:
for line in datafile:
city,country = map(str.strip, line.strip().split(','))
result[country].append(city)
Then print your results:
for country in result:
print(country+':', ', '.join(result[country]))
If you think there may be duplicate country/city listings and you don't want them, use set and add instead of list and append.
answered Apr 20, 2016 at 21:26
TigerhawkT3
49.4k6 gold badges66 silver badges101 bronze badges
Comments
lang-py
split()and losing the commas?collections.defaultdictmanages all of that for you