I have to iterate through multiple text files. For each file I read its contents and append each line to its corresponding dictionary to then build a JSON file.
Each text file has the following structure:
- Line 1: The title key.
- Line 2: The name key.
- Line 3: The date key.
- Line 4: The feedback key.
Here is an example of two of these files:
001.txt
Great Customer Service
John
2017年12月21日
The customer service here is very good. They helped me find a 2017 Camry with good condition in reasonable price. Compared to other dealers they provided the lowest price. Definitely recommend!
002.txt
You will find what you want here
Tom
2019年06月05日
I've being look around for a second handed Lexus RX for my family and this store happened to have a few of those. The experience was similar to most car dealers. The one I ended up buying has good condition and low mileage. I am pretty satisfied with the price they offered.
My approach is successful but I wonder if there is a better and faster approach of joining each line to its corresponding dictionary.
Additionally do I need to write with open('file', 'r')
for each file? Even when I use os.listdir()
I still have the same issue.
import json
l1 = []
l2 = []
with open("C:/Users/user/Desktop/001.txt") as file1, open("C:/Users/user/Desktop/002.txt") as file2:
for line1, line2 in zip(file1, file2):
if not line1.isspace() and not line2.isspace():
l1.append(line1.rstrip())
l2.append(line2.rstrip())
Dict = {}
Dict['dictio1'] = {'title': "", "name": "", "date": "", "feedback": ""}
Dict['dictio2'] = {'title': "", "name": "", "date": "", "feedback": ""}
Dict['dictio1']["title"] = l1[0]
Dict['dictio1']["name"] = l1[1]
Dict['dictio1']["date"] = l1[2]
Dict['dictio1']["feedback"] = l1[3]
Dict['dictio2']["title"] = l2[0]
Dict['dictio2']["name"] = l2[1]
Dict['dictio2']["date"] = l2[2]
Dict['dictio2']["feedback"] = l2[3]
with open('file.json', 'w') as file_json:
json.dump(Dict, file_json, indent=2)
{
"dictio1": {
"title": "Great Customer Service",
"name": "John",
"date": "2017年12月21日",
"feedback": "The customer service here is very good. They helped me find a 2017 Camry with good condition in reasonable price. Campared to other dealers they provided the lowest price. Definttely recommend!"
},
"dictio2": {
"title": "You will find what you want here",
"name": "Tom",
"date": "2019年06月05日",
"feedback": "I've being look around for a second handed Lexus RX for my family and this store happened to have a few of those. The experience was similar to most car dealers. The one I ended up buying has good condition and low mileage. I am pretty satisfied with the price they offered."
}
}
1 Answer 1
There are some ways you can improve your code:
Rather than building a dictionary and then manually assigning each value you can assign to
l1[0]
etc straight away.Dict['dictio1'] = {'title': "", "name": "", "date": "", "feedback": ""} Dict['dictio1']["title"] = l1[0] Dict['dictio1']["name"] = l1[1] Dict['dictio1']["date"] = l1[2] Dict['dictio1']["feedback"] = l1[3]
Dict["dictio1"] = { "title": l1[0], "name": l1[1], "date": l1[2], "feedback": l1[3], }
You should use a
for
loop over the paths and have thewith
inside it. Only building one dictionary at a time.for key, path in ...: with open(path) as f: lines = [] for line in f: if not line.isspace(): lines.append(line.rstrip()) Dict[key] = { "title": l1[0], "name": l1[1], "date": l1[2], "feedback": l1[3], }
We can use a list comprehension to build
lines
with some sugar.lines = [line.rstrip() for line in f if not line.isspace()]
Putting this all together we can get:
data = {}
paths = [
("dictio1", "C:/Users/user/Desktop/001.txt"),
("dictio2", "C:/Users/user/Desktop/002.txt"),
]
for key, path in paths:
with open(path) as f:
lines = [line.rstrip() for line in f if not line.isspace()]
data[key] = {
"title": lines[0],
"name": lines[1],
"date": lines[2],
"feedback": lines[3],
}
with open('file.json', 'w') as file_json:
json.dump(data, file_json, indent=2)
I would recomend you change your JSON structure to remove the outer dictionary and instead use a list. This would make all your code simpler not only building it here but consuming it later.
This would look like:
data = []
paths = [
"C:/Users/user/Desktop/001.txt",
"C:/Users/user/Desktop/002.txt",
]
for path in paths:
with open(path) as f:
lines = [line.rstrip() for line in f if not line.isspace()]
data.append({
"title": lines[0],
"name": lines[1],
"date": lines[2],
"feedback": lines[3],
})
with open('file.json', 'w') as file_json:
json.dump(data, file_json, indent=2)
if not line1.isspace() and not line2.isspace():
? Is that important? It's really really weird. \$\endgroup\$