The project outline:
Write a program to read in the contents of several text files (you can make the text files yourself) and insert those contents into a spreadsheet, with one line of text per row. The lines of the first text file will be in the cells of column A, the lines of the second text file will be in the cells of column B, and so on.
Use the readlines() File object method to return a list of strings, one string per line in the file. For the first file, output the first line to column 1, row 1. The second line should be written to column 1, row 2, and so on. The next file that is read with readlines() will be written to column 2, the next file to column 3, and so on.
My solution:
# A program to read text files and write it to an Excel file with one line per row
# Usage: python text_to_spreadsheet.py "folder to search for text files" "save location for spreadsheet"
import sys, openpyxl
from pathlib import Path
def main(search_folder, save_path):
workbook = openpyxl.Workbook()
sheet = workbook.active
for column_index, filename in enumerate(search_folder.glob("*.txt")):
with open(filename, "r", encoding="utf-8") as text_file:
for row_index, line in enumerate(text_file.readlines()):
sheet.cell(row=row_index + 1, column=column_index + 1).value = line.strip()
workbook.save(save_path)
if __name__ == "__main__":
search_folder = Path(sys.argv[1]) # the folder with text files to search
save_path = Path(sys.argv[2]) # the path of the new spreadsheet (must end in .xlsx)
main(search_folder, save_path)
1 Answer 1
You have not stated any review goals so I will share my thoughts and ideas about your code:
General
Overall your code looks clean, follows pythonic style and is readable. You are using descriptive variable names and the structure is easy to follow.
Formatting / PEP8
There are two minor PEP8 violations I spotted:
- Imports should be on separate lines, i.e.
sys
andopenpyxl
should be imported using separate import statements - There should be two blank lines before and after a top level function definition
Using Argparse
I like the comments describing the command line arguments and the usage of the program. This is also the area in which I think you could improve it the most, because with very little effort you could turn these comments into argparse
help texts. This does not negatively impact the readability of the program (in my opinion) but helps the user of the program.
if __name__ == "__main__":
from argparse import ArgumentParser
parser = ArgumentParser(
description="A program to read text files and write it to an Excel file with one line per row"
)
parser.add_argument(
"search_folder",
help="the folder with text files to search",
)
parser.add_argument(
"save_path",
help="the path of the new spreadsheet (must end in .xlsx)",
)
args = parser.parse_args()
main(Path(args.search_folder), Path(args.save_path))
When the new code is called with the --help
flag it outputs the folloing message:
usage: test.py [-h] search_folder save_path
A program to read text files and write it to an Excel file with one line per row
positional arguments:
search_folder the folder with text files to search
save_path the path of the new spreadsheet (must end in .xlsx)
options:
-h, --help show this help message and exit
Additionally a usefull error message is generated if not enough or too many arguments are provided.