Annother little excercise with openpyxl:
Say you have a Excel file like this:
The Goal is to make make it look like this:
My code to acomplish this:
inversion.py
"""
Inverts the content of table row and columns
"""
from copy import copy
import openpyxl
from openpyxl.utils import get_column_letter
def save_workbook_excel_file(workbook, filename: str):
"""Tries to save created data to excel file"""
try:
workbook.save(filename)
except PermissionError:
print("Error: No permission to save file.")
def invert_row_column(filename: str):
"""
Main loop to invert rows and column
"""
workbook = openpyxl.load_workbook(filename)
sheet_names = workbook.sheetnames
sheet = workbook[sheet_names[0]]
workbook.create_sheet(index=0, title='tmp_sheet')
tmp_sheet = workbook['tmp_sheet']
data = []
for row in sheet:
cells = []
for cell in row:
cells.append(cell)
data.append(cells)
for x in range(0, len(data)):
for y in range(0, len(data[x])):
column = get_column_letter(x + 1)
row = str(y + 1)
tmp_sheet[column + row] = copy(data[x][y].value)
sheet_name = sheet.title
del workbook[sheet_name]
tmp_sheet.title = sheet_name
save_workbook_excel_file(workbook, 'updated_' + filename)
invert_row_column("test.xlsx")
I wonder how this can be improved? Can the naming be better? Is there a better / or shorter solution?
-
1\$\begingroup\$ a small tip: don't overwrite the original file, but write it to a different filename, or rename the original file \$\endgroup\$Maarten Fabré– Maarten Fabré2018年12月18日 16:51:01 +00:00Commented Dec 18, 2018 at 16:51
3 Answers 3
Disclaimer: I'm not familiar with openpyxl
. I hope this review won't be nonsense. Do tell me!
The posted code copies the content of the first sheet into data
,
writes inverted (transposed?) content into a new sheet tmp_sheet
,
copies attributes of the original sheet to tmp_sheet
and finally deletes the original sheet.
What I don't get is why not update the original sheet directly? You could loop over coordinates of the cells below the diagonal of the sheet, compute the coordinates of the cell to swap with, use a suitable temporary storage for swapping single values. The diagonal can be left alone, they don't need to be swapped with anything.
This approach would have the advantages that if there are multiple sheets in the file, the content of the first sheet stays on the first sheet, and you don't need to worry about copying properties of the sheet such as the title.
Another way to solve this, without using openpyxl
and thereby slightly defeating the purpose of learning more about that, would be to use pandas
, where this is quite short:
import pandas as pd
# This would load the first sheet by default, but we need the name to save it again
# df = pd.read_excel(filename)
sheets = pd.read_excel(file_name, sheet_name=None) # all sheets
sheet_name, df = next(iter(sheets.items())) # first sheet
df = df.T # transpose
df.to_excel(file_name, sheet_name) # write back
This uses pandas.read_excel
to read the file and pandas.DataFrame.to_excel
to write it back. You need to have the xlrd
module installed for this to work.
Feel free to wrap it in functions again if needed.
This should be faster than the manual iteration in Python, since the transpose should happen at C speed.
Simple approach, with openpyxl
import openpyxl
current_wb = openpyxl.load_workbook('./data/items.xlsx')
current_sheet = current_wb.get_active_sheet()
new_wb = openpyxl.Workbook()
new_sheet = new_wb.get_active_sheet()
for row in range(1, current_sheet.max_row + 1):
for col in range(1, current_sheet.max_column + 1):
new_sheet.cell(col, row).value = current_sheet.cell(row, col).value
new_wb.save('./dist/items.xlsx')
-
\$\begingroup\$ Welcome to Code Review! You have presented an alternative solution, but haven't reviewed the code. Please edit to show what aspects of the question code prompted you to write this version, and in what ways it's an improvement over the original. It may be worth (re-)reading How to Answer. \$\endgroup\$Toby Speight– Toby Speight2019年07月23日 08:59:27 +00:00Commented Jul 23, 2019 at 8:59
-
\$\begingroup\$ this is a nice approach since it uses openpyxl like my original question. It is so much shorter than it. \$\endgroup\$Sandro4912– Sandro49122019年07月24日 17:10:48 +00:00Commented Jul 24, 2019 at 17:10