My objective is to enter/type in values into Excel cells if the column names and indices match between Excel and dataframe.
So, my code does the below
Gets the ID values from the dataframe
for each ID, see whether it is present in excel file
If they are present, get the column name list for those corresponding IDs (from dataframe)
Filter only non-NA columns
Check whether those columns names are present in excel file
If both ID and column names (between dataframe and excel) match, key in the column value (from dataframe) into the excel sheet at appropriate position
You can refer this post for more details on the problem.
My code looks like below:
sales = xlwings.book('file1.xlsx')
df_value = region_1['ID'].tolist()
for val in df_value:
for a_cell in sales.sheets[0]['B5:B8']:
if a_cell.value == val:
rn1 = a_cell.row
temp = region_1.loc[[val]]
temp = temp.dropna(axis=1, how='all')
colu = temp.columns.tolist()
for col in colu:
for b_cell in sales.sheets[0]['G3:J3']:
if b_cell.value == col:
rn2 = b_cell.row
data_entry_loc = str(b_cell.address[1]) + str(rn1)
enter_val = temp[col].values
sales.sheets[0][data_entry_loc].value = enter_val
However, am not sure whether this code is elegant and efficient enough to handle big data. While it worked for sample data provided in the above post, I would like to seek your opinion/suggestions on how it can be improved. You can please write the improved version of the code as an answer below.
In case, if you wish to try out the sample data, you can find below
Sample dataframe
ID,DIV,APP1,APP2,APP3,Col1,Col4
1,A,AB1,ABC1,ABCD1,20,40
2,A,AB1,ABC2,ABCD2,60,
3,B,BC1,BCD1,BCDE1,10,20
region_1 = pd.read_clipboard(sep=',')
region_1.set_index(['ID','DIV','APP1','APP2','APP3'],inplace=True)
Input excel file
I expect my output to be like as shown below
1 Answer 1
Step by step,
(1) Rather than "getting the ID", just make sure your indices are sane (they basically are already)
(2-5) Don't do this. Instead, align to the indices read from the Excel file.
(6) alignment takes care of this.
Showing template reading and alignment but not the final rendering step,
import io
import pandas as pd
# The sales template has no data, only indices. We load it to align to those indices.
sales_template = pd.read_excel(
'file1.xlsx',
header=[1, 2], index_col=list(range(1, 6)),
).iloc[:, 1:]
sales_template.columns.names = 'region', 'column'
print('Template:')
print(sales_template)
print()
with io.StringIO(
'''ID,DIV,APP1,APP2,APP3,Col1,Col4
1,A,AB1,ABC1,ABCD1,20,40
2,A,AB1,ABC2,ABCD2,60,
3,B,BC1,BCD1,BCDE1,10,20''') as f:
region_1 = pd.read_csv(f, index_col=['ID', 'DIV', 'APP1', 'APP2', 'APP3'])
_, aligned = sales_template['Region 1'].align(region_1)
print('Aligned:')
print(aligned)
Template:
region Region 1 Region 2
column Col1 Col2 Col3 Col4 Col1 Col2 Col3 Col4
ID DIV APP1 APP2 APP3
1 A AB1 ABC1 ABCD1 NaN NaN NaN NaN NaN NaN NaN NaN
2 A AB1 ABC2 ABCD2 NaN NaN NaN NaN NaN NaN NaN NaN
3 B BC1 BCD1 BCDE1 NaN NaN NaN NaN NaN NaN NaN NaN
4 B BC1 BCD2 BCDE2 NaN NaN NaN NaN NaN NaN NaN NaN
Aligned:
Col1 Col2 Col3 Col4
ID DIV APP1 APP2 APP3
1 A AB1 ABC1 ABCD1 20.0 NaN NaN 40.0
2 A AB1 ABC2 ABCD2 60.0 NaN NaN NaN
3 B BC1 BCD1 BCDE1 10.0 NaN NaN 20.0
4 B BC1 BCD2 BCDE2 NaN NaN NaN NaN
You must log in to answer this question.
Explore related questions
See similar questions with these tags.