3
\$\begingroup\$

My objective is to enter/type in values into Excel cells if the column names and indices match between Excel and dataframe.

So, my code does the below

  1. Gets the ID values from the dataframe

  2. for each ID, see whether it is present in excel file

  3. If they are present, get the column name list for those corresponding IDs (from dataframe)

  4. Filter only non-NA columns

  5. Check whether those columns names are present in excel file

  6. If both ID and column names (between dataframe and excel) match, key in the column value (from dataframe) into the excel sheet at appropriate position

You can refer this post for more details on the problem.

My code looks like below:

sales = xlwings.book('file1.xlsx')
df_value = region_1['ID'].tolist()
for val in df_value:
 for a_cell in sales.sheets[0]['B5:B8']:
 if a_cell.value == val:
 rn1 = a_cell.row
 temp = region_1.loc[[val]]
 temp = temp.dropna(axis=1, how='all')
 colu = temp.columns.tolist()
 for col in colu:
 for b_cell in sales.sheets[0]['G3:J3']:
 if b_cell.value == col:
 rn2 = b_cell.row
 data_entry_loc = str(b_cell.address[1]) + str(rn1)
 enter_val = temp[col].values
 sales.sheets[0][data_entry_loc].value = enter_val

However, am not sure whether this code is elegant and efficient enough to handle big data. While it worked for sample data provided in the above post, I would like to seek your opinion/suggestions on how it can be improved. You can please write the improved version of the code as an answer below.

In case, if you wish to try out the sample data, you can find below

Sample dataframe

ID,DIV,APP1,APP2,APP3,Col1,Col4
1,A,AB1,ABC1,ABCD1,20,40
2,A,AB1,ABC2,ABCD2,60,
3,B,BC1,BCD1,BCDE1,10,20
region_1 = pd.read_clipboard(sep=',') 
region_1.set_index(['ID','DIV','APP1','APP2','APP3'],inplace=True)

Input excel file

enter image description here

I expect my output to be like as shown below

enter image description here

asked Mar 22, 2022 at 14:27
\$\endgroup\$
0

1 Answer 1

5
\$\begingroup\$

Step by step,

(1) Rather than "getting the ID", just make sure your indices are sane (they basically are already)

(2-5) Don't do this. Instead, align to the indices read from the Excel file.

(6) alignment takes care of this.

Showing template reading and alignment but not the final rendering step,

import io
import pandas as pd
# The sales template has no data, only indices. We load it to align to those indices.
sales_template = pd.read_excel(
 'file1.xlsx',
 header=[1, 2], index_col=list(range(1, 6)),
).iloc[:, 1:]
sales_template.columns.names = 'region', 'column'
print('Template:')
print(sales_template)
print()
with io.StringIO(
'''ID,DIV,APP1,APP2,APP3,Col1,Col4
1,A,AB1,ABC1,ABCD1,20,40
2,A,AB1,ABC2,ABCD2,60,
3,B,BC1,BCD1,BCDE1,10,20''') as f:
 region_1 = pd.read_csv(f, index_col=['ID', 'DIV', 'APP1', 'APP2', 'APP3'])
_, aligned = sales_template['Region 1'].align(region_1)
print('Aligned:')
print(aligned)
Template:
region Region 1 Region 2 
column Col1 Col2 Col3 Col4 Col1 Col2 Col3 Col4
ID DIV APP1 APP2 APP3 
1 A AB1 ABC1 ABCD1 NaN NaN NaN NaN NaN NaN NaN NaN
2 A AB1 ABC2 ABCD2 NaN NaN NaN NaN NaN NaN NaN NaN
3 B BC1 BCD1 BCDE1 NaN NaN NaN NaN NaN NaN NaN NaN
4 B BC1 BCD2 BCDE2 NaN NaN NaN NaN NaN NaN NaN NaN
Aligned:
 Col1 Col2 Col3 Col4
ID DIV APP1 APP2 APP3 
1 A AB1 ABC1 ABCD1 20.0 NaN NaN 40.0
2 A AB1 ABC2 ABCD2 60.0 NaN NaN NaN
3 B BC1 BCD1 BCDE1 10.0 NaN NaN 20.0
4 B BC1 BCD2 BCDE2 NaN NaN NaN NaN
answered Dec 15, 2024 at 20:28
\$\endgroup\$
0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.