Python data entry into excel based on equality check with dataframe

Question 1

My objective is to enter/type in values into Excel cells if the column names and indices match between Excel and dataframe.

So, my code does the below

Gets the ID values from the dataframe
for each ID, see whether it is present in excel file
If they are present, get the column name list for those corresponding IDs (from dataframe)
Filter only non-NA columns
Check whether those columns names are present in excel file
If both ID and column names (between dataframe and excel) match, key in the column value (from dataframe) into the excel sheet at appropriate position

You can refer this post for more details on the problem.

My code looks like below:

sales = xlwings.book('file1.xlsx')
df_value = region_1['ID'].tolist()
for val in df_value:
 for a_cell in sales.sheets[0]['B5:B8']:
 if a_cell.value == val:
 rn1 = a_cell.row
 temp = region_1.loc[[val]]
 temp = temp.dropna(axis=1, how='all')
 colu = temp.columns.tolist()
 for col in colu:
 for b_cell in sales.sheets[0]['G3:J3']:
 if b_cell.value == col:
 rn2 = b_cell.row
 data_entry_loc = str(b_cell.address[1]) + str(rn1)
 enter_val = temp[col].values
 sales.sheets[0][data_entry_loc].value = enter_val

However, am not sure whether this code is elegant and efficient enough to handle big data. While it worked for sample data provided in the above post, I would like to seek your opinion/suggestions on how it can be improved. You can please write the improved version of the code as an answer below.

In case, if you wish to try out the sample data, you can find below

Sample dataframe

ID,DIV,APP1,APP2,APP3,Col1,Col4
1,A,AB1,ABC1,ABCD1,20,40
2,A,AB1,ABC2,ABCD2,60,
3,B,BC1,BCD1,BCDE1,10,20
region_1 = pd.read_clipboard(sep=',') 
region_1.set_index(['ID','DIV','APP1','APP2','APP3'],inplace=True)

Input excel file

enter image description here

I expect my output to be like as shown below

enter image description here

Question 2

Step by step,

(1) Rather than "getting the ID", just make sure your indices are sane (they basically are already)

(2-5) Don't do this. Instead, align to the indices read from the Excel file.

(6) alignment takes care of this.

Showing template reading and alignment but not the final rendering step,

import io
import pandas as pd
# The sales template has no data, only indices. We load it to align to those indices.
sales_template = pd.read_excel(
 'file1.xlsx',
 header=[1, 2], index_col=list(range(1, 6)),
).iloc[:, 1:]
sales_template.columns.names = 'region', 'column'
print('Template:')
print(sales_template)
print()
with io.StringIO(
'''ID,DIV,APP1,APP2,APP3,Col1,Col4
1,A,AB1,ABC1,ABCD1,20,40
2,A,AB1,ABC2,ABCD2,60,
3,B,BC1,BCD1,BCDE1,10,20''') as f:
 region_1 = pd.read_csv(f, index_col=['ID', 'DIV', 'APP1', 'APP2', 'APP3'])
_, aligned = sales_template['Region 1'].align(region_1)
print('Aligned:')
print(aligned)

Template:
region Region 1 Region 2 
column Col1 Col2 Col3 Col4 Col1 Col2 Col3 Col4
ID DIV APP1 APP2 APP3 
1 A AB1 ABC1 ABCD1 NaN NaN NaN NaN NaN NaN NaN NaN
2 A AB1 ABC2 ABCD2 NaN NaN NaN NaN NaN NaN NaN NaN
3 B BC1 BCD1 BCDE1 NaN NaN NaN NaN NaN NaN NaN NaN
4 B BC1 BCD2 BCDE2 NaN NaN NaN NaN NaN NaN NaN NaN
Aligned:
 Col1 Col2 Col3 Col4
ID DIV APP1 APP2 APP3 
1 A AB1 ABC1 ABCD1 20.0 NaN NaN 40.0
2 A AB1 ABC2 ABCD2 60.0 NaN NaN NaN
3 B BC1 BCD1 BCDE1 10.0 NaN NaN 20.0
4 B BC1 BCD2 BCDE2 NaN NaN NaN NaN

Reinderien 71.1k5 gold badges76 silver badges256 bronze badges · Answer 1 · 2024-12-15 20:28:04Z

Step by step,

(1) Rather than "getting the ID", just make sure your indices are sane (they basically are already)

(2-5) Don't do this. Instead, align to the indices read from the Excel file.

(6) alignment takes care of this.

Showing template reading and alignment but not the final rendering step,

import io
import pandas as pd
# The sales template has no data, only indices. We load it to align to those indices.
sales_template = pd.read_excel(
 'file1.xlsx',
 header=[1, 2], index_col=list(range(1, 6)),
).iloc[:, 1:]
sales_template.columns.names = 'region', 'column'
print('Template:')
print(sales_template)
print()
with io.StringIO(
'''ID,DIV,APP1,APP2,APP3,Col1,Col4
1,A,AB1,ABC1,ABCD1,20,40
2,A,AB1,ABC2,ABCD2,60,
3,B,BC1,BCD1,BCDE1,10,20''') as f:
 region_1 = pd.read_csv(f, index_col=['ID', 'DIV', 'APP1', 'APP2', 'APP3'])
_, aligned = sales_template['Region 1'].align(region_1)
print('Aligned:')
print(aligned)

Template:
region Region 1 Region 2 
column Col1 Col2 Col3 Col4 Col1 Col2 Col3 Col4
ID DIV APP1 APP2 APP3 
1 A AB1 ABC1 ABCD1 NaN NaN NaN NaN NaN NaN NaN NaN
2 A AB1 ABC2 ABCD2 NaN NaN NaN NaN NaN NaN NaN NaN
3 B BC1 BCD1 BCDE1 NaN NaN NaN NaN NaN NaN NaN NaN
4 B BC1 BCD2 BCDE2 NaN NaN NaN NaN NaN NaN NaN NaN
Aligned:
 Col1 Col2 Col3 Col4
ID DIV APP1 APP2 APP3 
1 A AB1 ABC1 ABCD1 20.0 NaN NaN 40.0
2 A AB1 ABC2 ABCD2 60.0 NaN NaN NaN
3 B BC1 BCD1 BCDE1 10.0 NaN NaN 20.0
4 B BC1 BCD2 BCDE2 NaN NaN NaN NaN

Stack Exchange Network

Python data entry into excel based on equality check with dataframe

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Python data entry into excel based on equality check with dataframe

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions