In the example below there is two dataframes, df and df1. I also have a MySQL database with two tables. Tables1 and Table2, both has the same columns as df.
What I want to do is:
- Use df1 to locate the correct row in table1,
- Move the row to table2 and update the columns 'REMOVED' (From False to True) and 'REMOVED_DATE (from None to '2022-01-10).
- Remove the row from table1
Below you see my effort. But it dosen't work, and I can imagine there is a better solution available. Any help would be much appreciated.
df=pd.DataFrame({'ID':[25, 10],
'DATE':['2021-11-13', '2021-12-03'],
'LOC': ['NY', 'ML'],
'MAIL':['[email protected]', '[email protected]'],
'NR': ['100000', '200000'],
'REMOVED': [False, False],
'REMOVED_DATE':[None,None]})
df1=pd.DataFrame({'ID':[25],
'REMOVED':[True],
'REMOVED_DATE':['2022-01-11']})
cols = "`,`".join([str(i) for i in df.columns.tolist()]) #use df to mirror the columns in the tables in the database.
for i, row in df1.iterrows():
id=row['ID']
sql=f"SELECT ID FROM table1 WHERE id={id}" # Check if ID is in table1.
cursor.execute(sql)
z=cursor.fetchall()
if not z:
print('insert') # if not, move on.
else: #If ID in, update, move and delete
sql=f"""UPDATE table1 SET (`" +cols + "`) VALUES (" + "%s,"*(len(row)-1) + "%s) WHERE id={id}""" # Update the whole row where ID match.
cursor.execute(sql, tuple(row))
mydb.commit()
sql=f"""INSERT INTO table2 SELECT * FROM table1 WHERE id={id}""" #insert it into table2
cursor.execute(str(sql), tuple(row))
mydb.commit()
sql=f"""DELETE from table1 WHERE id={id}""" # Delete it from table1
cursor.execute(sql)
mydb.commit()
1 Answer 1
You don't need to update table1 -- there's nothing in the specification that indicates this is needed, and you're going to delete it anyway.
Your code doesn't change REMOVED and REMOVED_DATE when it's copying from table1 to table2. You can do this by replacing SELECT * with selecting all the columns other than those, and substituting the new values you want in the SELECT list.
You should also use parametrized queries rather than substituting values with f-strings.
cols = ",".join(f"`{col}`" for col in df.columns.tolist())
new_cols = ",".join('%s' if col in ('REMOVED', 'REMOVED_DATE') else f"`{col}`" for col in df.columns.tolist())
insert_sql=f"""INSERT INTO table2 ({cols}) SELECT {new_cols} FROM table1 WHERE id=%s""" #insert it into table2
delete_sql = """DELETE from table1 WHERE id=%s"""
select_sql = "SELECT ID FROM table1 WHERE id=%s"
for i, row in df1.iterrows():
id=row['ID']
cursor.execute(select_sql, (id,))
row = cursor.fetchone()
if not row:
print('insert') # if not, move on.
else: #If ID in, update, move and delete
cursor.execute(insert_sql, (False, '2022-01-10', id))
cursor.execute(delete_sql, (id,))
mydb.commit()
You should only commit once for each set of operations when moving and deleting, so that they're an atomic set.
9 Comments
{col}" for col in df.columns.tolist())tuple(row) in execute() if you need to substitute all the columns from the df into the query, and in the same order. In most cases you only need some columns, so you write (row['col1'], row['col2'])
cursor.execute()for theUPDATEstatement.