Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

ENH: pd.to_sql(upsert=True, upsert_on_columns=['game_id']) #60434

Open
Labels
Enhancement IO SQLto_sql, read_sql, read_sql_query Needs TriageIssue that has not been reviewed by a pandas team member
@vile319

Description

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

I think it would be fantastic if upserting was built in as part of the pandas .to_sql() function. This is because I have to make my tables have a unique index separately before using the to_sql function, which is annoying.

Feature Description

Just go ahead and make the upsert add a unique constraint to a specific table based on the specific columns that the user wants. Of course you should warn them that this will remove duplicates (but it they want to upsert, they probably don't want duplicates in the first place). Just have something similar to this when doing pd.to_sql(upsert=True, upsert_on_columns=[game_id, player_id])

def create_table_with_unique_constraint(table_name, engine, unique_columns):
 """Create a new table with unique constraints and copy data from old table."""
 # Get column names and types from existing table
 inspector = sqlalchemy.inspect(engine)
 columns = inspector.get_columns(table_name)
 
 # Create column definitions preserving data types
 cols_sql = ', '.join([f'{col["name"]} {col["type"]}' for col in columns])
 unique_cols = ', '.join(unique_columns)
 
 # Create new table with unique constraint
 temp_table = f"{table_name}_temp"
 create_sql = f'CREATE TABLE {temp_table} ({cols_sql}, UNIQUE({unique_cols}) ON CONFLICT REPLACE)'
 
 with engine.begin() as conn:
 # Create new table
 conn.execute(text(create_sql))
 
 # Copy data from old to new table
 conn.execute(text(f"INSERT INTO {temp_table} SELECT * FROM {table_name}"))
 
 # Drop old table
 conn.execute(text(f"DROP TABLE {table_name}"))
 
 # Rename new table to original name
 conn.execute(text(f"ALTER TABLE {temp_table} RENAME TO {table_name}"))
def to_sql_upsert(df, table_name, engine, unique_columns):
 """
 Write DataFrame to SQL with upsert functionality.
 If table exists with unique constraint, appends directly.
 If not, creates table with constraint after appending.

 Args:
 df: DataFrame to write
 table_name: Name of target SQL table
 engine: SQLAlchemy engine
 unique_columns: List of columns for unique constraint
 """
 inspector = sqlalchemy.inspect(engine)
 
 # Check if table exists and has unique constraint
 has_constraint = False
 if inspector.has_table(table_name):
 unique_constraints = inspector.get_unique_constraints(table_name)
 for constraint in unique_constraints:
 if set(constraint['column_names']) == set(unique_columns):
 has_constraint = True
 break
 
 # Write data
 df.to_sql(table_name, engine, if_exists='append', index=False)
 
 # Add constraint if needed
 if not has_constraint:
 create_table_with_unique_constraint(table_name, engine, unique_columns)

Alternative Solutions

I don't know of any existing functionality, and a solution as simple as this does not seem to exist in 3rd party packages

Additional Context

This SHOULD be able to handle the case where A. you have no table and it is making it for the first time (it will automatically make it into a unique index table) B. you have a table but it does not have unique on conflict replace (it will copy over the old data into the new unique index table) and C. you have a table and it does have unique on conflict replace (it will just to_sql append. From there, sql will automatically handle the upserting.
Please, let me know if you have any questions, and thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Enhancement IO SQLto_sql, read_sql, read_sql_query Needs TriageIssue that has not been reviewed by a pandas team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

      Relationships

      None yet

      Development

      No branches or pull requests

      Issue actions

        AltStyle によって変換されたページ (->オリジナル) /