SQL Server - Export large table without primary key

Question 1

I need to sync a large table ~500 millions rows without a primary key between SQL Server and MySQL. The table has only a clustered composite non-unique index.

I do have a ODBC connection between the servers, but an import of ~8 million rows took around 45 minutes, so I believe a larger single import would be unreasonable as interruptions may occur at any point. I can't change the existing table structure, I can add other tables. After further reading, offset / fetch is not an option for large tables. "Select ... where x between ... and ... " is not an option as I don't have an unique key.

How can I export the table in batches that are guaranteed to contain all rows? My problem is that since the clustered key is not unique, ordering after it would not guarantee the physical rows have the same order between consecutive queries and ordering after all columns would take too long. And how would you recommend to migrate the batches, through ODBC or CSV files?

Question 2

This will be repeating (ususal operation) or one time operation?

Question 3

The initial export will be a one time operation, the sync changes like new records or updates should be repetitive. CDC is not an option, but will investigate further after the initial migration.

Question 4

I think to receive help on this you have to explain in more details the whole process (it look you have very complex problem)

Question 5

You note "since the clustered key is not unique, ordering after it would not guarantee the physical rows have the same order between consecutive queries". Since row order is not preserved (unless you have some sequence data) you cannot rely on getting the same physical row order. Order of rows does not default to insertion order nor index order, but is defined by the ORDER BY clause.

Question 6

Yes, RLF, I agree. The columns are all ints, A, B, C, D, E. Clustered key is on ABC. A combination ABC is not unique, neither a combination ABCD. Would "order by" an non-unique column(s) allow me to export the entire table in batches? And Bogdan Bodganov, Stack platform discourages complex problems, it's better just to address the question. How to export the complete large table as fast as possible in batches without loss of rows?

Question 7

Assuming you don't have updates or deletes against source table you can try the following :
1. Make a copy of existing table using CTAS syntax (for SQLServer it's SELECT * into source_table_copy FROM source_table). Such operation is very fast even for huge tables.
2. Add after insert trigger on source_table that copies new record[s] to source_table_copy.
3. Now when all new records in source_table go to source_table_copy as well, and you can move data from copied table to Mysql in batches. For instance, if you have a link between 2 servers, everything can be done within the body of TSQL stored procedure.
E.g. a piece of code that move up to 20 records to new server might look like

 --declare table variable to keep deleted records until they delivered to target host 
 BEGIN TRANSACTION;
 DELETE TOP (20) FROM source_table_copy OUTPUT DELETED.* INTO @Table_Var;
 --insert data into linked server , or to csv file
 COMMIT;

It's also possible to use CURSOR to read data and then delete with where current of clause.

**Ideally you need to prevent applications from inserting data into source_table during step 1. If it's absolutely impossible , I'll go with an after insert trigger which is added right before step 1 and removed right after it's done which copies data to some other table I can later merge with source_table_copy.

Question 8

Thank you for the solution, I was trying something as well, however with a normal insert. I'll try the CTAS syntax to see if it's speeds things up. Followup question, if you don't mind: would the "after insert trigger" affect performances?

Question 9

Since the trigger body is very simple (just insert data to another table), performance impact will be minimal.

a1ex07 a1ex07 9,0603 gold badges27 silver badges41 bronze badges · Accepted Answer · 2016-01-29 16:55:24Z

Assuming you don't have updates or deletes against source table you can try the following :
1. Make a copy of existing table using CTAS syntax (for SQLServer it's SELECT * into source_table_copy FROM source_table). Such operation is very fast even for huge tables.
2. Add after insert trigger on source_table that copies new record[s] to source_table_copy.
3. Now when all new records in source_table go to source_table_copy as well, and you can move data from copied table to Mysql in batches. For instance, if you have a link between 2 servers, everything can be done within the body of TSQL stored procedure.
E.g. a piece of code that move up to 20 records to new server might look like

 --declare table variable to keep deleted records until they delivered to target host 
 BEGIN TRANSACTION;
 DELETE TOP (20) FROM source_table_copy OUTPUT DELETED.* INTO @Table_Var;
 --insert data into linked server , or to csv file
 COMMIT;

It's also possible to use CURSOR to read data and then delete with where current of clause.

**Ideally you need to prevent applications from inserting data into source_table during step 1. If it's absolutely impossible , I'll go with an after insert trigger which is added right before step 1 and removed right after it's done which copies data to some other table I can later merge with source_table_copy.

Thank you for the solution, I was trying something as well, however with a normal insert. I'll try the CTAS syntax to see if it's speeds things up. Followup question, if you don't mind: would the "after insert trigger" affect performances?
Since the trigger body is very simple (just insert data to another table), performance impact will be minimal.

Stack Exchange Network

SQL Server - Export large table without primary key

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

SQL Server - Export large table without primary key

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions