I have an INNODB table that's > 93 million rows. A lot of the data is considered "temp" data and is governed by an "is_active" flag of 1/0. When a user updates a form, new data is written with an "is_active=1" and the previous active records are updated to "is_active=0".
We wanted to move the data to a new table to clean up and ran statement like...
INSERT INTO tblNew (a, b, c)
SELECT a, b, c FROM tblOld WHERE is_active=1
This ran overnight and when I looked in the morning I noticed there were a bunch of processes backed up in the SHOW PROCESS LIST so I did a KILL on the process on the ID, which started the ROLLBACK and brought the server down for another 10 hours... production box of course.
I've been reading a lot on how you can try to repair, etc. and have been doing that all day, but I'm wondering if there's any kind of option I could have added to avoid the need for rollback on failure? Or is there a strategy commit or flush every X number of rows, etc.
I was trying this...
INSERT INTO tblNew (a, b, c)
SELECT a, b, c FROM tblOld WHERE is_active=1 AND pkID > 0 AND pkID < 1000000
Where the pkID was the primary key. I would run it in groups of 550k - 1M and up the number range for PK each run. There's an index on the PK and on is_active, yet I noticed speeds increased each run from 30 seconds to over 5 minutes a run by time it was in the 20M range. Any idea why this would take longer each run when it's the same number of rows for the work?
So 2 in summary, questions...
Can I do something to keep a huge rollback from happening if I stop the process?
Why did inserting the same number of items based on PK and indexed column take progressively longer per run?
1 Answer 1
What percentage of the table is_active
normally?
The following should avoid having to do massive updates to flip that flag:
Consider abandoning the is_active
flag in the main table. Instead have a separate table with PRIMARY KEY
matching the main table, plus (optionally) a timestamp (I'll get back to that in a minute).
Instead of setting is_active
, INSERT
a row in the new table.
Instead of turning off is_active
, DELETE
the row.
To check for [in]active, use LEFT JOIN
plus WHERE id IS [NOT] NULL
.
The TIMESTAMP
is to deal with programming errors that insert a row but somehow forget about it. Also add a user_id
column for debugging, if relevant.
-
About 30% are currently is_active=0. I We do a cleanup periodically where I DELETE FROM all the inactive. I moved to INNODB a few months ago so I could do this w/out locking the table for selects. I also partitioning based on another ID. IDs of 0 are in their own partition, then I go up by 100k to additional partitions. This table stores the responses (form fields) for individual form entries. So if we have 300k forms submitted, we'd have forms that were not finished w/ ID 0, is_active 0, and the other partitions would contain the various form entries. The delete on edit is what we may do.Don– Don2015年05月15日 23:10:46 +00:00Commented May 15, 2015 at 23:10
-
1If you are walking through based on
PRIMARY KEY
, do it only 1000 ids at a time. But there are probably lots of gaps? If so, this blog discusses how to do 1000 at a time efficiently, even with gaps.Rick James– Rick James2015年05月16日 02:45:11 +00:00Commented May 16, 2015 at 2:45 -
1And be sure to
COMMIT
after each chunk. This will keep from gathering up a huge rollback log.Rick James– Rick James2015年05月16日 02:45:51 +00:00Commented May 16, 2015 at 2:45