My new company stores an intermediate data for our loading process in a table using the memory engine. However a quite simple update (set a.id=b.id) is taking something like six hours to run.
I've checked how big the tables are and in this import they have a about 4.3 million rows of data in them. I'm certain the problem is the memory engine and the lack of index on id (there is a PK but its across 5 other attributes) based on HASH.
I'm 95% certain I need to change the table engine and index type but I want to make sure before I push for the change.
EDIT: Here is the statement that is is so slow:
select @Extraction_id := id from extraction;
update items as a
set a.extraction_id=@Extraction_id;
Edit 2: create statement of table
CREATE TABLE `items` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`extraction_id` char(12) NOT NULL,
`a` char(6) NOT NULL,
`b` char(10) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
`c` date NOT NULL,
`d` varchar(100) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
`e` varchar(255) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
`f` decimal(17,6) DEFAULT NULL,
`g` decimal(17,6) DEFAULT NULL,
`h` varchar(150) DEFAULT NULL,
`i` varchar(150) DEFAULT NULL,
PRIMARY KEY (`id`,`extraction_id`,`a`,`b`,`c,`d`),
KEY `fk_extraction` (`extraction_id`),
) ENGINE=MEMORY AUTO_INCREMENT=2343054016 DEFAULT CHARSET=latin1;
1 Answer 1
ENGINE=MEMORY
is not designed for big tables. InnoDB
is well designed for virtually all use cases; switch to it.
It sounds like you need more than an ENGINE
and index type change. Please provide SHOW CREATE TABLE
for the relevant tables, and the statement that is so slow.
set a.id=b.id
?? -- By convention, DBAs call the PRIMARY KEY
id
. It is very strange to be setting the PK of one table to the PK of another.
Chunking
One approach to UPDATEing
4M rows will less impact is to "chunk" up the problem. I discuss details here.
-
"set a.id=b.id" is part of the processes clash checking where if the ID has already been used before it creates a new one and applies to to the various import tables.Exostrike– Exostrike2018年01月22日 12:38:01 +00:00Commented Jan 22, 2018 at 12:38
-
I've tested using InnoDB on a test database using 1m rows and InnoDB takes even longer, so I'm not sure thats going to save us.Exostrike– Exostrike2018年01月22日 13:12:17 +00:00Commented Jan 22, 2018 at 13:12
-
"ENGINE=MEMORY is not designed for big tables" - in mos serious scenarios you would not consider this a large table. Is MySql really that limited? In Memory Storage is used with hundreds of gigabytes RAM by competitors.TomTom– TomTom2018年01月22日 13:38:54 +00:00Commented Jan 22, 2018 at 13:38
-
1"It is very strange to be setting the PK of one table to the PK of another." - no. Horizontal partitioning (splitting fields of a logical table to multiple database tables) is one valid scenario. They obviously all share the same ID key. One of them will be leading (mandatory). I use this quite regularly.TomTom– TomTom2018年01月22日 13:39:00 +00:00Commented Jan 22, 2018 at 13:39
-
@TomTom - And I recommend it -- but only rarely. I am wondering if this is an "over-simplification" of the real problem. And one of those
id's
is not really spelledid
.Rick James– Rick James2018年01月23日 02:30:40 +00:00Commented Jan 23, 2018 at 2:30
Explore related questions
See similar questions with these tags.
@Extraction_id
value from the previous select. Just use it where you need it, don't update million of rows with it.extraction_id
in thatitems
table at all. This looks more like a design issue, as Rick has pointed.