2

I have searched in Google, in StackOverflow, I have tested many methods, but I really don't understand why the query is extremely slow.

I have a CSV file with this structure with 94K rows:

name;email;employer;city;state;born;active;group;

with two load data queries. One for group (with dummy columns and with name in the last) and other query to import the columns ignoring the last.

--- FIRST (It is running for 12 hours and did not finish yet.)
LOAD DATA INFILE 'C:/path/file.csv'
 REPLACE INTO TABLE group 
 FIELDS TERMINATED BY ';' OPTIONALLY ENCLOSED BY '"' 
 LINES TERMINATED BY '\r\n' IGNORE 1 LINES
 (@dummy, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy, name)
 SET id = NULL, active = 1;
--- SECOND (not begin because the first not ended)
LOAD DATA INFILE 'C:/path/file.csv'
 REPLACE INTO TABLE subscriber 
 FIELDS TERMINATED BY ';' OPTIONALLY ENCLOSED BY '"' 
 LINES TERMINATED BY '\r\n' IGNORE 1 LINES
 (name, email, employer, city, state, born, active, @dummy)
 SET id = NULL;

Ok, I don't have the best computer, but the file not is too large. I have a Windows 64bits with 4GB RAM, using XAMPP;

In my.ini I have this settings:

innodb_buffer_pool_size = 2G
innodb_log_file_size = 256M

Please, can someone help me? Sorry my english and thank you!

Edit

CREATE TABLE group (
 id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, 
 name VARCHAR(80) NOT NULL, 
 created TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP, 
 active TINYINT(1) UNSIGNED DEFAULT "1", 
 PRIMARY KEY (id), 
 UNIQUE KEY name_UNIQUE (name)
) ENGINE=INNODB DEFAULT CHARSET=utf8 CHECKSUM=1 DELAY_KEY_WRITE=1 ROW_FORMAT=DYNAMIC 
Rick James
80.7k5 gold badges52 silver badges119 bronze badges
asked Oct 11, 2016 at 13:51
5
  • Smells like there is a missing index. Let's see SHOW CREATE TABLE. Commented Oct 18, 2016 at 3:25
  • Are email, etc, also in that table?? Commented Oct 19, 2016 at 17:34
  • Does subscriber have the same layout? Commented Oct 19, 2016 at 17:36
  • Email is used only in the second script to insert into subscriber table, ignoring the last column from the first script. The subscriber table have the same layout but the unique key is email Commented Oct 21, 2016 at 13:01
  • Please add SHOW CREATE TABLE subscriber to the question. Commented Oct 21, 2016 at 17:05

1 Answer 1

1

REPLACE is DELETE any or all rows that match, then INSERT the one row.

For REPLACE to be useful, you need an index, preferrably UNIQUE or PRIMARY so that it knows what to DELETE.

Do you have an index on group; that is what the first REPLACE needs. Or am I confused about the columns?

Also, ... Consider whether it would be better to use INSERT IGNORE instead of REPLACE. The semantics is similar, but different.

If those suggestions do not work, then I recommend, LOADing the CSV files into temp tables, then perform queries to copy the data into the 'real' tables.

answered Oct 18, 2016 at 3:30
2
  • Thanks for reply. I have created a dynamically MyISAM table (not TEMPORARY) to both and insert. After I inserted into "official" table (InnoDB) using ON DUPLICATE KEY UPDATE. Total: 0.944 sec for the first statement and 1.765 sec for the second Commented Oct 19, 2016 at 12:43
  • So, you are happy with those timings? And the solution was to have a staging table, plus change from REPLACE to IODKU? Commented Oct 19, 2016 at 17:38

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.