2

This query takes ~90 seconds:

CREATE TABLE filtered
(
 SELECT DISTINCT
 f2.cityCode, 
 f2.productCode, 
 f2.productDesc, 
 f2.inventoryDate, 
 f2.inventoryLevel
 FROM 
 etm
 INNER JOIN f2avail AS f2 ON 
 etm.cityCode = f2.cityCode AND 
 etm.tvlDate = f2.inventoryDate
);

but the subquery on its own takes ~0.07 seconds:

-- CREATE TABLE filtered
-- (
 SELECT DISTINCT
 f2.cityCode, 
 f2.productCode, 
 f2.productDesc, 
 f2.inventoryDate, 
 f2.inventoryLevel
 FROM 
 etm
 INNER JOIN f2avail AS f2 ON 
 etm.cityCode = f2.cityCode AND 
 etm.tvlDate = f2.inventoryDate
-- );

Why is that?

Following this solution, I ran OPTIMIZE TABLE f2avail (the one with ~246,000 rows) and the create query ran about 25% faster, but still slow.

How can I speed up saving the subquery results? This is just one step in a series of filtering stages in an internal DB for reporting. Both tables are InnoDB.


Subquery explain results:

explain results

Update

The solution was to use composite indices on both tables.

-- Add the composite indices
ALTER TABLE `etm` ADD INDEX (`cityCode`, `tvlDate`);
ALTER TABLE `f2avail` ADD INDEX (`cityCode`, `inventoryDate`);

The query went from ~90s to about 6s.

asked Feb 23, 2017 at 1:37
3
  • Does filtered end up with about ~246,000 rows? Commented Feb 23, 2017 at 2:44
  • On explain extended? With the solution above it shows 100.0. Commented Feb 23, 2017 at 6:26
  • I'm referring to your table filtered. Commented Feb 23, 2017 at 16:12

2 Answers 2

1

Your screen images leave out details that would show up in SHOW CREATE TABLE. But I will make some guesses.

You need, but don't have, a "composite" INDEX(cityCode, tvlDate) on etm and/or a similar index on the other table.

Other issues:

  • Comparing a DATE to a DATETIME may not work as you expect. If the DATETIME has any time other than 00:00:00, it will not match any DATE.
  • Why mention etm when all the data comes from the other table?
  • Be consistent on definitions -- I'm looking inconsistent lengths on citycode.

Back to the question... Perhaps nothing was cached in RAM when the 90s run was done. Try each query twice.

answered Feb 23, 2017 at 2:25
6
  • yes! forgot write about DATE and DATETIME Commented Feb 23, 2017 at 2:28
  • Thank you for replying. Let me address your good points. Only the etm table was created "by design". The other table - f2avail - was itself a result of a similar CREATE ... SELECT query, which was itself the result of yet another query on even bigger tables. Seven other similar chained queries take only about ~2s. It's only this one that is slow. All these tables will be deleted. In fact, they will be temporary tables once the speed issue is better addressed. The ~90s is consistent. When I run the optimize query, it then becomes ~60s consistently. These numbers are for the OP query only. Commented Feb 23, 2017 at 2:32
  • Good point about DATE and DATETIME. The schema information you see in the screenshot is a result of a previous CREATE ... SELECT query similar to the one in my question. I have not manually defined the columns as these are to be temporary tables. Commented Feb 23, 2017 at 2:34
  • I ALTERED the tables so both are using DATE, and the column widths are consistent. I also increased SQL memory limits just to be safe. Only a few seconds were shaved. Commented Feb 23, 2017 at 6:12
  • Your comment stating "you need a composite index" on both tables rang a bell. Once I added those to both tables, the query ran in about 6s. Thanks for your help friend. Commented Feb 23, 2017 at 6:13
2

Looking on Your pictures, I can guess - for run subquery You also use MySQLWorkbench. Most of software - not return to You back all rows from dataset, by default it filter up to 1000 records (some less)

So select without ORDER BY start work and immediately stop after first XXX rows.

In case of create table:

  1. The subquery return all rows
  2. Insert operation always longer than SELECT ... much, much longer

Possible or not increase speed of insert? it depends from Your current server settings and configuration - may yes, may be not

Also it depend from other server loading:

  • Select it non-locking operations, it start work immediately
  • INSERT FROM SELECT (other form of CREATE from SELECT) - lock all data, so if other transactions running on this time, it spend some time for lock tables

Also You can test query with combined indexes on both tables: (cityCode,tvlDate) and (cityCode,inventoryDate)

answered Feb 23, 2017 at 2:18
3
  • Thank you for your reply. You are right about how Workbench limits results to 1000 rows. Good catch. I added LIMIT 300000 to return all the results and the SELECT query took only ~10s from ~0.07s. Commented Feb 23, 2017 at 2:39
  • 10s vs 90s might be the difference between writing and not writing. Commented Feb 23, 2017 at 2:45
  • You are right about the composite indices on both tables. That was the key factor. Thank you for your help. I appreciate it as well. Commented Feb 23, 2017 at 6:14

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.