6

I have a browse category query that im trying to optimize. Im ending up with Using temporary; Using filesort in the explain and the query is slow on a category with say 60,000 rows. If i remove the Order By clauses the query runs very fast .05 seconds to do 60,000 rows. With the Order By clauses its very slow around 5 seconds. Parts contains some 500,000 rows as does Parts_Category.

I have a group index on Parts (status, level, warehouse, updated) called sort_index

At the top of the explain I have | ALL | Using temporary; Using filesort

All the other indexes are showing OK. Can somebody please tell me what the problem might be? Im out of ideas. Maybe i should rearrange this query so i can get better performance perhaps?

query.

SELECT Parts.*, Image.type, Image.width, Image.height,
(SELECT name FROM Location_State WHERE id = Parts.state_id) AS state, 
(SELECT name FROM Location_Region WHERE id = Parts.region_id) AS region, 
(SELECT start_date FROM Promotion WHERE id = Parts.promotion_id) AS promotion_start_date, 
(SELECT end_date FROM Promotion WHERE id = Parts.promotion_id) AS promotion_end_date 
FROM ( SELECT parts_id FROM Parts_Category WHERE Parts_Category.category_id = '40' 
UNION SELECT parts_id FROM Parts_Category WHERE Parts_Category.main_category_id = '40') cid 
LEFT JOIN Image ON Parts.image_id = Image.id
JOIN Parts ON Parts.id = cid.parts_id AND Parts.status = 'A'
ORDER BY Parts.level DESC, Parts.warehouse DESC, Parts.updated DESC LIMIT 0, 15
Table structure for table Parts
Field Type Null Default
id int(11) No auto_increment
image_id int(11) Yes 0
gallery_id int(11) Yes 0
image_count int(3) Yes 0
promotion_id int(11) Yes 0
country_id int(11) Yes NULL
state_id int(11) Yes NULL
region_id int(11) Yes NULL
city_id int(11) Yes NULL
area_id int(11) Yes NULL
updated datetime Yes 0000-00-00 00:00:00
entered datetime Yes 0000-00-00 00:00:00
renewal_date date Yes 0000-00-00
discount_id varchar(10) Yes NULL
title varchar(100) Yes 
search_title varchar(255) Yes 
warehouse varchar(50) Yes 
url varchar(255) Yes 
display_url varchar(255) Yes 
friendly_url varchar(100) Yes NULL
description varchar(255) Yes 
keywords varchar(1000) Yes NULL
attachment_file varchar(255) Yes 
attachment_caption varchar(255) Yes 
status char(1) Yes 
level tinyint(3) Yes 0
worldwide tinyint(1) Yes 0
random_number int(11) Yes NULL
reminder tinyint(4) Yes NULL
category_search varchar(1000) Yes 
video_snippet varchar(1000) Yes 
importID int(11) Yes 0
Indexes
PRIMARY 518623 id
random_number INDEX 32201 random_number
country_id INDEX 1 country_id
state_id INDEX 8 state_id
region_id INDEX 5 region_id
renewal_date INDEX 1 renewal_date
worldwide INDEX 1 worldwide
friendly_url INDEX 518623 friendly_url
promotion_id INDEX 1 promotion_id
city_id INDEX 1 city_id
area_id INDEX 1 area_id
zip_code INDEX 2790 zip_code
importID INDEX 518623 importID
image_id INDEX 10 image_id
--------------
index_browse_category INDEX 52 
level
status
warehouse 
updated
-----------------
keywords FULLTEXT 1 
description
keywords
category_search 
Parts_Category
id int(11) No auto_increment 
parts_id int(11) No 0 
category_id int(11) No 0 
main_category_id int(10) No 0 
Index
PRIMARY PRIMARY 519330 id
category_id INDEX 519330 category_id
parts_id
main_category_id INDEX 519330 main_category_id
parts_id
asked Jul 22, 2009 at 14:52
4
  • John, I've given a preliminary answer below. In order to fully optimize your query though, please provide the items requested in my Answer. Commented Jul 22, 2009 at 15:29
  • Don't forget the manual! MySQL has a specific page dealing with ORDER BY optimizing: dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html Commented Jul 22, 2009 at 15:38
  • @John: check your composite index. It should be created on (status, level, warehouse, updated), in this order (it's important). Seems it's created wrong way now (first level, then status), and seems it's the case why my query is slow. When posting your query structure, please just run SHOW CREATE TABLE parts and post its output: it will output the CREATE TABLE statement which is easy to copy and paste. Commented Jul 22, 2009 at 20:41
  • @John: to create indexes, don't rely on tools, I've seen cases when the variables get into the index in the wrong order. Just run CREATE INDEX ix_parts_order ON parts (status, level, warehouse, updated) Commented Jul 22, 2009 at 20:43

2 Answers 2

42

Try rewriting your query as this:

SELECT p.*, i.type, i.width, i.height,
 (SELECT name FROM Location_State WHERE id = p.state_id) AS state, 
 (SELECT name FROM Location_Region WHERE id = p.region_id) AS region, 
 (SELECT start_date FROM Promotion WHERE id = p.promotion_id) AS promotion_start_date, 
 (SELECT end_date FROM Promotion WHERE id = p.promotion_id) AS promotion_end_date 
FROM parts p
LEFT JOIN
 image i
ON i.id = p.image_id
WHERE EXISTS (
 SELECT NULL
 FROM Parts_Category pc
 WHERE pc.category_id = '40'
 AND pc.parts_id = p.id
 UNION ALL
 SELECT NULL
 FROM Parts_Category pc
 WHERE pc.main_category_id = '40'
 AND pc.parts_id = p.id
 )
 AND p.status = 'A'
ORDER BY
 p.status DESC, p.level DESC, p.warehouse DESC, p.updated DESC
LIMIT 15

You need the following indexes for this to work efficiently:

parts (status, level, warehouse, updated) -- this one you have
parts_category (category_id, parts_id)
parts_category (main_category_id, parts_id)

Update:

I just created the tables as this:

DROP TABLE IF EXISTS `test`.`image`;
CREATE TABLE `test`.`image` (
 `id` int(11) NOT NULL,
 `type` int(11) NOT NULL,
 `width` int(11) NOT NULL,
 `height` int(11) NOT NULL,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `test`.`location_region`;
CREATE TABLE `test`.`location_region` (
 `id` int(11) NOT NULL,
 `name` varchar(20) NOT NULL,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `test`.`location_state`;
CREATE TABLE `test`.`location_state` (
 `id` int(11) NOT NULL,
 `name` varchar(20) NOT NULL,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `test`.`parts`;
CREATE TABLE `test`.`parts` (
 `id` int(11) NOT NULL,
 `status` char(1) NOT NULL,
 `level` int(11) NOT NULL,
 `warehouse` int(11) NOT NULL,
 `updated` int(11) NOT NULL,
 `state_id` int(11) NOT NULL,
 `region_id` int(11) NOT NULL,
 `promotion_id` int(11) NOT NULL,
 `image_id` int(11) NOT NULL DEFAULT '1',
 PRIMARY KEY (`id`),
 KEY `status` (`status`,`level`,`warehouse`,`updated`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `test`.`parts_category`;
CREATE TABLE `test`.`parts_category` (
 `id` int(11) NOT NULL,
 `parts_id` int(11) NOT NULL,
 `category_id` int(11) NOT NULL,
 `main_category_id` int(11) NOT NULL,
 PRIMARY KEY (`id`),
 KEY `ix_pc_cat_parts` (`category_id`,`parts_id`),
 KEY `ix_pc_main_parts` (`main_category_id`,`parts_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `test`.`promotion`;
CREATE TABLE `test`.`promotion` (
 `id` int(11) NOT NULL,
 `start_date` datetime NOT NULL,
 `end_date` datetime NOT NULL,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

and filled them with sample data:

INSERT
INTO parts
SELECT id,
 CASE WHEN RAND() < 0.1 THEN 'A' ELSE 'B' END,
 RAND() * 100,
 RAND() * 100,
 RAND() * 100,
 RAND() * 50,
 RAND() * 50,
 RAND() * 50,
 RAND() * 50
FROM t_source
LIMIT 500000;
INSERT
INTO parts_category
SELECT id,
 id,
 RAND() * 100,
 RAND() * 100
FROM t_source
LIMIT 500000;
INSERT
INTO location_state
SELECT id, CONCAT('State ', id)
FROM t_source
LIMIT 1000;
INSERT
INTO location_region
SELECT id, CONCAT('Region ', id)
FROM t_source
LIMIT 1000;
INSERT
INTO promotion
SELECT id,
 '2009-07-22' - INTERVAL RAND() * 5 - 20 DAY,
 '2009-07-22' - INTERVAL RAND() * 5 DAY
FROM t_source
LIMIT 1000;

The query above runs for 30 milliseconds and yields the following plan:

1, 'PRIMARY', 'p', 'ref', 'status', 'status', '3', 'const', 107408, 'Using where'
1, 'PRIMARY', 'i', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.image_id', 1, ''
6, 'DEPENDENT SUBQUERY', 'pc', 'ref', 'ix_pc_cat_parts', 'ix_pc_cat_parts', '8', 'const,test.p.id', 1, 'Using index'
7, 'DEPENDENT UNION', 'pc', 'ref', 'ix_pc_main_parts', 'ix_pc_main_parts', '8', 'const,test.p.id', 1, 'Using index'
, 'UNION RESULT', '<union6,7>', 'ALL', '', '', '', '', , ''
5, 'DEPENDENT SUBQUERY', 'Promotion', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.promotion_id', 1, ''
4, 'DEPENDENT SUBQUERY', 'Promotion', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.promotion_id', 1, ''
3, 'DEPENDENT SUBQUERY', 'Location_Region', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.region_id', 1, ''
2, 'DEPENDENT SUBQUERY', 'Location_State', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.state_id', 1, ''

As you can see, no temporary, no filesort, everything's very fast.

To help you anymore, I just need to see how your tables are defined.

answered Jul 22, 2009 at 15:04
28
  • Thanks again Quassnoi. Seem to be getting syntax error near Left Join on LEFT JOIN image i ? Commented Jul 22, 2009 at 15:49
  • Could you please provide your exact table definitions, so that I would be able to check syntax? Commented Jul 22, 2009 at 15:53
  • @John: meanwhile, try now, I messed the clause order indeed :) Commented Jul 22, 2009 at 15:54
  • Parts Table Col is just id not parts_id so I changed this line to AND pc.parts_id = p.id, now its working. Changing indexes takes a while so ill be back soon. Commented Jul 22, 2009 at 16:13
  • Explain | PRIMARY | p | ALL | 520,000 | Using where; Using filesort. Query is slower however - seems to be scanning all rows - before it was only scanning the amount equal to categories returned? Commented Jul 22, 2009 at 16:37
1

John, the problem is your query is constructed such that it is selecting from a derived table. The derived table cannot benefit from your indices. Try updating your query as follows:

SELECT
 Parts.*,
 Image.type, Image.width, Image.height,
 Location_State.name AS state,
 Location_Region.name AS region,
 Promotion.start_date AS promotion_start_date,
 Promotion.end_date AS promotion_end_date
FROM Parts
LEFT JOIN Image ON Parts.image_id = Image.id
LEFT JOIN Location_State ON Parts.state_id = Location_State.id
LEFT JOIN Location_Region ON Parts.state_id = Location_Region.id
LEFT JOIN Promotion ON Parts.promotion_id = Promotion.id
INNER JOIN Parts_Category ON (Parts_Category.category_id = 40 OR Parts_Category.main_category_id = 40)
WHERE Parts.status = 'A'
GROUP BY Parts.id
ORDER BY Parts.level DESC, Parts.warehouse DESC, Parts.updated DESC LIMIT 0, 15

Note, if you don't need LEFT JOIN's for your Location_State, Location_Region, Promotion tables, then use an INNER JOIN instead. It will likely perform better.

To further assist in optimizing this query, please provide the following:

SHOW CREATE TABLE Parts;

If the rewritten query I provided works the same as your example (it should), then also provide:

EXPLAIN <my query here>\G
answered Jul 22, 2009 at 15:28
7
  • 2
    DESC clause is ignored by MySQL. All indexed values are stored in ascending order. Commented Jul 22, 2009 at 15:49
  • Seems you are correct. I never noticed that portion of the MySQL documentation. Thanks Quassnoi. Commented Jul 22, 2009 at 15:53
  • Thanks hobodave, If i use OR for category i would need to then add Group By clause on parts id because the part may be listed in multiple categories so we dont want them showing multiple times in the same results. Group By is evil whne mixed with order by hence the reason we used UNION. Not Union ALL Commented Jul 22, 2009 at 15:54
  • You are correct with the GROUP BY, but can you provide any substance to your statement that "GROUP BY is evil"? The query you presented is inefficient, and highly unlikely to perform better than what I suggest (with an added GROUP BY). Using EXPLAIN on your query will show a terrible mess of DEPENDENT SUBQUERY, derived tables, etc. Commented Jul 22, 2009 at 15:58
  • Hobodave, with this i get a Sytnax error somewhere near * on second line. Commented Jul 22, 2009 at 16:49

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.