I have a browse category query that im trying to optimize. Im ending up with Using temporary; Using filesort in the explain and the query is slow on a category with say 60,000 rows. If i remove the Order By clauses the query runs very fast .05 seconds to do 60,000 rows. With the Order By clauses its very slow around 5 seconds. Parts contains some 500,000 rows as does Parts_Category.
I have a group index on Parts (status, level, warehouse, updated) called sort_index
At the top of the explain I have | ALL | Using temporary; Using filesort
All the other indexes are showing OK. Can somebody please tell me what the problem might be? Im out of ideas. Maybe i should rearrange this query so i can get better performance perhaps?
query.
SELECT Parts.*, Image.type, Image.width, Image.height,
(SELECT name FROM Location_State WHERE id = Parts.state_id) AS state,
(SELECT name FROM Location_Region WHERE id = Parts.region_id) AS region,
(SELECT start_date FROM Promotion WHERE id = Parts.promotion_id) AS promotion_start_date,
(SELECT end_date FROM Promotion WHERE id = Parts.promotion_id) AS promotion_end_date
FROM ( SELECT parts_id FROM Parts_Category WHERE Parts_Category.category_id = '40'
UNION SELECT parts_id FROM Parts_Category WHERE Parts_Category.main_category_id = '40') cid
LEFT JOIN Image ON Parts.image_id = Image.id
JOIN Parts ON Parts.id = cid.parts_id AND Parts.status = 'A'
ORDER BY Parts.level DESC, Parts.warehouse DESC, Parts.updated DESC LIMIT 0, 15
Table structure for table Parts Field Type Null Default id int(11) No auto_increment image_id int(11) Yes 0 gallery_id int(11) Yes 0 image_count int(3) Yes 0 promotion_id int(11) Yes 0 country_id int(11) Yes NULL state_id int(11) Yes NULL region_id int(11) Yes NULL city_id int(11) Yes NULL area_id int(11) Yes NULL updated datetime Yes 0000-00-00 00:00:00 entered datetime Yes 0000-00-00 00:00:00 renewal_date date Yes 0000-00-00 discount_id varchar(10) Yes NULL title varchar(100) Yes search_title varchar(255) Yes warehouse varchar(50) Yes url varchar(255) Yes display_url varchar(255) Yes friendly_url varchar(100) Yes NULL description varchar(255) Yes keywords varchar(1000) Yes NULL attachment_file varchar(255) Yes attachment_caption varchar(255) Yes status char(1) Yes level tinyint(3) Yes 0 worldwide tinyint(1) Yes 0 random_number int(11) Yes NULL reminder tinyint(4) Yes NULL category_search varchar(1000) Yes video_snippet varchar(1000) Yes importID int(11) Yes 0 Indexes PRIMARY 518623 id random_number INDEX 32201 random_number country_id INDEX 1 country_id state_id INDEX 8 state_id region_id INDEX 5 region_id renewal_date INDEX 1 renewal_date worldwide INDEX 1 worldwide friendly_url INDEX 518623 friendly_url promotion_id INDEX 1 promotion_id city_id INDEX 1 city_id area_id INDEX 1 area_id zip_code INDEX 2790 zip_code importID INDEX 518623 importID image_id INDEX 10 image_id -------------- index_browse_category INDEX 52 level status warehouse updated ----------------- keywords FULLTEXT 1 description keywords category_search Parts_Category id int(11) No auto_increment parts_id int(11) No 0 category_id int(11) No 0 main_category_id int(10) No 0 Index PRIMARY PRIMARY 519330 id category_id INDEX 519330 category_id parts_id main_category_id INDEX 519330 main_category_id parts_id
2 Answers 2
Try rewriting your query as this:
SELECT p.*, i.type, i.width, i.height,
(SELECT name FROM Location_State WHERE id = p.state_id) AS state,
(SELECT name FROM Location_Region WHERE id = p.region_id) AS region,
(SELECT start_date FROM Promotion WHERE id = p.promotion_id) AS promotion_start_date,
(SELECT end_date FROM Promotion WHERE id = p.promotion_id) AS promotion_end_date
FROM parts p
LEFT JOIN
image i
ON i.id = p.image_id
WHERE EXISTS (
SELECT NULL
FROM Parts_Category pc
WHERE pc.category_id = '40'
AND pc.parts_id = p.id
UNION ALL
SELECT NULL
FROM Parts_Category pc
WHERE pc.main_category_id = '40'
AND pc.parts_id = p.id
)
AND p.status = 'A'
ORDER BY
p.status DESC, p.level DESC, p.warehouse DESC, p.updated DESC
LIMIT 15
You need the following indexes for this to work efficiently:
parts (status, level, warehouse, updated) -- this one you have
parts_category (category_id, parts_id)
parts_category (main_category_id, parts_id)
Update:
I just created the tables as this:
DROP TABLE IF EXISTS `test`.`image`;
CREATE TABLE `test`.`image` (
`id` int(11) NOT NULL,
`type` int(11) NOT NULL,
`width` int(11) NOT NULL,
`height` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `test`.`location_region`;
CREATE TABLE `test`.`location_region` (
`id` int(11) NOT NULL,
`name` varchar(20) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `test`.`location_state`;
CREATE TABLE `test`.`location_state` (
`id` int(11) NOT NULL,
`name` varchar(20) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `test`.`parts`;
CREATE TABLE `test`.`parts` (
`id` int(11) NOT NULL,
`status` char(1) NOT NULL,
`level` int(11) NOT NULL,
`warehouse` int(11) NOT NULL,
`updated` int(11) NOT NULL,
`state_id` int(11) NOT NULL,
`region_id` int(11) NOT NULL,
`promotion_id` int(11) NOT NULL,
`image_id` int(11) NOT NULL DEFAULT '1',
PRIMARY KEY (`id`),
KEY `status` (`status`,`level`,`warehouse`,`updated`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `test`.`parts_category`;
CREATE TABLE `test`.`parts_category` (
`id` int(11) NOT NULL,
`parts_id` int(11) NOT NULL,
`category_id` int(11) NOT NULL,
`main_category_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `ix_pc_cat_parts` (`category_id`,`parts_id`),
KEY `ix_pc_main_parts` (`main_category_id`,`parts_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `test`.`promotion`;
CREATE TABLE `test`.`promotion` (
`id` int(11) NOT NULL,
`start_date` datetime NOT NULL,
`end_date` datetime NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
and filled them with sample data:
INSERT
INTO parts
SELECT id,
CASE WHEN RAND() < 0.1 THEN 'A' ELSE 'B' END,
RAND() * 100,
RAND() * 100,
RAND() * 100,
RAND() * 50,
RAND() * 50,
RAND() * 50,
RAND() * 50
FROM t_source
LIMIT 500000;
INSERT
INTO parts_category
SELECT id,
id,
RAND() * 100,
RAND() * 100
FROM t_source
LIMIT 500000;
INSERT
INTO location_state
SELECT id, CONCAT('State ', id)
FROM t_source
LIMIT 1000;
INSERT
INTO location_region
SELECT id, CONCAT('Region ', id)
FROM t_source
LIMIT 1000;
INSERT
INTO promotion
SELECT id,
'2009-07-22' - INTERVAL RAND() * 5 - 20 DAY,
'2009-07-22' - INTERVAL RAND() * 5 DAY
FROM t_source
LIMIT 1000;
The query above runs for 30 milliseconds
and yields the following plan:
1, 'PRIMARY', 'p', 'ref', 'status', 'status', '3', 'const', 107408, 'Using where'
1, 'PRIMARY', 'i', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.image_id', 1, ''
6, 'DEPENDENT SUBQUERY', 'pc', 'ref', 'ix_pc_cat_parts', 'ix_pc_cat_parts', '8', 'const,test.p.id', 1, 'Using index'
7, 'DEPENDENT UNION', 'pc', 'ref', 'ix_pc_main_parts', 'ix_pc_main_parts', '8', 'const,test.p.id', 1, 'Using index'
, 'UNION RESULT', '<union6,7>', 'ALL', '', '', '', '', , ''
5, 'DEPENDENT SUBQUERY', 'Promotion', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.promotion_id', 1, ''
4, 'DEPENDENT SUBQUERY', 'Promotion', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.promotion_id', 1, ''
3, 'DEPENDENT SUBQUERY', 'Location_Region', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.region_id', 1, ''
2, 'DEPENDENT SUBQUERY', 'Location_State', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.state_id', 1, ''
As you can see, no temporary
, no filesort
, everything's very fast.
To help you anymore, I just need to see how your tables are defined.
-
Thanks again Quassnoi. Seem to be getting syntax error near Left Join on LEFT JOIN image i ?gus– gus07/22/2009 15:49:55Commented Jul 22, 2009 at 15:49
-
Could you please provide your exact table definitions, so that I would be able to check syntax?Quassnoi– Quassnoi07/22/2009 15:53:26Commented Jul 22, 2009 at 15:53
-
@John: meanwhile, try now, I messed the clause order indeed :)Quassnoi– Quassnoi07/22/2009 15:54:39Commented Jul 22, 2009 at 15:54
-
Parts Table Col is just id not parts_id so I changed this line to AND pc.parts_id = p.id, now its working. Changing indexes takes a while so ill be back soon.gus– gus07/22/2009 16:13:16Commented Jul 22, 2009 at 16:13
-
Explain | PRIMARY | p | ALL | 520,000 | Using where; Using filesort. Query is slower however - seems to be scanning all rows - before it was only scanning the amount equal to categories returned?gus– gus07/22/2009 16:37:39Commented Jul 22, 2009 at 16:37
John, the problem is your query is constructed such that it is selecting from a derived table. The derived table cannot benefit from your indices. Try updating your query as follows:
SELECT
Parts.*,
Image.type, Image.width, Image.height,
Location_State.name AS state,
Location_Region.name AS region,
Promotion.start_date AS promotion_start_date,
Promotion.end_date AS promotion_end_date
FROM Parts
LEFT JOIN Image ON Parts.image_id = Image.id
LEFT JOIN Location_State ON Parts.state_id = Location_State.id
LEFT JOIN Location_Region ON Parts.state_id = Location_Region.id
LEFT JOIN Promotion ON Parts.promotion_id = Promotion.id
INNER JOIN Parts_Category ON (Parts_Category.category_id = 40 OR Parts_Category.main_category_id = 40)
WHERE Parts.status = 'A'
GROUP BY Parts.id
ORDER BY Parts.level DESC, Parts.warehouse DESC, Parts.updated DESC LIMIT 0, 15
Note, if you don't need LEFT JOIN's for your Location_State, Location_Region, Promotion tables, then use an INNER JOIN instead. It will likely perform better.
To further assist in optimizing this query, please provide the following:
SHOW CREATE TABLE Parts;
If the rewritten query I provided works the same as your example (it should), then also provide:
EXPLAIN <my query here>\G
-
2
DESC
clause is ignored byMySQL
. All indexed values are stored in ascending order.Quassnoi– Quassnoi07/22/2009 15:49:26Commented Jul 22, 2009 at 15:49 -
Seems you are correct. I never noticed that portion of the MySQL documentation. Thanks Quassnoi.hobodave– hobodave07/22/2009 15:53:58Commented Jul 22, 2009 at 15:53
-
Thanks hobodave, If i use OR for category i would need to then add Group By clause on parts id because the part may be listed in multiple categories so we dont want them showing multiple times in the same results. Group By is evil whne mixed with order by hence the reason we used UNION. Not Union ALLgus– gus07/22/2009 15:54:16Commented Jul 22, 2009 at 15:54
-
You are correct with the GROUP BY, but can you provide any substance to your statement that "GROUP BY is evil"? The query you presented is inefficient, and highly unlikely to perform better than what I suggest (with an added GROUP BY). Using EXPLAIN on your query will show a terrible mess of DEPENDENT SUBQUERY, derived tables, etc.hobodave– hobodave07/22/2009 15:58:40Commented Jul 22, 2009 at 15:58
-
Hobodave, with this i get a Sytnax error somewhere near * on second line.gus– gus07/22/2009 16:49:22Commented Jul 22, 2009 at 16:49
Explore related questions
See similar questions with these tags.
(status, level, warehouse, updated)
, in this order (it's important). Seems it's created wrong way now (firstlevel
, thenstatus
), and seems it's the case why my query is slow. When posting your query structure, please just runSHOW CREATE TABLE parts
and post its output: it will output theCREATE TABLE
statement which is easy to copy and paste.CREATE INDEX ix_parts_order ON parts (status, level, warehouse, updated)