I have an INNODB table levels
:
+--------------------+--------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------------------+--------------+------+-----+---------+-------+ | id | int(9) | NO | PRI | NULL | | | level_name | varchar(20) | NO | | NULL | | | user_id | int(10) | NO | | NULL | | | user_name | varchar(45) | NO | | NULL | | | rating | decimal(5,4) | NO | | 0.0000 | | | votes | int(5) | NO | | 0 | | | plays | int(5) | NO | | 0 | | | date_published | date | NO | MUL | NULL | | | user_comment | varchar(255) | NO | | NULL | | | playable_character | int(2) | NO | | 1 | | | is_featured | tinyint(1) | NO | MUL | 0 | | +--------------------+--------------+------+-----+---------+-------+
There are ~4 million rows. Because of the front-end functionality, I need to query this table with a variety of filters and sorts. They are on playable_character
, rating
, plays
, and date_published
. The date_published
can be filtered to show by the last day, week, month, or anytime(last 3 years). There's also paging. So, depending on the user choices, the queries can look, for example, like one of these:
SELECT * FROM levels
WHERE playable_character = 0 AND
date_published BETWEEN date_sub(now(), INTERVAL 3 YEAR) AND now()
ORDER BY date_published DESC
LIMIT 0, 1000;
SELECT * FROM levels
WHERE playable_character = 4 AND
date_published BETWEEN date_sub(now(), INTERVAL 1 WEEK) AND now()
ORDER BY rating DESC
LIMIT 4000, 1000;
SELECT * FROM levels
WHERE playable_character = 5 AND
date_published BETWEEN date_sub(now(), INTERVAL 1 MONTH) AND now()
ORDER BY plays DESC
LIMIT 1000, 1000;
I should add that rating
and plays
are always queried as DESC
. Only date_published
may be either DESC
or ASC
.
I started out with an index idx_date_char(date_published, playable_character)
that worked great on the first example query here. Based on some other answers, I changed to two other indexes (date_published, playable_character, plays) and (date_published, playable_character, rating).
The first query still runs very fast, however there's some unusual things happening in EXPLAIN, when player_character = x exceeds a certain number of rows (~700,000): the USING WHERE pops on in EXPLAIN.
So, first question is are there any improvements in the query or indexes possible, and, second, what MySQL settings should get altered to allow for the large result sets.
Any suggestions greatly appreciated. TIA.
-
I sure would like to see the "explains" and "show indexes from levels" output when the unusual things happen.. looks like the deep scanning (larger start value in LIMIT) triggers the use off an temporary table and using filesort.Raymond Nijland– Raymond Nijland2013年10月18日 17:39:04 +00:00Commented Oct 18, 2013 at 17:39
-
Also how slow run other queries? How much time is needed to get the results?Mindaugas Riauba– Mindaugas Riauba2013年10月18日 17:57:38 +00:00Commented Oct 18, 2013 at 17:57
-
1Isn't this a duplicate of this question?: stackoverflow.com/questions/19385610/…ypercubeᵀᴹ– ypercubeᵀᴹ2013年10月18日 17:59:09 +00:00Commented Oct 18, 2013 at 17:59
-
@ypercube sure looks like an duplicate i thought i had an Déjà vu but i could not find the topic on stackoverflow annymoreRaymond Nijland– Raymond Nijland2013年10月18日 18:11:08 +00:00Commented Oct 18, 2013 at 18:11
-
@Hal50000 If your explain shows "using filesort" i think this blogs.oracle.com/realneel/entry/... could be your problem note that article is old so most likly the filesort will be hinted now with explain hints to get an estimate_rows_upper_bound() see sqlfiddle.com/#!2/4f56a/2, i think filesort gets 30 as rows there.. i also think i need to track down the estimate_rows_upper_bound() in the source code to get an better explainment this is thoery.Raymond Nijland– Raymond Nijland2013年10月19日 09:48:31 +00:00Commented Oct 19, 2013 at 9:48
3 Answers 3
WHERE playable_character = 0 AND
date_published BETWEEN date_sub(now(), INTERVAL 3 YEAR) AND now()
Start with the "=" item, then do the range:
INDEX(playable_character, date_published);
"Pagination", a la ORDER BY rating DESC LIMIT 4000, 1000;
is best done by remember where you "left off". That way, you don't have scan over the 4000 records that you don't need.
create both indexes for better performance if not much of insert or update query:
ADD INDEX (playable_character, date_published, rating desc)
ADD INDEX (playable_character, date_published, plays desc)
otherwise if insert and update query are more using on this table than create index:
ADD INDEX (playable_character, date_published)
or don't create any index
Everybody here are missing one point - SQLs in question are retrieving 1000 rows. And one cannot retrieve 1000 rows fast unless most of the data is cached. If data is not cached one has to do 1000 random reads sequentially to retrieve the data.
Good disk based storage with fast disks will give you up to ~200 reads per second. Common disks even in RAID I doubt that manage even 100. That means 10+ seconds to get results even with the best indexes.
So in the longer run such data model and queries won't work. Now the queries run fast when you hit cached data.
-
Even with covering indexes?ypercubeᵀᴹ– ypercubeᵀᴹ2013年10月18日 19:23:23 +00:00Commented Oct 18, 2013 at 19:23
-
That's one of the possible ideas how to reorganize the data.Mindaugas Riauba– Mindaugas Riauba2013年10月18日 19:26:15 +00:00Commented Oct 18, 2013 at 19:26
-
Covering indexes also give best results when they are cached in the buffer pool. Otherwise, you have to expect to wait for disk I/O.Bill Karwin– Bill Karwin2013年11月17日 21:46:24 +00:00Commented Nov 17, 2013 at 21:46
-
And for what it's worth, SSD-based systems with thousands of IOPS are becoming mainstream for RDBMS servers.Bill Karwin– Bill Karwin2013年11月17日 21:47:21 +00:00Commented Nov 17, 2013 at 21:47
-
1There are many flavors of good/bad indexing -- from a covering index to a table scan. In between are optimal non-covering index, sub-optimal index, range-scan, unnecessary joins, etc. Fetching 1000 rows may take zero I/O (good index and good caching), all the way to more than 1000 disk hits (huge table, no useful index).Rick James– Rick James2016年10月25日 17:48:12 +00:00Commented Oct 25, 2016 at 17:48
Explore related questions
See similar questions with these tags.