I got issues with some geo IP query. So basically here is create code of table:
CREATE TABLE `geo_ip_city` (
`id` INT(5) UNSIGNED NOT NULL AUTO_INCREMENT,
`begin` BIGINT(20) NOT NULL DEFAULT '0',
`end` BIGINT(20) NOT NULL DEFAULT '0',
`code` VARCHAR(2) NOT NULL DEFAULT '',
`city` TINYINT(3) UNSIGNED NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
INDEX `begin` (`begin`),
INDEX `end` (`end`),
INDEX `code` (`code`)
)
And query:
SELECT * FROM `geo_ip_city` USE INDEX ( `end` ) WHERE `begin` <= 2523596988 AND `end` >= 2523596988 LIMIT 1
Table size is about 4.5m records.
This query runs when all other criteria doesn't work, but still on peak times quite often.
In worst case scenario (where nothing found) it executes for 1.9444 seconds.
Here is explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE geo_ip_city range end end 8 NULL 2297506 Using index condition; Using where
So when i have alot of requests for this feature, my servers going crazy.
Question:
Can I do something with this table or query to increase performance (maybe partitioning or complex keys)? Or should I watch in some other way?
3 Answers 3
If you are 100% sure that the intervals (begin, end)
are never going to be overlapping, you can use this query, which only needs an index on (begin)
or (begin, end)
and will be much more efficient than what you have:
SELECT t.*
FROM
( SELECT g.*
FROM geo_ip_city AS g
WHERE g.begin <= 2523596988
ORDER BY g.begin DESC
-- ORDER BY g.begin DESC, g.end DESC
LIMIT 1
) AS t
WHERE t.end >= 2523596988 ;
The only problem with the above is that this constraint (non-overlapping intervals) is not enforced by the database. Postgres has a nice EXCLUDE
feature which can be used for exactly such a constraint - but MySQL cannot do this via DDL alone. Your application or procedures have to enforce it.
So, if by accident two or more overlapping intervals have been inserted into the table, the query may return 0 rows - while a match actually exists.
-
Wow. Thats incredible fast. Lemme check on real data. Intervals can't overlap, and if they will it's issue anyway. So data consistency won't suffer.ineersa– ineersa2014年11月24日 13:09:56 +00:00Commented Nov 24, 2014 at 13:09
Your conditions are searching for ranges. Now imagine you're looking in a dictionary and looking up all words where the first letter is "greater" than A
. Of how much use is the index? You want to narrow the search range down as much as possible. MySQL most of the time can only use one index per table. Combine those begin
and end
indexes.
CREATE TABLE `geo_ip_city` (
`id` INT(5) UNSIGNED NOT NULL AUTO_INCREMENT,
`begin` BIGINT(20) NOT NULL DEFAULT '0',
`end` BIGINT(20) NOT NULL DEFAULT '0',
`code` VARCHAR(2) NOT NULL DEFAULT '',
`city` TINYINT(3) UNSIGNED NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
INDEX `idx_begin_end` (`begin`, `end`)
);
I also removed the index on code
because it's also pretty much useless. It consists of two letters. Assuming there are no special characters/symbols included, this leaves you with 26*26=676 possible values for this column. That's a selectivity of 676 / 4.500.000 = 0.0001. You want to have the selectivity as near to 1 as possible.
-
I will try it, actually i thought about it. Indexing will take alot, so i needed to check this. I will try to do it and will response here after that. Thank you.ineersa– ineersa2014年11月24日 12:21:46 +00:00Commented Nov 24, 2014 at 12:21
-
Tried, added complex index. Query now executes about 2 seconds, so no win.ineersa– ineersa2014年11月24日 13:08:31 +00:00Commented Nov 24, 2014 at 13:08
if you trying to use the same query then it could be optimized adding composite index to geo_ip_city
the new index could be
ALTER TABLE geo_ip_city
ADD INDEX ind_begin_end (begin, end)
with this you would need to remove you use index in the select statment to be
SELECT * FROM geo_ip_city WHERE begin <= 2523596988 AND end >= 2523596988 LIMIT 1
Explore related questions
See similar questions with these tags.
LIMIT 1
? If you remove it, is there a chance that you get more than 1 row?(begin, end)
overlapping or not?LIMIT 1
here to prevent scanning table if result already found.