1

I got issues with some geo IP query. So basically here is create code of table:

CREATE TABLE `geo_ip_city` (
 `id` INT(5) UNSIGNED NOT NULL AUTO_INCREMENT,
 `begin` BIGINT(20) NOT NULL DEFAULT '0',
 `end` BIGINT(20) NOT NULL DEFAULT '0',
 `code` VARCHAR(2) NOT NULL DEFAULT '',
 `city` TINYINT(3) UNSIGNED NOT NULL DEFAULT '0',
 PRIMARY KEY (`id`),
 INDEX `begin` (`begin`),
 INDEX `end` (`end`),
 INDEX `code` (`code`)
)

And query:

SELECT * FROM `geo_ip_city` USE INDEX ( `end` ) WHERE `begin` <= 2523596988 AND `end` >= 2523596988 LIMIT 1

Table size is about 4.5m records.

This query runs when all other criteria doesn't work, but still on peak times quite often.

In worst case scenario (where nothing found) it executes for 1.9444 seconds.

Here is explain:

id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE geo_ip_city range end end 8 NULL 2297506 Using index condition; Using where

So when i have alot of requests for this feature, my servers going crazy.

Question:

Can I do something with this table or query to increase performance (maybe partitioning or complex keys)? Or should I watch in some other way?

asked Nov 24, 2014 at 10:48
3
  • Can I ask why you have LIMIT 1? If you remove it, is there a chance that you get more than 1 row? Commented Nov 24, 2014 at 11:50
  • In other words, are the intervals (begin, end) overlapping or not? Commented Nov 24, 2014 at 11:55
  • No they aren't. LIMIT 1 here to prevent scanning table if result already found. Commented Nov 24, 2014 at 12:19

3 Answers 3

2

If you are 100% sure that the intervals (begin, end) are never going to be overlapping, you can use this query, which only needs an index on (begin) or (begin, end) and will be much more efficient than what you have:

SELECT t.*
FROM 
 ( SELECT g.* 
 FROM geo_ip_city AS g
 WHERE g.begin <= 2523596988 
 ORDER BY g.begin DESC 
 -- ORDER BY g.begin DESC, g.end DESC
 LIMIT 1
 ) AS t
WHERE t.end >= 2523596988 ;

The only problem with the above is that this constraint (non-overlapping intervals) is not enforced by the database. Postgres has a nice EXCLUDE feature which can be used for exactly such a constraint - but MySQL cannot do this via DDL alone. Your application or procedures have to enforce it.

So, if by accident two or more overlapping intervals have been inserted into the table, the query may return 0 rows - while a match actually exists.

answered Nov 24, 2014 at 12:42
1
  • Wow. Thats incredible fast. Lemme check on real data. Intervals can't overlap, and if they will it's issue anyway. So data consistency won't suffer. Commented Nov 24, 2014 at 13:09
2

Your conditions are searching for ranges. Now imagine you're looking in a dictionary and looking up all words where the first letter is "greater" than A. Of how much use is the index? You want to narrow the search range down as much as possible. MySQL most of the time can only use one index per table. Combine those begin and end indexes.

CREATE TABLE `geo_ip_city` (
 `id` INT(5) UNSIGNED NOT NULL AUTO_INCREMENT,
 `begin` BIGINT(20) NOT NULL DEFAULT '0',
 `end` BIGINT(20) NOT NULL DEFAULT '0',
 `code` VARCHAR(2) NOT NULL DEFAULT '',
 `city` TINYINT(3) UNSIGNED NOT NULL DEFAULT '0',
 PRIMARY KEY (`id`),
 INDEX `idx_begin_end` (`begin`, `end`)
);

I also removed the index on code because it's also pretty much useless. It consists of two letters. Assuming there are no special characters/symbols included, this leaves you with 26*26=676 possible values for this column. That's a selectivity of 676 / 4.500.000 = 0.0001. You want to have the selectivity as near to 1 as possible.

answered Nov 24, 2014 at 11:13
2
  • I will try it, actually i thought about it. Indexing will take alot, so i needed to check this. I will try to do it and will response here after that. Thank you. Commented Nov 24, 2014 at 12:21
  • Tried, added complex index. Query now executes about 2 seconds, so no win. Commented Nov 24, 2014 at 13:08
1

if you trying to use the same query then it could be optimized adding composite index to geo_ip_city

the new index could be

ALTER TABLE geo_ip_city ADD INDEX ind_begin_end (begin, end)

with this you would need to remove you use index in the select statment to be

SELECT * FROM geo_ip_city WHERE begin <= 2523596988 AND end >= 2523596988 LIMIT 1

chx
4275 silver badges14 bronze badges
answered Nov 24, 2014 at 11:34
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.