I have a table with several columns, say a
, b
, c
, d
that should be searchable. The problem arises when I need to search a, b, c
separately from d
(and vise-versa). AFAIK, there's no way to achieve this using one composite fulltext index on all columns, so I create two separate indexes like this:
CREATE FULLTEXT INDEX idx1 ON content (a, b, c);
CREATE FULLTEXT INDEX idx2 ON content (d);
Now I can search the first and second one successfully. For both of them, I would use the following command:
SELECT * FROM content
WHERE MATCH(a, b, c) AGAINST ('keyword')
AND MATCH(d) AGAINST ('keyword');
explain
tells me this:
+----+-------------+---------+----------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+----------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | content | fulltext | idx1,idx2 | idx1 | 0 | | 1 | Using where |
+----+-------------+---------+----------+---------------+------+---------+------+------+-------------+
Great! So it's using two indexes, but returns only rows where keyword
is present in both inclusive and I need either one, so I change AND
to OR
and now explain says:
+----+-------------+---------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | content | ALL | NULL | NULL | NULL | NULL | 128 | Using where |
+----+-------------+---------+------+---------------+------+---------+------+------+-------------+
What? Suddenly, it's doing a full table scan. Why is this happening? What would be the best way to avoid this?
2 Answers 2
Unfortunately, this is how MySQL Query Optimizer treats FULLTEXT indexes. When a MATCH
clause is the only clause in the WHERE
, the index will be used. When used in conjunction with AND
, the index may easily get overlooked.
I wrote about this behavior before in Mysql fulltext search my.cnf optimization
SUGGESTION : Rewrite the query as the union of two FULLTEXT searches
SELECT * FROM content
WHERE MATCH(a, b, c) AGAINST ('keyword')
UNION
SELECT * FROM content
WHERE MATCH(d) AGAINST ('keyword');
GIVE IT A TRY !!!
-
Thanks! Indeed, this gives me the desired result, however when run with
explain
, it now shows 3 actions instead of 1. Does this mean it actually doing more operations, or it comes down to one lookup internally?bobo– bobo2015年07月05日 20:12:03 +00:00Commented Jul 5, 2015 at 20:12 -
Yes it does. It should be 3 operations, 1 resultset from
MATCH(a, b, c)
, 1 resultset fromMATCH(d)
, and a merger of the two resultsets.RolandoMySQLDBA– RolandoMySQLDBA2015年07月05日 20:22:56 +00:00Commented Jul 5, 2015 at 20:22
Unclear what you want to do.
If you want to see rows with keyword
in both d
and somewhere in a,b,c
, then your AND
is a good way to go.
But if you want keyword
to be in any of a,b,c,d
, then add a third index
FULLTEXT(a,b,c,d)
and change to
MATCH(a,b,c,d) AGAINST('keyword')
For further discussion please specify whether you are using InnoDB or MyISAM; they work differently.
-
This way I'll have 3 indexes, one of which is completely redundant. Plus, on every
INSERT
, MySQL would have to rebuild all three of them. Seems like too much extra operations for a simple search feature. Using MyISAM.bobo– bobo2015年07月07日 17:48:19 +00:00Commented Jul 7, 2015 at 17:48 -
MySQL does not "rebuild" indexes on each
INSERT
; it just augments the indexes. This is cheap enough not to be a big issue. Ithink
MyISAM does not need the thirdFULLTEXT
, but InnoDB does, hence my question.Rick James– Rick James2015年07月07日 19:05:26 +00:00Commented Jul 7, 2015 at 19:05 -
That's nice to know, thanks. It doesn't absolutely need the third FULLTEXT, but in that case it's doing full table scan, which I'm trying to avoid. My original question was why does MySQL treat
AND
so differently fromOR
in a single query, when internally it should be implemented the same way. There doesn't seem to be much information online about this behavior, except for the question here, which basically says "OR
is not optimized, whileAND
is"bobo– bobo2015年07月07日 22:39:42 +00:00Commented Jul 7, 2015 at 22:39 -
1Yes.
AND
can often be optimized, often with a different index.OR
is rarely optimizable. OftenUNION
is a trick to optimize it, as Rolando points out.Rick James– Rick James2015年07月08日 03:44:00 +00:00Commented Jul 8, 2015 at 3:44