1

There is a table have 2 indexes, called state and CATEGORY, which has a very low cardinality respectively 4 and 24 within 7,110,590 rows in the table.

When running query of select statements that includes above 2 indexes in where clause, an optimizer try to index merge, Using intersect(state,CATEGORY), that is less efficient than even table scan, takes about 20 sec.

Is there any way to except a index if the cardinality of the index is lower than a specific directed number when an optimizer is about to make a execution plan?

The database is MariaDB-1:10.6.11 Community version

But other databases which has lower version than previous mentioned database server, 10.3.28-MariaDB, handles the same query in 5ms. Servers of older version run a table scan using high cardinality index(date). All mentioned servers are same replicas of a source database with slightly different versions. Depends on difference of their version, their optimizers create query that have a lot of difference in performance.

What factors could bring that difference?


Full explain extended in newer version having low performance

id 1
select_type SIMPLE
table that_table
type index_merge
possible_keys CATEGORY,state
key state,CATEGORY
key_len 6,152
ref NULL
rows 1778388
filtered 100
Extra
Using intersect(state,CATEGORY); Using where; Using filesort

Full explain extended in old version having good performance

id 1
select_type SIMPLE
table that_table
type index
possible_keys CATEGORY,state
key DATE
key_len 62
ref NULL
rows 18
filtered 100
Extra Using where

Below is indexes on the table

# cardinality: 24, `CATEGORY` VARCHAR(50) DEFAULT '' NOT 
CREATE INDEX CATEGORY ON that_table (CATEGORY) ; 
# cardinality: 4, `state` VARCHAR(1) DEFAULT 'Y' NULL;
CREATE INDEX state ON that_table (state);
# cardinality: 7,110,590, `DATE` VARCHAR(20) DEFAULT '' NOT NULL;
CREATE INDEX DATE ON that_table (DATE);

The problematic query:

EXPLAIN EXTENDED
SELECT *
FROM that_table
WHERE category = 'blabla'
 AND state = 'Y'
ORDER BY date DESC
LIMIT 9;

A definition of table

CREATE TABLE `that_table` (
 `UID` int(10) NOT NULL AUTO_INCREMENT,
 `state` varchar(1) DEFAULT 'Y',
 `CATEGORY` varchar(50) NOT NULL DEFAULT '',
 `DATE` varchar(20) NOT NULL DEFAULT '',
 # .... other columns
 
 PRIMARY KEY (`UID`),
 KEY `DATE` (`DATE`),
 KEY `CATEGORY` (`CATEGORY`),
 KEY `state` (`state`),
 # ... other indexes
);
asked Feb 2, 2023 at 9:27
6
  • "which has a very low cardinality(4,24) with 7110590 rows" - What does this mean?... cardinality is the number of rows, in this context in the index. Are you saying the cardinality is 7,110,590 rows (which is not exactly small) or are you saying the cardinality is 4 rows for one index and 24 rows for the other index? Commented Feb 2, 2023 at 13:14
  • A latter is what I meant Editing a question, because it is confusing and ambiguous Commented Feb 2, 2023 at 15:06
  • "When running query of select statements that includes above 2 indexes in where clause, an optimizer try to index merge, Using intersect(state,CATEGORY), that is less efficient than even table scan, takes about 20 sec." - Yea, that's a sign something bigger is likely wrong with either your query, database, MariaDB instance, or the server itself. An index merge between two indexes that are that tiny should never take more than 1 second, let alone 20 seconds, even on the worst hardware. I doubt your problem is the operation the optimizer is choosing, itself. Commented Feb 2, 2023 at 16:20
  • Btw you should add the index definitions to your Post. How do you know what the cardinality is of each index? Commented Feb 2, 2023 at 16:22
  • 1
    It has been added definitions of indexes Commented Feb 3, 2023 at 1:29

1 Answer 1

0

"Using intersect(state,CATEGORY)" almost always inplies that you have single-column INDEXes on each of those two columns.

Replace INDEX(state) with INDEX(state, category).

That will get rid of that Using intersect and speed up the query. (Using union is a different story.)

If you need further discussion, please provide SHOW CREATE TABLE and the full EXPLAIN SELECT ...

Meanwhile, don't bother to turn off "index merge" or use "ignore index"; they are usually better than nothing.

More

WHERE category = 'blabla'
 AND state = 'Y'
ORDER BY date DESC
LIMIT 9;

That would benefit even more from

INDEX(category, state, -- either order is OK
 date) -- last

More More

Different servers (even with the same MariaDB version) will sometimes happen to have different "statistics". This can lead to different query plans, hence the differences you are seeing.

My 3-column index, plus removing any index that is a prefix of that (namely `KEY(category)) is very likely to be picked and used effectively on all Replicas.

It is risky to try to explicitly direct the Optimizer (or use trickery) into using one plan over another. With a different state or category value or date range, and with different statistics, the attempt could backfire.

answered Feb 2, 2023 at 19:28
3
  • Thx It could be helpful, just added for more information Commented Feb 3, 2023 at 3:11
  • @soredive - I added more. Commented Feb 3, 2023 at 5:03
  • thank you - I added more questions in the post, Could you kindly give an opinion about that Commented Feb 3, 2023 at 6:02

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.