I've created a view:
create view dh_hits_new_all_old
as
select * from dh_hits_new_201806
union all
select * from dh_hits_new_201807;
When I select something with a where clause containing a field which has an index on the individual tables, MySQL doesn't use any indexes. See the explain outputs:
mysql> explain select count(*) from dh_hits_new_201806 where id_hits_url = 130442;
+----+-------------+--------------------+------------+------+-----------------------------+-----------------------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------------+------------+------+-----------------------------+-----------------------------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | dh_hits_new_201806 | NULL | ref | dh_idx-hits_new-id_hits_url | dh_idx-hits_new-id_hits_url | 4 | const | 4453 | 100.00 | Using index |
+----+-------------+--------------------+------------+------+-----------------------------+-----------------------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0,00 sec)
mysql> explain select count(*) from dh_hits_new_201807 where id_hits_url = 130442;
+----+-------------+--------------------+------------+------+-----------------------------+-----------------------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------------+------------+------+-----------------------------+-----------------------------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | dh_hits_new_201807 | NULL | ref | dh_idx-hits_new-id_hits_url | dh_idx-hits_new-id_hits_url | 4 | const | 4009 | 100.00 | Using index |
+----+-------------+--------------------+------------+------+-----------------------------+-----------------------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0,00 sec)
mysql> explain select count(*) from dh_hits_new_all_old where id_hits_url = 130442;
+----+-------------+--------------------+------------+------+---------------+-------------+---------+-------+-------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------------+------------+------+---------------+-------------+---------+-------+-------+----------+-------+
| 1 | PRIMARY | <derived2> | NULL | ref | <auto_key0> | <auto_key0> | 4 | const | 10 | 100.00 | NULL |
| 2 | DERIVED | dh_hits_new_201806 | NULL | ALL | NULL | NULL | NULL | NULL | 65658 | 100.00 | NULL |
| 3 | UNION | dh_hits_new_201807 | NULL | ALL | NULL | NULL | NULL | NULL | 61552 | 100.00 | NULL |
+----+-------------+--------------------+------------+------+---------------+-------------+---------+-------+-------+----------+-------+
3 rows in set, 1 warning (0,00 sec)
I do not understand why. This behavior is reflected of course also in the execution times:
mysql> select count(*) from dh_hits_new_201806 where id_hits_url = 130442;
+----------+
| count(*) |
+----------+
| 4453 |
+----------+
1 row in set (0,01 sec)
mysql> select count(*) from dh_hits_new_201807 where id_hits_url = 130442;
+----------+
| count(*) |
+----------+
| 4009 |
+----------+
1 row in set (0,00 sec)
mysql> select count(*) from dh_hits_new_all_old where id_hits_url = 130442;
+----------+
| count(*) |
+----------+
| 8462 |
+----------+
1 row in set (2,33 sec)
2 Answers 2
VIEWs
are syntactic sugar. They probably never improve performance and sometimes hurt performance.
Your query is effectively:
SELECT COUNT(*)
FROM (
select * from dh_hits_new_201806
union all
select * from dh_hits_new_201807
) AS x
WHERE id_hits_url = 130442;
The Optimizer is not very smart; it cannot transform that into this, which would run faster:
SELECT
( SELECT COUNT(*) from dh_hits_new_201806 WHERE id_hits_url = 130442 ) +
( SELECT COUNT(*) from dh_hits_new_201807 WHERE id_hits_url = 130442 );
Perhaps a better solution is to avoid the clumsy pattern wherein you have multiple tables that logically combine to make a single one. That is, have only one table dh_hits_new
. Optionally, that table could be partitioned. But, again, do not expect performance gains -- except for deleting 'old' data via DROP PARTITION
.
-
Yes, of course I know, that to query the individual tables could be a solution. However to select
count(*)
was only an example to demonstrate the behavior and the individual tables are not usable if the query is embedded into a data provider in a framework to perform sorting and paging. As I see, I have to go back to use only 1 really huge archive table.Tibor Nagy– Tibor Nagy2020年11月17日 08:56:08 +00:00Commented Nov 17, 2020 at 8:56 -
@TiborNagy - Write a feature request (bugs.mysql.com) requesting enhancements Views of Unions -- to push the outside Where clause into the individual Selects.Rick James– Rick James2020年11月17日 18:10:52 +00:00Commented Nov 17, 2020 at 18:10
-
@TiborNagy - We can discuss (in a differently-focused Question) how to deal with "huge archive tables" and "Summary tables" and "Data Wherehousing" in general. Here is some general reading: mysql.rjweb.org/doc.php/datawarehouseRick James– Rick James2020年11月17日 18:12:35 +00:00Commented Nov 17, 2020 at 18:12
Because it is a kind of bug in the old MySQL version. UNION
effect using indexes.
In the versions older than the 5.7.3 version of MySQL, UNION
statement works with a temporary table. It means, first of all, your data move to the temporary table and read them from the temporary table before that implement filtering.
The server no longer uses a temporary table for UNION statements that meet certain qualifications. Instead, it retains from temporary table creation only the data structures necessary to perform result column typecasting. The table is not fully instantiated and no rows are written to or read from it; rows are sent directly to the client. The result is reduced memory and disk requirements, and smaller delay before the first row is sent to the client because the server need not wait until the last query block is executed. EXPLAIN and optimizer trace output will change: The UNION RESULT query block will not be present because that block is the part that reads from the temporary table.
Check out version 5.7.3 optimizer fixes.
-
I'm using a never version, 5.7.32 on Ubuntu 16 now. However it was also newer before (5.7.27). This is not the problem.Tibor Nagy– Tibor Nagy2020年11月17日 08:46:32 +00:00Commented Nov 17, 2020 at 8:46
-
@TiborNagy - I agree. That improvement in
UNION
has only a small impact on your query, and probably no impact on optimizing queries involvingVIEWs
.lRick James– Rick James2020年11月17日 18:08:58 +00:00Commented Nov 17, 2020 at 18:08 -
A missing optimization is not a bug though.ypercubeᵀᴹ– ypercubeᵀᴹ2022年11月26日 22:05:04 +00:00Commented Nov 26, 2022 at 22:05