0

Hello MySQL Community,

I wanted to share some observations regarding the performance differences between MySQL 5.7 and MySQL 8.0, specifically related to join algorithms.

In a recent project, I noticed a significant performance difference when running a complex query involving multiple joins using regular expressions on MySQL 5.7 and MySQL 8.0. So the join is not like Join TableA ON (TableX.table_a_id = TableA.id) instead it is like Join TableA ON (TableX.details REGEXP CONCAT("\\table_a_id[[:space:]]*:[[:space:]]*", TableA.id, "\\b")) The same query ran noticeably faster on MySQL 5.7 compared to MySQL 8.0.

Upon investigating, I found that the EXPLAIN output showed that MySQL 5.7 was using the Block Nested Loop algorithm for joins, while MySQL 8.0 was using the hash join algorithm. This difference in join algorithms seems to be the primary reason for the performance difference.

While MySQL 8.0 has many improvements and new features, including security enhancements and roles for better access control, it appears that for this specific type of query, MySQL 5.7's Block Nested Loop algorithm performs better.

It's important to note that the performance of SQL queries can be highly dependent on the specific data and query structure, and this observation may not apply to all scenarios. However, for those who are running complex queries involving multiple joins and regular expressions, it may be worth testing the performance on both MySQL 5.7 and MySQL 8.0.

I'm sharing this in the hope that it might be helpful to others who are considering upgrading to MySQL 8.0 or who are experiencing performance issues after upgrading. As always, thorough testing and performance analysis is recommended before making any major changes to your database system.

If anyone has experienced similar issues or has any insights into the differences in join algorithms between MySQL 5.7 and MySQL 8.0, I would love to hear your thoughts.

Best Regards,

asked Mar 14, 2024 at 20:35

1 Answer 1

0
-- Force Block Nested Loop Join in MySQL 8.0
SET optimizer_switch = 'block_nested_loop=on,hash_join=off';
-- Example of preprocessing to avoid REGEXP in join
CREATE TABLE PreprocessedDetails AS
SELECT id, table_a_id
FROM TableX
WHERE details REGEXP 'pattern';
-- Optimized equality-based join
SELECT *
FROM PreprocessedDetails
JOIN TableA ON PreprocessedDetails.table_a_id = TableA.id;
-- Add a computed column for pattern matching
ALTER TABLE TableX ADD COLUMN extracted_id INT GENERATED ALWAYS AS (
 CASE 
 WHEN details REGEXP 'pattern' THEN REGEXP_EXTRACT(details, 'regex_group')
 ELSE NULL
 END
);
-- Create an index on the computed column
CREATE INDEX idx_extracted_id ON TableX(extracted_id);
-- Use the computed column for optimized join
SELECT *
FROM TableX
JOIN TableA ON TableX.extracted_id = TableA.id;
-- Profile the query execution plan
EXPLAIN SELECT * 
FROM TableX 
JOIN TableA 
ON (TableX.details REGEXP CONCAT("\\table_a_id[[:space:]]*:[[:space:]]*", TableA.id, "\\b"));
answered Dec 31, 2024 at 18:04

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.