What my query is doing:
I have two SQL tables and I want to copy datarows from the one table to the other table. I am doing this for statistics. The source table includes all current member-sessions/logins and with my statistics table I also want to see member-logins of the past. Thus I want to store these data into my bot_sessions_statistics table as well to make sure they are not being updated anymore:
The original query:
INSERT INTO bot_sessions_statistics (member_id, session_token, username, ip_address)
SELECT sessions.member_id, sessions.session_token, sessions.username, sessions.ip_address
FROM bot_sessions AS sessions
LEFT JOIN
(
SELECT stats.member_id, stats.session_token, stats.username, stats.ip_address
FROM bot_sessions_statistics as stats
WHERE date_active >= date(NOW())
) AS stats
ON sessions.member_id = stats.member_id
WHERE latest_renewal >= date(NOW())
AND stats.member_id IS NULL
The performance of the original query:
Query_time: 86.364613 Lock_time: 0.000085 Rows_sent: 0 Rows_examined: 1088312551
The table structures:
Table bot_sessions: bot_sessions table definition
Table bot_sessions_statistics: bot_sessions_statistics table definition
The performance of single queries:
SELECT sessions.member_id, sessions.session_token, sessions.username, sessions.ip_address
FROM bot_sessions AS sessions
WHERE latest_renewal >= date(NOW())
Returns 44.2k rows (Duration: 0.078s / Fetch: 1.607s)
SELECT stats.member_id, stats.session_token, stats.username, stats.ip_address
FROM bot_sessions_statistics as stats
WHERE date_active >= date(NOW())
Returns 23.3k rows (Duration: 0.047s / Fetch 0.920s)
1 Answer 1
I have much more experience with SQL Server than with MySQL, but I think that some things very similar:
LEFT JOIN
replacement - you are using onlymember_id
from stats inner query. You could just have a directLEFT JOIN
with thebot_sessions_statistics
table.Cluster index candidate -
bot_sessions_statistics.date_active
looks like it is assigned one time with current time. As time always goes up, it should not create re-orderings. This is particularly useful when fetching multiple columns (index has all the columns) with conditions like>= date(NOW())
(faster seeks).Possible use of partitioning - I have never used this in MySQL, but this functionality can provide increased performance when seeking data in very large tables. Partitioning by date columns might gain you some performance.
NOT EXISTS
instead ofLEFT JOIN
/IS NULL
- generally, it seems thatLEFT JOIN
withIS NULL
is slower thatNOT EXISTS
.