This is more of a "please confirm/correct my understanding" question.
Background I connect to MySQL using JDBC, the connection pool has around 250 connections, most of which seem persistently connected even after query termination - all use the same User account. The Stored Procedure that I wish to use a Temporary Table for is called by Java code and performs a transform on a JSON column to convert that into a row for a different table. The transform is relatively expensive, performed over potentially thousands of rows, so I don't wish to perform the transform more than once. To get around this, my proposed solution is to transform the data into a temporary table, perform my statistical queries around the now transformed data, and then insert this wholesale into the target table.
Question I have read that Temporary tables are bound to the User. Am I right in theorising that because the connections are seemingly never terminated and all use the same user account, that the temporary table created in the procedure would be shared across all connections and queries (and so the data could be accessed by other connections running the same procedure at the same time)?
Limitations What I would like to have done is either transmit the original data, or perform the transform and transmit that data back to Java. However, we're dealing with around a few hundred megabytes of data each time and the data transfer back to Java seems to lock the database until it's finished the transmission. (I'm not sure why, if anyone has a recommendation to fix this, please let me know)
So I'm left with the only other solution I could think of which is to not transmit the source data, but to perform the transform, collect the aggregate data I need, then insert the transformed data to the target without it ever leaving the database server.
Thanks in advance
-
FWIW, transmitting data from client to server, even a few hundred megabytes, should not lock the database. It may be creating gap locks for a specific table during the INSERT, but not the whole database.Bill Karwin– Bill Karwin2023年12月04日 18:14:25 +00:00Commented Dec 4, 2023 at 18:14
-
This is something I haven’t been able to figure out myself as it’s definitely locking on a select during the fetch (not the query). I’ll ask it as a separate question.DSZ– DSZ2023年12月06日 09:36:38 +00:00Commented Dec 6, 2023 at 9:36
3 Answers 3
Temporary tables are bound to a specific session, not a user. In MySQL, you may have many sessions concurrently active with the same user. Every one of these can have a temporary table of the same name, but none can see any of the others outside its own session. In this way, it's like a local variable in a function.
Connections are held open by the connection pool, but subsequent clients who acquire a given connection do not see temporary tables left by an earlier client.
When a client requests a new connection from a connection pool, by default the pool implementation resets the connection state. This means the MySQL session clears all session-related data, such as transactions, session variables, temp tables, and prepared statements.
protected synchronized Connection getConnection(boolean resetServerState, boolean forXa) throws SQLException {
...
if (resetServerState) {
this.physicalConn.resetServerState();
}
It makes sense that this would be a good thing to do, because otherwise one client requesting a connection from a connection pool could view data left by a prior client of that connection. This would create a risk of private data leaking from one client to another.
All connection pool implementations I know of have used this connection-reset feature for years. I believe some connectors failed to do that years ago, but have been fixed. If you're still using a connection pool that doesn't reset the connection, you should stop using it and switch to a more modern implementation.
-
Hi Bill, thanks very much for your extended answer; I will be moving the accepted answer to yours after I’ve finished writing this. The temporary table is used as part of a stored procedure and I want it to be as ephemeral as possible so this is exactly what I was hoping for. Just to be sure I drop the temp table at the end of the procedure. What I was unsure about was whether two connections of the same user had access to the temporary table if they executed the procedure at the same time. I’m using the latest connector for MySQL so it should be resetting the session.DSZ– DSZ2023年12月06日 09:32:16 +00:00Commented Dec 6, 2023 at 9:32
-
I can suggest you test it to remove your doubt. Open two windows with the MySQL client (the command-line
mysql
client will work for this). In one session, create a temporary table. In the second session, try to query from that table. You will find that it can't. Then in the second session create a temp table of the same name, and insert different data into each. Then query the temp table in each session to confirm that they can see only their own data in that temp table.Bill Karwin– Bill Karwin2023年12月06日 15:56:56 +00:00Commented Dec 6, 2023 at 15:56
around 250 connections, most of which seem persistently connected
That's the whole purpose of a connection pool: to keep connections open and allow applications to reuse them, instead of going through the expensive process of establishing new connections.
According to the documentation, "[a] TEMPORARY
table is visible only within the current session". This means that, no, a temporary table created by one connection will not be shared across all connections. However, when another application obtains a connection from the pool, it will see whatever temporary tables have been created so far in the session corresponding to that connection.
-
Excellent. Thanks for answering. That was my understanding of the Connection Pool also, but then I started to second-guess myself after some of them started dropping the connection after running their query.DSZ– DSZ2023年12月04日 13:24:09 +00:00Commented Dec 4, 2023 at 13:24
-
NOT SO FAST! I think Bill's answer is more correct. That is, you cannot hang onto a temp table via a pool. Please dig further and consider moving the "Accepted Answer checkmark" to the other Answer.Rick James– Rick James2023年12月04日 17:42:10 +00:00Commented Dec 4, 2023 at 17:42
CREATE TEMPORARY TABLE
is not safe for your task.
May I suggest creating a 'permanent' table to hold the processed data. Then devise some mechanism, possibly via another permanent table, to say which table has what data processed.
Some kind of checksum or hash could be used to uniquely identify the dataset. That could be the PRIMARY KEY
of data containing the processed data.
Explore related questions
See similar questions with these tags.