I have mysql table that holds log data for some products, this data are deleted and inserted by a service every 5 minutes.
This is the table:
| id | product_id | dev_stage_id | production_order | log | log_type | date
Right now id
is the auto-increment primary key but is in fact useless as we never use it. Every other column is not unique, so we cannot have a different composite primary key.
As I said before a service delete all rows by product_id
and dev_stage_id
every 5 minutes and then new data is created. This means id
gets very big very quickly.
The data are later used for display using a simple query like:
SELECT * FROM table where product_id = 1231 and dev_stage_id = 233
Is it ok in our case to just remove the id
column and have no primary key or should we have a different structure ?
-
You do not need in unique row identifying. You don't perform any operations which needs in this.So i think that you may remove this column without unpleasant consequences. service delete all rows by product_id and dev_stage_id every 5 minutes and then new data is created Service deletes the rows and you cannot influence this? if you may effect this deletion then use not DELETE but TRUNCATE - it resets AUTOINCREMENT attribute for the table.Akina– Akina2022年05月26日 12:45:39 +00:00Commented May 26, 2022 at 12:45
-
Anyway, for (default) InnoDB storage if no PK declared explicitly then a hidden column with autoincremented BIGINT value will be added to the table implicitly and will be used as PK. Therefore there are no advantages to omit PK column, at least for InnoDB tables.Kondybas– Kondybas2022年05月26日 13:06:37 +00:00Commented May 26, 2022 at 13:06
-
@Kondybas - Minor detail: It's a 6-byte number that is pseudo-shared by all tables without a PK.Rick James– Rick James2022年05月26日 20:16:33 +00:00Commented May 26, 2022 at 20:16
1 Answer 1
Short answer: Yes, you can simply and safely remove id
and the PK.
Long answer:
The Question is vague on how many rows are deleted every 5 minutes, so my answer is a bit vague. For one thing, if all the data is being tossed, then TRUNCATE
may be a better approach. Or maybe CREATE
a new table and swap. This allows the old data to survive until the new data is in place.
Also, the term "log" usually implies data that is kept "forever" (or at least a lot longer than 5 minutes).
I agree that id
is probably useless. But, a PK is important, even if it is a mostly-useless id
. Without an explicit PK, a hidden 6-byte number will be provided.
If there is only one entry per (product_id, dev_stage_id)
each 5 minutes, it seems like that pair could be the PK, and a simple UPDATE
would be better than a delete+insert. The update would change the value and the datetime.
How many product-stage pairs are there? (I want to estimate how soon an `INT UNSIGNED would overflow.)
If there are several entries for one product+stage in 5 minutes, then please provide more details.
You mention date
; is that actually a DATETIME
?
If you don't have PRIMARY KEY(product_id, dev_stage_id)
, then do have INDEX(product_id, dev_stage_id)
-- this will benefit the SELECT *
that you mentioned.