1

I read a lot of articles that PARTITIONing is not helpful in most cases. I believe it should speed up the queries in my case. I have a table with the structure of

CREATE TABLE pages
(
page_id int(11) unsigned NOT NULL AUTO_INCREMENT,
category_id smallint(5) unsigned,
title varchar(255),
created datetime,
updated datetime,
FOREIGN KEY(category_id) REFERENCES categories(category_id) ON DELETE CASCADE,
UNIQUE INDEX (category_id,title),
INDEX(title),
PRIMARY KEY(page_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE utf8_general_ci ROW_FORMAT=COMPRESSED

The table is close to 1 billion rows and 200-1000 category_id.

Almost all queries have category_id in it.

I consider PARTITIONing the table as

PARTITION BY KEY(category_id)
PARTITIONS 40; // between 20-50

Is it worthy?

asked Apr 1, 2019 at 3:30
1
  • Why don't you ask about your slow queries? After all, those are the things you want to be faster and you've read PARITIIONING is rarely helpful (and pretty impossible to determine without knowing your queries). Commented Apr 1, 2019 at 4:25

3 Answers 3

1

I have bad news with regard to this table: MySQL does not support Partitioning of a Table with Foreign Keys

According to MySQL 5.7 Docs on Partitioning Limitations

InnoDB storage engine. InnoDB foreign keys and MySQL partitioning are not compatible. Partitioned InnoDB tables cannot have foreign key references, nor can they have columns referenced by foreign keys. InnoDB tables which have or which are referenced by foreign keys cannot be partitioned.

answered Sep 3, 2019 at 16:33
0

Not enough information. What are the queries against the table -- in more detail.

Meanwhile, ...

What do you mean by "Almost all queries have category_id"? It makes a big difference as to whether you are doing

WHERE category_id = constant

versus

WHERE category_id IN (...)

For the former case, I would push for having category_id as the first column in the PRIMARY KEY:

PRIMARY KEY(category_id, title) -- helps with queries, and is UNIQUE
INDEX(page_id) -- to keep auto_inc happy

On the other hand, is that all the columns in the table?? If, that's all, I am having trouble imagining more than a couple of different SELECTs, namely mapping cat+title to page_id and vice versa. If that is all, then what you have is optimal. And what I have is equally optimal, with the slight improvement of having one fewer UNIQUE constraint.

If you have category_id IN (...) then (again) we need to see the rest of the WHERE.

answered Apr 19, 2019 at 17:14
0

You will get a error if you try partitioning by category_id because primary key needs to be in partition column. So the short answer is no.

Very large tables become problematic typically because you want to delete data (which partitions on time series tables) or because of slow queries, so share a query with us and let's see if we can fix it.

answered Sep 3, 2019 at 15:27

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.