4

I had asked a previous question "Proper technique for storing users event data" and the correct answer in my opinion was to create a database partition. Now from what I have read on it there are different ways to partition, but for this question we are going to assume we are doing a horizontal key partition using a date field on a RDBMS such as MySQL... (if you have objection or argument to this, by all means contribute).

The base question is how do you know how many partitions to create?

I understand this is a pretty open question being that it would also heavily rely on what hardware you are running this on, yet either way there should be some guidelines that would point to where the better performance would be, or a proper way of doing so, or even how would you judge something like this? Most all of the documentation I have found use terms such as "large", "big", "a lot"... what do these terms mean when it comes to access speed, or row count, efficiency vs storage or required hardware. Is it something where from trial and error or observed performance where if things start getting a little rough you would just add a partition or two?

I'm very interested in opinions and contradictions to this seemingly common hurdle in large DB schemes.

Thanks

asked Jul 26, 2011 at 19:38
2
  • I like the question, never had anything 'large' enough for this yet, though I've considered it on a couple activity logging tables that are over 100mil rows. Commented Jul 27, 2011 at 14:43
  • Yes @DTest , I too have never had anything 'large' enough either... but with some of the databases I have designed, this could be a potential problem if I don't start taking this into consideration now. Commented Jul 27, 2011 at 16:53

1 Answer 1

5

I will tell you my experience in defining the terms of "large", "big", "a lot":

  • a lot is a database that takes around 400 GB for the data of a full month (custom logging information from all our web apps)

  • big is a table in this database that contains much of this space :-).. about half of this size (200 GB a table, from what I kinda remember)

  • large is the data in a full day (about 12-15 GB) - which means a partition of that magic table

This is not a rule or a best practice.. but when you feel that your current indexing strategy is a bit lost and nothing seems to make your queries faster, I believe that it's time to take into account the partitioning.

answered Jul 26, 2011 at 22:31
2
  • thanks @Marian , it sounds like you have quite some experience with 'large' databases. The systems you have been involved with were on dedicated hardware with I would assume a SAN or NAS storage of such, correct? What kind of processing power did these systems have? and how much 'more' could they handle? Commented Jul 27, 2011 at 16:56
  • @CenterOrbit: that machine was not very powerful, 4 CPU-s, 8 GB or RAM, and a SAN box for storage. But it was good enough for it's logging purpose and error reports. Commented Aug 2, 2011 at 7:19

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.