How to scale with MySQL (when not ready to scale properly)

Question 1

We are using MySql.

My situation

I have a large number of tables with millions of rows each. Those tables are updated every second and are used both for adding info and retrieving info. Each table can be 5GB or 10GB or even more.
I have one table that I keep sums of information (something like a summary table of the information I need) but this is starting to get big in size as well.

My limitations

at the moment I cannot change database due to various reasons (mainly no knowledge, time and budget)
all the extra power that we add to the server goes to other resources needed so I cannot run very heavy queries

Temporary ways I have thought for scaling

Having these things in mind I am trying to think of ways to scale with what I have:

For the tables with millions of rows I have thought to keep to separate databases (could make my life easier for backups / exports / changes). Keep my main data in 1 database and all peripherals (huge tables) to other databases. Let’s say have a different database for a different need.
For the problem with the table that I really need regularly and is growing fast I was thinking into splitting it into XX tables. Could be 1 table per user (which might be too much) or 1 table per XXX users.

Are these ideas totally crazy and really bad DB design?

If yes..... any suggestions other than changing everything at once?

Question 2

Hi and welcome to the forum! Have you considered partitioning? Also, you say that you have tables updated every second and are used both for adding info and retrieving info. - when retrieving info, is this an OLTP (single record by single record) scenario or are you running reports (selecting over large ranges of data) at the same time as your updates?

Question 3

I'm assuming that your updates are either point inserts or point updates? If you have a mixed OLTP/OLAP situation, you might want to consider some sort of replication to reporting servers? They don't have to be very expensive machines, but they might take the load off your main server - just a thought!

Question 4

what exact issues are you trying to solve?

Question 5

First of all thank you for your help. With your questions you are already giving me more info to look for & guiding me. @Vérace 1) I haven't considered partitioning - I don't know what it is, but will read about it now. 2) When retrieving info it is OLTP for most. I will also check for reporting servers because in the near future this is also something I need.

Question 6

@nikita I am trying to avoid having tables with a) lots of data and b) big in size so that I won't need LOTS of extra server resources to run my queries fast. I want to buy time to focus build my product in other areas and not re-do the entire database at the moment.

Question 7

PARTITIONing does not save space. Each partition has 4M-7M of "free" space. That can add up. Partitioning can be used for removing data from the server (or moving to another server) "old" data. More discussion: http://mysql.rjweb.org/doc.php/partitionmaint

Nor is any ad hoc partitioning likely to help. That is splitting tables up, moving between databases (I'm talking about CREATE DATABASE, not server) will not change performance, nor space. On the other hand, "easier for backups / exports" may be a valid reason. Please provide details so we can discuss.

"Sharding" is complex, but it is the way to split data across multiple servers. (Note: I am using "server" to mean a separate physical machine.)

A small fix is to shrink the datatypes in tables. Are you using BIGINT (8 bytes) when MEDIUMINT (3 bytes) would suffice? Etc, etc.

Summary tables are an excellent way to save space and speed up "reports". But perhaps they are not as small as they could be? Let's see SHOW CREATE TABLE.

"growing fast I was thinking into splitting it into XX tables" -- NO, NO, NO! The only exceptions is when "sharding".

How do you populate the Summary Table(s)? Some tips: http://mysql.rjweb.org/doc.php/summarytables I assume your summary tables are updated either in realtime, or nightly? And you update them only incrementally, not by starting over.

You say 1 insert per second? That is a very low rate. Perhaps it is faster than that?

When the data is properly indexed and/or summarized, the size of the table does not matter (for performance). Are you concerned about disk space or speed?

Question 8

I think the first step to solving your problems is acceptance that you cannot achieve something with nothing. :-)

With that tautology out of the way, it isn't at all obvious what difference it would make if you rearranged tables between different databases. Why do you think this would help?

Same for the table you are looking at splitting into one table per user - why would this help? Are you doing full table scans? Partitioning the table would probably be a cleaner approach, but it is difficult to give specific advice based on the limited information you provided. A common solution to full table scans is summary tables, which you mentioned you are already using.

Question 9

Hi @gordan-bobic :) I am trying to avoid having tables with a) lots of data and b) big in size so that I won't need LOTS of extra server resources to run my queries fast. I want to buy time to focus build my product in other areas and not re-do the entire database at the moment. At some cases I do full table scans, but when that got a bit slow, I created cron jobs to sum data I need and place then at a different table. I will check out Partitioning tables - Verace also mentioned this in the comments above as well.

Question 10

Last but not least, I was thinking different tables just to keep things organized / it was just an idea. It could be in the same database but I want to be flexible in backups etc. I do not want to end up with a database of 50,100GB which will make my life harder at the moment. You already helped a lot just by asking me those questions.

Question 11

You mentioned keeping data small - I know I'm stating the obvious, but keeping the data small means keeping less data. That means pruning data that is no longer useful. Partitioning can make that much cheaper - ALTER TABLE ... DROP PARTITION is instantaneous. DELETE, on the other hand, can be very slow.

Question 12

Yes, I am sorry @gordan-bobic. By saying keep data small I mean: If I have 1 table which has 10M rows VS having a table with 1M rows, wouldn't it be easier for the queries to run faster without having to add extra resources? At the moment all the data I have is useful cannot remove them. Whatever could be removed is already out. I will read more about partitions, because maybe this will help me. I didn't know about partitions. Maybe it is a good solution for me

Question 13

@VickyDallas when you will read about partitioning just remember that this is not a "performance" feature )

Rick James Rick James 80.7k5 gold badges52 silver badges119 bronze badges · Answer 1 · 2020-05-10 18:41:48Z