I have a purely theoretical question that would help me understand how exactly data is stored in the database.
Database settings:
MySQL 8.0.15, Innodb;
Innodb_file_per_table - ON (default);
innodb_default_row_format - dynamic (default);
innodb_page_size - 16384 (default);
For example, if we create a table with 198 VARCHAR(40) NOT NULL columns, it will be successfully created:
Example 1:
CREATE TABLE tab(
col1 varchar(40) NOT NULL,
...
col198 varchar(40) NOT NULL)
198*(40*4byte(utf8mb4) + 1byte(Length prefix)) + (some unaccounted data) = 31,878 + (some unaccounted data) < 65,535
If try to create a table with 199 columns (Example 2), there will be an error:
[42000][1118] Row size too large (> 8126). Changing some columns to TEXT or BLOB may help. In current row format, BLOB prefix of 0 bytes is stored inline.
Bytes count:
199*(40*4byte(utf8mb4) + 1byte(Length prefix)) + (some unaccounted data) = 32039 + (some unaccounted data) < 65,535
At the same time, a table like this will also successfully be created:
Example 3:
CREATE TABLE tab(
col1 varchar(100) NOT NULL,
...
col163 varchar(100) NOT NULL)
Bytes count:
163*(100*4byte(utf8mb4) + 1byte(Length prefix)) + (some unaccounted data) = 65,363 + (some unaccounted data) < 65,535
My first theory was that it's related to the maximum row size for an InnoDB table (half of innodb_page_size) https://dev.mysql.com/doc/refman/8.0/en/column-count-limit.html#row-size-limits
I thought that for variable-length columns with a large number of characters, the engine immediately makes a conclusion that the data of this column will be stored in external off-page storage. And only the pointer will be stored in the row itself.
Therefore, it does not work for the examples above.
Obviously, I do not fully understand these limits from the documentation:
The internal representation of a MySQL table has a maximum row size limit of 65,535 bytes
the maximum row size is slightly less than 8KB for the default 16KB InnoDB page size.
And because of this, can`t predict table behavior.
(I read the documentation, but this didn't help.)
I would be glad for any explanation.
2 Answers 2
Off-record storage -- When the row size becomes "too long", InnoDB will attempt to move the contents of some columns to a separate 16KB block(s). When it does so, it leaves behind a "20-byte pointer" (per moved column) so that the main block for the record can find the overflow. It does not bother to apply this technique if the column is less than 40 bytes. (Hence, it does not apply to your table?)
This off-record technique is automatic and is applied independently to each column of each row. And it applies to CHARACTER
and BYTE
columns.
199 Columns That is a huge number of columns. You cannot index more than a few of them, so it sounds like it is simply the 'obvious' database model for your data.
If they are "phone_1", "phone_2", etc, then consider having a separate table for "phone_numbers".
On the other hand, consider JSON (below) as a 'solution'.
JSON Having one, large, JSON column with the "miscellany" columns is a likely solution for your situation. Yes, it adds to the bulk (quotes, field names, etc), but JSON can handle more than 199 items in TEXT CHARACTER SET utf8mb4
. (Or MEDIUMTEXT
.) See 'compression', below.
16K Block size This limit can be changed. But there are so many drawbacks that I do not recommend it. 16KB block size limits each row to under 8KB; 32KB allows nearly 16KB per row, but 64KB is limited to 16KB. Also, the setting applies to all InnoDB tables in the server.
Compression
ROW_FORMAT=COMPRESSED
. I do not recommend this; it has a lot of overhead and rarely gives you more than 2x compression on disk. Meanwhile it is costs CPU and buffer_pool space.
Manually compressing selected TEXT
/VARCHAR
/'BLOB' columns can give you about 3x compression (for ordinary text), but your app must do the compression/decompression. And, you must change the column type to some size of BLOB
(or VARBINARY
)
Change to BLOB You quoted something that implied a possible advantage of BLOB
over VARCHAR
. I can't think of such (other than what I mention here.)
40*4byte VARCHAR(40) CHARSET utf8mb4
consumes 1 or 2 bytes for length, up to 40 characters of text, each of which is 1 to 4 bytes.
If all the strings are always 40 bytes, then use CHAR(40)
to avoid the [trivial] length field. If the text is always ASCII, then change the charset. Neither of these comments will impact the table enough to matter much.
Recommendations
- Give up on understanding the details of how records are laid out in InnoDB.
- If practical, decrease the number of columns in the table.
- Consider
JSON
, either as theJSON
datatype or asTEXT
or as a user-compressedBLOB
.
-
Thank you, @Rick James As I mentioned, it's just a theoretical question. And this huge number of columns is definitely bad architecture. And I understand how this can be solved with JSON, TEXT, or BLOB. But it's still not clear why 198 VARCHAR(40) table can be created, but 199 VARCHAR(40) can't. What limit is exceeded in this case?Kontar Maryna– Kontar Maryna2025年05月26日 08:44:51 +00:00Commented May 26 at 8:44
-
I think the answer is "40" * "199" + overhead > 8K. The 40 was mentioned in my first paragraph. The 8K is the max size for a row before spilling to off-record storage, as mentioned in "16KB block size". My answer was part theory, part advice. (Other readers may get here with a real need to build such a table.)Rick James– Rick James2025年05月26日 21:13:38 +00:00Commented May 26 at 21:13
-
So, to summarize. 1. After the 8120 limit, the row will spill to off-page storage. 2. When selecting columns to be stored off-page, it selects columns with a length greater than a certain limit (can't find it in the documentation). But in Example1 it can't select the columns, because they are all ~160 bytes (and maybe less than this limit). And for Examle2 it works, because all columns are ~400 bytes . Is this assumption correct?Kontar Maryna– Kontar Maryna2025年05月28日 09:03:37 +00:00Commented May 28 at 9:03
-
@KontarMaryna - I think "40 bytes" is the cutoff for storing off-page. It will start with the longest (in actual size) column, continuing with more columns until the 8120 is met. A
VARCHAR(40)
column can be up to 160 bytes. The actual size of the column in the relevant row is important.Rick James– Rick James2025年05月28日 18:51:04 +00:00Commented May 28 at 18:51 -
@KontarMaryna - The 40 in varchar and the 40 I mention are not the same. 199 will fail, whether the varchars have empty strings. This is because of my "40", not the varchar(40). With 198, and 40 Emoji in each, all columns will spill to off-record storage. If the data in 198 varchars varies in length, some sill spill.Rick James– Rick James2025年05月28日 21:58:51 +00:00Commented May 28 at 21:58
I tested creating a table:
CREATE TABLE tab(
id INT AUTO_INCREMENT PRIMARY KEY
,col1 varchar(40) not null default 'abcdefghijklmnopqrstuvwxyzabcdefghijklmn'
...
,col197 varchar(40) not null default 'abcdefghijklmnopqrstuvwxyzabcdefghijklm'
);
INSERT INTO tab () VALUES ();
Then I installed Jeremy Cole's innodb_ruby
tools, and took a look at what's in that page:
innodb_space -f test/tab.ibd space-page-type-regions
start end count type
0 0 1 FSP_HDR
1 1 1 IBUF_BITMAP
2 2 1 INODE
3 3 1 SDI
4 4 1 INDEX
5 6 2 FREE (ALLOCATED)
There are no LOB pages, which I think means none of the VARCHAR columns are overflowing off page. The INDEX page (4) is the one with the data (InnoDB tables are referred to as index, i.e. the clustered index is the table).
innodb_space -f test/tab.ibd -p 4 page-dump
...
sizes:
header 120
trailer 8
directory 4
free 8341
used 8043
record 7911
per record 7911.00
...
Looks like with 197 columns brings the record close to the limit of 8126 bytes. If I try 198 columns, I get the error. I'm not sure why you can create a table of 198 columns but I can't. Perhaps there's a slight difference in the InnoDB storage, given that I'm testing on MySQL 8.4.4.
7911 bytes is approximately 40 bytes per column, which is what we'd expect for 40 single-byte characters.
Second test using VARCHAR(100) instead of VARCHAR(40). I was able to create a table with 163 columns of VARCHAR(100).
Here's what the page-dump
command told me:
sizes:
header 120
trailer 8
directory 4
free 8141
used 8243
record 8111
per record 8111.00
8111 / 163 is 49.76 bytes per column on average. This time there are external pages allocated:
innodb_space -f test/tab.ibd space-page-type-regions
start end count type
0 0 1 FSP_HDR
1 1 1 IBUF_BITMAP
2 2 1 INODE
3 3 1 SDI
4 4 1 INDEX
5 36 32 LOB_FIRST
37 63 27 FREE (ALLOCATED)
64 137 74 LOB_FIRST
138 319 182 FREE (ALLOCATED)
So I believe some of the VARCHAR(100) were long enough to push onto additional pages. I'm not sure how to tell how many or which columns were treated this way.
What I conclude is that it's tricky to store very wide rows in InnoDB. There's a balancing act going on internally, trying to store columns on the main index page if possible, but if they're too long, some columns are moved to LOB pages, as needed. I don't know how InnoDB chooses which columns to move. And since a moved column leaves a 20-byte pointer on the index page, it's not like you save all the space for that column. You could still have a row that's so wide that it won't fit.
The bottom line is: don't store very wide rows in InnoDB. It's meant to be an OLTP engine for tables that are more or less normalized, and it's probably not right in a normalized data model to have hundreds of columns in any table.
-
Thank you, Bill Karwin. Yes, it looks like there also exists a limit (or some dynamic limit) for the column length after which this column can be chosen to move to off-page storage. But I can't find it in the documentation.Kontar Maryna– Kontar Maryna2025年05月28日 09:11:54 +00:00Commented May 28 at 9:11
-
Ultimately you would have to study the source code to understand it to the degree you want to. Few people need to do that.Bill Karwin– Bill Karwin2025年05月28日 13:30:55 +00:00Commented May 28 at 13:30