I have a file for a table which has grown very large. I expected to have a large amount of unallocated space needing to be optimized, but when checking INFORMATION_SCHEMA.TABLES, there is little data_free (4 MiB, 0.4 %) and the numbers do not sum to the size of the file:
$ sudo ls -lh /var/lib/mysql/data/modnar.ibd
-rw-r----- 1 mysql mysql 7.0G Aug 23 09:52 /var/lib/mysql/data/modnar.ibd
mysql> select table_name, engine, data_length, max_data_length, index_length, data_free from information_schema.tables where table_name = "modnar";
+------------+--------+-------------+-----------------+--------------+-----------+
| table_name | engine | data_length | max_data_length | index_length | data_free |
+------------+--------+-------------+-----------------+--------------+-----------+
| modnar | InnoDB | 1172307968 | 0 | 0 | 4194304 |
+------------+--------+-------------+-----------------+--------------+-----------+
The file is 7 GiB but adding the numbers from INFORMATION_SCHEMA.TABLES, (1172307968+4194304)/1024/1024/1024 is only 1.1 GiB.
Why is the file so large, 7 times the amount of data in the table?
-
Does this answer your question? MySQL: My table size using a query does not match the IBD's file size on diskRolandoMySQLDBA– RolandoMySQLDBA2020年08月23日 17:44:11 +00:00Commented Aug 23, 2020 at 17:44
-
No that link does not answer my question, but my table did not have fragmentation but still looked 1/7th of the file size. A lot of blogs and SO answers keep referring to INFORMATION_SCHEMA.TABLES when as per the percona article information in that table is often stale.Yves Dorfsman– Yves Dorfsman2020年08月23日 18:01:17 +00:00Commented Aug 23, 2020 at 18:01
1 Answer 1
It turns out INFORMATION_SCHEMA.TABLES is rarely updated and can be off by a large factor. Extensive details from percona: https://www.percona.com/blog/2016/01/26/finding_mysql_table_size_on_disk/
INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES is always up to date, indeed:
mysql> select * from INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES where name = 'data/modnar';
+-------+-------------+------+-------------+------------+-----------+---------------+------------+---------------+------------+----------------+
| SPACE | NAME | FLAG | FILE_FORMAT | ROW_FORMAT | PAGE_SIZE | ZIP_PAGE_SIZE | SPACE_TYPE | FS_BLOCK_SIZE | FILE_SIZE | ALLOCATED_SIZE |
+-------+-------------+------+-------------+------------+-----------+---------------+------------+---------------+------------+----------------+
| 54 | data/modnar | 33 | Barracuda | Dynamic | 16384 | 0 | Single | 4096 | 7470055424 | 7470059520 |
+-------+-------------+------+-------------+------------+-----------+---------------+------------+---------------+------------+----------------+
Running OPTIMIZE on the modnar table did update INFORMATION_SCHEMA.TABLES for which the numbers now add up to 6.25 GiB.
A query to find which table use large amount of unallocated space from up to date data:
select name, unallocated_KB, round(unallocated_KB/(fs_block_size/1024)) unallocated_blocks, unallocated_KB/(file_size/1024) unallocated_percent from (
select name, fs_block_size, file_size, (ALLOCATED_SIZE - FILE_SIZE)/1024 unallocated_KB
from INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES
where name not like 'mysql/%' and name not like 'sys/%'
order by unallocated_KB
) as tables;
-
6.25 GB is still not accurate but way closer. From the .ibd and INFORMATION_SCHEMA, the real fragmentation is 0.75GB (7 - 6.25). See my old posts dba.stackexchange.com/questions/110996/… and dba.stackexchange.com/questions/264278/…RolandoMySQLDBA– RolandoMySQLDBA2020年08月23日 17:47:15 +00:00Commented Aug 23, 2020 at 17:47
-
If data_length + index_length is 1.1 GB, then fragmentation is really 5.9 GB. This might have occurred after a mass DELETE or some large UPDATEs. You should run
pt-online-schema-change ... --alter "ENGINE=InnoDB"
and shrink that table.RolandoMySQLDBA– RolandoMySQLDBA2020年08月23日 17:50:06 +00:00Commented Aug 23, 2020 at 17:50 -
That table is added on only, no updates, no deletion. It turned out the INFORMATION_SCHEMA.TABLES was not being updated as explained in the percona article. Also note that after I ran optimize, the data_length reflected the real value, because optimize forces an update on INFORMATION_SCHEMA.TABLES.Yves Dorfsman– Yves Dorfsman2020年08月23日 18:04:43 +00:00Commented Aug 23, 2020 at 18:04