Many apologies in advance, but I was not able to find this out myself:
What is the difference between live tuples and tuples inserted in postgres statistics?
There's this post, but it doesn't answer the question: pg_stat_get_live_tuples
are Representing the number of live [...] rows (tuples) in the table.
But what means "live"?
The Postgres documentation doesn't provide answers either:
pg_stat_get_live_tuples(oid)
Number of live rows in table
pg_stat_get_tuples_inserted(oid)
Number of rows inserted into table
Why do my stats show 213,324,422 inserted tuples, but only 124,510,280 live tuples? (3,087,919 dead tuples)
The PG Admin documentation is also unclear about this. It further says that Tuples Inserted displays the number of tuples inserted into the database within the last week.
I would like to know the total number of existing entries in my database. Is this live rows or inserted rows?
1 Answer 1
This nomenclature all comes down to knowing about MVCC, or Multi-Version Concurrency Control, which is how PostgreSQL handles transactions and concurrency.
When you update or delete a row, doesn't actually remove the row. When you do an UPDATE
or DELETE
, the row isn't actually physically deleted. For a DELETE
, the database simply marks the row as unavailable for future transactions, and for UPDATE
, under the hood it's a combined INSERT
then DELETE
, where the previous version of the row is marked unavailable. These new versions of rows are generally referred to as the "live" rows, and the older versions are referred to as "dead" rows.
The statistics that you are looking at, where it shows 213,324,422 inserted tuples
are the number of new data inserts that has been done on that table. The 124,510,280 live tuples
are the number of tuples that are up to date in the table and available to be read or modified in transactions. The 3,087,919 dead tuples
are the number of tuples that have been changed and are unavailable to be used in future transactions. The autovacuum daemon, or a manual vacuum will eventually come along and mark the space of those "dead" tuples available for future use, which means that new INSERTS
can overwrite the data in them.
Side note, since these are just statistics that are gathered by the statistics collector, they aren't going to be exact, but they're good enough to give you an idea of how active the table is, and how much bloat there is (live vs dead tuples), and how well the autovacuum daemon is keeping up with your workload.
Hopefully this makes it clearer. You can read more about this in detail in the PostgreSQL Manual in the chapter on Concurrency Control
-
1Fantastic, many thanks Kassandry for this full explanation! This is what I was seeking and now it is absolutely clear to me what dead, live and inserted means.Alex– Alex2017年12月17日 09:39:22 +00:00Commented Dec 17, 2017 at 9:39
-
Thank you! Very clear explanation.Reaz Masud– Reaz Masud2022年07月26日 21:21:48 +00:00Commented Jul 26, 2022 at 21:21
Tuples deleted
says6,710,495
- this does not make sense to me