Commit d45b2c0

committed

Day 45

1 parent d0a0d82 commit d45b2c0Copy full SHA for d45b2c0

File tree

2 files changed

+162

-0

lines changed

0045_how_to_monitor_xmin_horizon.md
README.md

2 files changed

+162

-0

lines changed

`‎0045_how_to_monitor_xmin_horizon.md‎`

Lines changed: 161 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,161 @@`
	`1`	`+Originally from: [tweet](https://twitter.com/samokhvalov/status/1722916380239073399), [LinkedIn post]().`
	`2`	`+`
	`3`	`+---`
	`4`	`+`
	`5`	`+# How to monitor xmin horizon to prevent XID/MultiXID wraparound and high bloat`
	`6`	`+`
	`7`	`+> I post a new PostgreSQL "howto" article every day. Join me in this`
	`8`	`+> journey – [subscribe](https://twitter.com/samokhvalov/), provide feedback, share!`
	`9`	`+`
	`10`	`+Previously, we discussed`
	`11`	`+[how to implement monitoring for the risks of XID (transaction ID) and MultiXID wraparound](0044_how_to_monitor_transaction_id_wraparound_risks.md).`
	`12`	`+That type of check is critical and a must-have in any monitoring.`
	`13`	`+`
	`14`	`+However, while it helps you understand the risk level, it doesn't reveal the root cause – something that you'll`
	`15`	`+definitely need for your XID wraparound postmortem, when applying the "Five Whys" method (just kidding, we're going to`
	`16`	`+improve our monitoring and have autovacuum behavior control, so none of us will ever experience a XID wraparound in`
	`17`	`+production).`
	`18`	`+`
	`19`	+This problem can be solved with the `xmin` horizon monitoring. And this very check is also helpful in understanding
	`20`	`+the reasons of high bloat growth.`
	`21`	`+`
	`22`	`+## Four reasons of growing XID/MultiXID wraparound risks`
	`23`	`+`
	`24`	+XID/MultiXID wraparound occurs when `autovacuum` doesn't freeze tuples. Considering that `autovacuum` is turned on,
	`25`	`+there might be four reasons for this:`
	`26`	`+`
	`27`	`+1. Long-running transaction on the primary.`
	`28`	`+2. Abandoned replication slot.`
	`29`	+3. Long-running transaction on a standby node with `hot_standby_feedback = 'on'` (including cases with cascaded
	`30`	`+ replication – the feedback can be propagated).`
	`31`	`+4. Abandoned prepared transaction.`
	`32`	`+`
	`33`	`+All four reasons need to be checked, as they can contain the transaction ID value for the data that.`
	`34`	`+`
	`35`	`+A good post on this topic:`
	`36`	`+[VACUUM won't remove dead rows: 4 reasons why](https://cybertec-postgresql.com/en/reasons-why-vacuum-wont-remove-dead-rows/).`
	`37`	`+`
	`38`	`+Clarification (based on the analysis of mistakes various monitoring systems demonstrate):`
	`39`	`+`
	`40`	`+- It is never enough to monitor only long-running transactions – all four reasons have to be covered. In fact,`
	`41`	`+ long-running transaction monitoring alone is helpful for troubleshooting a different kind of problem: risk locking`
	`42`	`+ issues.`
	`43`	`+- It is not enough to monitor only slots – certain standbys may be configured without slots. Moreover, monitoring only`
	`44`	+ `pg_stat_replication` is not sufficient – it wouldn't cover abandoned replication slots.
	`45`	+ Both `pg_stat_replication` and `pg_replication_slots` should be checked.
	`46`	`+`
	`47`	`+## xmin horizon`
	`48`	`+`
	`49`	+The term "`xmin` horizon" is used in the Postgres documentation (for example, when describing
	`50`	+`pg_stat_activity.backend_xmin`),
	`51`	`+although it is never explicitly defined. It's also used in the source code, and there is a`
	`52`	`+[good comment](https://github.com/postgres/postgres/blob/6bf2efb38285626a9de3004dd1c23d9a85453372/src/backend/storage/ipc/procarray.c#L1662)`
	`53`	+on the function `ComputeXidHorizons()`, explaining the machinery.
	`54`	`+`
	`55`	+The `xmin` of a row indicates the transaction ID that inserted the row – every table has a hidden (system) column
	`56`	+`xmin` (try this: `select xmin, * from your_table;`).
	`57`	`+`
	`58`	+The "`xmin` horizon" represents the XID of the oldest snapshot of data that must be preserved.
	`59`	`+`
	`60`	`+## What about bloat?`
	`61`	`+`
	`62`	+If `xmin` horizon doesn't progress for short period of time, blocking `autovacuum`, it is not a problem – this normally
	`63`	`+happens often.`
	`64`	`+`
	`65`	+But if this happens for long period of time, and `xmin` horizon is far in the past, it can cause two big problems:
	`66`	`+`
	`67`	`+- XID/MultiXID wraparound, as discussed;`
	`68`	+- higher bloat growth: inability to delete dead tuples now leads to massive deletes of them later, when `xmin` horizon
	`69`	+ shifts – and this makes `autovacuum` a massive "dead tuple to bloat converter".
	`70`	`+`
	`71`	+That's why it's important to monitor `xmin` horizon and react when it's too far behind (`xmin` horizon age is high).
	`72`	`+`
	`73`	`+## How to monitor`
	`74`	`+`
	`75`	+There are two ways to monitor `xmin` horizon:
	`76`	`+`
	`77`	+1) Observing `autovacuum` logs.
	`78`	`+2) Querying 4 system views to cover 4 reasons discussed above.`
	`79`	`+`
	`80`	`+## Log-based monitoring`
	`81`	`+`
	`82`	+The log-based approach doesn't help understand the reasons behind a non-progressing `xmin` horizon, but it can still be
	`83`	+helpful as it demonstrates `autovacuum` behavior in action.
	`84`	`+`
	`85`	`+A log example and how to read it:`
	`86`	`+`
	`87`	+```
	`88`	`+2023年11月10日 01:04:03.828 PST [56538] LOG: automatic vacuum of table "nik.public.t": index scans: 0`
	`89`	`+ pages: 0 removed, 4480 remain, 4480 scanned (100.00% of total)`
	`90`	`+ tuples: 0 removed, 1000000 remain, 666667 are dead but not yet removable`
	`91`	`+ removable cutoff: 784, which was 112449 XIDs old when operation ended`
	`92`	`+ frozen: 0 pages from table (0.00% of total) had 0 tuples frozen`
	`93`	`+ index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed`
	`94`	`+ avg read rate: 9.685 MB/s, avg write rate: 0.000 MB/s`
	`95`	`+ buffer usage: 7281 hits, 1698 misses, 0 dirtied`
	`96`	`+ WAL usage: 0 records, 0 full page images, 0 bytes`
	`97`	`+ system usage: CPU: user: 0.11 s, system: 0.02 s, elapsed: 1.36 s`
	`98`	+```
	`99`	`+`
	`100`	`+Here, the indicators of a problem are:`
	`101`	`+`
	`102`	+- `666667 are dead but not yet removable` – a lot of tuples are dead but `autovacuum` cannot remove them because these
	`103`	+ tuples are "younger" than the current `xmin` horizon (their `xmin` values are in the future compared to the `xmin`
	`104`	`+ horizon)`
	`105`	+- `removable cutoff: 784, which was 112449 XIDs old when operation ended` – this tells us that the XID horizon is 784
	`106`	+ and its age is 112449 – so, the `xmin` horizon (the data version that is still considered needed) is more than 112k
	`107`	+ transaction behind in the past, at the moment when `autovacuum` finished this processing attempt.
	`108`	`+`
	`109`	+This indicates that the `xmin` horizon is far behind the current moment, and something is holding it in the distant
	`110`	`+past. To understand what it is, we need to check several system views.`
	`111`	`+`
	`112`	`+## Monitoring using system views`
	`113`	`+`
	`114`	`+An example query:`
	`115`	`+`
	`116`	+```sql
	`117`	`+with bits as (`
	`118`	`+ select`
	`119`	`+ (`
	`120`	`+ select backend_xmin`
	`121`	`+ from pg_stat_activity`
	`122`	`+ order by age(backend_xmin) desc nulls last`
	`123`	`+ limit 1`
	`124`	`+ ) as xmin_pg_stat_activity,`
	`125`	`+ (`
	`126`	`+ select xmin`
	`127`	`+ from pg_replication_slots`
	`128`	`+ order by age(xmin) desc nulls last`
	`129`	`+ limit 1`
	`130`	`+ ) as xmin_pg_replication_slots,`
	`131`	`+ (`
	`132`	`+ select backend_xmin`
	`133`	`+ from pg_stat_replication`
	`134`	`+ order by age(backend_xmin) desc nulls last`
	`135`	`+ limit 1`
	`136`	`+ ) as xmin_pg_stat_replication,`
	`137`	`+ (`
	`138`	`+ select transaction`
	`139`	`+ from pg_prepared_xacts`
	`140`	`+ order by age(transaction) desc nulls last`
	`141`	`+ limit 1`
	`142`	`+ ) as xmin_pg_prepared_xacts`
	`143`	`+)`
	`144`	`+select`
	`145`	`+ *,`
	`146`	`+ age(xmin_pg_stat_activity) as xmin_pgsa_age,`
	`147`	`+ age(xmin_pg_replication_slots) as xmin_pgrs_age,`
	`148`	`+ age(xmin_pg_stat_replication) as xmin_pgsr_age,`
	`149`	`+ age(xmin_pg_prepared_xacts) as xmin_pgpx_age,`
	`150`	`+ greatest(`
	`151`	`+ age(xmin_pg_stat_activity),`
	`152`	`+ age(xmin_pg_replication_slots),`
	`153`	`+ age(xmin_pg_stat_replication),`
	`154`	`+ age(xmin_pg_prepared_xacts)`
	`155`	`+ ) as xmin_horizon_age`
	`156`	`+from bits;`
	`157`	+```
	`158`	`+`
	`159`	+Note that the `min(...)` function cannot be applied to XID values directly, because of their nature (32-bit
	`160`	+and `rotation`) – casting XID to `int` doesn't exist for good reason. But the` age(XID)` function is helpful here. So
	`161`	+instead of considering `xmin_horizon` values, we need to deal with `xmin_horizon_age` instead.

`‎README.md‎`

Lines changed: 1 addition & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -71,6 +71,7 @@ As an example, first 2 rows:`
`71`	`71`	`- 0042 [How to analyze heavyweight locks, part 2: Lock trees (a.k.a. "lock queues", "wait queues", "blocking chains")](./0042_how_to_analyze_heavyweight_locks_part_2.md)`
`72`	`72`	`- 0043 [How to format SQL](./0043_how_to_format_sql.md)`
`73`	`73`	`- 0044 [How to monitor transaction ID wraparound risks](./0044_how_to_monitor_transaction_id_wraparound_risks.md)`
	`74`	`+- 0045 [How to monitor xmin horizon to prevent XID/MultiXID wraparound and high bloat](./0045_how_to_monitor_xmin_horizon.md)`
`74`	`75`	`- ...`
`75`	`76`
`76`	`77`	`## Contributors`

0 commit comments

Comments

(0)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit d45b2c0

File tree

2 files changed

2 files changed

`‎0045_how_to_monitor_xmin_horizon.md‎`

`‎README.md‎`

0 commit comments