Why is PostgreSQL data checksums not enabled by default?

Question 1

According to user comments PostgreSQL data checksums have very minimal runtime overhead (both CPU and storage) but would allow (among other things) using pg_rewind for point in time recovery (PITR). However, data checksums are not enabled by default and enabling it on already existing HA cluster is not possible without pretty significant downtime. (If I've understood correctly, you cannot enable checksums on hot-standby only and promote it as new master once enabling the checksums were complete on the hot-standby.)

Is there some poorly known issues if data checksums were enabled by default? Or is the default state (checksums disabled) just due historical reasons even though enabling data checksums would make much more sense in all cases?

Question 2

Interesting question (+1) - I'd say that it's something like "enable as little as possible by default - let your users decide" - i.e. make the system as lean as possible by default - BTW, this is a complete guess on my part!

Question 3

Enabling data checksums is not for free:

you have to calculate the checksums frequently, which costs CPU
data checksums require that hint bits are WAL logged, which increases the amount of WAL that needs to be written

Also, data checksums don't offer any benefit unless you are using shoddy storage, where data could change between the time you wrote a block and the time you read it again.

If you want to use pg_rewind, you don't need data checksums. It is sufficient to enable the parameter wal_log_hints.

Question 4

How much would enabling wal_log_hints typically increases the WAL log sizes? Is it closer to +1% or +100%?

Question 5

The old answer: it depends. Could be virtually zero, but if you load the data, then have a checkpoint, then query the data, it could be 100%, since all the blocks are modified by the first reader, who sets the hint bits.

Question 6

I will quote some opinions from core postgresql developers

That's nice for us but I'm not sure that it's a benefit for users. I've seen little if any data to suggest that checksums actually catch enough problems to justify the extra CPU costs and the risk of false positives.

My problem is more that I'm not confident the checks are mature enough. The basebackup checks are atm not able to detect random data, and neither basebackup nor backend checks detect zeroed out files/file ranges.

I can believe that many users have shared_buffers set to its default value and that we are going to get complains about "performance drop after upgrade to v12" if we switch data checksums to on by default.

data checksums can catch not so many actual data corruptions, are not free (especially with small sizes of shared_buffers - checksums are calculated when writing a page to disk or when reading from disk to shared_buffers) neither by CPU nor by WAL size.

No issues per se, but the general consensus so far is that there's not much benefit for end users to have it enabled by default.

Question 7

I would have expected shared_buffers to be increased for all cases where the performance actually matters because it's that critical. Maybe PostgreSQL should automatically adjust shared_buffers by default instead of hardcoding it to really small default? For example, default could be auto and it would be interpreted as 10% of total system RAM.

Laurenz Albe Laurenz Albe 62.1k4 gold badges57 silver badges93 bronze badges · Accepted Answer · 2023-06-09 12:01:44Z

Enabling data checksums is not for free:

you have to calculate the checksums frequently, which costs CPU
data checksums require that hint bits are WAL logged, which increases the amount of WAL that needs to be written

Also, data checksums don't offer any benefit unless you are using shoddy storage, where data could change between the time you wrote a block and the time you read it again.

If you want to use pg_rewind, you don't need data checksums. It is sufficient to enable the parameter wal_log_hints.

How much would enabling wal_log_hints typically increases the WAL log sizes? Is it closer to +1% or +100%?
The old answer: it depends. Could be virtually zero, but if you load the data, then have a checkpoint, then query the data, it could be 100%, since all the blocks are modified by the first reader, who sets the hint bits.

Stack Exchange Network

Why is PostgreSQL data checksums not enabled by default?

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Why is PostgreSQL data checksums not enabled by default?

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions