git.postgresql.org Git - postgresql.git/commitdiff

git projects / postgresql.git / commitdiff

Fix nbtree deduplication README commentary.

author Peter Geoghegan <pg@bowt.ie>

2020年3月24日 21:58:27 +0000 (14:58 -0700)

committer Peter Geoghegan <pg@bowt.ie>

2020年3月24日 21:58:27 +0000 (14:58 -0700)

Descriptions of some aspects of how deduplication works were unclear in
a couple of places.

src/backend/access/nbtree/README patch | blob | blame | history

diff --git a/src/backend/access/nbtree/README b/src/backend/access/nbtree/README

index 6499f5adb793f0f6cb8ed42de4577a7cf966c067..2d0f8f4b79ae5f20b0987f8bb091a7a09421ad08 100644 (file)

--- a/src/backend/access/nbtree/README

+++ b/src/backend/access/nbtree/README

@@ -780,20 +780,21 @@ order. Delaying deduplication minimizes page level fragmentation.

Deduplication in unique indexes

-------------------------------

-Very often, the range of values that can be placed on a given leaf page in

-a unique index is fixed and permanent. For example, a primary key on an

-identity column will usually only have page splits caused by the insertion

-of new logical rows within the rightmost leaf page. If there is a split

-of a non-rightmost leaf page, then the split must have been triggered by

-inserts associated with an UPDATE of an existing logical row. Splitting a

-leaf page purely to store multiple versions should be considered

-pathological, since it permanently degrades the index structure in order

-to absorb a temporary burst of duplicates. Deduplication in unique

-indexes helps to prevent these pathological page splits. Storing

-duplicates in a space efficient manner is not the goal, since in the long

-run there won't be any duplicates anyway. Rather, we're buying time for

-standard garbage collection mechanisms to run before a page split is

-needed.

+Very often, the number of distinct values that can ever be placed on

+almost any given leaf page in a unique index is fixed and permanent. For

+example, a primary key on an identity column will usually only have leaf

+page splits caused by the insertion of new logical rows within the

+rightmost leaf page. If there is a split of a non-rightmost leaf page,

+then the split must have been triggered by inserts associated with UPDATEs

+of existing logical rows. Splitting a leaf page purely to store multiple

+versions is a false economy. In effect, we're permanently degrading the

+index structure just to absorb a temporary burst of duplicates.

+Deduplication in unique indexes helps to prevent these pathological page

+splits. Storing duplicates in a space efficient manner is not the goal,

+since in the long run there won't be any duplicates anyway. Rather, we're

+buying time for standard garbage collection mechanisms to run before a

+page split is needed.

Unique index leaf pages only get a deduplication pass when an insertion

(that might have to split the page) observed an existing duplicate on the

@@ -838,13 +839,15 @@ list splits.

Only a few isolated extra steps are required to preserve the illusion that

the new item never overlapped with an existing posting list in the first

-place: the heap TID of the incoming tuple is swapped with the rightmost/max

-heap TID from the existing/originally overlapping posting list. Also, the

-posting-split-with-page-split case must generate a new high key based on

-an imaginary version of the original page that has both the final new item

-and the after-list-split posting tuple (page splits usually just operate

-against an imaginary version that contains the new item/item that won't

-fit).

+place: the heap TID of the incoming tuple has its TID replaced with the

+rightmost/max heap TID from the existing/originally overlapping posting

+list. Similarly, the original incoming item's TID is relocated to the

+appropriate offset in the posting list (we usually shift TIDs out of the

+way to make a hole for it). Finally, the posting-split-with-page-split

+case must generate a new high key based on an imaginary version of the

+original page that has both the final new item and the after-list-split

+posting tuple (page splits usually just operate against an imaginary

+version that contains the new item/item that won't fit).

This approach avoids inventing an "eager" atomic posting split operation

that splits the posting list without simultaneously finishing the insert

This is the main PostgreSQL git repository.

RSS Atom