Postgres heap vs SQL Server clustered index

Asked 6 years, 5 months ago

Viewed 4k times

I am transitioning from SQL Server to Postgres, and one of the biggest things for me to digest is the non-existence of the "clustered key" that sorts the data in Postgres.

Can someone share their thoughts on how Postgres avoided the need for an internally sorted dataset and how it works with large heap tables and still supply exceptional performance?

Improve this question

edited Apr 3, 2019 at 19:50

MDCCL's user avatar

MDCCL

8,5303 gold badges32 silver badges63 bronze badges

asked Apr 2, 2019 at 20:24

Jude's user avatar

Jude Jude

931 silver badge5 bronze badges

SQL Server tables can be used as heaps but the benefit of having the clusters index that automatically sorts data is far greater to not have one. Even though postgres does have clusters index it doesn't maintain the order of data. And this made wonder how it performs under a significant workload.

Jude
– Jude

2019年04月02日 20:57:59 +00:00
Commented Apr 2, 2019 at 20:57
Did you benchmark your database to see if not having a clustered index really slows down things?

user1822
– user1822

2019年04月03日 06:00:35 +00:00
Commented Apr 3, 2019 at 6:00
2

I have mostly worked with multitenant data sets with SQL server and having tenants data in a sorted manner has allowed the read-ahead-read to be more effective and load more relevenet/validata data to the buffer cache.

Jude
– Jude

2019年04月03日 17:33:56 +00:00
Commented Apr 3, 2019 at 17:33

Add a comment |

3 Answers 3

Sorted by: Reset to default

You can try pg_repack extension to cluster online with less locking

Improve this answer

answered Apr 3, 2019 at 19:45

oscar's user avatar

oscar oscar

362 bronze badges

Add a comment |

PostgreSQL simply doesn't implement this feature. There is no trick to not implementing it. It isn't implemented in the straight forward, uncomplicated way of just not doing it. To use one bit of jargon, all btree indexes in PostgreSQL are "secondary indexes", not "primary indexes". Even the primary key's index is a "secondary index".

There are some cases where clustered keys (or index organized tables, as another product calls them) are important, and in those cases PostgreSQL fails to "supply exceptional performance". You can argue about how common those cases are, of course, but they certainly do exist and it is unfortunate that PostgreSQL doesn't offer a solution for them. There have proposals to address this, but I don't think any of those efforts are currently active.

In some cases, you can ameliorate the problem by using the CLUSTER command, or by implementing partitioning, or by using covering indexes, but none of these is entirely satisfactory as an alternative to real clustering.

Improve this answer

answered Apr 3, 2019 at 15:56

jjanes's user avatar

jjanes jjanes

42.4k3 gold badges44 silver badges54 bronze badges

Hey @jjanes, thanks for your feedback. Using CLUSTERED is definitely a possibility. But that will require an explicit lock on the table which is a no-no mostly. And Yes I have taken declarative partitioning to keep the data sets more manageable. So from what you have mentioned getting the correct indexes is probably more critical.

Jude
– Jude

2019年04月03日 17:59:59 +00:00
Commented Apr 3, 2019 at 17:59

Add a comment |

PostgreSQL doesn't do anything special to replace the "need" of a clustered index.

It just simply doesn't have that feature. (Some would say that isn't a great loss.)

You can manually perform a one-time cluster with CLUSTER or pg_repack.

There is also declarative partitioning (though it has a number of caveats until PostgreSQL 11). It isn't quite clustering, but can be used to group rows into specified buckets.

Improve this answer

edited Sep 3, 2019 at 23:03

answered Sep 3, 2019 at 22:57

Paul Draper's user avatar

Paul Draper Paul Draper

8007 silver badges20 bronze badges

Add a comment |

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-sql

Stack Exchange Network

Postgres heap vs SQL Server clustered index

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Postgres heap vs SQL Server clustered index

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions