When to use sort_in_tempdb when rebuilding indexes?

Question 1

We are debating whether to use the SORT_IN_TEMPDB option for our DW tables. My understanding is that there are more writes when using this option, although they are more sequential. We have a SAN (which has has been notoriously slow at times), so in our case we want to limit the number of writes as much as possible. I believe tempdb is on a separate LUN (set of disks).

We have plenty of disk space in our data file and on our tempdb file. In this case, would we benefit from using SORT_IN_TEMPDB?

One thing that struck me was this comment on this Answer

When rebuilding an index you would need twice the space of the index + 20% for the sorting. So in general to rebuild every index in your db you only need 120% of your biggest index in your DB. If you use SORT_IN_TEMPDB, you only win 20%, you still need an aditional 100% in your data file. Further more, using sort in tempdb increases your IO load drastically, since instead of Writing the index one time to the datafile, you now write it one time to the tempdb and then write it to the data file. So that is not always ideal.

We definitely don't want to increase our IO load with our slow/possibly misconfigured SAN.

What would be the best way to test this? By simply rebuilding the table with and without the option and log the times?

Edit: We have 8 tempdb files, each 15GB. We do have TF 1117/1118 flags set and IFI is enabled. We currently do a mixture of rebuilding with the sort_in_tempdb option and without it.

Thanks!

SQL Server 2012 Enterprise

Question 2

SORT_IN_TEMPDB means that SQL server will use tempdb to allocate the temporary space as opposed to allocating space in the user database whose index is being rebuild. This means you will need less free space in your user database during an index rebuild operation and more free space in tempdb.

It gives you better advantage when tempdb is on a different set of disks (LUNs) from the user database.

From SORT_IN_TEMPDB Option - BOL :

If the SORT_IN_TEMPDB option is set to ON and tempdb is on a separate set of disks from the destination filegroup, during the first phase, the reads of the data pages occur on a different disk from the writes to the sort work area in tempdb. This means the disk reads of the data keys generally continue more serially across the disk, and the writes to the tempdb disk also are generally serial, as do the writes to build the final index. Even if other users are using the database and accessing separate disk addresses, the overall pattern of reads and writes are more efficient when SORT_IN_TEMPDB is specified than when it is not.

Make sure you read the disk space requirements when SORT_IN_TEMPDB is ON.

slow/possibly misconfigured SAN

You know the pain point. Why dont you work with your SAN admin to fix it ? Misconfigured and or slow SAN will cause all sort of problems like slowness.

Some important points to note :

Use MAXDOP=1 when doing an online index rebuild - else it will cause increase fragmentation.
Intelligently reorg/rebuild your indexs based on the fragmentation level as well as page fragments.
Read SQLFool's experience with Creating a 60 GB Index
Make sure you have configured tempdb properly - TF 1117 (GROW ALL FILES IN A FILEGROUP EQUALLY) and 1118 (FULL EXTENTS ONLY) and have more tempdb datafiles.

What would be the best way to test this?

Yes, you have to test it by analyzing the waitstats when you rebuild the index with and without SORT_IN_TEMPDB. Measure run time as well and when doing in PROD, make sure you do it during a maintenance window or less server activity. Also check your read/write data and log latency.

I am not sure you have Instant file initialization, but it will benefit when restoring, during autogrowth of data files and when creating a new database (just mentioning for completeness).

Question 3

I edited my comment with my tempdb configuration. Thanks, Did not know about the serial online rebuild tip. I'll do some more testing and try to get with the SAN admin, who unfortunately has been less than welcoming. Are there any specific waitstats I should be comparing (ex. PageIOLatch)? Our tempdb writes are super high (4000ms) which is horrendous. Under 40ms for main DBs. That might be a question for another time though...!

Question 4

@Gabe you should show your SAN admin proper facts that it is indeed a SAN problem - read/write latency - sys.dm_io_virtual_file_stats. Is your tempdb on seperate LUN ?

Question 5

Now that it's the mid-2020s and every DBA worth their salt(ed hashes) will be running their DB off NVMe SSDs - so concerns about smoothing-over spinning-rust disk reads are largely moot; but I'd like to know if MAXDOP=1 still makes sense when index rebuilds might not be IO-bound anymore? Also, what about Azure SQL? I note there's still no authoritative word on tempdb sorting in AzSQL where there's no support for putting tempdb in a different filegroup but we do have Hyperscale now - so I'd like to know if it's a good - or bad - idea before I start a 4-hour index maintenance job.

Question 6

I would question - in AG environment if index maintenance is worth ? I have been only doing update stats since many years and there is no perf complains.

Kin Shah Kin Shah 62.6k6 gold badges124 silver badges247 bronze badges · Accepted Answer · 2015-07-31 19:28:22Z

SORT_IN_TEMPDB means that SQL server will use tempdb to allocate the temporary space as opposed to allocating space in the user database whose index is being rebuild. This means you will need less free space in your user database during an index rebuild operation and more free space in tempdb.

It gives you better advantage when tempdb is on a different set of disks (LUNs) from the user database.

From SORT_IN_TEMPDB Option - BOL :

If the SORT_IN_TEMPDB option is set to ON and tempdb is on a separate set of disks from the destination filegroup, during the first phase, the reads of the data pages occur on a different disk from the writes to the sort work area in tempdb. This means the disk reads of the data keys generally continue more serially across the disk, and the writes to the tempdb disk also are generally serial, as do the writes to build the final index. Even if other users are using the database and accessing separate disk addresses, the overall pattern of reads and writes are more efficient when SORT_IN_TEMPDB is specified than when it is not.

Make sure you read the disk space requirements when SORT_IN_TEMPDB is ON.

slow/possibly misconfigured SAN

You know the pain point. Why dont you work with your SAN admin to fix it ? Misconfigured and or slow SAN will cause all sort of problems like slowness.

Some important points to note :

Use MAXDOP=1 when doing an online index rebuild - else it will cause increase fragmentation.
Intelligently reorg/rebuild your indexs based on the fragmentation level as well as page fragments.
Read SQLFool's experience with Creating a 60 GB Index
Make sure you have configured tempdb properly - TF 1117 (GROW ALL FILES IN A FILEGROUP EQUALLY) and 1118 (FULL EXTENTS ONLY) and have more tempdb datafiles.

What would be the best way to test this?

Yes, you have to test it by analyzing the waitstats when you rebuild the index with and without SORT_IN_TEMPDB. Measure run time as well and when doing in PROD, make sure you do it during a maintenance window or less server activity. Also check your read/write data and log latency.

I am not sure you have Instant file initialization, but it will benefit when restoring, during autogrowth of data files and when creating a new database (just mentioning for completeness).

I edited my comment with my tempdb configuration. Thanks, Did not know about the serial online rebuild tip. I'll do some more testing and try to get with the SAN admin, who unfortunately has been less than welcoming. Are there any specific waitstats I should be comparing (ex. PageIOLatch)? Our tempdb writes are super high (4000ms) which is horrendous. Under 40ms for main DBs. That might be a question for another time though...!
@Gabe you should show your SAN admin proper facts that it is indeed a SAN problem - read/write latency - sys.dm_io_virtual_file_stats. Is your tempdb on seperate LUN ?
Now that it's the mid-2020s and every DBA worth their salt(ed hashes) will be running their DB off NVMe SSDs - so concerns about smoothing-over spinning-rust disk reads are largely moot; but I'd like to know if MAXDOP=1 still makes sense when index rebuilds might not be IO-bound anymore? Also, what about Azure SQL? I note there's still no authoritative word on tempdb sorting in AzSQL where there's no support for putting tempdb in a different filegroup but we do have Hyperscale now - so I'd like to know if it's a good - or bad - idea before I start a 4-hour index maintenance job.
I would question - in AG environment if index maintenance is worth ? I have been only doing update stats since many years and there is no perf complains.

Stack Exchange Network

When to use sort_in_tempdb when rebuilding indexes?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

When to use sort_in_tempdb when rebuilding indexes?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions