SQL Server hangs on index rebuild, works well after SQL service restart

Question 1

We have a SQL Server 2014 enterprise cluster. Server 1 and Server 2. SQL Server is limited to 64GB of memory.

We are using Ola Hallengren scripts for index rebuilding and optimizations once per week for all our databases.

Every time script triggers, it sometimes hangs on the same index

ALTER INDEX [SecretId] ON [someDB].[dbo].[Articles] REBUILD WITH (SORT_IN_TEMPDB = OFF, ONLINE = ON)

This is unique identifier field in the database. Table has approx 1.5 mil records.

When index rebuild hangs, we do a failover to another SQL server, restart service on first one, and failover back to instance one. After that, we re-run scripts for index rebuilding, and it passes well in a matter of minutes.

Why do we need to restart service in order to rebuild index successfully?

Question 2

What type of wait do you see when it "hangs"? Is it being blocked? What makes you determine that it's hung and not doing work? Also, can you clarify if this is a Failover Cluster Instance or an Availability Group? Your "Failover-restart-failback" description makes me think it's an AG...

Question 3

Is the column a UUID type column? That could potentially take a long time to rebuild. What does the index definition look like? Could you add that to the question? Thanks.

Question 4

I've come across this issue recently. In my case, Paul Williams' comment was spot on and it's to do with a complex nvarchar(max) column I had that stores JSON and HTML:

Is this a table with a large varbinary or nvarchar(max) column? I've seen unusual maintenance behaviors with these kinds tables.

I use the following query to narrow it down to the offending table, which is a slight variation to Ola's script at https://ola.hallengren.com/scripts/misc/CommandLogSelect.sql

The index with a null EndTime is where the crash is occurring, most likely in the last row returned.

use [master];
SELECT DatabaseName,
 SchemaName,
 ObjectName,
 CASE WHEN ObjectType = 'U' THEN 'USER_TABLE' WHEN ObjectType = 'V' THEN 'VIEW' END AS ObjectType,
 IndexName,
 CASE WHEN IndexType = 1 THEN 'CLUSTERED' WHEN IndexType = 2 THEN 'NONCLUSTERED' WHEN IndexType = 3 THEN 'XML' WHEN IndexType = 4 THEN 'SPATIAL' END AS IndexType,
 PartitionNumber,
 ExtendedInfo.value('(ExtendedInfo/PageCount)[1]','int') AS [PageCount],
 ExtendedInfo.value('(ExtendedInfo/Fragmentation)[1]','float') AS Fragmentation,
 CommandType,
 Command,
 StartTime,
 EndTime,
 DATEDIFF(ss,StartTime, (CASE WHEN EndTime IS NULL THEN StartTime ELSE EndTime END)) AS Duration,
 ErrorNumber,
 ErrorMessage
FROM dbo.CommandLog
WHERE CommandType = 'ALTER_INDEX'
ORDER BY StartTime ASC;

Question 5

Check the dbo.CommandLog table to see what the error is (for IndexOptimize)?

If it's lock timeout (from blocking), then obviously failover would resolve it as there shouldn't be blocking after a restart

You can also run sp_whoisactive to find the Head blocker

In my work's cases, sometimes even retry after 1 minute or 10 minutes would work (no need to failover) as the BLOCKING session has cleared/released the lock

But sometimes it'd still fail after retrying for an hour (inside SQL Job)

Let us know how it turns out

Good luck

Meistro Meistro 111 bronze badge · Answer 1 · 2020-06-05 02:18:51Z

I've come across this issue recently. In my case, Paul Williams' comment was spot on and it's to do with a complex nvarchar(max) column I had that stores JSON and HTML:

Is this a table with a large varbinary or nvarchar(max) column? I've seen unusual maintenance behaviors with these kinds tables.

I use the following query to narrow it down to the offending table, which is a slight variation to Ola's script at https://ola.hallengren.com/scripts/misc/CommandLogSelect.sql

The index with a null EndTime is where the crash is occurring, most likely in the last row returned.

use [master];
SELECT DatabaseName,
 SchemaName,
 ObjectName,
 CASE WHEN ObjectType = 'U' THEN 'USER_TABLE' WHEN ObjectType = 'V' THEN 'VIEW' END AS ObjectType,
 IndexName,
 CASE WHEN IndexType = 1 THEN 'CLUSTERED' WHEN IndexType = 2 THEN 'NONCLUSTERED' WHEN IndexType = 3 THEN 'XML' WHEN IndexType = 4 THEN 'SPATIAL' END AS IndexType,
 PartitionNumber,
 ExtendedInfo.value('(ExtendedInfo/PageCount)[1]','int') AS [PageCount],
 ExtendedInfo.value('(ExtendedInfo/Fragmentation)[1]','float') AS Fragmentation,
 CommandType,
 Command,
 StartTime,
 EndTime,
 DATEDIFF(ss,StartTime, (CASE WHEN EndTime IS NULL THEN StartTime ELSE EndTime END)) AS Duration,
 ErrorNumber,
 ErrorMessage
FROM dbo.CommandLog
WHERE CommandType = 'ALTER_INDEX'
ORDER BY StartTime ASC;

Jerry Hung Jerry Hung 1308 bronze badges · Answer 2 · 2018-11-05 02:41:47Z

Check the dbo.CommandLog table to see what the error is (for IndexOptimize)?

If it's lock timeout (from blocking), then obviously failover would resolve it as there shouldn't be blocking after a restart

You can also run sp_whoisactive to find the Head blocker

In my work's cases, sometimes even retry after 1 minute or 10 minutes would work (no need to failover) as the BLOCKING session has cleared/released the lock

But sometimes it'd still fail after retrying for an hour (inside SQL Job)

Let us know how it turns out

Good luck

Stack Exchange Network

SQL Server hangs on index rebuild, works well after SQL service restart

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

SQL Server hangs on index rebuild, works well after SQL service restart

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions