1

We have a SQL Server 2014 enterprise cluster. Server 1 and Server 2. SQL Server is limited to 64GB of memory.

We are using Ola Hallengren scripts for index rebuilding and optimizations once per week for all our databases.

Every time script triggers, it sometimes hangs on the same index

ALTER INDEX [SecretId] ON [someDB].[dbo].[Articles] REBUILD WITH (SORT_IN_TEMPDB = OFF, ONLINE = ON)

This is unique identifier field in the database. Table has approx 1.5 mil records.

When index rebuild hangs, we do a failover to another SQL server, restart service on first one, and failover back to instance one. After that, we re-run scripts for index rebuilding, and it passes well in a matter of minutes.

Why do we need to restart service in order to rebuild index successfully?

Paul White
95.4k30 gold badges440 silver badges689 bronze badges
asked Nov 4, 2018 at 11:20
2
  • 3
    What type of wait do you see when it "hangs"? Is it being blocked? What makes you determine that it's hung and not doing work? Also, can you clarify if this is a Failover Cluster Instance or an Availability Group? Your "Failover-restart-failback" description makes me think it's an AG... Commented Nov 4, 2018 at 12:10
  • Is the column a UUID type column? That could potentially take a long time to rebuild. What does the index definition look like? Could you add that to the question? Thanks. Commented Nov 9, 2020 at 10:46

2 Answers 2

1

I've come across this issue recently. In my case, Paul Williams' comment was spot on and it's to do with a complex nvarchar(max) column I had that stores JSON and HTML:

Is this a table with a large varbinary or nvarchar(max) column? I've seen unusual maintenance behaviors with these kinds tables.

I use the following query to narrow it down to the offending table, which is a slight variation to Ola's script at https://ola.hallengren.com/scripts/misc/CommandLogSelect.sql

The index with a null EndTime is where the crash is occurring, most likely in the last row returned.

use [master];
SELECT DatabaseName,
 SchemaName,
 ObjectName,
 CASE WHEN ObjectType = 'U' THEN 'USER_TABLE' WHEN ObjectType = 'V' THEN 'VIEW' END AS ObjectType,
 IndexName,
 CASE WHEN IndexType = 1 THEN 'CLUSTERED' WHEN IndexType = 2 THEN 'NONCLUSTERED' WHEN IndexType = 3 THEN 'XML' WHEN IndexType = 4 THEN 'SPATIAL' END AS IndexType,
 PartitionNumber,
 ExtendedInfo.value('(ExtendedInfo/PageCount)[1]','int') AS [PageCount],
 ExtendedInfo.value('(ExtendedInfo/Fragmentation)[1]','float') AS Fragmentation,
 CommandType,
 Command,
 StartTime,
 EndTime,
 DATEDIFF(ss,StartTime, (CASE WHEN EndTime IS NULL THEN StartTime ELSE EndTime END)) AS Duration,
 ErrorNumber,
 ErrorMessage
FROM dbo.CommandLog
WHERE CommandType = 'ALTER_INDEX'
ORDER BY StartTime ASC;
Paul White
95.4k30 gold badges440 silver badges689 bronze badges
answered Jun 5, 2020 at 2:18
0

Check the dbo.CommandLog table to see what the error is (for IndexOptimize)?

If it's lock timeout (from blocking), then obviously failover would resolve it as there shouldn't be blocking after a restart

You can also run sp_whoisactive to find the Head blocker

In my work's cases, sometimes even retry after 1 minute or 10 minutes would work (no need to failover) as the BLOCKING session has cleared/released the lock

But sometimes it'd still fail after retrying for an hour (inside SQL Job)

Let us know how it turns out

Good luck

answered Nov 5, 2018 at 2:41

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.