Recreating Clustered Index Efficiently

Question 1

In a test database, I am looking to:

Drop clustered index (they are Primary Key Clustered Constraints on a row that is super useless for us.)
Create new Clustered index
re-create Primary Key constraint as nonclustered index
rebuild all other non-clustered indexes.

My workflow is as above as well, with the addition of disabling all the non-clustered indexes prior to dropping the clustered.

Since dropping the clustered constraint index requires the table to save as a HEAP, the amount of time this process takes on our 45m row table is tremendous. The drop on the constraint has been going for 1:17:00 and seems to only be at about 31m (based on Logical Reads in Spotlight for the Session).

Is there a more efficient way to handle this workflow? Perhaps a way to drop the constraint index and rebuild as the new clustered index, rather than as a HEAP?

Thanks, Wes

DDL Statements:

TABLE STRUCTURE

 CREATE TABLE [dbo].[hist](
 [prrowid] [varchar](36) NOT NULL,
 [part] [varchar](30) NULL,
 [date] [datetime] NULL,
 [per_date] [datetime] NULL,
 [type] [varchar](80) NULL,
 [loc] [varchar](80) NULL,
 [loc_begin] [decimal](28, 10) NULL,
 [begin_qoh] [decimal](28, 10) NULL,
 [qty_req] [decimal](28, 10) NULL,
 [qty_chg] [decimal](28, 10) NULL,
 [qty_short] [decimal](28, 10) NULL,
 [um] [varchar](30) NULL,
 [last_date] [datetime] NULL,
 [nbr] [varchar](30) NULL,
 [so_job] [varchar](80) NULL,
 [ship_type] [varchar](30) NULL,
 [addr] [varchar](80) NULL,
 [rmks] [varchar](80) NULL,
 [xdr_acct] [varchar](80) NULL,
 [xcr_acct] [varchar](80) NULL,
 [mtl_std] [decimal](28, 10) NULL,
 [lbr_std] [decimal](28, 10) NULL,
 [bdn_std] [decimal](28, 10) NULL,
 [price] [decimal](28, 10) NULL,
 [trnbr] [int] NULL,
 [gl_amt] [decimal](28, 10) NULL,
 [xdr_cc] [varchar](30) NULL,
 [xcr_cc] [varchar](30) NULL,
 [lot] [varchar](80) NULL,
 [sub_std] [decimal](28, 10) NULL,
 [gl_date] [datetime] NULL,
 [qty_loc] [decimal](28, 10) NULL,
 [userid] [varchar](80) NULL,
 [serial] [varchar](50) NULL,
 [effdate] [datetime] NULL,
 [prod_line] [varchar](30) NULL,
 [xslspsn1] [varchar](80) NULL,
 [xslspsn2] [varchar](80) NULL,
 [xcr_proj] [varchar](80) NULL,
 [xdr_proj] [varchar](80) NULL,
 [line] [int] NULL,
 [user1] [varchar](80) NULL,
 [user2] [varchar](80) NULL,
 [curr] [varchar](30) NULL,
 [ex_rate] [decimal](28, 10) NULL,
 [rev] [varchar](30) NULL,
 [time] [int] NULL,
 [ovh_std] [decimal](28, 10) NULL,
 [site] [varchar](80) NULL,
 [status] [varchar](80) NULL,
 [grade] [varchar](30) NULL,
 [expire] [datetime] NULL,
 [assay] [decimal](28, 10) NULL,
 [xgl_ref] [varchar](30) NULL,
 [_chr01] [varchar](80) NULL,
 [_chr02] [varchar](80) NULL,
 [_chr03] [varchar](80) NULL,
 [_chr04] [varchar](80) NULL,
 [_chr05] [varchar](80) NULL,
 [_chr06] [varchar](80) NULL,
 [_chr07] [varchar](80) NULL,
 [_chr08] [varchar](80) NULL,
 [_chr09] [varchar](80) NULL,
 [_chr10] [varchar](80) NULL,
 [_chr11] [varchar](80) NULL,
 [_chr12] [varchar](80) NULL,
 [_chr13] [varchar](80) NULL,
 [_chr14] [varchar](80) NULL,
 [_chr15] [varchar](80) NULL,
 [_dte01] [datetime] NULL,
 [_dte02] [datetime] NULL,
 [_dte03] [datetime] NULL,
 [_dte04] [datetime] NULL,
 [_dte05] [datetime] NULL,
 [_dec01] [decimal](28, 10) NULL,
 [_dec02] [decimal](28, 10) NULL,
 [_dec03] [decimal](28, 10) NULL,
 [_dec04] [decimal](28, 10) NULL,
 [_dec05] [decimal](28, 10) NULL,
 [_log01] [bit] NULL,
 [_log02] [bit] NULL,
 [ref] [varchar](80) NULL,
 [msg] [int] NULL,
 [program] [varchar](30) NULL,
 [ord_rev] [int] NULL,
 [ref_site] [varchar](80) NULL,
 [rsn_code] [varchar](80) NULL,
 [vend_lot] [varchar](30) NULL,
 [vend_date] [datetime] NULL,
 [daycode] [varchar](80) NULL,
 [for] [varchar](30) NULL,
 [slspsn##1] [varchar](82) NULL,
 [slspsn##2] [varchar](82) NULL,
 [slspsn##3] [varchar](82) NULL,
 [slspsn##4] [varchar](82) NULL,
 [fsm_type] [varchar](80) NULL,
 [upd_isb] [bit] NULL,
 [auto_install] [bit] NULL,
 [ca_int_type] [varchar](80) NULL,
 [covered_amt] [decimal](28, 10) NULL,
 [fcg_code] [varchar](80) NULL,
 [batch] [varchar](30) NULL,
 [fsc_code] [varchar](80) NULL,
 [sa_nbr] [varchar](80) NULL,
 [sv_code] [varchar](80) NULL,
 [eng_area] [varchar](30) NULL,
 [sys_prod] [varchar](30) NULL,
 [svc_type] [varchar](30) NULL,
 [ca_opn_date] [datetime] NULL,
 [cprice] [decimal](28, 10) NULL,
 [eng_code] [varchar](80) NULL,
 [wod_op] [int] NULL,
 [enduser] [varchar](80) NULL,
 [ship_inv_mov] [varchar](80) NULL,
 [ship_date] [datetime] NULL,
 [ship_id] [varchar](30) NULL,
 [ex_rate2] [decimal](28, 10) NULL,
 [ex_ratetype] [varchar](80) NULL,
 [exru_seq] [int] NULL,
 [promise_date] [datetime] NULL,
 [fldchg_cmtindx] [int] NULL,
 [SrcPDB] [varchar](12) NULL,
 CONSTRAINT [hist_PK] PRIMARY KEY CLUSTERED 
 (
 [prrowid] ASC
 )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
 ) ON [PRIMARY]

CURRENT INDEXES

ALTER TABLE [dbo].[hist] ADD CONSTRAINT [hist_PK] PRIMARY KEY CLUSTERED ( [prrowid] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##addr_eff] ON [dbo].[hist] ( [addr], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##batch] ON [dbo].[hist] ( [batch] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##date_trn] ON [dbo].[hist] ( [date], [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##eff_trnbr] ON [dbo].[hist] ( [effdate], [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##nbr_eff] ON [dbo].[hist] ( [nbr], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##part_eff] ON [dbo].[hist] ( [part], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##part_trn] ON [dbo].[hist] ( [part], [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##ref_filter] ON [dbo].[hist] ( [ref] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##serial] ON [dbo].[hist] ( [serial] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##trnbr] ON [dbo].[hist] ( [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##type] ON [dbo].[hist] ( [type], [effdate] ) WITH (FILLFACTOR=100);

DESIRED INDEXES

CREATE UNIQUE CLUSTERED INDEX [hist##date_trn_CX] ON [dbo].[hist] ( [date], [trnbr] ) WITH (FILLFACTOR=100);
ALTER TABLE [dbo].[hist] ADD CONSTRAINT [hist_PK] PRIMARY KEY NONCLUSTERED ( [prrowid] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##addr_eff] ON [dbo].[hist] ( [addr], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##batch] ON [dbo].[hist] ( [batch] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##eff_trnbr] ON [dbo].[hist] ( [effdate], [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##part_eff] ON [dbo].[hist] ( [part], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##part_trn] ON [dbo].[hist] ( [part], [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##ref_filter] ON [dbo].[hist] ( [ref] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##serial] ON [dbo].[hist] ( [serial] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##trnbr] ON [dbo].[hist] ( [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##nbr_eff] ON [dbo].[hist] ( [trnbr], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##trnbr_char] ON [dbo].[hist] ( [trnbr_char] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##type] ON [dbo].[hist] ( [type], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##vend_lot] ON [dbo].[hist] ( [vend_lot] ) WITH (FILLFACTOR=100);

-- NOTE -- The below answer worked perfectly for me. I did have to have a volume added. I created a second Filegroup and a datafile on the new drive. Additionally, another log file as well.

Question 2

Ideally you would do something like this:

Drop the existing primary key constraint but keep the clustered index.
Recreate the clustered index on the new columns with the DROP_EXISTING = ON option set.
Create the primary key constraint on a new nonclustered index.

That would skip the step of the table being converted to a heap. Unfortunately, step 1 doesn't appear to be possible in SQL Server.

When the primary key is deleted, the corresponding index is deleted.

In addition, BOL has this to say about changing a primary key with DROP_EXISTING = ON:

If the index enforces a PRIMARY KEY or UNIQUE constraint and the index definition is not altered in any way, the index is dropped and re-created preserving the existing constraint. However, if the index definition is altered the statement fails. To change the definition of a PRIMARY KEY or UNIQUE constraint, drop the constraint and add a constraint with the new definition.

As far as I can tell, the best that you can do is to avoid the heap conversion by creating a copy of the table and moving all data there. Removing or adding a clustered index creates an internal copy of the data anyway so it's not like it'll require more space. Here are a few hints to speed that up:

You probably shouldn't use SELECT INTO. That will copy the data to a heap which is the step you're trying to avoid. However, both the SELECT INTO and the creation of the clustered index are eligible for parallelism in SQL Server 2014.
Take advantage of minimal logging if your recovery model allows for it. Note that for INSERT INTO... SELECT you'll need a TABLOCK hint against the target table to get minimal logging.
Create your nonclustered indexes after all data is loaded.
When creating the nonclustered indexes use the SORT_IN_TEMPDB = ON option if tempdb is sized for it.
Check for foreign keys on other tables. If you're able to disable those that might help speed up things.

As an aside, if you were curious to see step 2 in action (I was), here's some sample code which shows how the heap conversion step can be skipped:

DROP TABLE IF EXISTS dbo.X_NUMBERS_1000000;
CREATE TABLE dbo.X_NUMBERS_1000000 (ID INT NOT NULL, ID2 INT NOT NULL, FILLER VARCHAR(500));
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000 ON dbo.X_NUMBERS_1000000 (ID);
INSERT INTO dbo.X_NUMBERS_1000000 WITH (TABLOCK)
SELECT TOP (1000000) 
 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, REPLICATE('Z', 500)
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;
-- option 1
DROP INDEX X_NUMBERS_1000000.CI_X_NUMBERS_1000000;
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000_2_COL ON dbo.X_NUMBERS_1000000 (ID, ID2);

SQL Server Execution Times: CPU time = 31 ms, elapsed time = 51 ms.

SQL Server Execution Times: CPU time = 2406 ms, elapsed time = 3484 ms.

-- option 2 (after resetting the table)
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000 ON dbo.X_NUMBERS_1000000 (ID, ID2) 
WITH (DROP_EXISTING = ON);

SQL Server Execution Times: CPU time = 2422 ms, elapsed time = 3411 ms.

Question 3

Holy Cow, it worked!... 48,937,299 rows in 31 mins. I did create a second log file and cap the original just in case it continues to grow (not worried about restores and what not in Dev.)

Joe Obbish Joe Obbish 33.1k4 gold badges76 silver badges155 bronze badges · Accepted Answer · 2017-04-27 01:12:07Z

Ideally you would do something like this:

Drop the existing primary key constraint but keep the clustered index.
Recreate the clustered index on the new columns with the DROP_EXISTING = ON option set.
Create the primary key constraint on a new nonclustered index.

That would skip the step of the table being converted to a heap. Unfortunately, step 1 doesn't appear to be possible in SQL Server.

When the primary key is deleted, the corresponding index is deleted.

In addition, BOL has this to say about changing a primary key with DROP_EXISTING = ON:

If the index enforces a PRIMARY KEY or UNIQUE constraint and the index definition is not altered in any way, the index is dropped and re-created preserving the existing constraint. However, if the index definition is altered the statement fails. To change the definition of a PRIMARY KEY or UNIQUE constraint, drop the constraint and add a constraint with the new definition.

As far as I can tell, the best that you can do is to avoid the heap conversion by creating a copy of the table and moving all data there. Removing or adding a clustered index creates an internal copy of the data anyway so it's not like it'll require more space. Here are a few hints to speed that up:

You probably shouldn't use SELECT INTO. That will copy the data to a heap which is the step you're trying to avoid. However, both the SELECT INTO and the creation of the clustered index are eligible for parallelism in SQL Server 2014.
Take advantage of minimal logging if your recovery model allows for it. Note that for INSERT INTO... SELECT you'll need a TABLOCK hint against the target table to get minimal logging.
Create your nonclustered indexes after all data is loaded.
When creating the nonclustered indexes use the SORT_IN_TEMPDB = ON option if tempdb is sized for it.
Check for foreign keys on other tables. If you're able to disable those that might help speed up things.

As an aside, if you were curious to see step 2 in action (I was), here's some sample code which shows how the heap conversion step can be skipped:

DROP TABLE IF EXISTS dbo.X_NUMBERS_1000000;
CREATE TABLE dbo.X_NUMBERS_1000000 (ID INT NOT NULL, ID2 INT NOT NULL, FILLER VARCHAR(500));
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000 ON dbo.X_NUMBERS_1000000 (ID);
INSERT INTO dbo.X_NUMBERS_1000000 WITH (TABLOCK)
SELECT TOP (1000000) 
 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, REPLICATE('Z', 500)
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;
-- option 1
DROP INDEX X_NUMBERS_1000000.CI_X_NUMBERS_1000000;
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000_2_COL ON dbo.X_NUMBERS_1000000 (ID, ID2);

SQL Server Execution Times: CPU time = 31 ms, elapsed time = 51 ms.

SQL Server Execution Times: CPU time = 2406 ms, elapsed time = 3484 ms.

-- option 2 (after resetting the table)
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000 ON dbo.X_NUMBERS_1000000 (ID, ID2) 
WITH (DROP_EXISTING = ON);

SQL Server Execution Times: CPU time = 2422 ms, elapsed time = 3411 ms.

Holy Cow, it worked!... 48,937,299 rows in 31 mins. I did create a second log file and cap the original just in case it continues to grow (not worried about restores and what not in Dev.)

Stack Exchange Network

Recreating Clustered Index Efficiently

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Recreating Clustered Index Efficiently

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions