We have an SQL Server 2014 Enterprise Edition and a database around 6 TB in size.
Just to give quick background of infra as Server is in Azure and having premium disk of 9 TB for datafiles.
Have a few large table that is already partitioned by an identifier (MonthlyDate integer Eg 01012016,... 31122016 ). We have monthly partition starting from 2014 to 2016 (01012014 , 01022014......till 31122016).
Now we are trying to create partition for 2017 and 2018 month-wise.
I tried to use the partition wizard, but couldn't find an option for what I wanted. As far as I can tell, I need to run something like:
Alter partition Scheme [PartPScheme_BIGTAB] Next Used [PartFileGrp_201701]
Alter partition Function [PartPFN_BIGTAB] () split range(20170131)
I have tried to run the abrove script multiple time , But it have taken more then 7 hours just only 1 gb of the data has been moved to above partition and finally we have to roll back the script.
I'm looking for someone to point me in the right direction as to best practice for this sort of thing. Not really sure how to approach this.
What could be the issue?
The current partition function is:
CREATE PARTITION FUNCTION [PartPFN_BIGTAB](int) AS RANGE LEFT FOR VALUES (
20141031
, 20141131
, 20141231
, 20150131
, 20150231
, 20150331
, 20150431
, 20150531
, 20150631
, 20150731
, 20150831
, 20150931
, 20151031
, 20151131
, 20151231
, 20160131
, 20160231
, 20160331
, 20160431
, 20160531
, 20160631
, 20160731
, 20160831
, 20160931
, 20161031
, 20161131
, 20161231
);
3 Answers 3
If your intention is to partition this data by year/month, I think your current partitioning column (which is an integer and not a 'true' date data type and not even in yyyymmdd format even as an integer) is all wrong for the job. I'm also assuming you don't have another column in the table that IS the MonthlyDate in a true Date data type.
In my opinion (and assuming you have extra disk space), you should:
- Create a new partition function that uses a
DATE
data type. It is recommended that this should beRANGE RIGHT
(see Dan Guzman's link at the bottom of this answer for explanations). Make sure you create it with ranges well below your minimum date as well as ranges well above your maximum date. Also, as time goes by, make sure you are splitting empty partitions well ahead of any data already being populated in them. - Create a new partition scheme that references the new partition function.
- Create a new 'replacement' table (that references the newly created partition scheme) by scripting a create of the current table, but add a computed/persisted DATE column (ex:PartitionedColumn DATE) that is derived/converted from your integer MonthlyDate column.
- Set up an insert statement, inserting records from the existing table into the new table, say a few thousand or a hundred thousand in a batch, within a transaction. When the transaction completes perform a log backup (if you're not in SIMPLE recovery). That'll keep you transaction log from running out of space. Then repeat the batch... keep doing that until all records are written to the new table with the persisted computed column.
- Then change the old table name, change the new table name to that of the old table
Keep in mind that all unique indexes that participate in a partitioning scenario, MUST have the partitioning column defined as part of the base index definition and not simply an included column.
Also, to enable partition switching (often used in a "sliding window"), all indexes on the table must be aligned. BOL has good information about this. Look for Special Guidelines for Partitioned Indexes.
Lastly, check out Dan Guzman's excellent post on Sql Server partitioning.
-
yes, this one is a real mess that will require a non-standard approach to remediate.Dan Guzman– Dan Guzman2017年06月28日 15:58:15 +00:00Commented Jun 28, 2017 at 15:58
You should have created 2017 partitions before the year began. Because now your last partition has all the data from 2016 December 01 till now. When splitting every row has to be moved physically to new partition.
Ref: https://learn.microsoft.com/en-us/sql/t-sql/statements/alter-partition-function-transact-sql
Always keep empty partitions at both ends of the partition range to guarantee that the partition split (before loading new data) and partition merge (after unloading old data) do not incur any data movement. Avoid splitting or merging populated partitions. This can be extremely inefficient, as this may cause as much as four times more log generation, and may also cause severe locking.
There are few different approach you can take to minimize the io impact. I find the solution describe in below article is one of the best one. It is also written by Microsoft's premier field engineer team.
Oops... I forgot to leave an empty SQL table partition, how can I split it with minimal IO impact?
Another good reading by Kendra Little which is somehow related to your case.
Adding Partitions to the Lower End of a Left Based Partition Function
Edit: Based on the comment of @ScottHodgin you might have a bigger problem with your partition column being an integer data type and not datetime or date type. You might want to fix this at the same time.
-
1I think the OP has much bigger problems. His partitioning column is an integer in ddmmyyyy format.Scott Hodgin - Retired– Scott Hodgin - Retired2017年06月28日 11:58:16 +00:00Commented Jun 28, 2017 at 11:58
-
@ScottHodgin good catch, seeing this now. He needs to check if things are in right place to start with.SqlWorldWide– SqlWorldWide2017年06月28日 12:01:04 +00:00Commented Jun 28, 2017 at 12:01
Based on the additional information in your comments, it seems your current partition function boundary values are YYYYMM31. Assuming the actual partitioning column values are in YYYYMMDD format, you should be able to use a staging table and SWITCH
along with a temporary partition function and scheme.
Below is an example script using this method. I agree with Scott and SqlWordWide that it's best to plan in advance as to avoid splitting non-empty partitions. Unless you had a legacy schema you couldn't change, the partitioning column should have been a date data type. This ensure only valid dates exist and allows more intuitive partition boundaries and maintenance scripts.
--create temporary partition function and scheme with desired boundaries and filegroup mapping
DECLARE
@StartMonth date = '20141001'
, @EndMonth date = '20181201'
, @Month date
, @SQL nvarchar(MAX);
CREATE PARTITION FUNCTION PartPFN_BIGTAB_Temp(int) AS RANGE LEFT FOR VALUES ();
CREATE PARTITION SCHEME PartPScheme_BIGTAB_Temp
AS PARTITION PartPFN_BIGTAB_Temp ALL TO ([DEFAULT]);
SET @Month = @StartMonth;
WHILE @Month <= @EndMonth
BEGIN
SET @SQL = REPLACE(
N'ALTER PARTITION SCHEME PartPScheme_BIGTAB_Temp NEXT USED PartFileGrp_$(YYYYMM);'
, N'$(YYYYMM)'
, LEFT(CONVERT(char(8), @Month, 112), 6)
);
PRINT @SQL;
EXEC sp_executesql @SQL;
ALTER PARTITION FUNCTION PartPFN_BIGTAB_Temp()
SPLIT RANGE(CAST(LEFT(CONVERT(char(8), @Month, 112), 6) + '31' AS int));
SET @Month = DATEADD(month, 1, @Month);
END;
GO
--create aligned temporary staging table with same schema and indexes on original partition scheme
CREATE TABLE dbo.YourPartitionedTable_Staging(
YYYYMMDDColumn int
, OtherColumn int
, CONSTRAINT PK_YourPartitionedTable_Staging
PRIMARY KEY (
OtherColumn
, YYYYMMDDColumn
) ON PartPScheme_BIGTAB(YYYYMMDDColumn)
) ON PartPScheme_BIGTAB(YYYYMMDDColumn);
--switch last partition ( first value > 20161231 ) into staging table
ALTER TABLE dbo.YourPartitionedTable SWITCH
PARTITION $PARTITION.PartPFN_BIGTAB(20161232)
TO dbo.YourPartitionedTable_Staging
PARTITION $PARTITION.PartPFN_BIGTAB(20161232);
--recreate each index on temporary scheme using CREATE...INDEX...DROP_EXISTING
CREATE UNIQUE CLUSTERED INDEX PK_YourPartitionedTable_Staging ON dbo.YourPartitionedTable_Staging
(
OtherColumn
, YYYYMMDDColumn
)
WITH (DROP_EXISTING=ON)
ON PartPScheme_BIGTAB_Temp(YYYYMMDDColumn);
GO
--create new boundaries in original partition function along with filegroups mappings
DECLARE
@StartMonth date = '20170101'
, @EndMonth date = '20181201'
, @Month date
, @SQL nvarchar(MAX);
SET @Month = @StartMonth;
WHILE @Month <= @EndMonth
BEGIN
SET @SQL = REPLACE(
N'ALTER PARTITION SCHEME PartPScheme_BIGTAB NEXT USED PartFileGrp_$(YYYYMM);'
, N'$(YYYYMM)'
, LEFT(CONVERT(char(8), @Month, 112), 6)
);
PRINT @SQL;
EXEC sp_executesql @SQL;
ALTER PARTITION FUNCTION PartPFN_BIGTAB()
SPLIT RANGE(CAST(LEFT(CONVERT(char(8), @Month, 112), 6) + '31' AS int));
SET @Month = DATEADD(month, 1, @Month);
END;
GO
--SWITCH data in new parttions bach into original table
DECLARE
@StartMonth date = '20171001'
, @EndMonth date = '20181201'
, @Month date;
SET @Month = @StartMonth;
WHILE @Month <= @EndMonth
BEGIN
ALTER TABLE dbo.YourPartitionedTable_Staging SWITCH
PARTITION $PARTITION.PartPFN_BIGTAB(CAST(LEFT(CONVERT(char(8), @Month, 112), 6) + '31' AS int))
TO dbo.YourPartitionedTable
PARTITION $PARTITION.PartPFN_BIGTAB(CAST(LEFT(CONVERT(char(8), @Month, 112), 6) + '31' AS int));
SET @Month = DATEADD(month, 1, @Month);
END;
GO
--generally a good practice to update stats after SWITCH, although may not be reqiired here because data did not change
UPDATE STATISTICS dbo.YourPartitionedTable;
GO
--remove temporary objects
DROP TABLE dbo.YourPartitionedTable_Staging
DROP PARTITION SCHEME PartPScheme_BIGTAB_Temp;
DROP PARTITION FUNCTION PartPFN_BIGTAB_Temp;
GO
Explore related questions
See similar questions with these tags.
28062017
, but your partition function code is given an integer in yyyymmdd format (same date would be20170628
). The second might not be advisable (I leave that up to those who've actually worked with partitions), but the format would represent a huge problem: the range representing January of 2017 would be01012017
to01312017
- which happens to include dates like Jan 15, 2016 (01152016
) and Jan 15, 2020 (01152020
).