I use the Ola Hallengren DatabaseBackup procedure to run backups. I have two jobs configured so the backups can run parallel as I have one large database that takes 45 minutes to back up and many smaller ones that collectively take 23 minutes - about half the time of the largest database.
Over the past week, there have been two days when the largest database got backed up twice leading to a disk space issue. At first I thought each of the two jobs was backing up the largest DB but it was actually just one job doing the backup twice in a row.
Has anyone seen this before? I can't see anything appears to be the trigger for the second backup. I can always throw more disk space at the problem so it's not a critical failure - I'm just wondering if anyone has any ideas why this might be happening.
Ken
[Edit] Here's the log file from a run with two backups. Normal runs appear the same but without the second backup.
Job 'DatabaseBackup - USER_DATABASES - FULL' : Step 1, 'DatabaseBackup - USER_DATABASES - FULL' : Began Executing 2022年01月02日 01:05:00
Date and time: 2022年01月02日 01:05:00 [SQLSTATE 01000]
Server: SRV-SQLP102 [SQLSTATE 01000]
Version: 13.0.5888.11 [SQLSTATE 01000]
Edition: Enterprise Edition: Core-based Licensing (64-bit) [SQLSTATE 01000]
Platform: Windows [SQLSTATE 01000]
Procedure: [DBA].[dbo].[DatabaseBackup] [SQLSTATE 01000]
Parameters: @Databases = 'USER_DATABASES,-APS_PlanvalTst1', @Directory = 'H:\MSSQLBackups', @BackupType = 'FULL', @Verify = 'Y', @CleanupTime = 120, @CleanupMode = 'AFTER_BACKUP', @Compress = 'Y', @CopyOnly = 'N', @ChangeBackupType = 'N', @BackupSoftware = 'LITESPEED', @CheckSum = 'Y', @BlockSize = NULL, @BufferCount = NULL, @MaxTransferSize = NULL, @NumberOfFiles = NULL, @MinBackupSizeForMultipleFiles = NULL, @MaxFileSize = NULL, @CompressionLevel = NULL, @Description = NULL, @Threads = NULL, @Throttle = NULL, @Encrypt = 'Y', @EncryptionAlgorithm = 'AES_256', @ServerCertificate = NULL, @ServerAsymmetricKey = NULL, @EncryptionKey = '<password>', @ReadWriteFileGroups = 'N', @OverrideBackupPreference = 'N', @NoRecovery = 'N', @URL = NULL, @Credential = NULL, @MirrorDirectory = NULL, @MirrorCleanupTime = NULL, @MirrorCleanupMode = 'AFTER_BACKUP', @MirrorURL = NULL, @AvailabilityGroups = NULL, @Updateability = 'ALL', @AdaptiveCompression = NULL, @ModificationLevel = NULL, @LogSizeSinceLastLogBackup = NULL, @TimeSinceLastLogBackup = NULL, @DataDomainBoostHost = NULL, @DataDomainBoostUser = NULL, @DataDomainBoostDevicePath = NULL, @DataDomainBoostLockboxPath = NULL, @DirectoryStructure = '{ServerName}${InstanceName}{DirectorySeparator}{DatabaseName}{DirectorySeparator}{BackupType}_{Partial}_{CopyOnly}', @AvailabilityGroupDirectoryStructure = '{ClusterName}${AvailabilityGroupName}{DirectorySeparator}{DatabaseName}{DirectorySeparator}{BackupType}_{Partial}_{CopyOnly}', @FileName = '{ServerName}${InstanceName}_{DatabaseName}_{BackupType}_{Partial}_{CopyOnly}_{Year}{Month}{Day}_{Hour}{Minute}{Second}_{FileNumber}.{FileExtension}', @AvailabilityGroupFileName = '{ClusterName}${AvailabilityGroupName}_{DatabaseName}_{BackupType}_{Partial}_{CopyOnly}_{Year}{Month}{Day}_{Hour}{Minute}{Second}_{FileNumber}.{FileExtension}', @FileExtensionFull = NULL, @FileExtensionDiff = NULL, @FileExtensionLog = NULL, @Init = 'N', @Format = 'N', @ObjectLevelRecoveryMap = 'Y', @ExcludeLogShippedFromLogBackup = 'Y', @StringDelimite...
Version: 2020年12月31日 18:58:56 [SQLSTATE 01000]
Source: https://ola.hallengren.com [SQLSTATE 01000]
[SQLSTATE 01000]
Date and time: 2022年01月02日 01:05:01 [SQLSTATE 01000]
Database: [APS_Arieldb] [SQLSTATE 01000]
State: ONLINE [SQLSTATE 01000]
Standby: No [SQLSTATE 01000]
Updateability: READ_WRITE [SQLSTATE 01000]
User access: MULTI_USER [SQLSTATE 01000]
Recovery model: FULL [SQLSTATE 01000]
Encrypted: No [SQLSTATE 01000]
Is accessible: Yes [SQLSTATE 01000]
Log shipping role: PRIMARY [SQLSTATE 01000]
Differential base LSN: 171335000288621800250 [SQLSTATE 01000]
Last log backup LSN: 171335000319597800001 [SQLSTATE 01000]
Allocated extent page count: 111568608 (871629.750000 MB) [SQLSTATE 01000]
Modified extent page count: 115016 (898.562500 MB) [SQLSTATE 01000]
[SQLSTATE 01000]
Date and time: 2022年01月02日 01:05:01 [SQLSTATE 01000]
Database context: [master] [SQLSTATE 01000]
Command: DECLARE @ReturnCode int EXECUTE @ReturnCode = dbo.xp_create_subdir N'H:\MSSQLBackups\SRV-SQLP102\APS_Arieldb\FULL' IF @ReturnCode <> 0 RAISERROR('Error creating directory.', 16, 1) [SQLSTATE 01000]
Outcome: Succeeded [SQLSTATE 01000]
Duration: 00:00:00 [SQLSTATE 01000]
Date and time: 2022年01月02日 01:05:01 [SQLSTATE 01000]
[SQLSTATE 01000]
Date and time: 2022年01月02日 01:05:01 [SQLSTATE 01000]
Database context: [master] [SQLSTATE 01000]
Command: DECLARE @ReturnCode int EXECUTE @ReturnCode = dbo.xp_backup_database @database = N'APS_Arieldb', @filename = N'H:\MSSQLBackups\SRV-SQLP102\APS_Arieldb\FULL\SRV-SQLP102_APS_Arieldb_FULL_20220102_010501.bak', @with = 'CHECKSUM', @olrmap = 1, @cryptlevel = 8, @encryptionkey = N'<password>' IF @ReturnCode <> 0 RAISERROR('Error performing LiteSpeed backup.', 16, 1) [SQLSTATE 01000]
SQL> Processed 914 pages for database 'APS_Arieldb', file 'Arielv81_log' on file 1. SQL> BACKUP DATABASE successfully processed 111593234 pages in 2898.516 seconds (300.782 MB/sec). BACKUP DATABASE successfully processed 111593234 pages in 2898.516 seconds (300.782 MB/sec). BACKUP ATTACHED FILES successfully processed 1 file(s) Backup added as file number: 1 Output File(s): H:\MSSQLBackups\SRV-SQLP102\APS_Arieldb\FULL\SRV-SQLP102_APS_Arieldb_FULL_20220102_010501.bak Native Size: 871893.78 MB Backup Size: 108595.84 MB Attachments: 20.41 MB CPU Seconds: 2558.83 Environment: Intel(R) Xeon(R) Gold 6134 CPU @ 3.20GHz CPUs: 16 logical, 32 cores, 32 physical packages. [SQLSTATE 01000]
Outcome: Succeeded [SQLSTATE 01000]
Duration: 00:49:59 [SQLSTATE 01000]
Date and time: 2022年01月02日 01:55:00 [SQLSTATE 01000]
[SQLSTATE 01000]
Date and time: 2022年01月02日 01:55:00 [SQLSTATE 01000]
Database context: [master] [SQLSTATE 01000]
Command: DECLARE @ReturnCode int EXECUTE @ReturnCode = dbo.xp_restore_verifyonly @filename = N'H:\MSSQLBackups\SRV-SQLP102\APS_Arieldb\FULL\SRV-SQLP102_APS_Arieldb_FULL_20220102_010501.bak', @with = 'CHECKSUM', @encryptionkey = N'<password>' IF @ReturnCode <> 0 RAISERROR('Error verifying LiteSpeed backup.', 16, 1) [SQLSTATE 01000]
LiteSpeed® for SQL Server® Version 8.9.5.213 (64bit) Copyright 2020 Quest Software Inc. ALL RIGHTS RESERVED. CPU Seconds: 1246.91 Environment: Intel(R) Xeon(R) Gold 6134 CPU @ 3.20GHz CPUs: 16 logical, 32 cores, 32 physical packages. [SQLSTATE 01000]
Outcome: Succeeded [SQLSTATE 01000]
Duration: 00:07:13 [SQLSTATE 01000]
Date and time: 2022年01月02日 02:02:13 [SQLSTATE 01000]
[SQLSTATE 01000]
Date and time: 2022年01月02日 02:02:13 [SQLSTATE 01000]
Database context: [master] [SQLSTATE 01000]
Command: DECLARE @ReturnCode int EXECUTE @ReturnCode = dbo.xp_slssqlmaint N'-MAINTDEL -DELFOLDER "H:\MSSQLBackups\SRV-SQLP102\APS_Arieldb\FULL" -DELEXTENSION "bak" -DELUNIT "7201" -DELUNITTYPE "minutes" -DELUSEAGE' IF @ReturnCode <> 0 RAISERROR('Error deleting LiteSpeed backup files.', 16, 1) [SQLSTATE 01000]
LiteSpeed SlsSqlMaint Utility. Logging on to SQL Server 'SRV-SQLP102' with Windows Authentication Logged on to SQL Server 'SRV-SQLP102' Starting maintenance plan 'All ad-hoc plans' on 1/2/2022 2:02:13 AM [1] Delete files... Deleting old file: "H:\MSSQLBackups\SRV-SQLP102\APS_Arieldb\FULL\SRV-SQLP102_APS_Arieldb_FULL_20211228_010501.bak" [1] Execution Time: 0 hrs, 0 mins, 0 secs. End of maintenance plan 'All ad-hoc plans' on 1/2/2022 2:02:14 AM (Success) D:\MSSQL13.MSSQLSERVER\MSSQL\Binn\slssqlmaint.exe Process Exit Code: 0 (Success) [SQLSTATE 01000]
Outcome: Succeeded [SQLSTATE 01000]
Duration: 00:00:01 [SQLSTATE 01000]
Date and time: 2022年01月02日 02:02:14 [SQLSTATE 01000]
[SQLSTATE 01000]
Date and time: 2022年01月02日 02:02:14 [SQLSTATE 01000]
Database: [APS_Arieldb] [SQLSTATE 01000]
State: ONLINE [SQLSTATE 01000]
Standby: No [SQLSTATE 01000]
Updateability: READ_WRITE [SQLSTATE 01000]
User access: MULTI_USER [SQLSTATE 01000]
Recovery model: FULL [SQLSTATE 01000]
Encrypted: No [SQLSTATE 01000]
Is accessible: Yes [SQLSTATE 01000]
Log shipping role: PRIMARY [SQLSTATE 01000]
Differential base LSN: 171335000321564900080 [SQLSTATE 01000]
Last log backup LSN: 171335000319597800001 [SQLSTATE 01000]
Allocated extent page count: 111569000 (871632.812500 MB) [SQLSTATE 01000]
Modified extent page count: 14840 (115.937500 MB) [SQLSTATE 01000]
[SQLSTATE 01000]
Date and time: 2022年01月02日 02:02:14 [SQLSTATE 01000]
Database context: [master] [SQLSTATE 01000]
Command: DECLARE @ReturnCode int EXECUTE @ReturnCode = dbo.xp_create_subdir N'H:\MSSQLBackups\SRV-SQLP102\APS_Arieldb\FULL' IF @ReturnCode <> 0 RAISERROR('Error creating directory.', 16, 1) [SQLSTATE 01000]
Outcome: Succeeded [SQLSTATE 01000]
Duration: 00:00:00 [SQLSTATE 01000]
Date and time: 2022年01月02日 02:02:14 [SQLSTATE 01000]
[SQLSTATE 01000]
Date and time: 2022年01月02日 02:02:14 [SQLSTATE 01000]
Database context: [master] [SQLSTATE 01000]
Command: DECLARE @ReturnCode int EXECUTE @ReturnCode = dbo.xp_backup_database @database = N'APS_Arieldb', @filename = N'H:\MSSQLBackups\SRV-SQLP102\APS_Arieldb\FULL\SRV-SQLP102_APS_Arieldb_FULL_20220102_020214.bak', @with = 'CHECKSUM', @olrmap = 1, @cryptlevel = 8, @encryptionkey = N'<password>' IF @ReturnCode <> 0 RAISERROR('Error performing LiteSpeed backup.', 16, 1) [SQLSTATE 01000]
SQL> Processed 1160 pages for database 'APS_Arieldb', file 'Arielv81_log' on file 1. SQL> BACKUP DATABASE successfully processed 111593872 pages in 2624.690 seconds (332.163 MB/sec). BACKUP DATABASE successfully processed 111593872 pages in 2624.690 seconds (332.163 MB/sec). BACKUP ATTACHED FILES successfully processed 1 file(s) Backup added as file number: 1 Output File(s): H:\MSSQLBackups\SRV-SQLP102\APS_Arieldb\FULL\SRV-SQLP102_APS_Arieldb_FULL_20220102_020214.bak Native Size: 871899.65 MB Backup Size: 108596.85 MB Attachments: 20.41 MB CPU Seconds: 2591.97 Environment: Intel(R) Xeon(R) Gold 6134 CPU @ 3.20GHz CPUs: 16 logical, 32 cores, 32 physical packages. [SQLSTATE 01000]
Outcome: Succeeded [SQLSTATE 01000]
Duration: 00:45:26 [SQLSTATE 01000]
Date and time: 2022年01月02日 02:47:40 [SQLSTATE 01000]
[SQLSTATE 01000]
Date and time: 2022年01月02日 02:47:40 [SQLSTATE 01000]
Database context: [master] [SQLSTATE 01000]
Command: DECLARE @ReturnCode int EXECUTE @ReturnCode = dbo.xp_restore_verifyonly @filename = N'H:\MSSQLBackups\SRV-SQLP102\APS_Arieldb\FULL\SRV-SQLP102_APS_Arieldb_FULL_20220102_020214.bak', @with = 'CHECKSUM', @encryptionkey = N'<password>' IF @ReturnCode <> 0 RAISERROR('Error verifying LiteSpeed backup.', 16, 1) [SQLSTATE 01000]
LiteSpeed® for SQL Server® Version 8.9.5.213 (64bit) Copyright 2020 Quest Software Inc. ALL RIGHTS RESERVED. CPU Seconds: 1192.36 Environment: Intel(R) Xeon(R) Gold 6134 CPU @ 3.20GHz CPUs: 16 logical, 32 cores, 32 physical packages. [SQLSTATE 01000]
Outcome: Succeeded [SQLSTATE 01000]
Duration: 00:05:33 [SQLSTATE 01000]
Date and time: 2022年01月02日 02:53:13 [SQLSTATE 01000]
[SQLSTATE 01000]
Date and time: 2022年01月02日 02:53:13 [SQLSTATE 01000]
Database context: [master] [SQLSTATE 01000]
Command: DECLARE @ReturnCode int EXECUTE @ReturnCode = dbo.xp_slssqlmaint N'-MAINTDEL -DELFOLDER "H:\MSSQLBackups\SRV-SQLP102\APS_Arieldb\FULL" -DELEXTENSION "bak" -DELUNIT "7201" -DELUNITTYPE "minutes" -DELUSEAGE' IF @ReturnCode <> 0 RAISERROR('Error deleting LiteSpeed backup files.', 16, 1) [SQLSTATE 01000]
LiteSpeed SlsSqlMaint Utility. Logging on to SQL Server 'SRV-SQLP102' with Windows Authentication Logged on to SQL Server 'SRV-SQLP102' Starting maintenance plan 'All ad-hoc plans' on 1/2/2022 2:53:13 AM [1] Delete files... [1] Execution Time: 0 hrs, 0 mins, 0 secs. End of maintenance plan 'All ad-hoc plans' on 1/2/2022 2:53:13 AM (Success) D:\MSSQL13.MSSQLSERVER\MSSQL\Binn\slssqlmaint.exe Process Exit Code: 0 (Success) [SQLSTATE 01000]
Outcome: Succeeded [SQLSTATE 01000]
Duration: 00:00:01 [SQLSTATE 01000]
Date and time: 2022年01月02日 02:53:14 [SQLSTATE 01000]
[SQLSTATE 01000]
Date and time: 2022年01月02日 02:53:14 [SQLSTATE 01000]
[SQLSTATE 01000]
I have two jobs:
- DatabaseBackup - USER_DATABASES - FULL
- DatabaseBackup - USER_DATABASES - FULL_2
The jobs call Quest LiteSpeed to encrypt the backups. The jobs use the following code:
EXECUTE [dbo].[DatabaseBackup]
@Databases = 'USER_DATABASES',
@Directory = N'H:\MSSQLBackups',
@BackupType = 'FULL',
@BackupSoftware = 'LITESPEED',
@Compress = 'Y',
@Encrypt = 'Y',
@EncryptionAlgorithm = 'AES_256',
@EncryptionKey = '<password>',
@ObjectLevelRecoveryMap = 'Y',
@Verify = 'Y',
@CleanupTime = 120,
@CheckSum = 'Y',
@LogToTable = 'Y',
@DatabasesInParallel = 'Y'
2 Answers 2
Testing
I downloaded the newest version of Ola's Maintenance.sql script and created the objects in my local laptop instance.
I had to manually download the Queue.sql
file and the QueueDatabase.sql
file to create the two tables in my msdb
database, because my own OLA DatabaseBackup - USER_DATABASES - FULL
job failed with the parameter @DatabasesInParallel='Y'
because of the missing objects.
After that I scripted out the above mentioned job and created a new identical job by adding a PARALLEL TEST
to the job's name.
I now had two identical jobs:
OLA DatabaseBackup - USER_DATABASES - FULL
OLA DatabaseBackup - USER_DATABASES - FULL PARALLEL TEST
I then went ahead and started both jobs nearly simultaneously:
I quickly switched to the QueueDatabases
table and queried the content:
Queue Database Table displaying databases being backed up in sequence
So, it looked quite good. The databases were being backed up in parallel.
In the end the databases were backed up and each database only once.
Answering Your Question
I'm just wondering if anyone has any ideas why this might be happening.
There are possibly multiple reasons why this might be happening.
Reason 1
The first reason that springs to my mind, is that the @DatabasesInParallel
has been turned on (Y
) in one job, but not (N
) in the other.
Reason 2
The scripts of your two jobs aren't completely identical for some reason. Or you might have two scripts. One in the msdb
and the other in the master
database (default). This reason is a bit far fetched. I couldn't find a place in Ola's code where different job parameter's would result in multiple backups. But I only performed a quick scan of his code.
Reason 3
Another job that performs the TLOG backups has the parameter ChangeBackupType='Y'
set and the FULL Backup job has the parameter @CleanupMode='BEFORE_BACKUP'
set. This results in the FULL Backup deleting the FULL backup file before it backs up the database. During this time the TLOG backup kicks in and because of the missing FULL backup creates a new FULL backup.
Reason 4
You've found a bug. In that case open up a New Issue on Ola's Github Repository including everything you have posted here. You may even link this question to the Issue.
Reason 5
The UPDATE
statement that updates the QueueDatabase
table is unable to lock the record for update, because it is locked by another process. It then assumes the database hasn't been backed up and commences to do so.
Reason 6
Another reason for similar behaviour could be that you are running Ola's scripts in a dedicated database and that database has been set to READ_COMMITED_SNAPSHOT_ISOLATION
, which results in one process reading old data. Turn OFF READ_COMMITED_SNAPSHOT_ISOLATION
for this dedicated OLA MAINTENANCE database.
Reason 7
Any other non-default setting in the SQL Server's instance settings, or Availability Groups affecting the jobs.
Analysis During Runtime
In order to capture the issue, try querying the QueueDatabase
table during a FULL backup and see if you can observe any abnormal behaviour. As you can see in my example, the backup procedure, in general, seems to work just fine. Each job grabs a database and performs the action and duly notes when it is finished.
Good luck!
-
1
-
I was just querying the QueueDatabase table and found 200 entries going back about 1.5 years. Do you know if there's an online reference that explains the significance of the rows in this table? Should I be doing maintenance to clean out the rows in this table?Ken– Ken2022年01月10日 18:55:02 +00:00Commented Jan 10, 2022 at 18:55
-
I'm not currently at work. But I reckon the table should be cleaned up like OLA's CommanLog table and other things. You might be missing a job from OLA's maintenance script. But I'll have a look tomorrow. If the table is designed to be auto-incremented on the
QueueID
column, then I guess it is safe to manually delete anything older than a specific date or smaller than a specificQueueID
.John K. N.– John K. N.2022年01月10日 19:24:12 +00:00Commented Jan 10, 2022 at 19:24 -
I recreated all the jobs and schedules at work and none of the jobs clean up the
QueueDatabase
table. So you might want to clean up the table yourself.John K. N.– John K. N.2022年01月11日 10:35:21 +00:00Commented Jan 11, 2022 at 10:35 -
1Thanks everyone for the help. I'll continue to monitor the QueueDatabase table although it's now been a week since the last time I had duplicate backups. Without a reoccurrence I'm not sure how far I'll get in my investigation.Ken– Ken2022年01月12日 16:18:01 +00:00Commented Jan 12, 2022 at 16:18
I experienced the same behavior. Guess the problem is in the logic that determines the end of one backup and the start of the next one where "sys.dm_exec_requests" is queried at line 2524. The problem no longer occurs adding a small delay during the database loop at line 2513 (ie WAITFOR DELAY '00:01:00';). I'm still investigating. temporary fix
Explore related questions
See similar questions with these tags.