Major Discrepancy Between SQLIO IOPS/Throughput Stats and DBCC CHECKDB Operations

Question 1

I'm conducting time trials on a new SSD array running both SQLIO tests and a real-world workload of DB restores and DBCC CHECKDB calls. I'm seeing a major discrepancy between the IOPS and throughput generated with my SQLIO batches and what I'm observing with the workload, with the workload only requesting a fraction of what I was able to observe with SQLIO, usually in the 5,000 IOPS range and generating no more than 400 MB/s throughput.

Is there an inherent limitation as to how many resources DBCC CHECKDB will consume event if the hardware has more than sufficient capacity to handle the load? What settings can I experiment with to expand DBCC CHECKDBs usage of CPU and disk resources?

Here are the specifics...

From systeminfo

OS Name: Microsoft Windows Server 2012 R2 Standard OS Version: 6.3.9600 N/A Build 9600 System Manufacturer: HP System Model: ProLiant DL580 G7 System Type: x64-based PC Processor(s): 4 Processor(s) Installed. [01]: Intel64 Family 6 Model 46 Stepping 6 GenuineIntel ~1042 Mhz Total Physical Memory: 131,062 MB Network Card(s): 4 NIC(s) Installed. [01]: HP NC375i Integrated Quad Port Multifunction Gigabit Server Adapter

SQL Server Info

Microsoft SQL Server 2008 R2 (SP2) - 10.50.4000.0 (X64) Jun 28 2012 08:36:30 Copyright (c) Microsoft Corporation Enterprise Evaluation Edition (64-bit) on Windows NT 6.2 (Build 9200: )

User DB volume on 3 TB SSD LUN (Tlogs on same volume but only because its a DBCC box)
System DBs (except tempdb) on C: volume using RAID 1 on 15k spindles
TempDB data files on 1 TB SSD LUN (32 files totaling 80 GB)
TempDB log file on 100 GB SSD LUN (one 10 GB file)

Test script using SQLIO where the param file is directed to 40 GB test file on a 3 TB XtremeIO Flash Array LUN

sqlio -kW -t8 -s120 -o8 -fsequential -b64 -BH -LS -Fparam.txt sqlio -kR -t8 -s120 -o8 -fsequential -b64 -BH -LS -Fparam.txt sqlio -kW -t8 -s120 -o8 -frandom -b8 -BH -LS -Fparam.txt sqlio -kR -t8 -s120 -o8 -frandom -b8 -BH -LS -Fparam.txt

The specs for the XtremeIO array XtremIO - 1 Brick Version: 2.2.3 build 25 Build id: 9585409:HEAD-release-2_2

The results for the SQLIO run C:\SQLIO>sqlio -kW -t8 -s120 -o8 -fsequential -b64 -BH -LS -Fparam.txt sqlio v1.5.SG using system counter for latency timings, 2211143 counts per second parameter file used: param.txt file L:\testfile.dat with 8 threads (0-7) using mask 0x0 (0) 8 threads writing for 120 secs to file L:\testfile.dat using 64KB sequential IOs enabling multiple I/Os per thread with 8 outstanding buffering set to use hardware disk cache (but not file cache) using specified size: 40000 MB for file: L:\testfile.dat initialization done CUMULATIVE DATA: throughput metrics: IOs/sec: 23118.54 MBs/sec: 1444.90 latency metrics: Min_Latency(ms): 0 Avg_Latency(ms): 2 Max_Latency(ms): 9 histogram: ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+ %: 5 7 46 41 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


C:\SQLIO>sqlio -kR -t8 -s120 -o8 -fsequential -b64 -BH -LS -Fparam.txt
sqlio v1.5.SG
using system counter for latency timings, 2211143 counts per second
parameter file used: param.txt
 file L:\testfile.dat with 8 threads (0-7) using mask 0x0 (0)
8 threads reading for 120 secs from file L:\testfile.dat
 using 64KB sequential IOs
 enabling multiple I/Os per thread with 8 outstanding
 buffering set to use hardware disk cache (but not file cache)
using specified size: 40000 MB for file: L:\testfile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 25160.07
MBs/sec: 1572.50
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 2
Max_Latency(ms): 8
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 24 33 12 7 7 9 6 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
C:\SQLIO>sqlio -kW -t8 -s120 -o8 -frandom -b8 -BH -LS -Fparam.txt 
sqlio v1.5.SG
using system counter for latency timings, 2211143 counts per second
parameter file used: param.txt
 file L:\testfile.dat with 8 threads (0-7) using mask 0x0 (0)
8 threads writing for 120 secs to file L:\testfile.dat
 using 8KB random IOs
 enabling multiple I/Os per thread with 8 outstanding
 buffering set to use hardware disk cache (but not file cache)
using specified size: 40000 MB for file: L:\testfile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 153634.35
MBs/sec: 1200.26
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 0
Max_Latency(ms): 1
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

C:\SQLIO>sqlio -kR -t8 -s120 -o8 -frandom -b8 -BH -LS -Fparam.txt sqlio v1.5.SG using system counter for latency timings, 2211143 counts per second parameter file used: param.txt file L:\testfile.dat with 8 threads (0-7) using mask 0x0 (0) 8 threads reading for 120 secs from file L:\testfile.dat using 8KB random IOs enabling multiple I/Os per thread with 8 outstanding buffering set to use hardware disk cache (but not file cache) using specified size: 40000 MB for file: L:\testfile.dat initialization done CUMULATIVE DATA: throughput metrics: IOs/sec: 181107.89 MBs/sec: 1414.90 latency metrics: Min_Latency(ms): 0 Avg_Latency(ms): 0 Max_Latency(ms): 5 histogram: ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+ %: 100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Question 2

How many DBCC CHECKDB commands can you run simultaneously? I expect DBCC CHECKDB runs in a mostly serial fashion, and as such you will never see the same IOps compared to SQLio.

Question 3

Paul Randal is a DBCC CHECKDB guru, I'd read his blog on DBCC CHECKDB. Here is one blog post, you could read and perhaps revise your tests -- DBCC CHECKDB scalability and performance benchmarking on SSDs

Question 4

Max, I did one system test restoring the same DB under two different names and running simultaneous DBCC CHECKDBs. Runtime was 50% longer (63 GB DB taking 6 minutes on a solo run, 9 minutes in parallel with another).

Question 5

Denis, this post from Paul is a good find. I ran a set of unit tests with MAXDOP settings from 1 through 64 and MAXDOP 4 generated the best results as Paul had discovered as well. I'm running a full portfolio test now. The question that comes up then is am I running on oversized hardware for the task and can I get the same results using a host with less cores?

Question 6

On a piece of code like CHECKDB that has so many bottleneck in the threading model, you can probably run quite a bit faster on a smaller box with a higher clock rate.

Question 7

DBCC CHECKDB isn't a good storage test. It does logical tests too, not just reads from disk - for example, it compares data between multiple indexes on the same table to make sure they all have the same values. These checks consume CPU cycles.

If you want a better pure storage test, consider setting an artificially low buffer pool number and running multiple simultaneous SELECT COUNT(*) queries across multiple large tables with no nonclustered indexes.

Question 8

Thank you, Brent. I'm looking at how I wrote my question and realized I should have indicated I'm testing a host dedicated to restoring full backups and running DBCC CHECKDB exclusively (testing backups for my entire portfolio). With such specific parameters, the hardware profile could be substantially different. I guess I'm wondering how many CPUs and concurrent DBCC CHECKDB statements can I run to make the most use of the SSD array I have without breaking the bank on SQL licensing.

Question 9

Ah, typically I see people doing this in their development, QA, or staging environments as part of a data refresh, so there's no licensing incurred.

Question 10

Yes, there are cases where the wait will be mostly 100% OLEDB and the hardware will seem idle. In my case, I tried to run a DBCC CHECKTABLE on a 26 GB table with a spatial index. It would run and run and run.... I moved it to my workstation (6 core Zeon, 16 GB, with 2 SDDs) with hopes of getting it to finish. It runs faster, but runs and runs.... I tried using SQL 2012, SP2, traces flags, etc. A DBCC on a normal table would finish about 7 times faster than in production and really worked my disks, so I know my hardware helps. The DBCC on the table with spatial index ran over a week before failing. (I didn't limit the memory and starved the OS. I had VMs and other stuff going too.) While it was running, my machine seem almost idle. I could not identify the bottle neck. It was not CPU or disk. I'm thinking to submit a bug report for this.

Perhaps you can use DBCC CHECKTABLE to see if there are a select group of tables that also have this behavior.

MSDN link for detail on my DBCC issue

Question 11

Randy, running DBCC CHECKTABLE(yourSpatialTable) WITH PHYSICAL_ONLY on the spatial table should serve as a workaround? I would raise the connect item though. Let us know how you get on.

Brent Ozar Brent Ozar 43.4k51 gold badges233 silver badges391 bronze badges · Answer 1 · 2014-07-16 07:02:31Z

3

DBCC CHECKDB isn't a good storage test. It does logical tests too, not just reads from disk - for example, it compares data between multiple indexes on the same table to make sure they all have the same values. These checks consume CPU cycles.

If you want a better pure storage test, consider setting an artificially low buffer pool number and running multiple simultaneous SELECT COUNT(*) queries across multiple large tables with no nonclustered indexes.

Share

Improve this answer

answered Jul 16, 2014 at 7:02

Brent Ozar's user avatar

Brent Ozar Brent Ozar

43.4k51 gold badges233 silver badges391 bronze badges

2

Thank you, Brent. I'm looking at how I wrote my question and realized I should have indicated I'm testing a host dedicated to restoring full backups and running DBCC CHECKDB exclusively (testing backups for my entire portfolio). With such specific parameters, the hardware profile could be substantially different. I guess I'm wondering how many CPUs and concurrent DBCC CHECKDB statements can I run to make the most use of the SSD array I have without breaking the bank on SQL licensing.

MattyZDBA
– MattyZDBA

2014年07月21日 19:28:14 +00:00
Commented Jul 21, 2014 at 19:28
Ah, typically I see people doing this in their development, QA, or staging environments as part of a data refresh, so there's no licensing incurred.

Brent Ozar
– Brent Ozar

2014年07月21日 20:41:31 +00:00
Commented Jul 21, 2014 at 20:41

Add a comment |

Randy in Marin Randy in Marin 1312 bronze badges · Answer 2 · 2014-07-01 23:28:48Z

Yes, there are cases where the wait will be mostly 100% OLEDB and the hardware will seem idle. In my case, I tried to run a DBCC CHECKTABLE on a 26 GB table with a spatial index. It would run and run and run.... I moved it to my workstation (6 core Zeon, 16 GB, with 2 SDDs) with hopes of getting it to finish. It runs faster, but runs and runs.... I tried using SQL 2012, SP2, traces flags, etc. A DBCC on a normal table would finish about 7 times faster than in production and really worked my disks, so I know my hardware helps. The DBCC on the table with spatial index ran over a week before failing. (I didn't limit the memory and starved the OS. I had VMs and other stuff going too.) While it was running, my machine seem almost idle. I could not identify the bottle neck. It was not CPU or disk. I'm thinking to submit a bug report for this.

Perhaps you can use DBCC CHECKTABLE to see if there are a select group of tables that also have this behavior.

MSDN link for detail on my DBCC issue

Randy, running DBCC CHECKTABLE(yourSpatialTable) WITH PHYSICAL_ONLY on the spatial table should serve as a workaround? I would raise the connect item though. Let us know how you get on.

Stack Exchange Network

Major Discrepancy Between SQLIO IOPS/Throughput Stats and DBCC CHECKDB Operations

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Major Discrepancy Between SQLIO IOPS/Throughput Stats and DBCC CHECKDB Operations

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions