Different performance of SQL Server instances on same server

Question 1

I have 4 development SQL Server 2016 instances on a virtual server (128 GB, 12 virtual processors). Each instance is configured the same (16 GB memory, automatic processor and I/O affinity masks). All instance parameters are the same. Each instance has a database called "operations" with a table called "ErrLog" with a clustered PK index; each instance's ErrLog currently has 600-1000 records. Each operations database has one datafile on E: drive, one tx log file on F: drive. Querying the ErrLog table in instances (b), (c), and (d) is sub-second. In instance (a), it consistently takes 2-4 seconds.

I do typical maintenance (Ola Hallengren's index and statistic maintenance once a week), but this is consistently done in each instance.

I have used sp_BlitzFirst and sp_Blitz, which return issues, but consistent issues between the instances. And so far I'm not noticing any issues returned by various checkdb operations.

Can anyone suggest other troubleshooting steps to determine why instance (a) is having different performance than others? (Querying ErrLog is just a proxy for honing in on the differences; other queries are also performing differently.)

Also, I restored a copy of a.Operations into b.Operations_test, and query b.Operations_test.ErrLog also takes 2-4 seconds. I restored a copy of b.Operations into a.Operations_test, and querying a.Operations_test is sub-second.

EDIT: Thanks for your replies...

Here is the structure of the table (same in each of the instances); just the one, clustered primary key index:

CREATE TABLE [dbo].[ErrLog](
 [ErrorID] [int] IDENTITY(1,1) NOT NULL,
 [TimestampCreate] [dbo].[TimestampCreate] NOT NULL,
 [ErrorNumber] [int] NOT NULL,
 [ErrType] [int] NOT NULL,
 [ErrLocation] [varchar](2000) NULL,
 [ErrContext] [text] NULL,
 [ErrSeverity] [int] NOT NULL,
 [MessageLong] [varchar](1024) NOT NULL,
 [ServerName] [dbo].[DescriptionLong] NULL,
 CONSTRAINT [PK_ErrLog] PRIMARY KEY CLUSTERED 
(
 [ErrorID] ASC
)

PasteAPlan results:

NP02 NP03

This might be a clue to the different performances: I find that in NP02, some of the records have much longer values in the ErrContext text field:

NP02: 668 rows; maximum length of "ErrContext" text field is 3,090,039
NP03: 648 rows; maximum length of "ErrContext" text field is 154,013
NP04: 411 rows; max length of "ErrContext" text field is 28,519
NP05: 46 rows; max length of "ErrContext" text field is 55,445

Top 10 Wait Statistics (pulled using Pinal Dave's query):

NP02:

Wait_Type	Wait_Time_Seconds	Waiting_Tasks_Count	Percentage_WaitTime
OLEDB	17951.078	1006510364	34.22289012
BACKUPBUFFER	5775.054	6016680	11.00987018
ASYNC_IO_COMPLETION	5491.216	3688	10.46874632
BACKUPIO	5441.588	5592972	10.37413286
IO_COMPLETION	3451.687	1192286	6.58047973
SOS_SCHEDULER_YIELD	3271.515	6502869	6.236990244
PAGEIOLATCH_SH	3191.304	2482244	6.084071726
BACKUPTHREAD	2224.763	229830	4.241406543
TRACEWRITE	1680.446	864	3.203691656

NP03:

Wait_Type	Wait_Time_Seconds	Waiting_Tasks_Count	Percentage_WaitTime
CXPACKET	60954.838000	24948490	20.506848003464389
BACKUPBUFFER	58801.396000	27540692	19.782372158277558
ASYNC_IO_COMPLETION	48562.138000	1528	16.337610194112273
BACKUPIO	41613.771000	28454440	13.999992531322523
OLEDB	27122.218000	1397826908	9.124644085557670
BACKUPTHREAD	22593.140000	1078602	7.600940353594105
ASYNC_NETWORK_IO	12203.635000	1545004	4.105631254975333
PAGEIOLATCH_SH	8875.883000	5384558	2.986085921145972
SOS_SCHEDULER_YIELD	5326.097000	43211882	1.791842374032848
IO_COMPLETION	3153.986000	1708478	1.061085774800265

Question 2

Please provide the slow query, any index definitions, and the actual execution plan which can be uploaded to Paste The Plan.

Question 3

Also compare the session wait stats. learn.microsoft.com/en-us/sql/relational-databases/…

Question 4

From the execution plans, we can see that the important difference is the time spent doing ASYNC_NETWORK_IO waits. For your slow scenario, you do over 14,000 of them but only 159 in your fast case.

This is saying that your slower execution is having to do a lot more back and forth trips to your database in order to send all the data to your client. The most obvious explanation is the text column. The bigger the size of this lob column, the more read requests the client is going to have to do to get to the end of it. Lobs aren’t as simple as other data types, the contents of them won’t necessarily fit in the same pages as the rest of your columns so a pointer is used instead. Your client receives this pointer and does it’s own lookups. This can lead to multiple lookups per row (as in your case) as further pointers must be followed. You should have a read of this post for further information on how lobs are handled https://aboutsqlserver.com/2013/11/05/sql-server-storage-engine-lob-storage/

Question 5

Agreed: when I exclude the ErrContext field (the text column) from the query, it returns sub-second as in the other instances.

Andrew Sayer Andrew Sayer 2,7771 gold badge7 silver badges9 bronze badges · Accepted Answer · 2023-01-30 07:00:02Z

From the execution plans, we can see that the important difference is the time spent doing ASYNC_NETWORK_IO waits. For your slow scenario, you do over 14,000 of them but only 159 in your fast case.

This is saying that your slower execution is having to do a lot more back and forth trips to your database in order to send all the data to your client. The most obvious explanation is the text column. The bigger the size of this lob column, the more read requests the client is going to have to do to get to the end of it. Lobs aren’t as simple as other data types, the contents of them won’t necessarily fit in the same pages as the rest of your columns so a pointer is used instead. Your client receives this pointer and does it’s own lookups. This can lead to multiple lookups per row (as in your case) as further pointers must be followed. You should have a read of this post for further information on how lobs are handled https://aboutsqlserver.com/2013/11/05/sql-server-storage-engine-lob-storage/

Agreed: when I exclude the ErrContext field (the text column) from the query, it returns sub-second as in the other instances.

Stack Exchange Network

Different performance of SQL Server instances on same server

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Different performance of SQL Server instances on same server

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions