3

I've been trying to figure out this linked server problem for about a week now with no luck. The set up I have is a linked server from an instance of SQL Server 2016 to an instance of SQL Server 2016 Express (exact versions below as retrieved with @@Version). This install is roughly 2 years old.

The standard install is a Data Warehouse that pulls in an Order table from a production line scheduling application every 10 minutes and merges the data into the data warehouse. It does this with a stored procedure and a dynamic SQL OPENQUERY.

This merge operation usually takes between 1 - 5 seconds to complete depending on the network at the time, however randomly, it just hangs indefinitely placing a lock on the Order table in the scheduling application, causing other applications downstream to choke because of the lock (running sp_who2 against the scheduling application confirms the remote OPENQUERY is locking this table).

I've run the SQL Server Profiler against both DBs and have indeed discovered that the OPENQUERY is what gets hung up. The profiler shows the open query is being received by the remote server, but it's almost like it just decides not to do anything about it.

More details:

  • This ETL job runs every 10 minutes and is scheduled with SQL Server Job agent

  • This exact same procedure is collecting data from 5 other locations perfectly fine. This of course makes me suspect the problem is with the remote server.

  • If I call the job more frequently (eg every 5 minutes) the lock will happen more often

  • I've set the query timeout in the driver to 0 to prove that it will hang indefinitely (it does)

  • When I set the timeout to something like 60 seconds in the driver, I get the following error randomly:

    Cannot fetch a row from OLE DB provider "SQLNCLI11" for linked server "MY_LINKED_SERVER".

    which seems more like a simple permissions problem, but it's not.

  • I've checked that the remote log in has the proper db_datareader role, and it does.

  • I've confirmed that another process is not blocking the remote table

Current linked server driver settings Allow In Process = TRUE everything else = FALSE

Versions:

Microsoft SQL Server 2016 (SP1) (KB3182545) - 13.0.4001.0 (X64) Oct 28 2016 18:17:30 Copyright (c) Microsoft Corporation Standard Edition (64-bit) on Windows Server 2016 Standard 6.3 <X64> (Build 14393: ) (Hypervisor)
Microsoft SQL Server 2016 (SP1) (KB3182545) - 13.0.4001.0 (X64) Oct 28 2016 18:17:30 Copyright (c) Microsoft Corporation Express Edition (64-bit) on Windows Server 2016 Standard 6.3 <X64> (Build 14393: ) (Hypervisor)

Here is also a snip of the 'openquery'. I've removed a bunch of other code to keep it smaller. There is really nothing crazy going on here, but I know people will probably ask to see it.

Before people start talking about SQL injection and the dynamic SQL I'm composing not being safe, this code is not exposed to any data entry points, or the Internet at all.

 -- Bring in the new orders
 IF OBJECT_ID('tempdb..#tmpNewOrders') IS NOT NULL
 DROP TABLE #tmpNewOrders
 CREATE TABLE #tmpNewOrders
 (
 
 [OrderID] INT IDENTITY NOT NULL PRIMARY KEY, 
 [System] CHAR(1) NULL,
 [OrderNumber] INT NULL,
 [FacilityID] INT NULL,
 [ItemID] INT NULL, 
 [LineID] INT NULL,
 [CustomerID] INT NULL, 
 [ShortItemNumber] INT NULL, 
 [JDEWorkOrderID] INT NULL, 
 [ScheduledLotID] INT NULL, 
 [ScheduledLotOrder] INT NULL, 
 [ScheduledLineID] INT NULL, 
 [ScheduledDate] DATETIMEOFFSET (7) NULL, 
 [ScheduledRunOrder] SMALLINT NULL, 
 [OrderedQuantity] INT NULL, 
 [PackedQuantity] INT NULL, 
 [OrderRequestedShipDate] DATETIMEOFFSET(7) NULL,
 [SalesOrderNumber] VARCHAR(50),
 [EstimatedDuration] INT NULL, 
 [Memo] VARCHAR(MAX) NULL, 
 [JDEWorkOrderStatus] INT NULL, 
 [CompletedDateTime] DATETIMEOFFSET(7) NULL, 
 [IsCompleted] BIT NULL, 
 [IsRemoved] BIT NULL, 
 [TargetCasesperHour] INT NULL, 
 [TargetPoundsperHour] INT NULL, 
 [IsCreatedByScheduler] BIT NULL,
 [COOLCode] VARCHAR(30),
 [RecipeID] INT NULL,
 [LineRecipeID] INT NULL,
 [LongItemNumber] VARCHAR(30) NULL,
 [SalesOrderStatusCode] INT NULL
 )
-- set up what we want the insert to temp table to be
 SET @Insert =
 '
 INSERT INTO #tmpNewOrders(
 [OrderNumber] 
 ,[ScheduledLotID] 
 ,[ScheduledLotOrder] 
 ,[ScheduledLineID] 
 ,[ScheduledDate] 
 ,[ScheduledRunOrder] 
 ,[OrderedQuantity] 
 ,[PackedQuantity] 
 ,[EstimatedDuration] 
 ,[CustomerID] 
 ,[JDEWorkOrderID] 
 ,[ShortItemNumber] 
 ,[OrderRequestedShipDate] 
 ,[SalesOrderNumber]
 ,[Memo] 
 ,[JDEWorkOrderStatus] 
 ,[IsCompleted] 
 ,[IsRemoved] 
 ,[IsCreatedByScheduler] 
 ,[CompletedDateTime]
 ,[COOLCode]
 ,[RecipeID]
 ,[LineRecipeID]
 ,[TargetCasesperHour]
 ,[TargetPoundsperHour]
 ,[LongItemNumber]
 ,[SalesOrderStatusCode]
 )
 '
 -- set up the open query for doing replacement substitution
 SET @OpenQuery = '@Insert 
 SELECT 
 [OrderID] 
 ,[ScheduledLotID] 
 ,[ScheduledLotOrder] 
 ,[ScheduledLineID] 
 ,[ScheduledDate] 
 ,[ScheduledRunOrder] 
 ,[OrderedQuantity] 
 ,[PackedQuantity] 
 ,[EstimatedDuration] 
 ,[CustomerID]
 ,[JDEWorkOrderNumber]
 ,[ShortItemNumber] 
 ,[OrderRequestedShipDate]
 ,[SalesOrderNumber]
 ,[Memo] 
 ,[JDEWorkOrderStatusCode] 
 ,[IsCompleted] 
 ,[IsRemoved] 
 ,[IsCreatedByScheduler] 
 ,[CompletedDatetime] 
 ,[COOLCode] 
 ,[RecipeID]
 ,[LineRecipeID]
 ,[TargetCasesperHour]
 ,[TargetPoundsperHour]
 ,[LongItemNumber]
 ,[SalesOrderStatusCode]
 FROM OPENQUERY([@Server], @Query )'
 -- set the query that we want to pass to the OPENQUERY Statement
 -- very critical that quoted identifier is set to OFF
 SET @Query = "'SELECT o.[OrderID] 
 ,o.[ScheduledLotID] 
 ,o.[ScheduledLotOrder] 
 ,o.[ScheduledLineID] 
 ,o.[ScheduledDate] 
 ,o.[ScheduledRunOrder] 
 ,o.[OrderedQuantity] 
 ,o.[PackedQuantity] 
 ,o.[EstimatedDuration] 
 ,jo.[ShipToCustomerNumber] CustomerID
 ,o.[JDEWorkOrderNumber]
 ,jo.[ShortItemNumber] 
 ,jo.[OrderRequestedShipDate]
 ,jo.[SalesOrderNumber]
 ,o.[Memo] 
 ,o.[JDEWorkOrderStatusCode] 
 ,o.[CompletedDatetime] 
 ,o.[IsCompleted] 
 ,o.[IsRemoved] 
 ,o.[IsCreatedByScheduler] 
 ,o.[Timestamp] 
 ,o.[CreatedSource] 
 ,o.[CreatedDTS] 
 ,o.[CreatedBy] 
 ,o.[UpdateSource] 
 ,o.[UpdateDTS] 
 ,o.[UpdateBy] 
 ,o.[COOLCode]
 ,jo.[LongItemNumber]
 ,jo.[SalesOrderStatusCode]
 ,@DBName.dbo.GetRecipeID(jo.[LongItemNumber], jo.[ShipToCustomerNumber]) AS RecipeID
 ,lr.[LineRecipeID]
 ,lr.[TargetCasesperHour]
 ,lr.[TargetPoundsperHour]
 FROM @DBName.dbo.[Order] o
 INNER JOIN @DBName.dbo.[jdeorder] jo ON o.[JDEWorkOrderNumber] = jo.[WorkOrderNumber]
 LEFT JOIN @DBName.dbo.[LineRecipe] lr ON lr.RecipeID = @DBName.dbo.GetRecipeID(jo.[LongItemNumber], jo.[ShipToCustomerNumber]) AND lr.LineID = ScheduledLineID
 WHERE jo.LongItemNumber IS NOT NULL
 AND o.[UpdateDTS] >= ''@StartTime'''"
 -- set the DB name for our Query
 SET @Query = REPLACE(@Query, '@DBName', @DbName)
 -- Sub out our start time
 SET @Query = REPLACE(@Query, '@StartTime', @StartTime)
 -- Sub out our server name
 SET @OpenQuery = REPLACE(@OpenQuery, '@Server', @Server)
 -- sub out for our query
 SET @OpenQuery = REPLACE(@OpenQUery, '@Query', @Query)
 -- sub out our insert
 SET @OpenQuery = REPLACE(@OpenQuery, '@Insert', @Insert)
 -- now execute our query
 EXEC sp_executesql @OpenQuery

I've installed sp_whoisactive and it's revealed some interesting things. After my remote query is finished, (it takes about 1 - 5 seconds) it seems the spid status is set to DORMANT, which according to the SQL Server docs means SQL Server is resetting the session.

This seems to take in the range of 4 - 5 minutes before this spid is cleaned up by SQL Server. However the next time I call the ETL job, it seems to try and use the same spid, which is why I think the hanging is random.

Paul White
95.3k30 gold badges439 silver badges689 bronze badges
asked Sep 18, 2018 at 16:05
1
  • Is DBName.dbo.GetRecipeID() doing anything that may be blocked ot take time to finish? Commented Apr 7, 2024 at 1:55

1 Answer 1

0

I've experienced a similar problem and found a solution how to un-hang the oledb driver. This is what worked for me:

  • You need to get the Process Explorer from the Sysinternals Suite.
  • Open it and find the dllhost.exe.
  • Hover with the mouse over each one and select the one where the popup says OLE DB Core Services
  • Kill it - it'll turn red but you can re-run the query - it should work now

enter image description here

answered Nov 5, 2018 at 10:51

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.