0

In an attempt to make some reports while linking a MySQL and MSSQL server, the performance is poor under some circumstances. There are two unique test cases, one with a "YR_CDE" of 1617 and another with 1718.

Case 1:

SELECT NM.ID_NUM, NM.FIRST_NAME, NM.LAST_NAME, BM.ENTRANCE_CDE, SM.CURRENT_CLASS_CDE, STS.TUITION_CDE, DH.MAJOR_1
FROM NAME_MASTER NM
 INNER JOIN SHU_STUD_TERM_SUM_DIV STS ON NM.ID_NUM = STS.ID_NUM AND STS.YR_CDE = '1617' AND STS.TRM_CDE = 'FA' AND STS.TRANSACTION_STS IN ('C', 'H') AND STS.HRS_ENROLLED > 0
 INNER JOIN BIOGRAPH_MASTER BM ON NM.ID_NUM = BM.ID_NUM
 INNER JOIN STUDENT_MASTER SM ON NM.ID_NUM = SM.ID_NUM
 INNER JOIN DEGREE_HISTORY DH ON NM.ID_NUM = DH.ID_NUM AND DH.CUR_DEGREE = 'Y'
WHERE NM.ID_NUM not in (
 SELECT su.exid
 FROM [MYSQL1]...[myshu.shu_user] su
 INNER JOIN [MYSQL1]...[myshu.demographics_core] dc on su.id = dc.user_id
 INNER JOIN [MYSQL1]...[myshu.demographics_term] dt on dt.id = dc.term_id and dt.description = 'Spring 2017'
)
order by nm.LAST_NAME, nm.FIRST_NAME

Case 2:

SELECT NM.ID_NUM, NM.FIRST_NAME, NM.LAST_NAME, BM.ENTRANCE_CDE, SM.CURRENT_CLASS_CDE, STS.TUITION_CDE, DH.MAJOR_1
FROM NAME_MASTER NM
 INNER JOIN SHU_STUD_TERM_SUM_DIV STS ON NM.ID_NUM = STS.ID_NUM AND STS.YR_CDE = '1718' AND STS.TRM_CDE = 'FA' AND STS.TRANSACTION_STS IN ('C', 'H') AND STS.HRS_ENROLLED > 0
 INNER JOIN BIOGRAPH_MASTER BM ON NM.ID_NUM = BM.ID_NUM
 INNER JOIN STUDENT_MASTER SM ON NM.ID_NUM = SM.ID_NUM
 INNER JOIN DEGREE_HISTORY DH ON NM.ID_NUM = DH.ID_NUM AND DH.CUR_DEGREE = 'Y'
WHERE NM.ID_NUM not in (
 SELECT su.exid
 FROM [MYSQL1]...[myshu.shu_user] su
 INNER JOIN [MYSQL1]...[myshu.demographics_core] dc on su.id = dc.user_id
 INNER JOIN [MYSQL1]...[myshu.demographics_term] dt on dt.id = dc.term_id and dt.description = 'Spring 2017'
)
order by nm.LAST_NAME, nm.FIRST_NAME

Case 1 returns results in about 4 seconds while case 2 takes nearly 4 minutes. After digging through execution plans, we noticed one major difference, case 1 has a Table Spools.

Case 1: Case 1 image of MySQL joins

Case 2: Case 2 image of MySQL joins

The other interesting thing here is that the MySQL queries are executing for every single row in the MS SQL side.

Case 1 has 2418 rows from the base MS SQL query. Case 2 has 965 rows from the base MS SQL query. It appears that the MySQL queries are executing once for every single row in the base query.

It seems like case 1 is cache the data in a temp table while case 2 is not. Does anyone have suggestions on how to do this better or how we can force the second case to be cached as well?

--

Edit 1:

Thanks for the suggestions RMathis.

Performance is still an issue unfortunately. This query returned in 3:01. The execution plan looks similar to the original.

SELECT NM.ID_NUM, NM.FIRST_NAME, NM.LAST_NAME, BM.ENTRANCE_CDE, SM.CURRENT_CLASS_CDE, STS.TUITION_CDE, DH.MAJOR_1
FROM SHU_STUD_TERM_SUM_DIV STS
 INNER JOIN NAME_MASTER NM ON NM.ID_NUM = STS.id_num
 INNER JOIN BIOGRAPH_MASTER BM ON NM.ID_NUM = BM.ID_NUM
 INNER JOIN STUDENT_MASTER SM ON NM.ID_NUM = SM.ID_NUM
 INNER JOIN DEGREE_HISTORY DH ON NM.ID_NUM = DH.ID_NUM AND DH.CUR_DEGREE = 'Y'
WHERE STS.YR_CDE = '1718' AND STS.TRM_CDE = 'FA' AND STS.TRANSACTION_STS IN ('C', 'H') AND STS.HRS_ENROLLED > 0 AND NOT EXISTS (
 SELECT 1
 FROM [MYSQL1]...[myshu.shu_user] su
 INNER JOIN [MYSQL1]...[myshu.demographics_core] dc on su.id = dc.user_id
 INNER JOIN [MYSQL1]...[myshu.demographics_term] dt on dt.id = dc.term_id and dt.description = 'Spring 2017'
 WHERE su.exid = NM.ID_NUM
)
order by nm.LAST_NAME, nm.FIRST_NAME

The second suggestion took 3:35 to execute. The two parts separately return in less than 5 seconds combined.

 SELECT NM.ID_NUM, NM.FIRST_NAME, NM.LAST_NAME, BM.ENTRANCE_CDE, SM.CURRENT_CLASS_CDE, STS.TUITION_CDE, DH.MAJOR_1
FROM SHU_STUD_TERM_SUM_DIV STS
INNER JOIN NAME_MASTER NM ON NM.ID_NUM = STS.ID_NUM
INNER JOIN BIOGRAPH_MASTER BM ON NM.ID_NUM = BM.ID_NUM
INNER JOIN STUDENT_MASTER SM ON NM.ID_NUM = SM.ID_NUM
INNER JOIN DEGREE_HISTORY DH ON NM.ID_NUM = DH.ID_NUM AND DH.CUR_DEGREE = 'Y'
WHERE STS.YR_CDE = '1718' AND STS.TRM_CDE = 'FA' AND STS.TRANSACTION_STS IN ('C', 'H') AND STS.HRS_ENROLLED > 0 and sts.DIV_CDE = dh.DIV_CDE
EXCEPT
SELECT NM.ID_NUM, NM.FIRST_NAME, NM.LAST_NAME, BM.ENTRANCE_CDE, SM.CURRENT_CLASS_CDE, STS.TUITION_CDE, DH.MAJOR_1
FROM SHU_STUD_TERM_SUM_DIV STS
 INNER JOIN NAME_MASTER NM ON NM.ID_NUM = STS.ID_NUM
 INNER JOIN BIOGRAPH_MASTER BM ON NM.ID_NUM = BM.ID_NUM
 INNER JOIN STUDENT_MASTER SM ON NM.ID_NUM = SM.ID_NUM
 INNER JOIN DEGREE_HISTORY DH ON NM.ID_NUM = DH.ID_NUM AND DH.CUR_DEGREE = 'Y'
 INNER JOIN [MYSQL1]...[myshu.shu_user] su ON su.exid = NM.ID_NUM
 INNER JOIN [MYSQL1]...[myshu.demographics_core] dc on su.id = dc.user_id
 INNER JOIN [MYSQL1]...[myshu.demographics_term] dt on dt.id = dc.term_id and dt.description = 'Spring 2017'
WHERE STS.YR_CDE = '1718' AND STS.TRM_CDE = 'FA' AND STS.TRANSACTION_STS IN ('C', 'H') AND STS.HRS_ENROLLED > 0 and sts.DIV_CDE = dh.DIV_CDE
order by nm.LAST_NAME, nm.FIRST_NAME;
asked Apr 3, 2017 at 18:32

2 Answers 2

1

Try rewriting the query in different ways to see how the execution plan and times change. Changing the order in which tables are accessed, using set operators, indexing, etc... can make a huge difference.

The two approaches suggested below are using the MINUS set operator and using NOT EXISTS instead of NOT IN. In both cases, the queries are driven by the SHU_STUD_TERM_SUM_DIV table. This will reduce the number of rows joined to NAME_MASTER.

The MINUS set operator works by filtering the results of the first query with he results of the second. This allows the MySQL tables to be joined directly.

The NOT EXISTS will behave like the NOT IN (each row will query the MySQL tables) but does have the potential of comparing a smaller set of values (only those that directly match NM.ID_NUM).

Using MINUS:

SELECT NM.ID_NUM, NM.FIRST_NAME, NM.LAST_NAME, BM.ENTRANCE_CDE, SM.CURRENT_CLASS_CDE, STS.TUITION_CDE, DH.MAJOR_1
FROM SHU_STUD_TERMS_SUM_DIV STS
 INNER JOIN NAME_MASTER NM ON NM.ID_NUM = STS.ID_NUM
 INNER JOIN BIOGRAPH_MASTER BM ON NM.ID_NUM = BM.ID_NUM
 INNER JOIN STUDENT_MASTER SM ON NM.ID_NUM = SM.ID_NUM
 INNER JOIN DEGREE_HISTORY DH ON NM.ID_NUM = DH.ID_NUM AND DH.CUR_DEGREE = 'Y'
WHERE STS.YR_CDE = '1617' AND STS.TRM_CDE = 'FA' AND STS.TRANSACTION_STS IN ('C', 'H') AND STS.HRS_ENROLLED > 0
MINUS
SELECT NM.ID_NUM, NM.FIRST_NAME, NM.LAST_NAME, BM.ENTRANCE_CDE, SM.CURRENT_CLASS_CDE, STS.TUITION_CDE, DH.MAJOR_1
FROM SHU_STUD_TERMS_SUM_DIV STS
 INNER JOIN NAME_MASTER NM ON NM.ID_NUM = STS.ID_NUM
 INNER JOIN BIOGRAPH_MASTER BM ON NM.ID_NUM = BM.ID_NUM
 INNER JOIN STUDENT_MASTER SM ON NM.ID_NUM = SM.ID_NUM
 INNER JOIN DEGREE_HISTORY DH ON NM.ID_NUM = DH.ID_NUM AND DH.CUR_DEGREE = 'Y'
 INNER JOIN [MYSQL1]...[myshu.shu_user] su ON su.exid = NM.ID_NUM
 INNER JOIN [MYSQL1]...[myshu.demographics_core] dc on su.id = dc.user_id
 INNER JOIN [MYSQL1]...[myshu.demographics_term] dt on dt.id = dc.term_id and dt.description = 'Spring 2017'
WHERE STS.YR_CDE = '1617' AND STS.TRM_CDE = 'FA' AND STS.TRANSACTION_STS IN ('C', 'H') AND STS.HRS_ENROLLED > 0
order by nm.LAST_NAME, nm.FIRST_NAME;

Using EXISTS:

SELECT NM.ID_NUM, NM.FIRST_NAME, NM.LAST_NAME, BM.ENTRANCE_CDE, SM.CURRENT_CLASS_CDE, STS.TUITION_CDE, DH.MAJOR_1
FROM SHU_STUD_TERMS_SUM_DIV STS
 INNER JOIN NAME_MASTER NM ON NM.ID_NUM = STS.ID_NUM
 INNER JOIN BIOGRAPH_MASTER BM ON NM.ID_NUM = BM.ID_NUM
 INNER JOIN STUDENT_MASTER SM ON NM.ID_NUM = SM.ID_NUM
 INNER JOIN DEGREE_HISTORY DH ON NM.ID_NUM = DH.ID_NUM AND DH.CUR_DEGREE = 'Y'
WHERE STS.YR_CDE = '1617' AND STS.TRM_CDE = 'FA' AND STS.TRANSACTION_STS IN ('C', 'H') AND STS.HRS_ENROLLED > 0
AND NOT EXISTS (
 SELECT 1
 FROM [MYSQL1]...[myshu.shu_user] su
 INNER JOIN [MYSQL1]...[myshu.demographics_core] dc on su.id = dc.user_id
 INNER JOIN [MYSQL1]...[myshu.demographics_term] dt on dt.id = dc.term_id and dt.description = 'Spring 2017'
 WHERE su.exid = NM.ID_NUM
)
order by nm.LAST_NAME, nm.FIRST_NAME
answered Apr 3, 2017 at 19:41
3
  • I modified my original post after trying your suggestions, thanks for your help! Commented Apr 4, 2017 at 13:33
  • 1
    I am really puzzled: is MINUS possible to be used? I know it's an Oracle keyword. The standard (and SQL Server) equivalent is EXCEPT. MySQL doesn't have it implemented but I guess you might be able to use that if the query runs on SQL Server side. Commented Apr 4, 2017 at 13:39
  • You are correct - EXCEPT is exactly what I meant. I've worked with Oracle for many years - even though I looked up the T-SQL syntax and recognized EXCEPT as the proper syntax, it still came out minus. The value of EXCEPT is removing rows returned in the second query - it lets you join to MySQL tables and should have faster response than NOT IN or NOT EXISTS. Commented Apr 4, 2017 at 16:28
0

Your execution plan suggests there is a Remote Scan for each table within the NOT IN clause, with SQL Server doing the joins locally.

I'd recommend creating a view in MySQL:

CREATE VIEW myview
AS
 SELECT su.exid
 FROM [myshu.shu_user] su
 INNER JOIN [myshu.demographics_core] dc on su.id = dc.user_id
 INNER JOIN [myshu.demographics_term] dt on dt.id = dc.term_id and dt.description = 'Spring 2017'

Then when you do the SELECT, use the NOT IN against the view. This should enforce a single Remote Scan which may speed things up.

SELECT NM.ID_NUM, NM.FIRST_NAME, NM.LAST_NAME, BM.ENTRANCE_CDE, SM.CURRENT_CLASS_CDE, STS.TUITION_CDE, DH.MAJOR_1
FROM NAME_MASTER NM
 INNER JOIN SHU_STUD_TERM_SUM_DIV STS ON NM.ID_NUM = STS.ID_NUM AND STS.YR_CDE = '1617' AND STS.TRM_CDE = 'FA' AND STS.TRANSACTION_STS IN ('C', 'H') AND STS.HRS_ENROLLED > 0
 INNER JOIN BIOGRAPH_MASTER BM ON NM.ID_NUM = BM.ID_NUM
 INNER JOIN STUDENT_MASTER SM ON NM.ID_NUM = SM.ID_NUM
 INNER JOIN DEGREE_HISTORY DH ON NM.ID_NUM = DH.ID_NUM AND DH.CUR_DEGREE = 'Y'
WHERE NM.ID_NUM not in (
 SELECT exid FROM [MYSQL1]...[myview]
)
order by nm.LAST_NAME, nm.FIRST_NAME

You could also pre-cache the results of this view into a temporary table (assuming you're writing a Stored Procedure).

answered Apr 4, 2017 at 13:51

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.