0

I have a somewhat complicated query consisting of many joins on an issue tracker/project management database in MySQL.
My issue is that my query was doing OK while I was in the process of building it up, but after the addition of one final join, the query performance (duration) is no longer tolerable.

The query below ran fine with a duration of ~15 seconds, which is acceptable...

SELECT distinct p.name, (i_A.hours + i_B.hours + i_C.hours),
 (COALESCE(cv_db2_A.value,0) + COALESCE(cv_db2_B.value,0) + COALESCE(cv_db2_C.value,0)),
 (COALESCE(cv_db3_A.value,0) + COALESCE(cv_db3_B.value,0) + COALESCE(cv_db3_C.value,0)),
 i_A.hours, if (cv_db2_A.value='',NULL,cv_db2_A.value), if (cv_db3_A.value='',NULL,cv_db3_A.value),
 i_B.hours, if (cv_db2_B.value='',NULL,cv_db2_B.value), if (cv_db3_B.value='',NULL,cv_db3_B.value),
 i_C.hours, if (cv_db2_C.value='',NULL,cv_db2_C.value), if (cv_db3_C.value='',NULL,cv_db3_C.value),
 i_date.due_date
 FROM projects p, issues i_team_1, issue_statuses ist, 
 issues i_A, issues i_B, issues i_C, 
 custom_values cv_db2_A, custom_values cv_db3_A, 
 custom_values cv_db2_B, custom_values cv_db3_B,
 custom_values cv_db2_C, custom_values cv_db3_C,
 issues i_date
 WHERE i_date.project_id = p.id and
 ist.id = i_team_1.status_id and 
 i_team_1.project_id = p.id and
 i_team_1.tracker_id = 23 and i_team_1.subject like 'Group1' and
 i_A.project_id = p.id and i_B.project_id = p.id and i_C.project_id = p.id and
 i_A.subject = 'A Hours' and i_B.subject = 'B Hours' and i_C.subject = 'C Hours' and 
 cv_db2_A.custom_field_id = 172 and cv_db2_A.customized_id = i_A.id and 
 cv_db3_A.custom_field_id = 173 and cv_db3_A.customized_id = i_A.id and 
 cv_db2_B.custom_field_id = 172 and cv_db2_B.customized_id = i_B.id and 
 cv_db3_B.custom_field_id = 173 and cv_db3_B.customized_id = i_B.id and
 cv_db2_C.custom_field_id = 172 and cv_db2_C.customized_id = i_C.id and 
 cv_db3_C.custom_field_id = 173 and cv_db3_C.customized_id = i_C.id and
 i_date.tracker_id = 4 and i_date.subject like 'Date' and
 i_date.due_date BETWEEN CAST('2014-01-01' AS DATE) AND CAST('2014-12-11' AS DATE) and
 p.name NOT LIKE '# Temp%';

I then proceeded to add one more join into the mix by:
adding in one more "issues i_team_2" and 3 related "WHERE" conditions.

SELECT p.name, (i_A.hours + i_B.hours + i_C.hours),
 (COALESCE(cv_db2_A.value,0) + COALESCE(cv_db2_B.value,0) + COALESCE(cv_db2_C.value,0)),
 (COALESCE(cv_db3_A.value,0) + COALESCE(cv_db3_B.value,0) + COALESCE(cv_db3_C.value,0)),
 i_A.hours, if (cv_db2_A.value='',NULL,cv_db2_A.value), if (cv_db3_A.value='',NULL,cv_db3_A.value),
 i_B.hours, if (cv_db2_B.value='',NULL,cv_db2_B.value), if (cv_db3_B.value='',NULL,cv_db3_B.value),
 i_C.hours, if (cv_db2_C.value='',NULL,cv_db2_C.value), if (cv_db3_C.value='',NULL,cv_db3_C.value),
 i_date.due_date
 FROM projects p, issues i_team_1, issues i_team_2, issue_statuses ist, 
 issues i_A, issues i_B, issues i_C, 
 custom_values cv_db2_A, custom_values cv_db3_A, 
 custom_values cv_db2_B, custom_values cv_db3_B,
 custom_values cv_db2_C, custom_values cv_db3_C,
 issues i_date
 WHERE i_date.project_id = p.id and
 ist.id = i_team_1.status_id and 
 i_team_1.project_id = p.id and i_team_2.project_id = p.id and 
 i_team_1.tracker_id = 23 and i_team_1.subject like 'Group1' and
 i_team_2.tracker_id = 23 and i_team_2.subject like 'Group2' and
 i_A.project_id = p.id and i_B.project_id = p.id and i_C.project_id = p.id and
 i_A.subject = 'A Hours' and i_B.subject = 'B Hours' and i_C.subject = 'C Hours' and 
 cv_db2_A.custom_field_id = 172 and cv_db2_A.customized_id = i_A.id and 
 cv_db3_A.custom_field_id = 173 and cv_db3_A.customized_id = i_A.id and 
 cv_db2_B.custom_field_id = 172 and cv_db2_B.customized_id = i_B.id and 
 cv_db3_B.custom_field_id = 173 and cv_db3_B.customized_id = i_B.id and
 cv_db2_C.custom_field_id = 172 and cv_db2_C.customized_id = i_C.id and 
 cv_db3_C.custom_field_id = 173 and cv_db3_C.customized_id = i_C.id and
 i_date.tracker_id = 4 and i_date.subject like 'Date' and
 i_date.due_date BETWEEN CAST('2014-01-01' AS DATE) AND CAST('2014-12-11' AS DATE) and
 p.name NOT LIKE '# Temp%';

This one additional join causes my query time to increase 10x from ~15 seconds to ~160 seconds.

  • I have tried omitting the "WHERE" conditions for "i_team_2" and it made the query time go down to about ~110 seconds... which is still not acceptable.


The EXPLAIN function shows 14 "SIMPLE" select type actions. Each one is using an index/key and the number of rows checked for each action does not show a number higher than 150 for any of them.

 EXPLAIN: BEFORE
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE cv_db3_A ref index_custom_values_on_custom_field_id index_custom_values_on_custom_field_id 4 const 163 Using temporary
1 SIMPLE i_A eq_ref PRIMARY,issues_project_id PRIMARY 4 cv_db3_A.customized_id 1 Using where
1 SIMPLE p eq_ref PRIMARY PRIMARY 4 i_A.project_id 1 Using where
1 SIMPLE i_team_1 ref issues_project_id,index_issues_on_status_id,index_issues_on_tracker_id issues_project_id 4 i_A.project_id 9 Using where
1 SIMPLE ist eq_ref PRIMARY PRIMARY 4 i_team_1.status_id 1 Using index
1 SIMPLE cv_db3_B ref index_custom_values_on_custom_field_id index_custom_values_on_custom_field_id 4 const 163 
1 SIMPLE i_B eq_ref PRIMARY,issues_project_id PRIMARY 4 cv_db3_B.customized_id 1 Using where
1 SIMPLE cv_db3_C ref index_custom_values_on_custom_field_id index_custom_values_on_custom_field_id 4 const 163 
1 SIMPLE i_C eq_ref PRIMARY,issues_project_id PRIMARY 4 cv_db3_C.customized_id 1 Using where
1 SIMPLE i_date ref issues_project_id,index_issues_on_tracker_id issues_project_id 4 i_A.project_id 9 Using where
1 SIMPLE cv_db2_A ref index_custom_values_on_custom_field_id index_custom_values_on_custom_field_id 4 const 164 Using where
1 SIMPLE cv_db2_B ref index_custom_values_on_custom_field_id index_custom_values_on_custom_field_id 4 const 164 Using where
1 SIMPLE cv_db2_C ref index_custom_values_on_custom_field_id index_custom_values_on_custom_field_id 4 const 164 Using where
 EXPLAIN: AFTER
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE cv_db3_A ref index_custom_values_on_custom_field_id index_custom_values_on_custom_field_id 4 const 144 Using temporary
1 SIMPLE i_A eq_ref PRIMARY,issues_project_id PRIMARY 4 cv_db3_A.customized_id 1 Using where
1 SIMPLE p eq_ref PRIMARY PRIMARY 4 i_A.project_id 1 Using where
1 SIMPLE i_team_1 ref issues_project_id,index_issues_on_status_id,index_issues_on_tracker_id issues_project_id 4 i_A.project_id 10 Using where
1 SIMPLE ist eq_ref PRIMARY PRIMARY 4 i_team_1.status_id 1 Using index
1 SIMPLE cv_db3_B ref index_custom_values_on_custom_field_id index_custom_values_on_custom_field_id 4 const 144 
1 SIMPLE i_B eq_ref PRIMARY,issues_project_id PRIMARY 4 cv_db3_B.customized_id 1 Using where
1 SIMPLE cv_db3_C ref index_custom_values_on_custom_field_id index_custom_values_on_custom_field_id 4 const 144 
1 SIMPLE i_C eq_ref PRIMARY,issues_project_id PRIMARY 4 cv_db3_C.customized_id 1 Using where
1 SIMPLE i_team_2 ref issues_project_id,index_issues_on_tracker_id issues_project_id 4 i_A.project_id 10 Using where
1 SIMPLE i_date ref issues_project_id,index_issues_on_tracker_id issues_project_id 4 i_A.project_id 10 Using where
1 SIMPLE cv_db2_A ref index_custom_values_on_custom_field_id index_custom_values_on_custom_field_id 4 const 145 Using where; Distinct
1 SIMPLE cv_db2_B ref index_custom_values_on_custom_field_id index_custom_values_on_custom_field_id 4 const 145 Using where; Distinct
1 SIMPLE cv_db2_C ref index_custom_values_on_custom_field_id index_custom_values_on_custom_field_id 4 const 145 Using where; Distinct

Note: The "BEFORE" EXPLAIN has higher number of rows due to additions to the data and was taken after the "AFTER" EXPLAIN.


The Sense of the Query:

I am trying to grab information (i.e. kinds of hours worked ("db2" or "db3") divided per subject area/field (3 different fields: "A","B","C")) for particular project types (i.e. "Group1" or "Group2" project types) that has a certain kind of date fall within a given date range.

In this particular query, I am grabbing the 6 different information fields for project types that include both "Group1" and "Group2" only (intersect, not union).

Note: I accidentally forgot to include the actual data/columns I am retrieving in the SELECT statements above (I pasted my debugging version). However, testing with or without that portion has not given me any advantage in performance and, in conclusion, is not where the problem resides.


Question

  • Am I doing anything wrong?
  • How can I shorten this query's duration to a more manageable amount?
  • Should I split the query up?
11
  • Do you have an index on custom_values (custom_field_id, customized_id)? What indexes are there on issues and on projects? Adding the SHOW CREATE TABLE outputs of all the tables and the EXPLAIN plan would help. Commented Sep 15, 2014 at 18:34
  • I'd try adding ...and i_team_1.tracker_id = i_team_2.tracker_id... to the WHERE clause. Commented Sep 15, 2014 at 19:05
  • custom_values has custom_values_customized & index_custom_values_on_custom_field_id. Commented Sep 15, 2014 at 19:47
  • @mustaccio add "i_team_1.tracker_id = i_team_2.tracker_id" and leave my other i_team_2.tracker_id where condition?? or replace the i_team_2.tracker_id condition? Commented Sep 15, 2014 at 19:58
  • Do you have the before EXPLAIN plan too? Commented Sep 15, 2014 at 21:38

2 Answers 2

2

You have forgotten the join criteria and ist.id = i_team_2.status_id.

If you had used the modern JOIN syntax this would have stood out like a sore thumb.

answered Sep 17, 2014 at 0:04
2
  • has nothing to do with the performance of the query or my question really so I did not care to correct it (i say this from both assuming and empiracal perspectives, as that table "ist" contains 10 rows). was actually in the process of taking that join OUT when I ran into my actual problem. thanks. Commented Sep 18, 2014 at 23:57
  • as you might or might not be able to guess, I am trying to demonstrate in my question that the ONLY complexity I am adding to the query is the i_team_2 join and how it affects my query, and no other constraints or added complexity. Commented Sep 19, 2014 at 0:04
0

Looking at the EXPLAIN and examining the indices did not help me (maybe I did not try hard (or correctly) enough).

My final solution was to not JOIN together so many instances of large tables (in this case, the "issues" tables). In effect, I split up the query into two queries and now my querying performance is much less than 1 second.

answered Sep 24, 2014 at 22:23

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.