I haven't done too many nested statements in MySQL and I was hoping to have the below SQL looked at and let me know if there is a better/more efficient way of doing what I am trying to accomplish.
Through all of my tests it appears that everything is working properly, I just want to make sure I am meeting best practices, etc.
SELECT DISTINCT
-- Fab A with EMC
-- Conditions: Anything with an EMC
(SELECT
count(`wo_num`)
FROM
`work_orders`
WHERE
`emc` = '1'
AND `auth_status` = 'ACTIVE') as `a_count_e`,
-- Fab A Without EMC
-- Conditions: NO EMC, Contract price greater than 10,000, NOT "Time and Material", NOT Service Order, NOT Contract Install
(SELECT
count(`wo_num`)
FROM
`work_orders`
WHERE
`emc` = '0'
AND replace(replace(`contract_price`, ',', ''), '$', '') >= 10000
AND `auth_status` = 'ACTIVE'
AND `contract_install` = '0'
AND `terms` <> 'TIME AND MATERIAL'
AND `wo_type` <> 'SERVICE ORDER') as `a_count`,
-- Fab B
-- Conditions: NO EMC, Contract price between 1500 and 9999, NOT "Time and Material", NOT Service Order, NOT Contract Install
(SELECT
count(`wo_num`)
FROM
`work_orders`
WHERE
`emc` = '0'
AND replace(replace(`contract_price`, ',', ''), '$', '') >= 1500
AND replace(replace(`contract_price`, ',', ''), '$', '') <= 9999
AND `auth_status` = 'ACTIVE'
AND `contract_install` = '0'
AND `terms` <> 'TIME AND MATERIAL'
AND `wo_type` <> 'SERVICE ORDER') as `b_count`,
-- Small
-- Conditions: NO EMC, Contract price between 600 and 1499, NOT "Time and Material", NOT Service Order, NOT Contract Install
(SELECT
count(`wo_num`)
FROM
`work_orders`
WHERE
`emc` = '0' AND `auth_status` = 'ACTIVE'
AND `wo_type` <> 'SERVICE ORDER'
AND `contract_install` = '0'
AND `terms` <> 'TIME AND MATERIAL'
AND replace(replace(`contract_price`, ',', ''), '$', '') >= 600
AND replace(replace(`contract_price`, ',', ''), '$', '') <= 1499) as `sm`,
-- XX
-- Conditions: NO EMC, Contract price less than 600, NOT "Time and Material", NOT Service Order, NOT Contract Install
(SELECT
count(`wo_num`)
FROM
`work_orders`
WHERE
`emc` = '0' AND `auth_status` = 'ACTIVE'
AND `wo_type` <> 'SERVICE ORDER'
AND `contract_install` = '0'
AND `terms` <> 'TIME AND MATERIAL'
AND replace(replace(`contract_price`, ',', ''), '$', '') <= 599) as `xx`,
-- TM
-- Conditions: NO EMC, NOT Service Order, NOT Contract Install
(SELECT
count(`wo_num`)
FROM
`work_orders`
WHERE
`emc` = '0'
AND `auth_status` = 'ACTIVE'
AND `contract_install` = '0'
AND `wo_type` <> 'SERVICE ORDER'
AND `terms` = 'TIME AND MATERIAL') as `tm`,
-- Contract Install
-- Conditions: NO EMC, IS Contract Install
(SELECT
count(`wo_num`)
FROM
`work_orders`
WHERE
`emc` = '0'
AND `auth_status` = 'ACTIVE'
AND `contract_install` = '1') as `ci`,
-- Service
-- All NO EMC, IS Service Order, NOT Contract Install
(SELECT
count(`wo_num`)
FROM
`work_orders`
WHERE
`emc` = '0'
AND `wo_type` = 'SERVICE ORDER'
AND `auth_status` = 'ACTIVE'
AND `contract_install` = '0') as `service`,
-- Total count
-- All Active orders
(SELECT
count(`wo_num`)
FROM
`work_orders`
WHERE
`auth_status` = 'ACTIVE') as `total`
FROM
`work_orders`
WHERE
`auth_status` = 'ACTIVE'
Results:
Side Note:
Forgive me for the
`replace(replace(`contract_price`, ',', ''), '$', '')`
When the application was originally written the contract prices were stored as dollar amounts with $
and ,
, so I have to remove them for numerical comparisons.
-
\$\begingroup\$ Do you need 1 row that is returned or multiple rows may also? \$\endgroup\$chillworld– chillworld2014年05月13日 05:09:37 +00:00Commented May 13, 2014 at 5:09
2 Answers 2
The first thing to comment on, requires bold (sorry for shouting....) store numbers as numbers, not strings!!!
@Malachi is right that replace(replace(contract_price, ',', ''), '$', '')
is inefficient, and should not be repeated, but it would be even better for it to not exist at all.... your contract_price
column should be an integer, or decimal value if you have cents.
If you do have cents, then you should know that you may be missing values. The statements like:
AND replace(replace(`contract_price`, ',', ''), '$', '') >= 600
Will convert both sides of the comparison to floating-point values, and, there may be values with contract_price
of 599ドル.50
... right, and that will not be >= 600
. We expect 599ドル.50
to fall in to the bucket below that, but that bucket has the condition:
AND replace(replace(`contract_price`, ',', ''), '$', '') <= 599
and 599ドル.50
is not <= 599
....
To get the comparisons right, you need to compare to < 600
. THat way, you won't have gaps in your buckets.
Then, the select statement you are running is very inefficient. You can reduce it to a single query, with conditional values in the select. Consider the following:
SELECT
-- Fab A with EMC
-- Conditions: Anything with an EMC
SUM(case when `emc` = 1 then 1 else 0 end) as `a_count_e`,
-- Fab A Without EMC
-- Conditions: NO EMC, Contract price greater than 10,000, NOT "Time and Material", NOT Service Order, NOT Contract Install
SUM(case when `emc` = '0'
AND `contract_price` >= 10000
AND `contract_install` = '0'
AND `terms` <> 'TIME AND MATERIAL'
AND `wo_type` <> 'SERVICE ORDER'
then 1 else 0 end) as `a_count`,
-- Fab B
-- Conditions: NO EMC, Contract price between 1500 and 9999, NOT "Time and Material", NOT Service Order, NOT Contract Install
SUM(case when `emc` = '0'
AND `contract_price` >= 1500
AND `contract_price` < 10000
AND `contract_install` = '0'
AND `terms` <> 'TIME AND MATERIAL'
AND `wo_type` <> 'SERVICE ORDER'
then 1 else 0 end) as `b_count`,
-- Small
-- Conditions: NO EMC, Contract price between 600 and 1499, NOT "Time and Material", NOT Service Order, NOT Contract Install
SUM(case when `emc` = '0'
AND `contract_price` >= 600
AND `contract_price` < 1500
AND `contract_install` = '0'
AND `terms` <> 'TIME AND MATERIAL'
AND `wo_type` <> 'SERVICE ORDER'
then 1 else 0 end) as `sm`,
-- XX
-- Conditions: NO EMC, Contract price less than 600, NOT "Time and Material", NOT Service Order, NOT Contract Install
SUM(case when `emc` = '0'
AND `contract_price` < 600
AND `contract_install` = '0'
AND `terms` <> 'TIME AND MATERIAL'
AND `wo_type` <> 'SERVICE ORDER'
then 1 else 0 end) as `xx`,
-- TM
-- Conditions: NO EMC, NOT Service Order, NOT Contract Install
SUM(case when `emc` = '0'
AND `contract_install` = '0'
AND `terms` = 'TIME AND MATERIAL'
AND `wo_type` <> 'SERVICE ORDER'
then 1 else 0 end) as `tm`,
-- Contract Install
-- Conditions: NO EMC, IS Contract Install
SUM(case when `emc` = '0'
AND `contract_install` = '1'
then 1 else 0 end) as `ci`,
-- Service
-- All NO EMC, IS Service Order, NOT Contract Install
SUM(case when `emc` = '0'
AND `wo_type` = 'SERVICE ORDER'
AND `contract_install` = '0'
then 1 else 0 end) as `service`,
-- Total count
-- All Active orders
COUNT(*) as `total`
FROM (select Cast(replace(replace(`contract_price`, ',', ''), '$', ''), Decimal(10,2)) as `contract_price`,
`emc`,
`contract_install`,
`wo_type`,
`terms`
from `work_orders`
WHERE `auth_status` = 'ACTIVE')
Since changing trhe data type in the column is not feasible, the above query has been edited to do the conversion in just one pace...
Here's the from/where clause if the table had numeric contract_price
....
(Replace the original from and where clause )
FROM
`work_orders`
WHERE
`auth_status` = 'ACTIVE'
-
\$\begingroup\$ I am VERY aware of never storing numerical values as Strings, inheriting projects is not always friendly. I would love to run a quick little script to turn every entry's contract price into a numerical value but I am not being given the time to work on this project in order to make sure the front end shows it properly and remove and server side concoction the other dev may or may not have used \$\endgroup\$jnthnjns– jnthnjns2014年05月13日 12:11:04 +00:00Commented May 13, 2014 at 12:11
-
\$\begingroup\$ @Asok, What data type is
contract_price
? looks like you can change the column type without losing data. you just have to make sure that you don't have any null values hiding in there \$\endgroup\$Malachi– Malachi2014年05月13日 13:42:44 +00:00Commented May 13, 2014 at 13:42 -
\$\begingroup\$ Need to remove all instances of
CASE
afterEND
, otherwise it throws a1064
-..MySQL server version for the right syntax to use near 'case) as a_count_e,..
. I ran your version (with contract price as a string still) vs my/Malachi's nested SQL on the full data pool (not restricting to active orders) in MySQL workbench and it resulted in 0.109 vs 0.110 duration, maybe my data set isn't large enough to see a difference or I am not properly bench-marking? 9000 records. \$\endgroup\$jnthnjns– jnthnjns2014年05月13日 13:44:51 +00:00Commented May 13, 2014 at 13:44 -
\$\begingroup\$ @malachi Unfortunately
contract_price
is stored as avarchar(100)
with dollar signs and commas, I can't change it at this point. \$\endgroup\$jnthnjns– jnthnjns2014年05月13日 13:47:20 +00:00Commented May 13, 2014 at 13:47 -
1\$\begingroup\$ @malachi Changing the table would take minutes for me. The problem is that the previous programmer hated OOP and has
contract_price
values manipulated, displayed and everything else you can think of in a ton of different places. Luckily I did convince my boss to let me re-write a good chunk of it but I have to finish this reporting project and a couple others first. I will get around to it, trust me. I am OCD about consistency and best practices, which is partly why I posted this question. \$\endgroup\$jnthnjns– jnthnjns2014年05月13日 14:04:22 +00:00Commented May 13, 2014 at 14:04
Get rid of your Commented Code it's messy
This bit of Code might be shortened
AND replace(replace(`contract_price`, ',', ''), '$', '') >= 1500
AND replace(replace(`contract_price`, ',', ''), '$', '') <= 9999
to something like this, (削除) (if it works the way I think it should) (削除ここまで) it does work like I think it should.
AND REPLACE(REPLACE(`contract_price, ',',''),'$','') BETWEEN 1499 AND 10000 --(or 1500 AND 9999)
I know this works with Dates in SQL Server but I haven't really tried it in MySQL as I don't play with it much.
this would help performance in that you would only have to run the REPLACE
function twice here instead of 4 times
EDIT:
my logic is a little flawed. the end points are semi ambiguous,
the statement should be something like the following:
AND REPLACE(REPLACE(`contract_price, ',',''),'$','') BETWEEN 1500 AND 9999
and if you aren't holding Cents for this specific operation then:
AND REPLACE(REPLACE(`contract_price, ',',''),'$','') BETWEEN 1501 AND 9999
And the previous one would look like this
AND REPLACE(REPLACE(`contract_price, ',',''),'$','') BETWEEN 600 AND 1499.99
OR with out the Cents:
AND REPLACE(REPLACE(`contract_price, ',',''),'$','') BETWEEN 600 AND 1499
Where do you define column auth_status
, you use it in the WHERE
clause of the outside SELECT
statement but it doesn't exist as a column in the outer table that is built?
-
1\$\begingroup\$ Exactly what I was looking for. Thank you. Comments are not in my actual code, they are there to break apart the SQL in the OP to (what I had hoped) read easier. \$\endgroup\$jnthnjns– jnthnjns2014年05月12日 17:51:12 +00:00Commented May 12, 2014 at 17:51
-
1\$\begingroup\$ Regarding
auth_status
, that is a good question, that is a column in thework_orders
table, this entire SQL statement is pulling from the single table. I don't know if I need that since it is in the inner sql statements? \$\endgroup\$jnthnjns– jnthnjns2014年05月12日 17:54:14 +00:00Commented May 12, 2014 at 17:54 -
1\$\begingroup\$ doing a
WHERE
when you don't need to can occasionally have adverse side-effects to the performance of the query. \$\endgroup\$Malachi– Malachi2014年05月12日 17:57:38 +00:00Commented May 12, 2014 at 17:57 -
1\$\begingroup\$ This is great information, thank you. I removed the external
WHERE
and it works great. I am going to try your suggestions on thecontract_price
comparison. Thanks again. Marking as accepted. Any other suggestions is greatly appreciated. \$\endgroup\$jnthnjns– jnthnjns2014年05月12日 18:00:03 +00:00Commented May 12, 2014 at 18:00