This is not "real life" code. I'm trying to expand upon the well known Sakila
sample database for MySQL to make it more complex. Step 7 (or 6) is running surprisingly slow.
PS: Note these are all separate queries executed against the same database in the order specified.
Add columns to
sakila.customer
table:USE sakila; ALTER TABLE customer ADD COLUMN multiplier DECIMAL(3,2) AFTER active; ALTER TABLE customer ADD COLUMN cust_ranking VARCHAR(10) AFTER multiplier;
Duration: 0.289 sec
Create a proc to randomly distribute multiplier:
DROP PROCEDURE IF EXISTS sp_randCustMult; DELIMITER // CREATE PROCEDURE sp_randCustMult() BEGIN -- declare a counter SET @start = (SELECT MIN(customer_id) FROM customer); SET @stop = (SELECT MAX(customer_id) FROM customer); -- start while loop WHILE @start <= @stop DO -- select a random float variable SET @RAND = RAND(); -- update NULL field UPDATE customer SET multiplier = (SELECT (CASE WHEN @RAND <= 0.65 THEN 1.00 WHEN @RAND <= 0.90 THEN 0.85 WHEN @RAND <= 1.00 THEN 1.05 END)) WHERE customer_id = @start; -- tick counter one up SET @start = @start + 1; END WHILE; END// DELIMITER ;
Duration: 0.001 sec
Call the proc to populate the rows:
CALL sp_randCustMult;
Duration: 0.761 sec
With Safe Update Mode OFF, add human-friendly values to
cust_ranking
based on multiplier.UPDATE customer SET cust_ranking = 'Standard' WHERE multiplier = 1.00; UPDATE customer SET cust_ranking = 'Premium' WHERE multiplier = 0.85; UPDATE customer SET cust_ranking = 'Uplift' WHERE multiplier = 1.05;
Duration:
0.057 sec 0.037 sec 0.035 secAdd a
real_amount
column to tablepayment
:ALTER TABLE payment ADD COLUMN real_amount DECIMAL(5,2) AFTER amount;
Duration: 0.749 sec
Create another proc to populate
payment
with real amounts:DROP PROCEDURE IF EXISTS sp_RealAmtPayment; DELIMITER // CREATE PROCEDURE sp_RealAmtPayment() BEGIN -- declare a counter SET @start = (SELECT MIN(payment_id) FROM payment); SET @stop = (SELECT MAX(payment_id) FROM payment); WHILE @start <= @stop DO UPDATE payment AS p INNER JOIN customer AS c ON p.customer_id = c.customer_id SET real_payment = (p.amount * c.multiplier) WHERE p.payment_id = @start; SET @start = @start + 1; END WHILE; END// DELIMITER ;
Duration: 0.001 sec
CALL sp_RealAmtPayment;
Duration: 17.082 sec
SELECT * FROM payment;
1000 row(s) returned
0.002 sec / 0.002 sec
Step 7 seems extremely long considering the very small number of records. What am I missing? All comments/critiques welcome!
2 Answers 2
I've never played with mysql, so this may be completely wrong, but if I get it right the WHILE
loop would be the equivalent of a T-SQL CURSOR
, which is inherently slow.
You're essentially looping on payment_id
, incrementing at each iteration - this assumes the ID's are contiguous, which isn't a safe assumption to make with data: if records were deleted, you have more iterations than records:
SET @start = (SELECT MIN(payment_id) FROM payment); SET @stop = (SELECT MAX(payment_id) FROM payment); WHILE @start <= @stop DO UPDATE payment AS p INNER JOIN customer AS c ON p.customer_id = c.customer_id SET real_payment = (p.amount * c.multiplier) WHERE p.payment_id = @start; SET @start = @start + 1; END WHILE;
In pseudo-code, this can read as follows:
- For each
payment_id
inpayment
... - ... update the
real_payment
column top.amount*c.multiplier
I don't see why you need a loop to do this, I think this would be equivalent... and faster:
UPDATE payment AS p
INNER JOIN customer AS c
ON p.customer_id = c.customer_id
SET real_payment = (p.amount * c.multiplier)
Similar to Mat's answer, in stage 2, you are calculating a random value for each customer, and processing the customers one-at-a-time.
It would be faster to process them all together, but, the rand() becomes hard to do because it changes value each time you call it, and you need to change the 'obvious' odds of things as you go.
Your procedure (with a test SQLFiddle) can be reduced to:
DROP PROCEDURE IF EXISTS sp_randCustMult;
DELIMITER //
CREATE PROCEDURE sp_randCustMult()
BEGIN
SET @first = 0.65;
SET @second = (0.90 - @first)/(1.0 - @first);
-- update NULL field
UPDATE customer
SET multiplier = (SELECT (CASE
WHEN RAND() < @first then 1.0
WHEN RAND() < @second then 0.85
ELSE 1.05
END));
END//
DELIMITER ;
-
\$\begingroup\$ Cool I'll delete the data tonight and test those 2 scripts and post results. One thing I love about SQL is that the most elegant scripts are the simplest ones. \$\endgroup\$Phrancis– Phrancis2014年05月27日 21:07:53 +00:00Commented May 27, 2014 at 21:07
-
1\$\begingroup\$ Duration: 0.036 sec. Everything seems spiffy and in proportion. \$\endgroup\$Phrancis– Phrancis2014年05月28日 04:32:45 +00:00Commented May 28, 2014 at 4:32