I'd like to know if there's any room to improve the following query which is to find distinct values from multiple tables when someone searches for a model name. Two of the tables are storing phones' series attributes and model attributes. The model attr
table is there to deal with special cases, like a series generally use black cases, but it also has released a special edition that uses golden cases.
Here's the query that I have come up. Basically I'm using UNION
to unpivot table cpu
and find distinct values across cpu
,model_attr
and series_attr
Is there any need for improvement?
Here's a fiddle example.
Query:
SELECT attr_name,attr_val
FROM
(SELECT 'CPU_country' attr_name, c.country attr_val
FROM cpu c INNER JOIN
series s ON s.cpu_id = c.cpu_id
INNER JOIN model m ON m.series_id = s.series_id
WHERE m.model_name RLIKE 'C79'
UNION
SELECT 'CPU_hours' attr_name, c.hours attr_val
FROM cpu c INNER JOIN
series s ON s.cpu_id = c.cpu_id
INNER JOIN model m ON m.series_id = s.series_id
WHERE m.model_name RLIKE 'C79'
UNION
SELECT a.attr_name attr_name,a.attr_value attr_val
FROM model m INNER JOIN
series s
on m.series_id = s.series_id LEFT JOIN
series_attr sa
ON sa.series_id = s.series_id LEFT JOIN
attr saa
on saa.attr_id = sa.attr_id LEFT JOIN
model_attr ma
ON ma.model_id = m.model_id LEFT JOIN
attr a
ON a.attr_id = ma.attr_id
WHERE m.model_name RLIKE 'C79' AND a.attr_value IS NOT NULL
UNION
SELECT saa.attr_name attr_name,saa.attr_value attr_val
FROM model m INNER JOIN
series s
on m.series_id = s.series_id LEFT JOIN
series_attr sa
ON sa.series_id = s.series_id LEFT JOIN
attr saa
on saa.attr_id = sa.attr_id LEFT JOIN
model_attr ma
ON ma.model_id = m.model_id LEFT JOIN
attr a
ON a.attr_id = ma.attr_id
WHERE m.model_name RLIKE 'C79' AND saa.attr_value IS NOT NULL
)k
GROUP BY attr_name,attr_val
ORDER BY attr_name
Table Schema (InnoDB):
CREATE TABLE series
(`series_id` int, `series_name` varchar(20),`cpu_id` INT ,`ram`int)
;
INSERT INTO series
(`series_id`,`series_name`,`cpu_id`,`ram`)
VALUES
(1,'Nokia Series',1,512),
(2,'Sony Series',2,1024)
;
CREATE TABLE model
(`model_id` int, `model_name` varchar(20),`series_id` int)
;
INSERT INTO model
(`model_id`,`model_name`,`series_id`)
VALUES
(1,'A6578',1),
(2, 'B2345',1),
(3, 'C7906',2),
(4, 'D3544',2)
;
CREATE TABLE attr
(`attr_id` int, `attr_name` varchar(20),`attr_value` varchar(20))
;
INSERT INTO attr
(`attr_id`,`attr_name`,`attr_value`)
VALUES
(1, 'material','Gold'),
(2, 'material','Plastic'),
(3, 'color','Grey'),
(4, 'color','Black'),
(5, 'color','Green'),
(6, 'color','White')
;
CREATE TABLE series_attr
(`series_id` int, `attr_id` int )
;
INSERT INTO series_attr
(`series_id`,`attr_id`)
VALUES
(1,2),
(1,5),
(2,2),
(2,3)
;
CREATE TABLE model_attr
(`model_id` int, `attr_id` int)
;
INSERT INTO model_attr
(`model_id`,`attr_id`)
VALUES
(2,1),
(2,4),
(4,6)
;
CREATE TABLE cpu
(`cpu_id` int,`cpu_name` varchar(20), `country` varchar(20),`hours`int,`frequency` int)
;
INSERT INTO cpu
(`cpu_id`,`cpu_name`,`country`,`hours`,`frequency`)
VALUES
(1,'CPU A','China',40,24000),
(2,'CPU B','US',80,30000),
(3,'CPU C','Japan',100,35000)
;
1 Answer 1
This is a pet peeve of mine, and others have disagreed with me about it, so please take it with a grain of salt. I really just can't stand single letter aliases. I mean, is cpu
really so long that you can't just use the table name? Do you really need to shorten it to c
?
It's lazy and it's the worst kind of lazy. It's the kind of lazy that creates more work. Now, instead of plainly seeing series_attr.attr_name
, I have to keep in my mind that saa
means "series attribute attribute", and ma
means "just plain model attribute" and a
is a "model attribute attribute" and... Do you see where this is going? Imagine you're brand new to this code base. Would you be able to easily map the alias names to useful meanings? No. You wouldn't, but you shouldn't have to because alias names should be meaningful.
That said, I do like that you've consistently formatted your code. I especially like that you put some whitespace around your UNION
statements. However, I really prefer to see subqueries indented one level. It just makes it easier to follow the logic.
You've also hardcoded C79
into your query. What if I want to find info on D82
? I have to change the query in at least 4 places. You should be using a variable for this. Better yet, wrap the query in a stored procedure and use a parameter.