Basically I have taken over a project and found a table that has the following video and video title fields in it (why they didn't create a separate linked table is beyond me):
[VIDEOURL]
[VIDEOTITLE]
[COUKVIDEO1URL]
[COUKVIDEO2URL]
[COUKVIDEO3URL]
[COUKVIDEO4URL]
[COUKVIDEO1TITLE]
[COUKVIDEO2TITLE]
[COUKVIDEO3TITLE]
[COUKVIDEO4TITLE]
Can this query be improved to get a single list of titles and urls from the above fields:
SELECT
t.VIDEOURL,
t.VIDEOTITLE
FROM
(
SELECT VIDEOURL, VIDEOTITLE FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
UNION
SELECT [COUKVIDEO1URL] AS VIDEOURL, [COUKVIDEO1TITLE] AS VIDEOTITLE FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
UNION
SELECT [COUKVIDEO2URL] AS VIDEOURL, [COUKVIDEO2TITLE] AS VIDEOTITLE FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
UNION
SELECT [COUKVIDEO3URL] AS VIDEOURL, [COUKVIDEO3TITLE] AS VIDEOTITLE FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
UNION
SELECT [COUKVIDEO4URL] AS VIDEOURL, [COUKVIDEO4TITLE] AS VIDEOTITLE FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
) t
WHERE t.VIDEOURL != ''
AND t.VIDEOTITLE != ''
-
3\$\begingroup\$ this schema makes me shiver.. \$\endgroup\$Vogel612– Vogel6122014年02月27日 16:34:18 +00:00Commented Feb 27, 2014 at 16:34
-
\$\begingroup\$ Can you modify or add a new table? I'm not good in SQL, but I really feel that the real problem is not the query but the table. I know that sometime you can't modify existing table or have some limitations of some kinds. \$\endgroup\$Marc-Andre– Marc-Andre2014年02月27日 18:39:56 +00:00Commented Feb 27, 2014 at 18:39
-
1\$\begingroup\$ @Marc-Andre unfortunately not, the data is being replicated from some internal systems and I am unable to touch that or the db server \$\endgroup\$Pete– Pete2014年02月28日 08:46:25 +00:00Commented Feb 28, 2014 at 8:46
1 Answer 1
There is no telling how the database will optimize the query plan for this, but I would expect that the best query plan would involve applying the predicates to the tables before the union is done. Doing a UNION of multiple small result sets would be more efficient than doing a union on large sets and then filtering the results.
The UNION operation requires a lot of scanning because UNION is also a DISTINCT process (you only have one record of each value in the result).
So, two things.
If you are sure that the values in in a column-pair will not also be in any of the other column-pairs, then you can replace the UNION
with the faster UNION ALL
.
If you are sure that a column-pair is unique, or empty, then you can avoid the DISTINCT process completely.
I would write the query as:
SELECT VIDEOURL,
VIDEOTITLE
FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
WHERE [VIDEOURL] != ''
AND [VIDEOTITLE] != ''
UNION ALL
SELECT [COUKVIDEO1URL] AS VIDEOURL,
[COUKVIDEO1TITLE] AS VIDEOTITLE
FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
WHERE [COUKVIDEO1URL] != ''
AND [COUKVIDEO1TITLE] != ''
UNION ALL
SELECT [COUKVIDEO2URL] AS VIDEOURL,
[COUKVIDEO2TITLE] AS VIDEOTITLE
FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
WHERE [COUKVIDEO2URL] != ''
AND [COUKVIDEO2TITLE] != ''
UNION ALL
SELECT [COUKVIDEO3URL] AS VIDEOURL,
[COUKVIDEO3TITLE] AS VIDEOTITLE
FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
WHERE [COUKVIDEO3URL] != ''
AND [COUKVIDEO3TITLE] != ''
UNION ALL
SELECT [COUKVIDEO4URL] AS VIDEOURL,
[COUKVIDEO4TITLE] AS VIDEOTITLE
FROM [VincentV5].[dbo].[MARCMSITEMPRESENTATION]
WHERE [COUKVIDEO4URL] != ''
AND [COUKVIDEO4TITLE] != ''
If values in one column-pair can appear in other column pairs, then change the UNION
to UNION ALL
.
If the values can be duplicated in a column-pair, then add DISTINCT
to each SELECT
.
Do both DISTINCT
and UNION ALL
if you need to.
The idea is to make the data as small as possible before you start doing the big merging processes.