1

So I essentially have two tables in a database. One which is the base table of all of our media data, and another that has potential override data for spend. Not everything in the display table will need to get overridden, but some rows will. Also, the only grouping is done at the date AND pid level, and does not include creative. So if there are date AND pid combinations that have more than one creative, the spend will need to get distributed based on the proportion of impressions for that date AND pid combination. I would like to create a view that does this for me, but I'm not quite sure where to begin here or if it's even possible. I have a python script that can do the work, but I wanted to see if it was possible with just SQL. SQL Fiddle with basic schema and data

display table:

date pid creative impressions spend
1/1/16 1234 a 10 100ドル
1/1/16 1234 b 20 200ドル
1/1/16 1235 a 20 300ドル
1/1/16 1236 a 20 300ドル
1/2/16 1234 b 15 150ドル
1/2/16 1235 a 20 200ドル
1/2/16 1236 a 10 150ドル
1/2/16 1236 b 20 200ドル
1/2/16 1237 a 10 100ドル
1/2/16 1237 b 15 150ドル

spend table:

date pid spend
1/1/16 1234 400ドル
1/1/16 1235 600ドル
1/2/16 1235 1,200ドル
1/2/16 1236 500ドル
1/2/16 1237 1,000ドル

expected output:

date pid creative impressions spend
1/1/16 1234 a 10 133ドル.33
1/1/16 1234 b 20 266ドル.67
1/1/16 1235 a 20 600ドル
1/1/16 1236 a 20 300ドル
1/2/16 1234 b 15 150ドル
1/2/16 1235 a 20 1,200ドル
1/2/16 1236 a 10 166ドル.67
1/2/16 1236 b 20 333ドル.33
1/2/16 1237 a 10 400ドル
1/2/16 1237 b 15 600ドル

Answer

SELECT
 D.date,
 D.pid,
 D.creative,
 D.impressions,
 COALESCE(S.spend * D.impressions / SUM(D.impressions)
 OVER (PARTITION BY D.date, D.pid), D.spend) AS "spend"
FROM display D
LEFT JOIN spend S
 ON D.date = S.date
 AND D.pid = S.pid

So I finally got around to having a good chunk of real world data to play with and this is giving me issues. This:

SELECT D.rep_date, D.placement_id, D.creative
 , COALESCE(S.media_cost * D.impressions /
 SUM(D.impressions) OVER (PARTITION BY D.rep_date, D.placement_id)
 , D.media_cost
 ) AS "spend"
FROM dc3.display D
LEFT JOIN dc3.display_spend S ON D.rep_date = S.rep_date AND D.placement_id = S.placement_id

returns exactly half of what I would expect for spend. This:

SELECT D.rep_date, D.placement_id, D.creative
 , SUM(COALESCE(S.media_cost * D.impressions / 
 (SELECT SUM(D2.impressions)
 FROM dc3.display D2
 WHERE D.rep_date = D2.rep_date AND D.placement_id = D2.placement_id)
 , D.media_cost
 )) AS "spend"
FROM dc3.display D
LEFT JOIN dc3.display_spend S ON D.rep_date = S.rep_date AND D.placement_id = S.placement_id
GROUP BY 1, 2, 3

returns exactly what I would expect, but takes a long time to run. Any thoughts on what the issue could be?

asked Feb 13, 2016 at 13:47

1 Answer 1

2

A Window Function can be used (SUM): Display.impressions * Spend.spend / SUM(...) OVER(...)

Query:

SELECT D.date, D.pid, D.creative
 , COALESCE(
 D.impressions * S.spend 
 / SUM(D.impressions) OVER(PARTITION BY D.date, D.pid)
 , D.spend
 )
FROM display D 
LEFT JOIN spend S ON D.date = S.date AND D.pid = S.pid;

SQL Fiddle

Output:

date pid creative spend
January, 01 2016 00:00:00 1234 a 133.33
January, 01 2016 00:00:00 1234 b 266.66
January, 01 2016 00:00:00 1235 a 600
January, 01 2016 00:00:00 1236 a 300
January, 02 2016 00:00:00 1234 b 150
January, 02 2016 00:00:00 1235 a 1200
January, 02 2016 00:00:00 1236 a 166.66
January, 02 2016 00:00:00 1236 b 333.33
January, 02 2016 00:00:00 1237 a 400
January, 02 2016 00:00:00 1237 b 600
answered Feb 13, 2016 at 15:32
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.