5
\$\begingroup\$

In this Data Explorer query I am trying to do the following:

  • For each tag:

    • Compute sum of answer scores in this tag ("S")
    • Compute count of answers in this tag ("A")

    • For each tag class (Bronze, Silver, Gold)
      add two columns:

      • S divided by this class's score goal
      • A divided by this class's answer goal

I wanted to do this in the most general way possible, allowing more tag classes/goals to be added later. I came up with this:

-- Predefined tag badge goals
... TagBadges as (
 select * from
 (values
 (1, 'Bronze', 100, 20),
 (2, 'Silver', 400, 80),
 (3, 'Gold', 1000, 200))
 as Badge(Idx, Class, Score, Answers)
),
-- Progress per tag, per badge class
TypeProgress as (
 select
 RawData.TagName,
 format(iif(RawData.Score > TagBadges.Score, 1,
 cast(RawData.Score as float)/TagBadges.Score), '#0.#%') as Score,
 format(iif(RawData.Answers > TagBadges.Answers, 1,
 cast(RawData.Answers as float)/TagBadges.Answers), '#0.#%') as Answers,
 TagBadges.Class
 from RawData cross join TagBadges
),
-- Combine class & type columns
AllProgress as (
 select TagName, Progress, Class+' '+Type as Category
 from TypeProgress
 unpivot (Progress for Type in (Score, Answers)) p
) ...

But in the end I still had to list all the cases (2 ×ばつ 3 = 6) explicitly:

select *
from AllProgress
pivot (
 max(Progress) for Category in
 ([Bronze Score], [Bronze Answers],
 [Silver Score], [Silver Answers],
 [Gold Score], [Gold Answers])
) q

Is there a better way of doing this?

Phrancis
20.5k6 gold badges69 silver badges155 bronze badges
asked Aug 8, 2014 at 21:10
\$\endgroup\$

1 Answer 1

4
\$\begingroup\$

There is a more general way to do it, but it involves a dynamic query..... which is messy, and may not be better than what you have. Any time you involve a pivot, you are going in to the 'ugly' territory.

First though, some other comments:

  • you do not include the Rawdata CTE with your question, and it has some small issues I can see in SEDE
  • You select your data out as formatted strings, and you should only do the formatting as the last step, if possible.

Other than that, I found your query relatively easy to read, and since I understand the SEDE dataset quite well, I could follow the logic nicely.

Basically, your query is good, and I would 'accept it' in a review.

Ugly things of note though are:

  • max(Progress) is doing character-based max on a single-value set
  • the order by does a convert-to-float of a character value

I would like those things to be commented on, but, since the data volumes for those operations are small, I think it is OK, given the circumstances.

Out of interest, I processed your query using a more general syntax, that avoids the unpivot, and makes the pivot dynamic. As you will see, it is just as ugly, if not worse.

  • it does not do the order-by
  • it does not format the values as string at all.

See it for yourself

Note the multiple temp tables, and the variables for the pivot, and the query. Ugly, but it may give you some ideas.

answered Aug 8, 2014 at 22:18
\$\endgroup\$
3
  • \$\begingroup\$ Are there good alternatives to max(Progress)? I couldn't find an aggregate function that could be used when you know there is only a single value, like first() or something... \$\endgroup\$ Commented Aug 9, 2014 at 0:47
  • \$\begingroup\$ Also, you mentioned the RawData CTE — what problems do you think it has? \$\endgroup\$ Commented Aug 9, 2014 at 0:51
  • \$\begingroup\$ @jtbandes - Just SQL 'basics' ... I believe the Posts table has an index on PostTypeId, and you should restrict the Answers to PostTypeId 2, and Questions to PostTypeId 1. Relying on the ParentId relationship is technically OK, but more information for the DB optimizer is typically better than less. \$\endgroup\$ Commented Aug 9, 2014 at 1:23

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.