I have the following database schema. My goal is to obtain a result set that lists the total badge points earned by each user. The badges might be earned in different courses. I want to include the courseid and the average score per course.
Users table:
| id | username |
|----|----------|
| 1 | user1 |
| 2 | user2 |
| 3 | user3 |
Badges table:
| id | badgename | points | courseid |
|----|-----------|--------|----------|
| 1 | badge a | 15 | 1 |
| 2 | badge b | 10 | 1 |
| 3 | badge c | 20 | 1 |
| 4 | badge d | 15 | 2 |
| 5 | badge e | 10 | 2 |
| 6 | badge f | 25 | 2 |
BadgeAssignments table:
| userid | badgeid |
|--------|---------|
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 1 | 4 |
| 2 | 5 |
| 3 | 6 |
| 1 | 5 |
| 2 | 4 |
| 3 | 3 |
Courses table:
| courseid | coursename |
|----------|------------|
| 1 | course 1 |
| 2 | course 2 |
I have ended up with the following code, and I think it works fine:
SELECT userid, SUM(b.points), courseid,
(SELECT AVG(points) FROM Badges WHERE courseid = b.courseid) as courseAVG
FROM
Badges b INNER JOIN BadgeAssignments ba ON b.id = ba.badgeid
INNER JOIN Users u ON ba.userid = u.id
GROUP BY userid, b.courseid
which brings me this result:
| userid | SUM(b.points) | courseid | courseAVG |
|--------|---------------|----------|-----------|
| 1 | 15 | 1 | 15 |
| 1 | 25 | 2 | 16.6667 |
| 2 | 10 | 1 | 15 |
| 2 | 25 | 2 | 16.6667 |
| 3 | 40 | 1 | 15 |
| 3 | 25 | 2 | 16.6667 |
The numbers seem to be correct. I wonder if my query makes sense? and if it needs to be revised?
Here is the SQL Fiddle.
1 Answer 1
Directly based on the current tables definition, I agree that it doesn't seem possible to have a better way of querying the needed results.
But it appears that performance might be improved using a slightly different approach, assuming that likely:
- the number of courses and badges is reduced, compared to the number of users
- courses and badges have relatively stable content (don't change daily!)
In the other hand we can notice that, in the current approach, courseAVG
is computed again when the query is launched (more over, I'm not sure the optimizer takes care of computing it only once, rather than for each user/course pair!).
So an alternative approach might be:
- add a
pointsavg
field to theCourses
table - compute it in a separate
UPDATE
query (could be launched through a trigger applyingAFTER INSERT
andAFTER UPDATE
onBadges
andCourses
Here is it:
UPDATE Courses c
SET pointsavg = (
SELECT AVG(points)
FROM Badges b
WHERE b.courseid = c.courseid
GROUP BY courseid
);
Then we can simply use this pointsavg
in the main query:
SELECT
u.username,
SUM(b.points) AS userCoursePoints,
c.coursename,
c.pointsavg AS courseAVG
FROM
Users u,
BadgeAssignments ba,
Badges b,
Courses c
WHERE ba.userid = u.id
AND b.id = ba.badgeid
AND c.courseid = b.courseid
GROUP BY u.username, b.courseid
Note that I took the opportunity to output username
and coursename
instead of their simple id, supposing it's closer to what is generally needed.
Here is the SQL fiddle:
Badges
twice to get the individual and the average points. Two remarks: You should add an alias toSUM(b.points)
and there's no Primary Key in theBadgeAssignments
table, the same user can get the same batch more than once? \$\endgroup\$