0

The current pandemic is making it very difficult to contact my teachers for assistance, so I'm hoping I can describe my problem well enough to try getting some help here.

The database used is the sakila example database
(I tried to create a fiddle, but the tables are just too big)

My data will be grouped by each store (only 2 video stores), and I will be getting

  1. The total sales of each store (teachers hint: store table has inventory. inventory can be rented, rental table has payments.
  2. count of customers per store
  3. inventory count per store
  4. count of movies per store
  5. count of movies rented that are not yet returned

Here is my incorrect output:

--------------------------------------------------------------------------------------------------------------------------------
| Jordan_Rasmussen | store_id | total_sales | num_customers | count_inventory | count_titles | inventory_cost | num_rentals_out |
--------------------------------------------------------------------------------------------------------------------------------
| (date&time) | 1 | 68359569.18 | 326 | 2270 | 759 | 952923.30 | 29992 |
----------------------------------------------------------------------------------------------------------------------------
| (date&time) | 2 | 56966647.92 | 273 | 2311 | 762 | 970134.69 | 24843 | |
--------------------------------------------------------------------------------------------------------------------------------

Here is the correct output:

--------------------------------------------------------------------------------------------------------------------------------
| First_and_last N | store_id | total_sales | num_customers | count_inventory | count_titles | inventory_cost | num_rentals_out |
--------------------------------------------------------------------------------------------------------------------------------
| (date&time) | 1 | 209691.93 | 326 | 2270 | 759 | 46205.30 | 92 |
--------------------------------------------------------------------------------------------------------------------------------
| (date&time) | 2 | 208669.04 | 273 | 2311 | 762 | 46415.89 | 91 |
--------------------------------------------------------------------------------------------------------------------------------

Here is my code for trying to get the above output:

SELECT NOW() AS 'Jordan_Rasmussen',
s.store_id,
-- Get the total sales
SUM(p.amount) AS total_sales,
-- Get the number of customers
COUNT(DISTINCT c.customer_id) AS num_customers,
-- Get the inventory count
COUNT(DISTINCT i.inventory_id) AS inventory_count,
-- Get the number of movie titles
COUNT(DISTINCT f.title) AS num_titles,
-- Get the inventory value 
SUM(DISTINCT f.replacement_cost) * COUNT(DISTINCT i.inventory_id) AS inventory_value,
-- Get the number of movies rented that have not yet been returned 
COUNT(r.rental_date) AS num_rentals_out 
FROM store AS s 
LEFT JOIN inventory AS i ON s.store_id = i.store_id 
LEFT JOIN customer AS c ON s.store_id = c.store_id 
INNER JOIN rental AS r ON i.inventory_id = r.inventory_id
INNER JOIN payment AS p ON r.rental_id = p.rental_id
INNER JOIN film AS f ON i.film_id = f.film_id 
GROUP BY store_id;

I can get the correct results by themselves, but I notice the more I JOIN tables together, the more the results change. I've been scratching my head about this for awhile now, but I'm just not sure what it is I'm missing.

Should I be using subqueries? Or am I just joining improperly?

Sorry for such a big question, but I'm clueless right now.

asked Mar 29, 2020 at 0:57
2
  • Take a look here it explains how joining works and especially when to use aggregate functions Commented Mar 29, 2020 at 1:49
  • Thanks @nbk , I'll check it out! Commented Mar 29, 2020 at 5:30

1 Answer 1

0

The problem is you are joining unrelated tables. If you don't join them, as in

SELECT NOW() AS 'Dorjan Masrussen',
 s.store_id,
 (select sum(p.amount)
 from payment p
 where p.rental_id in (
 select r.rental_id
 from rental r
 join inventory i
 on i.inventory_id = r.inventory_id
 where i.store_id = s.store_id)) total_sales,
 (select count(*)
 from customer c
 where c.store_id = s.store_id) num_customers,
 (select count(*)
 from inventory i
 where i.store_id = s.store_id) count_inventory,
 (select count(distinct film_id)
 from inventory i
 where i.store_id = s.store_id) count_titles,
 (select sum(f.replacement_cost)
 from inventory i
 join film f on f.film_id = i.film_id
 where i.store_id = s.store_id) inventory_cost,
 (select count(*)
 from rental r
 join inventory i
 on i.inventory_id = r.inventory_id
 where i.store_id = s.store_id) num_rentals_out
 from store as s

you get:

+---------------------+----------+-------------+---------------+-----------------+--------------+----------------+-----------------+
| Dorjan Masrussen | store_id | total_sales | num_customers | count_inventory | count_titles | inventory_cost | num_rentals_out |
+---------------------+----------+-------------+---------------+-----------------+--------------+----------------+-----------------+
| 2020年03月31日 17:32:18 | 1 | 33679.79 | 326 | 2270 | 759 | 46205.30 | 7923 |
| 2020年03月31日 17:32:18 | 2 | 33726.77 | 273 | 2311 | 762 | 46415.89 | 8121 |
+---------------------+----------+-------------+---------------+-----------------+--------------+----------------+-----------------+

Total sales and num_rentals_out differ from your correct output. I wonder who is right. The Sakila structure diagram, if I'm not mistaken, has it's crow's feet inverted.

The same result can be obtained, while querying inventory only once, using a CTE, the Common Table Expression "WITH", as follows:

WITH idata as (
 select i.store_id,
 count(*) num,
 count(distinct i.film_id) num_films,
 sum(f.replacement_cost) repl_cost,
 sum((select count(*)
 from rental r
 where r.inventory_id = i.inventory_id)) num_rentals,
 sum((select sum(p.amount)
 from payment p
 where p.rental_id in (
 select r.rental_id
 from rental r
 where r.inventory_id = i.inventory_id
 ))) sum_rentals
 from inventory i
 join film f on f.film_id = i.film_id
 group by i.store_id
)
SELECT NOW() AS 'Dorjan Masrussen',
 s.store_id,
 idata.sum_rentals total_sales,
 (select count(*)
 from customer c
 where c.store_id = s.store_id) num_customers,
 idata.num count_inventory,
 idata.num_films count_titles, 
 idata.repl_cost inventory_cost,
 idata.num_rentals num_rentals_out
 from store as s
 join idata on idata.store_id = s.store_id

Querying the store table can also be dropped, since all we need is the store_id, which we also find in inventory.

answered Mar 31, 2020 at 15:35
7
  • Thank you very much for the response. I tried your code in my DB, and I received the correct output. I'm wondering if my instructors tweaked the sakila database around for the class. That being said though, You helped me understand what I was doing wrong. I took a closer look at some of the tables I was joining and found that for example: joining the rental table onto the inventory table actually messed up the results because inventory had the proper set of inventory IDs, when the rentals inventory was specifically the inventory ID of rented movies. Commented Mar 31, 2020 at 20:43
  • also, the Dorjan Masrussen made me laugh, thanks for that too! Commented Mar 31, 2020 at 20:56
  • My answer was done in a hurry, and is certainly not the most performant. See how many times inventory is queried for the same rows. But I wanted to show how to avoid joining unrelated data. I'll try to provide a better query tomorrow. Commented Mar 31, 2020 at 22:38
  • I look forward to it! I appreciate the help, this whole pandemic is keeping me from sitting down with my teachers. Commented Mar 31, 2020 at 22:52
  • The teachers can't afford internet? Commented Mar 31, 2020 at 22:59

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.