This is my first post so I apologize if I am not concise enough. I am trying to come up with an SQL query to identify data quality issues.
Here's the sample table:
DeviceOS Bytes
Roku 10,000
AppleTV -50000
SamsungTV -100000
Roku -100000
AppleTV 30000
Roku -90000
AppleTV -20000
AppleTV -10000
SamsungTV -100000
Output table:
DeviceOS Total Count bad_count
Roku 3 2
AppleTV 4 2
SamsungTV 1 1
Total_count field aggregates based on deviceOS and bad_count picks up only all the rows for which megabytes field is negative.
Essentially trying to do this-> select DeviceOS, count(*) from table group by DeviceOS
select DeviceOs, count(*) from table DeviceOS where megabytes < 0
How can I combine the above two queries and have the result of both of them displayed together similar to the output table?
2 Answers 2
For MySQL use simple
SELECT DeviceOS,
COUNT(*) `Total Count`,
SUM(Bytes < 0) bad_count
FROM source_table
GROUP BY DeviceOS
For BigQuery use
SELECT DeviceOS,
COUNT(*) `Total Count`,
COUNTIF(Bytes < 0) bad_count
FROM source_table
GROUP BY DeviceOS
You can do the following for SQL Server:
SELECT DeviceOS AS DeviceOS
, COUNT(*) AS TotalCount
, SUM(IIF(Bytes < 0, 1, 0)) AS BadCount
FROM (
VALUES ('Roku', 10000)
, ('AppleTV', -50000)
, ('SamsungTV', -100000)
, ('Roku', -100000)
, ('AppleTV', 30000)
, ('Roku', -90000)
, ('AppleTV', -20000)
, ('AppleTV', -10000)
, ('SamsungTV', -100000)
) AS t1 (DeviceOS, Bytes)
GROUP BY DeviceOS
We are grouping by the DeviceOS column and getting the Name and the Count.
The only tricky part is using SUM(IIF(Bytes < 0, 1, 0)) AS BadCount
to get the bad count.
What this does is it checks if the byte size is less than 0 (meaning it is negative):
- if it is, then +1 is added to the sum
- if it is not, then +0 is added to the sum
Explore related questions
See similar questions with these tags.