Copied to Clipboard
WHERE is that bouncer. It looks at each row, checks it against your rule, and only lets the matching ones into your result.
The basic rule: comparison operators
The simplest rule is "this column equals that value".
SELECT name, total
FROM orders
WHERE status = 'paid'; -- only rows where status is exactly 'paid'
The bouncer also understands other comparisons:
| Operator |
Means |
= |
equal |
<> |
not equal |
!= |
not equal (same thing) |
< |
less than |
> |
greater than |
<= |
less than or equal |
>= |
greater than or equal |
SELECT id, total
FROM orders
WHERE total > 100; -- only big orders
Combining rules: AND, OR, NOT
One rule is nice. Real life needs a few at once. The bouncer can check several things.
-
AND means both must pass.
-
OR means at least one must pass.
-
NOT flips a rule around.
SELECT id, total, status
FROM orders
WHERE status = 'paid' AND total > 100; -- paid AND big
SELECT id, status
FROM orders
WHERE status = 'paid' OR status = 'shipped'; -- either one is fine
Why parentheses matter (the wrong way first)
Here is a trap that bites almost everyone. Say you want orders that are paid or shipped, and over 100.
A junior often writes this:
-- WRONG: this does not mean what you think
SELECT id, total, status
FROM orders
WHERE status = 'paid' OR status = 'shipped' AND total > 100;
AND is stronger than OR, so the database reads it like this:
status = 'paid' OR (status = 'shipped' AND total > 100)
So you get every paid order, even a 5 dollar one, plus shipped orders over 100. Surprise! That is not what you asked for.
The fix is parentheses. They tell the bouncer how to group the rules:
-- RIGHT: group the OR, then apply the AND
SELECT id, total, status
FROM orders
WHERE (status = 'paid' OR status = 'shipped') AND total > 100;
When you mix AND and OR, always add parentheses. It costs you two characters and saves you an hour of confusion.
Handy shortcuts: IN and BETWEEN
When you check one column against a list of values, IN is cleaner than a pile of ORs.
-- instead of: status = 'paid' OR status = 'shipped' OR status = 'refunded'
SELECT id, status
FROM orders
WHERE status IN ('paid', 'shipped', 'refunded');
When you want a range, BETWEEN reads nicely (and it includes both ends):
SELECT id, total
FROM orders
WHERE total BETWEEN 10 AND 100; -- 10 and 100 are both included
Fuzzy matching: LIKE with % and _
Sometimes you do not know the exact value. You want "anything ending in @gmail.com". That is LIKE, with two wildcards:
-
% matches any number of characters (even zero).
-
_ matches exactly one character.
SELECT name, email
FROM users
WHERE email LIKE '%@gmail.com'; -- any email that ends in @gmail.com
SELECT email
FROM users
WHERE email LIKE 'a%'; -- any email that starts with the letter a
SELECT name
FROM users
WHERE name LIKE '_at'; -- 'cat', 'bat', 'hat'... three letters ending in 'at'
The NULL trap (read this twice)
NULL is not zero and not an empty string. NULL means "unknown". The value is simply not there.
Here is the classic mistake:
-- WRONG: this returns nothing, ever
SELECT name FROM users WHERE country = NULL;
Why does it return nothing? Think about it. You are asking "is this unknown thing equal to unknown?" The honest answer is "I do not know", and the bouncer only lets in rows that are clearly true. "I do not know" is not true, so the row is rejected. Every time.
You cannot use = with NULL. You need a special check:
-- RIGHT
SELECT name FROM users WHERE country IS NULL; -- has no country
SELECT name FROM users WHERE country IS NOT NULL; -- has some country
Remember: for NULL, use IS NULL and IS NOT NULL, never = or <>.
A real case
Your boss asks: "Show me all paid orders over 100 from users in Egypt."
That is three rules. The country lives on the users table, so we filter there, and the order rules live on orders.
SELECT orders.id, orders.total, users.name
FROM orders
JOIN users ON users.id = orders.user_id
WHERE users.country = 'EG'
AND orders.status = 'paid'
AND orders.total > 100;
Three rules joined with AND, so a row only gets in if it passes all three. That is the bouncer doing exactly what you told him.
Gotchas
-
Mixing AND and OR without parentheses.
AND binds tighter, so group your ORs in parentheses or you get surprise rows.
-
Using
= NULL. It never matches. Use IS NULL / IS NOT NULL.
-
Forgetting the quotes. Text values need single quotes (
'paid'). Numbers do not (100). status = paid will error because the database looks for a column named paid.
Recap
-
WHERE is a bouncer: it keeps only the rows that pass your rule.
- Compare with
=, <>, <, >, <=, >=.
- Combine rules with
AND, OR, NOT, and always parenthesize when you mix AND and OR.
-
IN checks a list, BETWEEN checks a range (both ends included), LIKE does fuzzy matching with % and _.
- For NULL, use
IS NULL and IS NOT NULL. = NULL never works.
Your turn
Write one query that finds users whose email is not a gmail address and who do have a country set. Two rules, joined with AND. If you can explain why you used IS NOT NULL instead of <> NULL, you have got it.
Next up: Part 04: ORDER BY and LIMIT, where we sort our rows and grab just the top few.