I would like to comprehend the different behavior of like
and =
. Sadly, I can not reproduce the following simple example thus that both queries return a different result:
SELECT 'ä' LIKE 'ae' COLLATE "de_DE.utf8";
SELECT 'ä' = 'ae' COLLATE "de_DE.utf8";
both return false
.
What am I doing wrong?
1 Answer 1
PostgreSQL does not support =
or LIKE
on COLLATE
. This is because internally index ordering uses =
and so even if the collation returns that they're equal PostgreSQL falls back to binary equal. This is documented,
Note that while this system allows creating collations that "ignore case" or "ignore accents" or similar (using the ks key), PostgreSQL does not at the moment allow such collations to act in a truly case- or accent-insensitive manner. Any strings that compare equal according to the collation but are not byte-wise equal will be sorted according to their byte values.
PostgreSQL also doesn't support Unicode collation in character classes.
-
2This is changed in PostgreSQL 12. See the updated documentation about "nondeterministic collations".Peter Eisentraut– Peter Eisentraut2019年09月24日 07:04:15 +00:00Commented Sep 24, 2019 at 7:04
de_DE.utf8
has any rules about the fact thatä
andae
should be considered equivalent. If at all, this might be possible with an ICU collation (but I don't know): postgresql.org/docs/current/static/…like
and=
behave identical, if no wildcards are used?unaccent
module.unaccent()
cannot help with'ä' = 'ae'
. It can help with'ä' = 'a'
.