How would I make this test pass?
names = [
"cote",
"coté",
"côte",
"côté",
"ReasonE",
"Reason1",
"ReasonĔ",
"Reason Super",
"ReasonÅ",
"ReasonA",
"Reasona",
"Reasone",
"death",
"deluge",
"de luge",
"disílva John",
"diSilva John",
"di Silva Fred",
"diSilva Fred",
"disílva Fred",
"di Silva John",
]
loc = icu.Locale("und-u-ka-shifted-kb-true")
c = icu.Collator.createInstance(loc)
assert sorted(names, key=c.getSortKey) == [
"cote",
"côte",
"coté",
"côté",
"death",
"deluge",
"de luge",
"di Silva Fred",
"diSilva Fred",
"disílva Fred",
"di Silva John",
"diSilva John",
"disílva John",
"Reason1",
"Reasona",
"ReasonA",
"ReasonÅ",
"Reasone",
"ReasonE",
"ReasonĔ",
"Reason Super",
]
Background:
I'm trying to replicate the sorting behavior from a postgres database. Best I can figure is it's got custom rules for space/punctuation based on the 'shift-trimmed' option, along with backwards level 2'backwards accent ' (kb-true or [backwards 2] (see collation settings).
ICU doesn't appear to support shift-trimmed and I'm not sure how else I could get these to sort "properly".
See collation settings and contextual sensitivity for more explanation.
How would I make this test pass?
names = [
"cote",
"coté",
"côte",
"côté",
"ReasonE",
"Reason1",
"ReasonĔ",
"Reason Super",
"ReasonÅ",
"ReasonA",
"Reasona",
"Reasone",
"death",
"deluge",
"de luge",
"disílva John",
"diSilva John",
"di Silva Fred",
"diSilva Fred",
"disílva Fred",
"di Silva John",
]
loc = icu.Locale("und-u-ka-shifted-kb-true")
c = icu.Collator.createInstance(loc)
assert sorted(names, key=c.getSortKey) == [
"cote",
"côte",
"coté",
"côté",
"death",
"deluge",
"de luge",
"di Silva Fred",
"diSilva Fred",
"disílva Fred",
"di Silva John",
"diSilva John",
"disílva John",
"Reason1",
"Reasona",
"ReasonA",
"ReasonÅ",
"Reasone",
"ReasonE",
"ReasonĔ",
"Reason Super",
]
Background:
I'm trying to replicate the sorting behavior from a postgres database. Best I can figure is it's got custom rules for space/punctuation based on the 'shift-trimmed' option, along with backwards level 2 (kb-true or [backwards 2] (see collation settings).
ICU doesn't appear to support shift-trimmed and I'm not sure how else I could get these to sort "properly".
How would I make this test pass?
names = [
"cote",
"coté",
"côte",
"côté",
"ReasonE",
"Reason1",
"ReasonĔ",
"Reason Super",
"ReasonÅ",
"ReasonA",
"Reasona",
"Reasone",
"death",
"deluge",
"de luge",
"disílva John",
"diSilva John",
"di Silva Fred",
"diSilva Fred",
"disílva Fred",
"di Silva John",
]
loc = icu.Locale("und-u-ka-shifted-kb-true")
c = icu.Collator.createInstance(loc)
assert sorted(names, key=c.getSortKey) == [
"cote",
"côte",
"coté",
"côté",
"death",
"deluge",
"de luge",
"di Silva Fred",
"diSilva Fred",
"disílva Fred",
"di Silva John",
"diSilva John",
"disílva John",
"Reason1",
"Reasona",
"ReasonA",
"ReasonÅ",
"Reasone",
"ReasonE",
"ReasonĔ",
"Reason Super",
]
Background:
I'm trying to replicate the sorting behavior from a postgres database. Best I can figure is it's got custom rules for space/punctuation based on the 'shift-trimmed' option, along with 'backwards accent ' (kb-true or [backwards 2])
ICU doesn't appear to support shift-trimmed and I'm not sure how else I could get these to sort "properly".
See collation settings and contextual sensitivity for more explanation.
python collation sort "shift-trimmed"
How would I make this test pass?
names = [
"cote",
"coté",
"côte",
"côté",
"ReasonE",
"Reason1",
"ReasonĔ",
"Reason Super",
"ReasonÅ",
"ReasonA",
"Reasona",
"Reasone",
"death",
"deluge",
"de luge",
"disílva John",
"diSilva John",
"di Silva Fred",
"diSilva Fred",
"disílva Fred",
"di Silva John",
]
loc = icu.Locale("und-u-ka-shifted-kb-true")
c = icu.Collator.createInstance(loc)
assert sorted(names, key=c.getSortKey) == [
"cote",
"côte",
"coté",
"côté",
"death",
"deluge",
"de luge",
"di Silva Fred",
"diSilva Fred",
"disílva Fred",
"di Silva John",
"diSilva John",
"disílva John",
"Reason1",
"Reasona",
"ReasonA",
"ReasonÅ",
"Reasone",
"ReasonE",
"ReasonĔ",
"Reason Super",
]
Background:
I'm trying to replicate the sorting behavior from a postgres database. Best I can figure is it's got custom rules for space/punctuation based on the 'shift-trimmed' option, along with backwards level 2 (kb-true or [backwards 2] (see collation settings).
ICU doesn't appear to support shift-trimmed and I'm not sure how else I could get these to sort "properly".