This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2016年08月27日 14:36 by revo, last changed 2022年04月11日 14:58 by admin. This issue is now closed.
| Messages (2) | |||
|---|---|---|---|
| msg273782 - (view) | Author: mohammad (revo) | Date: 2016年08月27日 14:36 | |
According to [UAX #29](http://unicode.org/reports/tr29) - unicode word boundaries (rule WB5a), an apostrophe includes U+0027 ( ' ) APOSTROPHE and U+2019 ( ’ ) RIGHT SINGLE QUOTATION MARK (curly apostrophe). However regex module only implements U+0027 and the second kind (U+2019) is missing: /* Break between apostrophe and vowels (French, Italian). */ /* WB5a */ if (pos_m1 >= 0 && char_at(state->text, pos_m1) == '\'' && is_unicode_vowel(char_at(state->text, text_pos))) return TRUE; [Source code](https://bitbucket.org/mrabarnett/mrab-regex/src/f21447bf288780d8dd9b1633820480484ce8f677/regex_3/regex/_regex.c?at=default&fileviewer=file-view-default#_regex.c-1657) |
|||
| msg273783 - (view) | Author: SilentGhost (SilentGhost) * (Python triager) | Date: 2016年08月27日 14:56 | |
regex module is not in standard library, on the latest 3.6 branch re module breaks on curly apostrophe just fine. Perhaps, try reporting this issue on the bitbucket tracker? |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:35 | admin | set | github: 72065 |
| 2016年08月27日 14:56:48 | SilentGhost | set | status: open -> closed versions: - Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6 nosy: + SilentGhost messages: + msg273783 resolution: not a bug stage: resolved |
| 2016年08月27日 14:36:09 | revo | create | |