This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
| Author | pitrou |
|---|---|
| Recipients | gvanrossum, humitos, pitrou, sven.siegmund |
| Date | 2008年06月28日.20:27:23 |
| SpamBayes Score | 0.004190082 |
| Marked as misclassified | No |
| Message-id | <1214684844.73.0.533760513068.issue2834@psf.upfronthosting.co.za> |
| In-reply-to |
| Content | |
|---|---|
Uh, actually, it works if you specify re.UNICODE. If you don't, the
getlower() function in _sre.c falls back to the plain ASCII algorithm.
>>> pat = re.compile('Á', re.IGNORECASE | re.UNICODE)
>>> pat.match('á')
<_sre.SRE_Match object at 0xb7c66c28>
>>> pat.match('Á')
<_sre.SRE_Match object at 0xb7c66cd0>
I wonder if re.UNICODE shouldn't be the default in Py3k, at least when
the pattern is a string and not a bytes object. There may also be a
re.ASCII flag for those cases where people want to fallback to the old
behaviour. |
|
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2008年06月28日 20:27:24 | pitrou | set | spambayes_score: 0.00419008 -> 0.004190082 recipients: + pitrou, gvanrossum, humitos, sven.siegmund |
| 2008年06月28日 20:27:24 | pitrou | set | spambayes_score: 0.00419008 -> 0.00419008 messageid: <1214684844.73.0.533760513068.issue2834@psf.upfronthosting.co.za> |
| 2008年06月28日 20:27:24 | pitrou | link | issue2834 messages |
| 2008年06月28日 20:27:23 | pitrou | create | |