This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2012年05月31日 22:05 by rurpy2, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Messages (6) | |||
|---|---|---|---|
| msg162025 - (view) | Author: rurpy (rurpy2) | Date: 2012年05月31日 22:05 | |
PEP 414 proposes restoring the "u" string prefix (semantically as a "noop") to make porting from Python2 easier. I would like to propose that "ru"-strings also interpret embedded "\uxxxx" unicode literals in the python2 fashion (as a single unicode character) rather than in the python 3.2 fashion (as 6 characters). Many Python2 programs use unicode literals in strings because they can be represented and displayed in source code with the ascii character set. For example, I often write ur" \u3000\u3042\t" rather than ur" あ " because the former is much clearer in source code than the latter and does not require the viewer to have a Japanese font installed. However such a string must be manually converted for Python3 because the former string has a very different meaning in Python3 than Python2. The equivalent in Python3 is " \u3000\u3042\\t". AFAIK, 2to3 does not fix this. Because there are no longer unicode literals in Python3 raw strings, any string with a unicode literal *has* to be a non-raw string (AFAICT). This means that strings used as regexes, that have a lot of backslashes and have unicode literals, must have the backslashes doubled. Doubling the backslashes in the above example is trivial but it is not trival in more realistic regexes. This was one of the main reasons for having raw strings in Python2 I thought. It is unfortunate that one looses this ability (in the presence of unicode literals) in Python3. When I raised this issue on the Python user's list [*1], Terry Reedy made the suggestion that since the "u" string prefix was being reintroduced for python 3.3, that having the prefix also restore the python2 unicode literal handling would not introduce any incompatibilties and would greatly increase the ease of porting to Python3 for some programs.[*2] He subsequently raised the issue on the dev list.[*3] An argument might be made that this is an extra feature that would encourage the use of the "u"-prefix beyond that of easing porting from Python2. Perhaps so but there is currently a hole in Python's capability that is difficult to work around, and I've seen no other proposals to fix it. So it seems to me that the benefits of this proposal greatly outweigh that somewhat purist argument. ---- [*1] http://mail.python.org/pipermail/python-list/2012-May/1292870.html [*2] http://mail.python.org/pipermail/python-list/2012-May/1292887.html [*3] http://mail.python.org/pipermail/python-dev/2012-May/119760.html |
|||
| msg162051 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年06月01日 05:38 | |
See issue3665. |
|||
| msg163045 - (view) | Author: Vinay Sajip (vinay.sajip) * (Python committer) | Date: 2012年06月17日 09:33 | |
See also http://mail.python.org/pipermail/python-list/2012-June/625406.html |
|||
| msg163286 - (view) | Author: Vinay Sajip (vinay.sajip) * (Python committer) | Date: 2012年06月20日 15:29 | |
Given the resolution of #15096, ISTM this will be a "wontfix", though I'll stop short of marking it as such myself. |
|||
| msg165744 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年07月18日 06:17 | |
Should it be closed? |
|||
| msg166834 - (view) | Author: Christian Heimes (christian.heimes) * (Python committer) | Date: 2012年07月29日 23:08 | |
As of #15096 I've closed this bug as "won't fix". |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:31 | admin | set | github: 59178 |
| 2012年07月29日 23:08:00 | christian.heimes | set | status: open -> closed nosy: + christian.heimes messages: + msg166834 resolution: wont fix |
| 2012年07月28日 08:57:04 | serhiy.storchaka | set | nosy:
- serhiy.storchaka |
| 2012年07月18日 06:17:46 | serhiy.storchaka | set | messages: + msg165744 |
| 2012年06月20日 15:30:37 | vinay.sajip | set | messages: - msg163288 |
| 2012年06月20日 15:30:11 | vinay.sajip | set | messages: + msg163288 |
| 2012年06月20日 15:29:55 | vinay.sajip | set | messages: + msg163286 |
| 2012年06月19日 16:19:25 | vinay.sajip | set | superseder: Drop support for the "ur" string prefix |
| 2012年06月17日 09:33:35 | vinay.sajip | set | nosy:
+ aronacher, vinay.sajip messages: + msg163045 |
| 2012年06月01日 18:04:15 | eric.araujo | set | nosy:
+ eric.araujo |
| 2012年06月01日 05:38:08 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg162051 title: restore python2 unicode literals in "ru" strings -> restore python2 unicode literals in "ur" strings |
| 2012年05月31日 22:05:13 | rurpy2 | create | |