homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Make PyUnicode_AsUTF8 returning "const char *" rather of "char *"
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: ncoghlan, python-dev, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2016年11月22日 07:55 by serhiy.storchaka, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
PyUnicode_AsUTF8-const.patch serhiy.storchaka, 2016年11月22日 07:55 review
PyUnicode_AsUTF8-const-2.patch serhiy.storchaka, 2016年12月21日 09:46 review
Pull Requests
URL Status Linked Edit
PR 1294 merged vstinner, 2017年04月26日 10:00
Messages (10)
msg281439 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年11月22日 07:55
PyUnicode_AsUTF8AndSize() and PyUnicode_AsUTF8() return a reference to cached readonly UTF-8 representation of a string. Changing the content of the UTF-8 representation is an error. Proposed patch makes these functions returning "const char *" rather of "char *" to force this restriction.
This is backward-incompatible change. Since PyUnicode_AsUTF8AndSize() and PyUnicode_AsUTF8() can return an error, it is more likely that the result is saved in a local variable rather than passing to other function. If the type of this variable is "char *" rather than "const char *", this would cause a compiler error. The fix is simple -- just add the const qualifier to the local variable declaration (more preferable) or cast the result of PyUnicode_AsUTF8AndSize() or PyUnicode_AsUTF8() to "char *".
Both functions are not in stable API.
msg281445 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016年11月22日 08:49
No opinion if this is a good change to make, but I left some review suggestions
msg281447 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年11月22日 08:59
Hum, I would like to discuss this topic on python-dev.
Changing PyUnicode_AsUTF8() alone is fine, but the issue with changing return type is that the const has to be propagated to callers, and then to callers of callers, etc. For example, if your patch, you cast (const char*) to (char*) to call tp_getattr.
The question is why tp_getattr doesn't use (const char*)?
I would prefer to take an overall decision for the C API, to decide if it's ok to "propagate" const changes in various places of the C API.
About the stable API: in fact, it's more a stable *ABI*: PEP 384, "Defining a Stable ABI". At the ABI level, there is no more "const". So it's perfectly fine to add or remove const, we already did that in the past.
Obviously, such change should only be done in Python 3.7.
For me, the main issue is for Python modules compiled with -Werror: if they upgrade to Python 3.7, the compilation will fail because they cast (const char*) to (char*) implicitly, which is a warning when using -Wall -Wextra, warning converted to a compilation error.
That's why I suggest to have an overall discussion on const on python-dev ;-)
msg281448 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年11月22日 09:04
Hum, sorry, my opinion on const is not obvious in my previous message: I like const :-) I want to use const everywhere! I still "believe" (I don't know if it's true or not) that const helps a lot compilers to optimize the code.
I don't know if it helps for a single variable. Maybe it's more helpful on a whole structure and/or pointers to avoid complex heuristics on aliasing.
My first attempt to design the _PyBytesWriter API was a big mistake: it was much slower: issue #17742. I understood that using a structure instead of multiple variables does stress the compiler who doesn't know if some optimizations are still save. In case of doubt, the compiler doesn't optimize to avoid generating invalid code.
msg283547 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年12月18日 09:25
Opened a topic on Python-Dev: https://mail.python.org/pipermail/python-dev/2016-December/147029.html.
msg283730 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年12月21日 09:46
Addressed comments, added the versionchanged directives, the code in _decimal.c is now more obvious.
msg286026 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017年01月22日 19:22
Stefan, what are your thoughts about this? The patch touches _decimal.c.
msg286029 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2017年01月22日 20:25
For _decimal I'm happy with just the cast from the first patch -- you have a one line diff and it's easy to see the focus of the issue.
msg286030 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2017年01月22日 21:07
New changeset 0d89212941f4 by Serhiy Storchaka in branch 'default':
Issue #28769: The result of PyUnicode_AsUTF8AndSize() and PyUnicode_AsUTF8()
https://hg.python.org/cpython/rev/0d89212941f4 
msg292335 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017年04月26日 11:51
New changeset 6e676954de7c4f3f06dd5b56842c9a2c931a1cab by Victor Stinner in branch 'master':
timemodule.c: Cast PyUnicode_AsUTF8() to char* (#1294)
https://github.com/python/cpython/commit/6e676954de7c4f3f06dd5b56842c9a2c931a1cab
History
Date User Action Args
2022年04月11日 14:58:39adminsetgithub: 72955
2017年04月26日 11:51:50vstinnersetmessages: + msg292335
2017年04月26日 11:19:46skrahsetnosy: - skrah
2017年04月26日 11:03:20martin.pantersetnosy: - martin.panter
2017年04月26日 10:01:50serhiy.storchakasetpull_requests: - pull_request1072
2017年04月26日 10:00:35vstinnersetpull_requests: + pull_request1401
2017年03月31日 16:36:34dstufftsetpull_requests: + pull_request1072
2017年01月22日 21:08:02serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2017年01月22日 21:07:27python-devsetnosy: + python-dev
messages: + msg286030
2017年01月22日 20:43:09serhiy.storchakasetassignee: serhiy.storchaka
2017年01月22日 20:25:23skrahsetmessages: + msg286029
2017年01月22日 19:22:44serhiy.storchakasetnosy: + skrah
messages: + msg286026
2016年12月21日 09:46:44serhiy.storchakasetfiles: + PyUnicode_AsUTF8-const-2.patch

messages: + msg283730
2016年12月18日 09:25:28serhiy.storchakasetmessages: + msg283547
2016年11月22日 09:04:14vstinnersetmessages: + msg281448
2016年11月22日 08:59:10vstinnersetnosy: + vstinner
messages: + msg281447
2016年11月22日 08:49:22martin.pantersetnosy: + martin.panter
messages: + msg281445
2016年11月22日 07:55:24serhiy.storchakacreate

AltStyle によって変換されたページ (->オリジナル) /