This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2018年08月27日 23:44 by vstinner, last changed 2022年04月11日 14:59 by admin. This issue is now closed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 8963 | merged | vstinner, 2018年08月28日 00:12 | |
| PR 8991 | merged | vstinner, 2018年08月29日 11:33 | |
| PR 8995 | merged | vstinner, 2018年08月29日 15:43 | |
| PR 8998 | merged | vstinner, 2018年08月29日 16:52 | |
| PR 9003 | merged | vstinner, 2018年08月29日 20:55 | |
| PR 9005 | merged | vstinner, 2018年08月29日 22:08 | |
| PR 10232 | merged | vstinner, 2018年10月30日 11:24 | |
| PR 10672 | merged | vstinner, 2018年11月23日 11:23 | |
| PR 10673 | merged | vstinner, 2018年11月23日 12:12 | |
| PR 10759 | merged | vstinner, 2018年11月28日 02:22 | |
| PR 10761 | merged | vstinner, 2018年11月28日 10:34 | |
| Messages (16) | |||
|---|---|---|---|
| msg324206 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2018年08月27日 23:44 | |
Currently, the Python filesystem encoding is get in the middle of the Python initialization. initfsencoding() is only called when the import machinery is ready. Technically, only getting the Python codec name requires a working import machinery. We can get the locale encoding even before Py_Initialize(), but update the encoding to the Python codec name (ex: replace "ANSI_X3.4-1968" with "ascii") once the import machinery is ready. With my change, for an application embedding Python, the filesystem encoding can now be forced using _PyCoreConfig.filesystem_encoding. If it's set, Python will use it (and ignore the locale encoding). Attached PR implements this idea and adds _PyCoreConfig.filesystem_encoding. The change move all code to select encoding and error handler inside config_init_fs_encoding(), rather than relying on Py_FileSystemDefaultEncoding constant (its default value is set using #ifdef) and initfsencoding() (to get the locale encoding at runtime). With this change, it becomes more obvious than the interpreter core configuration is mutable. initfsencoding() modify the encoding/errors during Python initialization, and sys._enablelegacywindowsfsencoding() even modify it at runtime. Previously, I wanted to expose the full core_config at the Python level, as something like sys.flags. But since it's mutable, I'm not longer sure that it's a good idea. To *get* the filesystem encoding/errors, they are already sys.getfilesystemencoding() and sys.getfilesystemencodeerrors(). Modifying the filesystem encoding at runtime is a very bad idea tried in the past and it caused many issues. For the long term, I would like to remove Py_HasFileSystemDefaultEncoding, since it's really an implementation detail and there is no need to expose it. More generally, I don't see the purpose of exposing directly the encoding and error handler at the C level: Py_FileSystemDefaultEncoding and Py_FileSystemDefaultEncodeErrors. These symbols are constant constant, they cannot be modified when Python is embedded. But well, it's just a remark, technically, these 3 symbols don't cause any kind of trouble. Commit message: --- bpo-34485: Add _PyCoreConfig.filesystem_encoding _PyCoreConfig_Read() is now responsible to choose the filesystem encoding and error handler. Using Py_Main(), the encoding is now selected even before calling Py_Initialize(). _PyCoreConfig.filesystem_encoding is now the reference instead of Py_FileSystemDefaultEncoding for the Python filesystem encoding. Changes: * Add filesystem_encoding and filesystem_errors to _PyCoreConfig * _PyCoreConfig_Read() now reads the locale encoding for the file system encoding. Coerce temporarly the locale or set temporarly the LC_CTYPE locale to the user preferred locale. * PyUnicode_EncodeFSDefault() and PyUnicode_DecodeFSDefaultAndSize() now use the interpreter configuration rather than Py_FileSystemDefaultEncoding and Py_FileSystemDefaultEncodeErrors global configuration variables. * Add _Py_SetFileSystemEncoding() and _Py_ClearFileSystemEncoding() private functions to only modify Py_FileSystemDefaultEncoding and Py_FileSystemDefaultEncodeErrors in coreconfig.c. --- |
|||
| msg324233 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2018年08月28日 10:21 | |
The compilation failed on my PR when running PCbuild\amd64\_freeze_importlib.exe: ValueError: only 'strict' and 'surrogateescape' error handlers are supported, not 'surrogatepass' The error comes from locale_error_handler(). Before my change, PyUnicode_EncodeFSDefault() and PyUnicode_DecodeFSDefault() used Py_FileSystemDefaultEncodeErrors which is initialized to "surrogateescape", but only set to "surrogatepass" by initfsencoding(). With my change, the error handler is directly set to "surrogatepass", but currently, unicode_encode_locale() and unicode_decode_locale() don't accept this error handler. |
|||
| msg324243 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2018年08月28日 12:36 | |
> The compilation failed on my PR when running PCbuild\amd64\_freeze_importlib.exe: (...) This issue should now be fixed. |
|||
| msg324249 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2018年08月28日 13:01 | |
See also bpo-34485 which allows to select (using _PyCoreConfig) the encoding and error handlers of standard streams like sys.stdout. |
|||
| msg324318 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2018年08月29日 11:25 | |
New changeset b2457efc78b74a1d6d1b77d11a939e886b8a4e2c by Victor Stinner in branch 'master': bpo-34523: Add _PyCoreConfig.filesystem_encoding (GH-8963) https://github.com/python/cpython/commit/b2457efc78b74a1d6d1b77d11a939e886b8a4e2c |
|||
| msg324319 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2018年08月29日 11:45 | |
New changeset 70fead25e503a742ad4c919b151b9b2b5facee36 by Victor Stinner in branch 'master': bpo-34523: Fix config_init_fs_encoding() (GH-8991) https://github.com/python/cpython/commit/70fead25e503a742ad4c919b151b9b2b5facee36 |
|||
| msg324339 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2018年08月29日 17:32 | |
New changeset c5989cd87659acbfd4d19dc00dbe99c3a0fc9bd2 by Victor Stinner in branch 'master': bpo-34523: Py_DecodeLocale() use UTF-8 on Windows (GH-8998) https://github.com/python/cpython/commit/c5989cd87659acbfd4d19dc00dbe99c3a0fc9bd2 |
|||
| msg324347 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2018年08月29日 20:21 | |
New changeset 3d4226a832cabc630402589cc671cc4035d504e5 by Victor Stinner in branch 'master': bpo-34523: Support surrogatepass in locale codecs (GH-8995) https://github.com/python/cpython/commit/3d4226a832cabc630402589cc671cc4035d504e5 |
|||
| msg324351 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2018年08月29日 21:27 | |
New changeset de427556746aa41a8b5198924ce423021bc0c718 by Victor Stinner in branch 'master': bpo-34523: Py_FileSystemDefaultEncoding NULL by default (GH-9003) https://github.com/python/cpython/commit/de427556746aa41a8b5198924ce423021bc0c718 |
|||
| msg324353 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2018年08月29日 22:50 | |
New changeset fbca90856d96273fd87c0b126f6e7966af7fbf7b by Victor Stinner in branch 'master': bpo-34523: Use _PyCoreConfig instead of globals (GH-9005) https://github.com/python/cpython/commit/fbca90856d96273fd87c0b126f6e7966af7fbf7b |
|||
| msg324354 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2018年08月29日 22:51 | |
It was much more complicated than expected to implement this idea, but it seems like it's now working! I close the issue. |
|||
| msg328900 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2018年10月30日 11:58 | |
New changeset 905f1ace5f7424e314ca7bed997868a2a3044839 by Victor Stinner in branch 'master': bpo-34523: Fix config_init_fs_encoding() for ASCII (GH-10232) https://github.com/python/cpython/commit/905f1ace5f7424e314ca7bed997868a2a3044839 |
|||
| msg330311 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2018年11月23日 12:08 | |
New changeset 353933e712b6c7f7ba9a9a50bd5bd472db7c35d0 by Victor Stinner in branch 'master': bpo-34523: Fix C locale coercion on FreeBSD CURRENT (GH-10672) https://github.com/python/cpython/commit/353933e712b6c7f7ba9a9a50bd5bd472db7c35d0 |
|||
| msg330317 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2018年11月23日 12:37 | |
New changeset f6e323ce322cf54b1a9e9252b13f93ebc28b5c24 by Victor Stinner in branch '3.7': bpo-34523: Fix C locale coercion on FreeBSD CURRENT (GH-10672) (GH-10673) https://github.com/python/cpython/commit/f6e323ce322cf54b1a9e9252b13f93ebc28b5c24 |
|||
| msg330586 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2018年11月28日 09:26 | |
New changeset bde9d6bbb46ca59bcee5d5060adaa33c3ffee3a6 by Victor Stinner in branch 'master': bpo-34523, bpo-35322: Fix unicode_encode_locale() (GH-10759) https://github.com/python/cpython/commit/bde9d6bbb46ca59bcee5d5060adaa33c3ffee3a6 |
|||
| msg330591 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2018年11月28日 11:42 | |
New changeset 85ab974f78c0ebcfa611639864640d0273eb5466 by Victor Stinner in branch '3.7': bpo-34523, bpo-35322: Fix unicode_encode_locale() (GH-10759) (GH-10761) https://github.com/python/cpython/commit/85ab974f78c0ebcfa611639864640d0273eb5466 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:59:05 | admin | set | github: 78704 |
| 2018年11月28日 11:42:43 | vstinner | set | messages: + msg330591 |
| 2018年11月28日 10:34:32 | vstinner | set | pull_requests: + pull_request10009 |
| 2018年11月28日 09:26:24 | vstinner | set | messages: + msg330586 |
| 2018年11月28日 02:22:52 | vstinner | set | pull_requests: + pull_request10006 |
| 2018年11月23日 12:37:45 | vstinner | set | messages: + msg330317 |
| 2018年11月23日 12:12:00 | vstinner | set | pull_requests: + pull_request9928 |
| 2018年11月23日 12:08:31 | vstinner | set | messages: + msg330311 |
| 2018年11月23日 11:23:01 | vstinner | set | pull_requests: + pull_request9926 |
| 2018年10月30日 11:58:13 | vstinner | set | messages: + msg328900 |
| 2018年10月30日 11:24:41 | vstinner | set | pull_requests: + pull_request9544 |
| 2018年08月29日 22:51:48 | vstinner | set | status: open -> closed resolution: fixed messages: + msg324354 stage: patch review -> resolved |
| 2018年08月29日 22:50:51 | vstinner | set | messages: + msg324353 |
| 2018年08月29日 22:08:34 | vstinner | set | pull_requests: + pull_request8476 |
| 2018年08月29日 21:27:00 | vstinner | set | messages: + msg324351 |
| 2018年08月29日 20:55:09 | vstinner | set | pull_requests: + pull_request8474 |
| 2018年08月29日 20:21:41 | vstinner | set | messages: + msg324347 |
| 2018年08月29日 17:32:51 | vstinner | set | messages: + msg324339 |
| 2018年08月29日 16:52:34 | vstinner | set | pull_requests: + pull_request8469 |
| 2018年08月29日 15:43:15 | vstinner | set | pull_requests: + pull_request8466 |
| 2018年08月29日 11:45:37 | vstinner | set | messages: + msg324319 |
| 2018年08月29日 11:33:09 | vstinner | set | pull_requests: + pull_request8464 |
| 2018年08月29日 11:25:40 | vstinner | set | messages: + msg324318 |
| 2018年08月28日 13:01:29 | vstinner | set | messages: + msg324249 |
| 2018年08月28日 12:36:10 | vstinner | set | messages: + msg324243 |
| 2018年08月28日 10:21:43 | vstinner | set | messages: + msg324233 |
| 2018年08月28日 00:13:07 | vstinner | set | title: Choose the filesystem encoding earlier in Python initialization (add _PyCoreConfig.filesystem_encoding) -> Choose the filesystem encoding before Python initialization (add _PyCoreConfig.filesystem_encoding) |
| 2018年08月28日 00:12:53 | vstinner | set | keywords:
+ patch stage: patch review pull_requests: + pull_request8436 |
| 2018年08月27日 23:44:03 | vstinner | create | |