homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: More correct string truncating in PyUnicode_FromFormat()
Type: enhancement Stage: needs patch
Components: Interpreter Core Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Drekin, ezio.melotti, gvanrossum, serhiy.storchaka, vstinner
Priority: normal Keywords:

Created on 2016年01月12日 09:54 by serhiy.storchaka, last changed 2022年04月11日 14:58 by admin.

Messages (5)
msg258092 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年01月12日 09:54
The C code often uses %.<number><format> in PyUnicode_FromFormat(). %.200s protects from unlimited output when broken pointer points on random non-null-terminated data. %.200R is used to limit the size of human-readable messages.
In all these case formatted string can look well-formed with short data, but mis-formed (not closed quote, truncated backslash escaping or � decoded from truncated UTF-8 sequence) with long data.
I propose to make truncating in PyUnicode_FromFormat() more smart.
1. Truncated %R should keep at least one end character (the quote or ">").
2. Truncated output should include "..." or "[...]" as truncating sign.
3. \c, \OOO, \xXX, \uXXXX, and \UXXXXXXXX should not be truncated. It is better to omit these sequences at all (cut the string before them) that output them truncated.
4. Doesn't truncate UTF-8 sequence inside a character for %s.
msg258095 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年01月12日 10:01
See my old issue #10833 which proposed to *remove* the arbitrary limit
on strings. It was rejected.
msg258108 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2016年01月12日 16:29
Could we make this feature available at the Python level too? It sounds
really useful.
--Guido (mobile)
On Jan 12, 2016 2:01 AM, "STINNER Victor" <report@bugs.python.org> wrote:
>
> STINNER Victor added the comment:
>
> See my old issue #10833 which proposed to *remove* the arbitrary limit
> on strings. It was rejected.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue26090>
> _______________________________________
>
msg258111 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年01月12日 17:49
I think we can make this feature available with classic formatting '%.100r', but with new formatting '{0:.100!r}' (especially with f-strings) this can be not so easy.
msg258118 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2016年01月12日 19:08
Well it seems a little odd to spend effort on a corner case of the C-level
error messages if we can't even replicate it in pure Python.
History
Date User Action Args
2022年04月11日 14:58:26adminsetgithub: 70278
2016年06月21日 07:22:59Drekinsetnosy: + Drekin
2016年01月12日 19:08:39gvanrossumsetmessages: + msg258118
2016年01月12日 17:49:08serhiy.storchakasetmessages: + msg258111
2016年01月12日 16:29:08gvanrossumsetmessages: + msg258108
2016年01月12日 10:01:46vstinnersetmessages: + msg258095
2016年01月12日 09:56:27ezio.melottisetnosy: + ezio.melotti

stage: needs patch
2016年01月12日 09:54:14serhiy.storchakacreate

AltStyle によって変換されたページ (->オリジナル) /