Message201657
| Author |
vstinner |
| Recipients |
Arfrever, ezio.melotti, serhiy.storchaka, vstinner |
| Date |
2013年10月29日.19:19:01 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<1383074341.74.0.754154238423.issue19424@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
> I don't see a benefit from this patch.
Oh, sorry, I forgot to explain the motivation. Performances of the warnings module are not critical module. The motivation here is to avoid to encoding string to UTF-8 for correctness. For example, _PyUnicode_AsString(filename) fails if the filename contains a surrogate character.
>>> warnings.warn_explicit("text", RuntimeError, "filename", 5)
filename:5: RuntimeError: text
>>> warnings.warn_explicit("text", RuntimeError, "filename\udc80", 5)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'utf-8' codec can't encode character '\udc80' in position 8: surrogates not allowed
Another example where a string to encoded to UTF-8 and decoded from UTF-8 a few instructions later:
PyObject *to_str = PyObject_Str(item);
err_str = _PyUnicode_AsString(to_str);
...
PyErr_Format(PyExc_RuntimeError, "...%s", err_str);
Using "%R" avoids any encoding conversion. |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2013年10月29日 19:19:01 | vstinner | set | recipients:
+ vstinner, ezio.melotti, Arfrever, serhiy.storchaka |
| 2013年10月29日 19:19:01 | vstinner | set | messageid: <1383074341.74.0.754154238423.issue19424@psf.upfronthosting.co.za> |
| 2013年10月29日 19:19:01 | vstinner | link | issue19424 messages |
| 2013年10月29日 19:19:01 | vstinner | create |
|