Message199462
| Author |
serhiy.storchaka |
| Recipients |
barry, christian.heimes, kristjan.jonsson, pitrou, serhiy.storchaka, vstinner |
| Date |
2013年10月11日.11:57:42 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<1381492663.04.0.601157666373.issue19219@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
> - unmarshalling ASCII strings is faster: you can pass 127 to PyUnicode_New without scanning for non-ASCII chars
You should ensure that loaded bytes are ASCII-only. Otherwise broken or malicious marshalled data will compromise you program. Decoding UTF-8 is so fast as decoding ASCII (with checks) and is almost so fast as memcpy.
As for output, we could use cached UTF-8 representation of string (always exists for ASCII only strings) before calling PyUnicode_AsUTF8String().
I'm good with buffering and codes for short strings and tuples (I have not examined a code closely yet), but special casing ASCII looks not so good to me. |
|