Message167730
| Author |
vstinner |
| Recipients |
alexandre.vassalotti, pitrou, vstinner |
| Date |
2012年08月08日.22:38:39 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<1344465522.38.0.375715302831.issue15596@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
Serialization of Unicode strings in the pickle module is suboptimal, especially for long strings.
Attached patch optimize the serialization thanks to new properties of Unicode strings (PEP 393):
* text (protocol 0): avoid any temporary buffer if the string is an ASCII or latin1 string without "\\" or "\n" character; otherwise use a small buffer of 64 KB (instead of two buffer)
* binary (protocol 1, 2): avoid any temporary buffer if string is an ASCII string or if the string is already available encoded as UTF-8
The current code for protocol 0 uses raw_unicode_escape() which is really suboptimal: it uses a first buffer to write the escape string, and then a new temporary buffer to store the buffer with the right size (instead of just calling _PyBytes_Resize). |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2012年08月08日 22:38:42 | vstinner | set | recipients:
+ vstinner, pitrou, alexandre.vassalotti |
| 2012年08月08日 22:38:42 | vstinner | set | messageid: <1344465522.38.0.375715302831.issue15596@psf.upfronthosting.co.za> |
| 2012年08月08日 22:38:41 | vstinner | link | issue15596 messages |
| 2012年08月08日 22:38:41 | vstinner | create |
|