Message149695
| Author |
vstinner |
| Recipients |
ezio.melotti, pitrou, vstinner |
| Date |
2011年12月17日.18:49:11 |
| SpamBayes Score |
3.4049632e-09 |
| Marked as misclassified |
No |
| Message-id |
<1324147751.99.0.957308589374.issue13624@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
iobench benchmarking tool showed that the UTF-8 encoder is slower in Python 3.3 than Python 3.2. The performance depends on the characters of the input string:
* 8x faster (!) for a string of 50.000 ASCII characters
* 1.5x slower for a string of 50.000 UCS-1 characters
* 2.5x slower for a string of 50.000 UCS-2 characters
The bottleneck looks to be the the PyUnicode_READ() macro.
* Python 3.2: s[i++]
* Python 3.3: PyUnicode_READ(kind, data, i++)
Because encoding string to UTF-8 is a very common operation, performances do matter. Antoine suggests to have different versions of the function for each Unicode kind (1, 2, 4). |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2011年12月17日 18:49:12 | vstinner | set | recipients:
+ vstinner, pitrou, ezio.melotti |
| 2011年12月17日 18:49:11 | vstinner | set | messageid: <1324147751.99.0.957308589374.issue13624@psf.upfronthosting.co.za> |
| 2011年12月17日 18:49:11 | vstinner | link | issue13624 messages |
| 2011年12月17日 18:49:11 | vstinner | create |
|