Message105094
| Author |
vstinner |
| Recipients |
lars.gustaebel, loewis, vstinner |
| Date |
2010年05月05日.22:14:49 |
| SpamBayes Score |
0.01858112 |
| Marked as misclassified |
No |
| Message-id |
<1273097691.15.0.651636064492.issue8633@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
tarfile is unable to open a TAR archive in PAX format embedding invalid filenames (filename not encoded in utf8, an undecodable filename). Attached file is an example (contain the file b'z/\xff', not decodable from utf8).
PAX specification has a "invalid" option with 4 values: bypass (default), rename, UTF-8, write.
http://www.opengroup.org/onlinepubs/009695399/utilities/pax.html
As it was done for other formats in issue #8390, PAX can use Python surrogateescape error handler to store undecodable bytes as unicode surrogates.
I think that PAX should be strict by default, but have an option to enable surrogateescape mode. |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2010年05月05日 22:14:51 | vstinner | set | recipients:
+ vstinner, loewis, lars.gustaebel |
| 2010年05月05日 22:14:51 | vstinner | set | messageid: <1273097691.15.0.651636064492.issue8633@psf.upfronthosting.co.za> |
| 2010年05月05日 22:14:49 | vstinner | link | issue8633 messages |
| 2010年05月05日 22:14:49 | vstinner | create |
|