This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2014年11月13日 13:16 by fhoech, last changed 2022年04月11日 14:58 by admin. This issue is now closed.
| Messages (7) | |||
|---|---|---|---|
| msg231110 - (view) | Author: Florian Höch (fhoech) * | Date: 2014年11月13日 13:16 | |
If 'top' is an unicode directory name, os.listdir can still return non-unicode filenames if they can't be decoded. This case is not handled in the Python 2.x standard library version of os.walk and will cause join(top, name) to fail on such filenames with an UnicodeDecodeError. |
|||
| msg231111 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2014年11月13日 13:23 | |
What is your OS? |
|||
| msg231112 - (view) | Author: Florian Höch (fhoech) * | Date: 2014年11月13日 13:30 | |
This problem only affects Linux as far as I know (in my case I'm using Fedora 21 Beta). |
|||
| msg231115 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2014年11月13日 14:40 | |
Your problem has two solutions. 1) Upgrade to Python 3 which handles correctly your use case (thanks to the PEP 383, surrogateescape error handler) 2) Only process filenames as bytes, and encode/decode manually (so you can decide how to handle undecodable filenames) |
|||
| msg231117 - (view) | Author: Florian Höch (fhoech) * | Date: 2014年11月13日 14:50 | |
1) Is not yet possible for me unfortunately, some libraries I require are not yet available for Python 3 (but in the long run, this would be my preferred solution) 2) Would necessitate too many changes in a carefully crafted, unicode-only application. I think I'll just override os.listdir and filter out filenames that are not decodable, or override os.walk and do something equivalent. |
|||
| msg231118 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2014年11月13日 14:57 | |
> 1) Is not yet possible for me unfortunately, some libraries I require are not yet available for Python 3 (but in the long run, this would be my preferred solution) I'm curious, which libraries? Oh, I forgot to say that it's not possible to fix this issue in Python 2. Backporting the PEP 383 in Python 2 requires deep changes in the Unicode machinery, starting by the UTF-8 codec. Currently, the UTF-8 encoder encodes surrogates which violates Unicode standard and makes impossible to use this codec with the surrogateescape error handler. |
|||
| msg231120 - (view) | Author: Florian Höch (fhoech) * | Date: 2014年11月13日 15:16 | |
> I'm curious, which libraries? wxPython and wexpect (wexpect I could probably port myself, so the problem is mainly with wx) > Oh, I forgot to say that it's not possible to fix this issue in Python 2. Backporting the PEP 383 in Python 2 requires deep changes in the Unicode machinery, starting by the UTF-8 codec. Ok, that's understandable of course. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:10 | admin | set | github: 67051 |
| 2014年11月13日 15:16:27 | fhoech | set | messages: + msg231120 |
| 2014年11月13日 15:15:54 | r.david.murray | set | status: open -> closed resolution: wont fix stage: resolved |
| 2014年11月13日 14:57:11 | vstinner | set | messages: + msg231118 |
| 2014年11月13日 14:50:07 | fhoech | set | messages: + msg231117 |
| 2014年11月13日 14:40:44 | vstinner | set | messages: + msg231115 |
| 2014年11月13日 13:30:54 | fhoech | set | messages: + msg231112 |
| 2014年11月13日 13:23:11 | vstinner | set | nosy:
+ ezio.melotti, vstinner messages: + msg231111 components: + Unicode |
| 2014年11月13日 13:16:20 | fhoech | create | |