[Python-Dev] Python-3.0, unicode, and os.environ
Stephen J. Turnbull
stephen at xemacs.org
Fri Dec 12 09:57:20 CET 2008
Toshio Kuratomi writes:
> Adam Olsen wrote:
> > On Thu, Dec 11, 2008 at 6:55 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> >> Unfortunately, even programmers experienced in I18N like Martin, and
> >> those with intuition-that-has-the-force-of-law<wink> like Guido,
> >> express deliberate disbelief on this point. They say that filesystem
> >> names and environment variable values are text, which is true from the
> >> semantic viewpoint but can't be fully supported by any implementation.
> >
> > With all the focus on backup tools and file managers I think we've
> > lost perspective. They're an important use case, but hardly the
> > dominant one.
True.
> > Please, as a user, if your app is creating new files, do NOT use
> > bytes! You have no excuse for creating garbage, and garbage doesn't
> > help the user any. Getting the encoding right, use the unicode APIs,
> > and don't pass the buck on to everything else.
> >
> Uhmmm.... That's good advice but doesn't solve any problems :-(.
Exactly. Furthermore, the problems *already exist*. My current
locale is UTF-8 and all files dated since about 2002 have UTF-8 names,
*except* in my MIME-bodies garbage can, where only recently have I got
around to coercing my MUA to doing the right thing. And of course
there are still legacy files names in EUC-JP, which I suppose I could
search for but since I only access a directory containing one once in
a pale blue moon, I'm not gonna bother.
It's just not reasonable to expect users or even sysadminns to go
around cleaning up legacy data.
More information about the Python-Dev
mailing list