[Python-Dev] Removal of Win32 ANSI API

Sun Nov 14 01:06:55 CET 2010

On Saturday 13 November 2010 17:21:37 you wrote:
> On 2010年11月12日 4:26, Victor Stinner wrote:
> > On Thursday 11 November 2010 17:07:28 Hirokazu Yamamoto wrote:
> >> Hello. Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA)
> >> and only use Win32 WIDE API (ie: GetFileAttributesW)?
> >> Mainly in posixmodule.c.
> > 
> > Even if I hate the MBCS encoding, because it replaces undecodable
>> characters
>> > by similar glyphs by default, I'm not certain that it is a good idea
>> to drop
>> > the bytes API.
>> On 2010年11月12日 21:08, Victor Stinner wrote:
> > On Thursday 11 November 2010 23:01:32 you wrote:
> >>> Sure, it will divide the number of lines, of the code specific to
> >>> Windows, by two.
> >> 
> >> Can we get most of the code cleanup benefit without the backwards
> >> compatibility risk by doing the decode from 'mbcs' on our side of the
> >> fence?
> > 
> > I created PyUnicode_FSDecoder, a ParseTuple converter used to work on
> > unicode paths, instead of bytes paths. On Windows, this converter uses
> > mbcs encoding in strict mode, whereas Windows converter uses replace
> > error handler to decode, and ignore to encode. So I don't think that we
> > should this converter on Windows.
> > 
> >> That is, have code that was the C equivalent of:
> >> 
> >> arg_is_bytes = not isinstance(arg, str)
> >> 
> >> if arg_is_bytes:
> >> val = _decode_mbcs(arg)
> >> # Decoding error checking here
> >> 
> >> else:
> >> val = arg
> >> 
> >> # Common processing using WIDE API
> >> 
> >> if arg_is_bytes:
> >> result = _encode_mbcs(wide_result)
> >> # Encoding error checking here
> >> 
> >> else:
> >> result = wide_result
> > 
> > This doesn't make the code shorter, it may be longer than the actual
> > code, and it is less compliant with the Windows native API...
>> Is it possible to implement new PyArg_ParseTuple converter to use
> PyUnicode_Decode(const char *s,
> Py_ssize_t size,
> const char *encoding, /* mbcs */
> const char *errors) /* replace */
> and use it?

Yes, but how do you check if the input argument is a bytes or a str object 
with your PyArg_Parse converter? You should use "O" format and manually 
convert it to unicode, and then convert the result back to bytes (if the input 
was bytes). It don't think that it makes the code shorter.
The code is currently working. The question is if we have to drop the ANSI API 
now, later or never. It looks like the decision moves to "later" (deprecate in 
3.2, remove in 3.3). I still think that drop now doesn't really hurt.
Victor