[Python-Dev] Re: Plan to remove Py_UNICODE APis except PEP 623.

2020年7月01日 16:39:08 -0700

On Thu, Jul 2, 2020 at 5:20 AM M.-A. Lemburg <[email protected]> wrote:
>
>
> The reasoning here is the same as for decoding: you have the original
> data you want to process available in some array and want to turn
> this into the Python object.
>
> The path Victor suggested requires always going via a Python Unicode
> object, but that it very expensive and not really an appropriate
> way to address the use case.
>
But current PyUnicode_Encode* APIs does `PyUnicode_FromWideChar`.
It is no direct API already.
Additionally, pyodbc, the only user of the encoder API, did
PyUnicode_EncodeUTF16(PyUnicode_AsUnicode(unicode), ...)
It is very inefficient. Unicode Object -> Py_UNICODE* -> Unicode
Object -> byte object.
And as many others already said, most C world use UTF-8 for Unicode
representation in C,
not wchar_t.
So I don't want to undeprecate current API.
> As an example application, think of a database module which provides
> the Unicode data as Py_UNICODE buffer.
Py_UNICODE is deprecated. So I assume you are talking about wchar_t.
> You want to write this as UTF-8
> data to a file or a socket, so you have the PyUnicode_EncodeUTF8() API
> decode this for you into a bytes object which you can then write out
> using the Python C APIs for this.
PyUnicode_FromWideChar + PyUnicode_AsUTF8AndSize is better than
PyUnicode_EncodeUTF8.
PyUnicode_EncodeUTF8 allocate temporary Unicode object anyway. So it needs
to allocate Unicode object *and* char* buffer for UTF-8.
On the other hand, PyUnicode_AsUTF8AndSize can just expose internal
data when it is plain ASCII. Since ASCII string is very common, this
is effective
optimization.
Regards,
-- 
Inada Naoki <[email protected]>
_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/UYOPQDKLSNOVPFGPCR5BIW3GHYB3V3KZ/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to