Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018年11月16日 15:13:47 -0800

On 2018年11月16日, Brett Cannon wrote:
> I think part of the challenge here (and I believe it has been
> brought up elsewhere) is no one knows what kind of API is
> necessary for some faster VM other than PyPy.
I think we have some pretty good ideas as to what are the
problematic parts of the current API. Victor's C-API web site has
details[1]. We can ask other implementors which parts are hard to
support.
Here are my thoughts about some desired changes:
- We are *not* getting rid of refcounting for extension modules. That
 would require a whole new API. We might as well start from
 scratch with Python 4. No one wants that. However, it is likely
 different VMs use a different GC internally and only use
 refcounting for objects passed through the C-API. Using
 refcounted handles is the usual implementation approach. We can
 make some changes to make that easier. I think making PyObject an
 opaque pointer would help.
- Borrowed references are a problem. However, because they are so
 commonly used and because the source code changes needed to change
 to a non-borrowed API is non-trivial, I don't think we should try
 to change this. Maybe we could just discourage their use? For
 CPython, using a borrowed reference API is faster. For other
 Python implementations, it is likely slower and maybe much slower.
 So, if you are an extension module that wants to work well with
 other VMs, you should avoid those APIs.
- It would be nice to make PyTypeObject an opaque pointer as well.
 I think that's a lot more difficult than making PyObject opaque.
 So, I don't think we should attempt it in the near future. Maybe
 we could make a half-way step and discourage accessing ob_type
 directly. We would provide functions (probably inline) to do what
 you would otherwise do by using op->ob_type-><something>.
 One reason you want to discourage access to ob_type is that
 internally there is not necessarily one PyTypeObject structure for
 each Python level type. E.g. the VM might have specialized types
 for certain sub-domains. This is like the different flavours of
 strings, depending on the set of characters stored in them. Or,
 you could have different list types. One type of list if all
 values are ints, for example.
 Basically, with CPython op->ob_type is super fast. For other VMs,
 it could be a lot slower. By accessing ob_type you are saying
 "give me all possible type information for this object pointer".
 By using functions to get just what you need, you could be putting
 less burden on the VM. E.g. "is this object an instance of some
 type" is faster to compute.
- APIs that return pointers to the internals of objects are a
 problem. E.g. PySequence_Fast_ITEMS(). For CPython, this is
 really fast because it is just exposing the internal details of
 the layout that is already in the correct format. For other VMs,
 that API could be expensive to emulate. E.g. you have a list to
 store only ints. If someone calls PySequence_Fast_ITEMS(), you
 have to create real PyObjects for all of the list elements.
- Reducing the size of the API seems helpful. E.g. we don't need
 PyObject_CallObject() *and* PyObject_Call(). Also, do we really
 need all the type specific APIs, PyList_GetItem() vs
 PyObject_GetItem()? In some cases maybe we can justify the bigger
 API due to performance. To add a new API, someone should have a
 benchmark that shows a real speedup (not just that they imagine it
 makes a difference).
I don't think we should change CPython internals to try to use this
new API. E.g. we know that getting ob_type is fast so just leave
the code that does that alone. Maybe in the far distant future,
if we have successfully got extension modules to switch to using
the new API, we could consider changing CPython internals. There
would have to be a big benefit though to justify the code churn.
E.g. if my tagged pointers experiment shows significant performance
gains (it hasn't yet).
I like Nathaniel Smith's idea of doing the new API as a separate
project, outside the cpython repo. It is possible that in that
effort, we would like some minor changes to cpython in order to make
the new API more efficient, for example. Those should be pretty
limited changes because we are hoping that the new API will work on
top of old Python versions, e.g. 3.6.
To avoid exposing APIs that should be hidden, re-organizing include
files is an idea. However, that doesn't help for old versions of
Python. So, I'm thinking that Dino's idea of just duplicating the
prototypes would be better. We would like a minimal API and so the
number of duplicated prototypes shouldn't be too large.
Victor's recent work in changing some macros to inline functions is
not really related to the new API project, IMHO. I don't think
there is a problem to leave an existing macro as a macro. If we
need to introduce new APIs, e.g. to help hide PyTypeObject, those
APIs could use inline functions. That way, if using CPython then
using the new API would be just as fast as accessing ob_type
directly. You are getting an essentially zero cost abstraction.
For the limited API builds, maybe it would be okay to change the
inline functions into non-inlined versions (same function name).
If the new API is going to be successful, it needs to be realatively
easy to change extension source code to use it. E.g. replacing one
function with another is pretty easy (PyObject_GetItem vs
PyList_GetItem). If too many difficult changes are required,
extensions are never going to get ported. The ported extension
*must* be usable with older Python versions. That's a big mistake
we made with the Python 2 to 3 migration. Let's not repeat it.
Also, the extension module should not take a big performance hit.
So, you can't change all APIs that were macros into non-inlined
functions. People are not going to accept that and rightly so.
However, it could be that we introduce a new ifdef like
Py_LIMITED_API that gives a stable ABI. E.g. when that's enabled,
most everything would turn into non-inline functions. In exchange
for the performance hit, your extension would become ABI compatible
between a range of CPython releases. That would be a nice feature.
Basically a more useful version of Py_LIMITED_API.
Regards,
 Neil
1. https://pythoncapi.readthedocs.io/bad_api.html
_______________________________________________
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to