homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Propagate qualname from the compiler unit to code objects for finer grained profiling data
Type: enhancement Stage: resolved
Components: C API Versions: Python 3.11
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Gabriele Tornetta, Mark.Shannon, pablogsal, steve.dower
Priority: normal Keywords: patch

Created on 2021年06月28日 22:35 by Gabriele Tornetta, last changed 2022年04月11日 14:59 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
py_main.png Gabriele Tornetta, 2021年06月28日 22:35
Pull Requests
URL Status Linked Edit
PR 26941 merged Gabriele Tornetta, 2021年06月28日 22:47
PR 27052 merged pablogsal, 2021年07月07日 11:28
PR 29809 merged steve.dower, 2021年11月26日 23:28
Messages (15)
msg396667 - (view) Author: Gabriele N Tornetta (Gabriele Tornetta) * Date: 2021年06月28日 22:35
When dumping profiling data out of code objects using out-of-process tools like Austin (https://github.com/p403n1x87/austin) one has access only to file name, function name, and line number. Consider the flame graph generated by running the following script, and aggregating on function names
----
class Foo:
 def on_cpu(self, n):
 a = []
 for i in range(n):
 a.append(i)
class Bar:
 def on_cpu(self, n):
 a = []
 for i in range(n):
 a.append(i)
if __name__ == "__main__":
 f = Foo()
 b = Bar()
 f.on_cpu(1_000_000)
 b.on_cpu(5_000_000)
----
Without the extra information coming from the actual Python source, one would not be able to tell, by looking at the flame graph alone, that on_cpu has contributions from two different methods. By propagating the qualname information from the compiler to code objects, such names would be disambiguated and the resulting flame graph would be clearer.
I would like to propose adding the co_qualname field to the PyCodeObject structure, which is to be set to NULL except for when the code object is created by the compiler in compile.c.
msg396706 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021年06月29日 08:19
I think this is a worthwhile improvement.
A few comments on this issue and your PR.
1. We already have qualified names on functions and generators and those can be modified by @decorators.
In those cases, the code object and the function would disagree.
Code objects are immutable, so any thoughts on the usability implications of qualname differing between code object and function?
2. If we are adding a new attribute, it should be at the Python level. The internal layout of the code object is likely to change. (I have no problem with it being visible to tools like austin, just it won't be part of the API.)
3. As it stands, this is beneficial to only a small set of tools, but has a negative impact on memory use.
If we were to leverage the new field to improve the performance of function creation (important for closures) by omitting the qualname (and defaulting to the code object's) then it would benefit everyone.
msg396730 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021年06月29日 14:17
This also has the side effect of making marshalled code objects bigger, and we keep adding to this in 3.11 with the new exception table and soon with PEP 657. Given that a lot of people are concerned about this and that this change only affects a very small set of tools, we need to analyze very carefully the balance.
msg396754 - (view) Author: Gabriele N Tornetta (Gabriele Tornetta) * Date: 2021年06月29日 20:16
Thanks for the feedback. I certainly appreciate all the concerns around this proposed change.
@Mark.Shannon
1. This is the desired behaviour as profiling data should be attached to the actual code objects that correlate with the actual source content. Besides, in this respect, given the immutability of code objects, qualname behaves like name.
2. At the Python level it is already possible to get the __qualname__ of a function (but not of a code object). As this is all I needed, I kept my initial implementation to the bare minimum.
3. I agree with this analysis. Whilst this increases the memory footprint, it might have the benefit of making the derivation of __qualname__ faster as this would now be cached on code objects.
However, I also appreciate @pablogsal's point of view of evaluating the actual benefits. The extra point in favour of this change that I can make is that it would resolve name clashes in profiling data from tools that have minimal impact on the runtime. Whether this is enough to justify the extra memory cost is certainly up for debate. From my side, I don't have the numbers to make this case more concrete.
As a next step, I'll look into exposing the new field to Python to see what it actually ends up looking like.
msg396927 - (view) Author: Gabriele N Tornetta (Gabriele Tornetta) * Date: 2021年07月03日 21:50
@pablogsal
Commit a0252ab9add7d48e9e0604ebf97342e46cf00419 exposes co_qualname to Python. I've added a test case to check that the result coincides with the current implementation of __qualname__. Is there a "standard" way for me to quantify the memory impact of these kinds of changes?
@Mark.Shannon
I'll look into making __qualname__ use __code__.co_qualname as a next step.
msg396930 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021年07月03日 22:03
> Is there a "standard" way for me to quantify the memory impact of these kinds of changes?
That is a pointer size per code object. The most standard way is to calculate the size of all pyc files in the stdlib after compiling them all with "-m compileall -r 1000 Lib".
msg396932 - (view) Author: Gabriele N Tornetta (Gabriele Tornetta) * Date: 2021年07月03日 22:35
> That is a pointer size per code object. The most standard way is to calculate the size of all pyc files in the stdlib after compiling them all with "-m compileall -r 1000 Lib".
This yields the following numbers between what was main when I branched off and the tip of my branch:
# main (7569c0fe91): 25_059_438 bytes
# bpo-445303-code-qualname (a0252ab9ad): 25_511_492 bytes
So that seems to be about half a MB increase over 25 MB (about 2% relative increase).
This is potentially just an interim result, as at first sight it looks like MAKE_FUNCTION could require one less argument (the qualname), which can now be taken from the code object. So not quite the final picture yet.
msg396933 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021年07月03日 22:43
> So that seems to be about half a MB increase over 25 MB (about 2% relative increase).
I personally think that is acceptable, so I would be supportive of the patch but for context, many folks have indicated that they are worried about this size getting bigger during previous discussions, so we need to still think collectively :)
msg396940 - (view) Author: Gabriele N Tornetta (Gabriele Tornetta) * Date: 2021年07月04日 10:19
With commit 7a12d31a8c22cae7a9c1a2239a1bb54adee33622 the new figures are
# main (7569c0fe91): 25_059_438 bytes
# bpo-445303-code-qualname (7a12d31a8c): 25_089_917 bytes
which is a 0.1% relative increase :). This is likely due to the fact that with the change MAKE_FUNCTION needs to pop one less value from the TOS, as the qualname now comes from the code object.
I think the PR is now good for a first review. I think I'd need some help with the documentation sources.
msg397071 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021年07月07日 11:16
I suspect that the 0.1% increase is noise.
The size of importlib.h etc show a small decrease, suggesting that the information content of the code object has *decreased*.
After all, the qualname has to stored somewhere and moving it from caller to callee should make no difference.
Organizing the code object so it wastes less memory is a separate issue.
msg397072 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021年07月07日 11:21
New changeset 2f180ce2cb6e6a7e3c517495e0f4873d6aaf5f2f by Gabriele N. Tornetta in branch 'main':
bpo-44530: Add co_qualname field to PyCodeObject (GH-26941)
https://github.com/python/cpython/commit/2f180ce2cb6e6a7e3c517495e0f4873d6aaf5f2f
msg397079 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021年07月07日 13:21
New changeset 8363c53369a582ff9ae4e797a80cdce12624a278 by Pablo Galindo in branch 'main':
bpo-44530: Document the new CodeObject.co_qualname attribute (GH-27052)
https://github.com/python/cpython/commit/8363c53369a582ff9ae4e797a80cdce12624a278
msg407105 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021年11月26日 23:22
This change modified the audit event 'code.__name__', which requires a deprecation period (all events are public API, as per PEP 578).
We need to revert that part of the change. I don't think we need to add a new event to report the qualname here, so just dropping the new argument seems fine.
msg407106 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021年11月26日 23:23
Correction: the event is `code.__new__`
msg407120 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021年11月27日 00:26
New changeset db55f3fabafc046e4fca907210ced4ce16bf58d6 by Steve Dower in branch 'main':
bpo-44530: Reverts a change to the 'code.__new__' audit event (GH-29809)
https://github.com/python/cpython/commit/db55f3fabafc046e4fca907210ced4ce16bf58d6
History
Date User Action Args
2022年04月11日 14:59:47adminsetgithub: 88696
2021年12月30日 00:20:44iritkatriellinkissue13672 superseder
2021年12月03日 19:50:26steve.dowersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021年11月27日 00:26:53steve.dowersetmessages: + msg407120
2021年11月26日 23:28:16steve.dowersetstage: resolved -> patch review
pull_requests: + pull_request28041
2021年11月26日 23:23:27steve.dowersetmessages: + msg407106
2021年11月26日 23:22:53steve.dowersetstatus: closed -> open

nosy: + steve.dower
messages: + msg407105

resolution: fixed -> (no value)
2021年07月07日 13:21:14pablogsalsetmessages: + msg397079
2021年07月07日 11:28:43pablogsalsetpull_requests: + pull_request25608
2021年07月07日 11:22:24pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021年07月07日 11:21:55pablogsalsetmessages: + msg397072
2021年07月07日 11:16:14Mark.Shannonsetmessages: + msg397071
2021年07月04日 10:19:44Gabriele Tornettasetmessages: + msg396940
2021年07月03日 22:43:56pablogsalsetmessages: + msg396933
2021年07月03日 22:35:33Gabriele Tornettasetmessages: + msg396932
2021年07月03日 22:03:05pablogsalsetmessages: + msg396930
2021年07月03日 21:50:33Gabriele Tornettasetmessages: + msg396927
2021年06月29日 20:16:24Gabriele Tornettasetmessages: + msg396754
2021年06月29日 14:17:13pablogsalsetnosy: + pablogsal
messages: + msg396730
2021年06月29日 08:19:43Mark.Shannonsetnosy: + Mark.Shannon
messages: + msg396706
2021年06月28日 22:47:55Gabriele Tornettasetkeywords: + patch
stage: patch review
pull_requests: + pull_request25508
2021年06月28日 22:35:33Gabriele Tornettacreate

AltStyle によって変換されたページ (->オリジナル) /