homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Enhanced cPython profiler with high-resolution timer
Type: performance Stage: resolved
Components: Extension Modules Versions: Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: MrJean1, alexandre.vassalotti, belopolsky, gregory.p.smith, kristjan.jonsson, methane, nirai, pitrou, vstinner
Priority: normal Keywords: patch

Created on 2008年03月12日 21:56 by MrJean1, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
hires_lsprof.tgz MrJean1, 2008年03月12日 21:56
hires_lsprof2.tgz MrJean1, 2008年03月13日 20:23
_lsprof2.6.diff MrJean1, 2008年03月22日 18:01 diff vs _lsprof.c rev 59564 for 2.6a1
_lsprof3.0.diff MrJean1, 2008年03月22日 18:02 diff vs _lsprof.c rev 59564 for 3.0a3
hires_lsprof3.tgz MrJean1, 2009年05月17日 21:34 use hires timer also on MacOS X, Solaris, HP-UX
hires_lsprof4.tgz MrJean1, 2009年05月18日 20:09 use hires timer on Linux as well
hpTime.c MrJean1, 2009年05月18日 20:10 standalone test program
Messages (23)
msg63486 - (view) Author: Jean Brouwers (MrJean1) Date: 2008年03月12日 21:56
Attached is a modified version of the cPython profiler file 
Modules/_lsprof.c using a high-resolution timer where available.
The enhancement has been tested on 32- and 64-bit Linux (x86 and x86_64) 
and on 32-bit MacOS X Tiger (Intel) and Panther (PPC).
No other platforms have been tested but as before the profiler will 
fallback to using gettimeofday() on non-Windows version, except the 64-
bit PPC build will issue a compile-time warning.
msg63487 - (view) Author: Jean Brouwers (MrJean1) Date: 2008年03月12日 21:58
This enhancement applies to Python 2.5.2 only.
msg63511 - (view) Author: Jean Brouwers (MrJean1) Date: 2008年03月13日 20:23
Attached are 2 Modules/_lsprof.c files, one for Python 2.5.2 and 2.6a1 
and the other for Python 3.0a3. Discard the previously posted one.
Both contain the same enhancements to use the high-resolution timer 
where available.
These versions catch wrap around of the timer and clock and adjust 
accordingly. In the hpTimerUnit function only and not for profile 
times.
Lastly, instead of malloc and free functions PyObject_MALLOC and 
PyObject_FREE are called making profiler memory usage the same as for 
other objects created.
msg64314 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2008年03月22日 03:19
I don't think this should be added to 2.5. Only bugfixes are admissible
to the backporting process (see PEP 6).
Finally, could you post the diff of your changes as described at
http://www.python.org/dev/patches/. Thanks!
msg64332 - (view) Author: Jean Brouwers (MrJean1) Date: 2008年03月22日 18:01
Here are 2 forward diff files against _lspprof.c rev 59564. One 
_lsprof2.6.diff for Python 2.6a1 and _lsprof3.0.diff for 3.0.
msg83551 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009年03月14日 00:28
Interesting patch, but there are lots of spurious #if's and #ifdef's
which could be simplified. Also, some changes look slightly unrelated
(e.g. the switch from malloc/free to PyObject_Malloc/PyObject_Free).
msg88004 - (view) Author: Jean Brouwers (MrJean1) Date: 2009年05月17日 21:34
Attached is another version of the Modules/_lsprof.c source file.
This _lsprof3.c uses the hires time base on Mac OS X, Solaris and HP-UX 
(plus MS Windows like before) and does not require any asm code.
Also, there are no changes other than the very top of the file. The 
forward diff with rev 66700 from the trunk is included.
msg88006 - (view) Author: Jean Brouwers (MrJean1) Date: 2009年05月17日 21:49
Attached is yet another version of the Modules/_lsprof.c source, called 
_lsprof4.c.
This one adds support for the hires timer on 32- and 64-bit Intel and 
PowerPC processors iff compiled with GNU C.
With _lsprof4.c, the high precision profile is available on Linux, Mac 
OS X, Solaris, HP-UX and MS Windows.
Included is the forward diff with _lsprof.c rev 66700 from the trunk. 
The changes are only at the very top and in the initialization function 
at the bottom.
msg88012 - (view) Author: Jean Brouwers (MrJean1) Date: 2009年05月18日 01:16
There were two problems with _lsprof4.c. Attached is the corrected file 
and the forward diff with _lsprof.c rev 6670.
msg88018 - (view) Author: Jean Brouwers (MrJean1) Date: 2009年05月18日 06:57
Here a small, standalone test program of the hires timer code from 
_lsprof4.c.
The results from 10 different builds are included in the comment.
msg88047 - (view) Author: Jean Brouwers (MrJean1) Date: 2009年05月18日 20:09
Both the hires_lsprof4.tgz and hpTime.c files have been updated with a 
change in the PowerPC asm code.
The previous implementation used a loop: label and that may cause 
duplicate symbols when compiled at optimization levels of -O3 and above 
with older GNU C compilers.
msg88048 - (view) Author: Jean Brouwers (MrJean1) Date: 2009年05月18日 20:10
Another test build and result has been added to the hpTime.c comment.
msg88622 - (view) Author: Jean Brouwers (MrJean1) Date: 2009年05月31日 23:39
Another thought on the hires timer to make the hires time and hires time 
units available as 2 other functions in the time module.
For example, function time.ticks() returns the hires time stamp as an 
int. Function time.ticks2secs(t) converts a given number of ticks to 
seconds.
To avoid duplicating the hires time code in both the time and profile 
modules, it would be necessary to move the hpTimer and hpTimerUnit 
functions to some place inside the Python core accessible for the time 
and profile modules. Perhaps to a new file, say Python/gethptime.c.
That new file can handle other platform-specific idiosyncrasies with 
respect to hires time. In particular, it could implement a different 
(and better) way to determine the resolution of a hires tick, e.g. on 
Linux and BSD Unix.
msg95728 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009年11月25日 23:32
One issue with using timestamp counters is that their resolution varies
with CPU speed, which is not necessarily constant during a whole run
(because of power management). On the other hand I'm not sure it's
really a problem.
msg105966 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010年05月18日 11:34
Nir, would you be interested in looking at this?
msg105971 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010年05月18日 12:44
Regarding the calibrating loop when using ASM timestamp counters, there's one problem if the thread gets preempted in the middle of the loop: the calibration will return incorrect results. Perhaps a way of avoiding this would be to call nanosleep(1 ns) just before the loop, which should relinquish the current timeslice.
Another issue can appear when threads migrate from one CPU core to another, according to Wikipedia: "AMD processors up to the K8 core always incremented the time-stamp counter every clock cycle.[5] Thus, power management features were able to change the number of increments per second, and the values could get out of sync between different cores or processors in the same system."
Perhaps using clock_gettime(CLOCK_REALTIME or CLOCK_MONOTONIC) is better than using raw ASM timestamp counters, when available.
msg105972 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010年05月18日 13:09
Sorry, the calibration loop actually looks correct in case of preemption, since it simply divides the TSC delta by the elapsed physical time (and both continue increasing monotonously when another thread runs). It is still vulnerable to the thread migration problem, though.
msg105977 - (view) Author: Nir Aides (nirai) (Python triager) Date: 2010年05月18日 14:08
> Nir, would you be interested in looking at this?
yes, I'll take a look, but will take me a few days.
msg113842 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010年08月13日 22:21
Can someone post a diff against current py3k? I would like to take a look, but the files attached to this issue seem to be more than a year old.
msg170942 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012年09月22日 00:46
Python 3.3 has a new time.perf_counter() function. It may be possible to expose this function in C and then use it in _lsprof. See the PEP for the list of (OS/hardware) clocks used by this function:
http://www.python.org/dev/peps/pep-0418/#time-perf-counter
http://www.python.org/dev/peps/pep-0418/#time-monotonic 
msg170944 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012年09月22日 00:54
Reading directly the CPU timestamp counter is not a good idea.
> One issue with using timestamp counters is that their resolution varies
> with CPU speed, which is not necessarily constant during a whole run
> (because of power management). On the other hand I'm not sure it's
> really a problem.
There is another major issue: you have to synchronize the counter of all CPU cores. Windows XP didn't synchronize the counters, and it was a major issue on QueryPerformanceCounter.
http://support.microsoft.com/?id=896256
It is really better to rely on the OS rather than querying directly the hardward. The OS (well, most OS except Windows) handles all these minor "nits" and provide the best possible resolution.
--
> Python 3.3 has a new time.perf_counter() function.
> It may be possible to expose this function in C ...
Oh, I now remember why I didn't do this before: when clock_gettime() function is used in perf_counter(), the program or library must be linked to librt on most Linux distro. It was easier to only handle this correctly for the time module. We can do something similar on the _lsprof module.
msg170975 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2012年09月22日 09:05
>There is another major issue: you have to synchronize the counter of all >CPU cores. Windows XP didn't synchronize the counters, and it was a major >issue on QueryPerformanceCounter.
>http://support.microsoft.com/?id=896256
I don't understand why you keep bashing Windows for this. It was actually a problem with CPU drivers. We encountered this problem for a number of our players and the fix was to get the latest intel/amd drivers. The issue you quote is a separate one.
This appears to be the correct one: http://support.microsoft.com/kb/895980 
msg339976 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019年04月11日 13:56
I think https://bugs.python.org/issue36575 fixed this.
History
Date User Action Args
2022年04月11日 14:56:31adminsetgithub: 46534
2019年04月11日 13:56:22methanesetstatus: open -> closed

nosy: + methane
messages: + msg339976

resolution: fixed
stage: patch review -> resolved
2012年09月22日 09:05:15kristjan.jonssonsetmessages: + msg170975
2012年09月22日 00:54:29vstinnersetmessages: + msg170944
2012年09月22日 00:46:14vstinnersetmessages: + msg170942
2010年11月03日 16:34:25pitrousetnosy: + kristjan.jonsson
2010年08月13日 22:21:55belopolskysetmessages: + msg113842
2010年08月13日 15:14:26pitrousetassignee: pitrou ->

nosy: + belopolsky
2010年05月20日 00:59:15gregory.p.smithsetassignee: gregory.p.smith -> pitrou
2010年05月18日 14:08:24niraisetmessages: + msg105977
2010年05月18日 13:09:10pitrousetmessages: + msg105972
2010年05月18日 13:00:55vstinnersetnosy: + vstinner
2010年05月18日 12:44:25pitrousetmessages: + msg105971
2010年05月18日 11:34:03pitrousetnosy: + nirai

messages: + msg105966
versions: - Python 2.7
2009年11月25日 23:32:48pitrousetmessages: + msg95728
versions: + Python 3.2, - Python 3.1
2009年05月31日 23:39:26MrJean1setmessages: + msg88622
2009年05月18日 20:11:33MrJean1setfiles: - hpTime.c
2009年05月18日 20:11:27MrJean1setfiles: - hires_lsprof4.tgz
2009年05月18日 20:10:52MrJean1setfiles: + hpTime.c

messages: + msg88048
2009年05月18日 20:09:07MrJean1setfiles: + hires_lsprof4.tgz

messages: + msg88047
2009年05月18日 06:57:21MrJean1setfiles: + hpTime.c

messages: + msg88018
2009年05月18日 01:16:51MrJean1setfiles: + hires_lsprof4.tgz

messages: + msg88012
2009年05月18日 01:13:30MrJean1setfiles: - hires_lsprof4.tgz
2009年05月17日 21:49:01MrJean1setfiles: + hires_lsprof4.tgz

messages: + msg88006
2009年05月17日 21:34:42MrJean1setfiles: + hires_lsprof3.tgz

messages: + msg88004
2009年03月14日 00:50:44gregory.p.smithsetassignee: gregory.p.smith

nosy: + gregory.p.smith
2009年03月14日 00:28:31pitrousetversions: + Python 3.1
nosy: + pitrou

messages: + msg83551

stage: patch review
2008年10月07日 17:47:28MrJean1setversions: + Python 2.7, - Python 2.5.3
2008年10月07日 17:42:07MrJean1setversions: + Python 2.5.3, - Python 3.0
2008年03月22日 18:02:49MrJean1setfiles: + _lsprof3.0.diff
versions: + Python 3.0, - Python 2.5
2008年03月22日 18:01:16MrJean1setfiles: + _lsprof2.6.diff
messages: + msg64332
2008年03月22日 03:19:40alexandre.vassalottisetnosy: + alexandre.vassalotti
messages: + msg64314
priority: normal
components: + Extension Modules, - None
keywords: + patch
type: performance
2008年03月13日 20:23:52MrJean1setfiles: + hires_lsprof2.tgz
messages: + msg63511
2008年03月12日 21:58:18MrJean1setmessages: + msg63487
components: + None
versions: + Python 2.5
2008年03月12日 21:56:47MrJean1create

AltStyle によって変換されたページ (->オリジナル) /