homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: PEP 511: code.co_lnotab: use signed line number delta to support moving instructions in an optimizer
Type: enhancement Stage:
Components: Interpreter Core Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Mark.Shannon, brett.cannon, gvanrossum, python-dev, rhettinger, serhiy.storchaka, vstinner, ztane
Priority: normal Keywords: patch

Created on 2016年01月14日 09:12 by vstinner, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
lnotab.patch vstinner, 2016年01月14日 09:12 review
lnotab-2.patch vstinner, 2016年01月18日 10:53 review
lnotab-3.patch vstinner, 2016年01月18日 11:00 review
lnotab-4.patch vstinner, 2016年01月18日 22:31 review
Messages (22)
msg258189 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年01月14日 09:12
Python doesn't store the original line number in the .pyc file in the bytecode. Instead, an efficient table is used to find the line number from the current in the bytecode: code.co_lnotab.
Basically, it's a list of (offset_delta, line_number_delta) pairs where offset_delta and line_number_delta are unsigned 8 bits numbers. If an offset delta is larger than 255, (offset_delta % 255, line_number_delta) and (offset_delta // 255, 0) pairs are emited. Same for line_number_delta. (In fact, more than two pairs can be created.)
The format is described in Objects/lnotab_notes.txt.
I implemented an optimizer which can generate *negative* line number. For example, the loop:
 for i in range(2): # line 1
 print(i) # line 2
is replaced with:
 i = 0 # line 1
 print(i) # line 2
 i = 1 # line 1
 print(i) # line 2
The third instruction has a negative line number delta.
I'm not the first one hitting the issue, but it's just that no one proposed a patch before. Previous projects bitten by this issue:
* issue #10399: "AST Optimization: inlining of function calls"
* issue #11549: "Build-out an AST optimizer, moving some functionality out of the peephole optimizer"
Attached patch changes the type of line number delta from unsigned 8-bit integer to *signed* 8-bit integer. If a line number delta is smaller than -128 or larger than 127, multiple pairs are created (as before).
My code in Lib/dis.py is inefficient. Maybe unpack the full lnotab than *then* skip half of the bytes? (instead of calling struct.unpack times for each byte).
The patch adds also "assert(Py_REFCNT(lnotab_obj) == 1);" to PyCode_Optimize(). The assertion never fails, but it's just to be extra safe.
The patch renames variables in PyCode_Optimize() because I was confused between "offset" and "line numbers". IMHO variables were badly named.
I changed the MAGIC_NUMBER of importlib, but it was already changed for f-string:
# Python 3.6a0 3360 (add FORMAT_VALUE opcode #25483)
Is it worth to modify it again?
You may have to recompile Python/importlib_external.h if it's not recompiled automatically (just touch the file before running make).
Note: this issue is related to the PEP 511 (the PEP is not ready for a review, but it gives a better overview of the use cases.)
msg258190 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016年01月14日 09:15
New changeset c4a826184937 by Victor Stinner in branch 'default':
PEP 511: add link to issue #26107
https://hg.python.org/peps/rev/c4a826184937 
msg258191 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年01月14日 09:20
> Attached patch changes the type of line number delta from unsigned 8-bit integer to *signed* 8-bit integer. If a line number delta is smaller than -128 or larger than 127, multiple pairs are created (as before).
The main visible change is that .pyc files can be a little bit larger and so use more disk space. Well... in fact only .pyc of files using line number delta larger than 127.
msg258197 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年01月14日 12:32
A patch was proposed in issue16956. And issue17611 is related.
msg258513 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年01月18日 10:53
Patch version 2 to take Serhiy's review in account:
* complete Objects/lnotab_notes.txt update. Replace (300, 300) delta example with (300, 200) delta for better readability (300 is the bytecode offset delta, 200 is the line number delta)
* fix added assertion in peephole: don't check the reference counter for empty byte string (which can be the empty byte string singleton)
* restore addrmap name in peephole
* don't change importlib MAGIC: we only change it between Python minor versions, it was already changed for Python 3.6a0
* assemble_lnotab() divide with positive numbers to avoid undefined behaviour on C
* dis.py: use "if line_incr >= 0x80: line_incr -= 0x100" instead of struct.unpack() to convert unsigned to signed
Note: avoid also useless "if (x != NULL)" checks before calling PyMem_Free(). PyMem_Free(NULL) is well specified: do nothing.
msg258514 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年01月18日 11:00
Patch version 3:
* When I reviewed issue #16956 patch, I noticed that I forgot to extract my changes on codeobject.c :-/ (FAT Python became a giant patch, it's hard to browse it!)
* Fix typo in dis.py (regression of patch 2)
msg258515 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年01月18日 11:06
> A patch was proposed in issue16956. And issue17611 is related.
I don't see directly the link between this issue and the issue17611, but cool if it helps to implement new optimizations :-)
I compared my patch with issue16956 patch:
* my patch mentions also the change in Lib/importlib/_bootstrap_external.py
* my patch updates also dis.py
* my patch updates also Objects/lnotab_notes.txt
* issue16956 patch changes compiler_while(), this change is not directly related to making line number delta signed
Additionally, my patch uses better names in the peephole optimizer, but it's not directly related to the issue. By the way, this change should be commited in a separated patch.
I prefer to push my recent. By the way, it's up to date, whereas issue16956 patch requires a rebase.
msg258520 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年01月18日 11:57
Yes, you patch supersedes issue16956 patch.
Added new comments on Rietveld for lnotab_notes.txt.
I afraid this patch can cause problems with code tracing where it is assumed that lines are increased monotonically and *instr_lb <= frame->f_lasti < *instr_ub. We should carefully analyze the effect of the patch on the tracing.
Before committing you must ask Guido for approval. AFAIK his had objections against code transformations that make debugging harder.
msg258522 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年01月18日 13:31
We have many unit test in the Python test suite which rely on exact line numbers. Examples:
* test_sys_settrace
* test_pdb
* test_dis
I know them because they were all broken when my fatoptimizer project had bugs related to line numbers :-)
With my patch, the full Python test suite pass whereas my patch doesn't modify any test.
> I afraid this patch can cause problems with code tracing where it is assumed that lines are increased monotonically and *instr_lb <= frame->f_lasti < *instr_ub. We should carefully analyze the effect of the patch on the tracing.
First, my patch has no impact on frame->f_lasti.
The trace module and test_sys_settrace use frame.f_lineno which PyFrame_GetLineNumber(). This function returns f->f_lineno if the frame has a trace function, or PyCode_Addr2Line(). PyCode_Addr2Line() and PyFrame_GetLineNumber() still work with my patch.
When you trace a program, "negative line delta" and "negative instruction offset" are not new in Python: it's a basic requirement to support loop, when you compare two instructions seen by the tracer.
To be clear, my patch does *not* introduce negative line number delta in co_lnotab. It only *adds support* for negative line number delta. If a tool decodes co_lnotab using 8-bit unsigned number for line number delta, the tool still works even with the patch. It only starts to return wrong line numbers if you start debugging a program which has negative line numbers.
If you use fatoptimizer, you get such negative delta. But if you use an optimizer, you should be prepared to some subtle differences. The good practice is to disable all optimizers when you debug code. It's also really hard (or impossible) to debug C code optimized with -O3. I always use gcc -O0 to debug CPython.
> Before committing you must ask Guido for approval. AFAIK his had objections against code transformations that make debugging harder.
Are you aware of tools decoding directly co_lnotab?
Oh, I forgot the old Misc/gdbinit script which *does* decode directly co_lnotab. Does anyone still use it? If yes, it should also be updated.
I failed to reproduce the bug with Misc/gdbinit, beacuse bug only occurs if you debug a program which uses negative line number, and CPython doesn't produce negative line number in co_lnotab by default... So it would be "nice" to also support negative line number in Misc/gdbinit, but maybe it's ok to let this old script dying? :-D
msg258523 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年01月18日 13:38
> Are you aware of tools decoding directly co_lnotab?
Ah! I found Ned Batchelder's coverage project which has a _bytes_lines() method "adapted from dis.py in the standard library". The method uses directly co_lnotab to compute line numbers.
Ok, *this project* will have to be updated if it wants to support fatoptimizer and other code transformers producing negative line numbers.
Maybe I can contribute to it with a patch if my change to CPython 3.6 is accepted ;-)
msg258527 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016年01月18日 17:01
I just wanted to comment on "don't change importlib MAGIC: we only change it between Python minor versions": that's actually not true. Feel free to up the number whenever you make a change that affects eval.c or bytecode. Otherwise .pyc files won't be regenerated. And that number is cheap anyway and isn't about to run out, so don't worry about updating it multiple times before the code sees a public release.
msg258529 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年01月18日 17:08
Brett Cannon added the comment:
> I just wanted to comment on "don't change importlib MAGIC: we only change it between Python minor versions": that's actually not true. Feel free to up the number whenever you make a change that affects eval.c or bytecode. Otherwise .pyc files won't be regenerated. And that number is cheap anyway and isn't about to run out, so don't worry about updating it multiple times before the code sees a public release.
Since my patch may break setup of multiple python developers, it can
be worth to increase this number, ok :-)
msg258547 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年01月18日 21:09
But there is no need to increase it by 10. I suppose the gap is added to allow updating bytecode in maintained releases, but in process of developing next version we don't need this.
msg258549 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016年01月18日 21:30
There's technically no need to worry about ranged values as the magic number is purely an equality check to see if the interpreter matches what the .pyc was created with. I guess there might be third-party code that does a range check, but that's bad as importlib checks the raw bytes only; using a number is mostly a convenience for changing it.
msg258552 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年01月18日 21:49
The launcher on Windows does a range check.
msg258556 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年01月18日 22:31
Patch version 4:
* finish to update Objects/lnotab_notes.txt
* update _PyCode_CheckLineNumber() in codeobject.c
* set importlib MAGIC to 3361
I don't expect my patch to be complete nor perfect. IMHO it's fine to adjust the code later if needed.
I would like to integrate FAT Python changes step by step. It looks like the general idea of AST optimizers is well accepted.
msg258628 - (view) Author: Antti Haapala (ztane) * Date: 2016年01月19日 22:26
Nice work, my issue21385 is also related. Basically, transforming non-Python code into Python meant that all line number information, which otherwise would have been useful for debugging, had to be discarded, or debug builds of Python would dump cores.
So, bye "assert(d_lineno >= 0);", you won't be missed.
msg258668 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016年01月20日 11:31
New changeset 775b74e0e103 by Victor Stinner in branch 'default':
co_lnotab supports negative line number delta
https://hg.python.org/cpython/rev/775b74e0e103 
msg258669 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年01月20日 11:34
> Nice work, my issue21385 is also related. Basically, transforming non-Python code into Python meant that all line number information, which otherwise would have been useful for debugging, had to be discarded, or debug builds of Python would dump cores.
Ok, it looks like there are multiple use cases for negative line numbers, and the change doesn't really break anything in practice. I tried to explain exactly who is impacted and how to update the code in the Porting section of What's New in Python 3.6.
For each review Serhiy. I pushed the the change to Python 3.6.
msg258672 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年01月20日 11:37
I closed issues #16956 and #21385 as duplicates of this issue.
msg258760 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016年01月21日 17:12
New changeset c6fb1651ea2e by Victor Stinner in branch 'default':
Issue #26107: Fix typo in Objects/lnotab_notes.txt
https://hg.python.org/cpython/rev/c6fb1651ea2e 
msg259015 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016年01月27日 11:20
New changeset 16f60cd918e0 by Victor Stinner in branch 'default':
PEP 511
https://hg.python.org/peps/rev/16f60cd918e0 
History
Date User Action Args
2022年04月11日 14:58:26adminsetgithub: 70295
2016年01月27日 11:20:21python-devsetmessages: + msg259015
2016年01月21日 17:12:46python-devsetmessages: + msg258760
2016年01月20日 11:37:02vstinnersetmessages: + msg258672
2016年01月20日 11:36:19vstinnerlinkissue16956 superseder
2016年01月20日 11:35:39vstinnerlinkissue21385 superseder
2016年01月20日 11:34:01vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg258669
2016年01月20日 11:31:37python-devsetmessages: + msg258668
2016年01月19日 22:26:54ztanesetnosy: + ztane
messages: + msg258628
2016年01月18日 22:31:05vstinnersetfiles: + lnotab-4.patch

messages: + msg258556
2016年01月18日 21:49:57serhiy.storchakasetmessages: + msg258552
2016年01月18日 21:30:37brett.cannonsetmessages: + msg258549
2016年01月18日 21:09:58serhiy.storchakasetmessages: + msg258547
2016年01月18日 17:08:23vstinnersetmessages: + msg258529
2016年01月18日 17:01:20brett.cannonsetmessages: + msg258527
2016年01月18日 13:38:40vstinnersetmessages: + msg258523
2016年01月18日 13:31:27vstinnersetmessages: + msg258522
2016年01月18日 11:57:53serhiy.storchakasetmessages: + msg258520
2016年01月18日 11:06:32vstinnersetmessages: + msg258515
2016年01月18日 11:00:29vstinnersetfiles: + lnotab-3.patch

messages: + msg258514
2016年01月18日 10:53:47vstinnersetfiles: + lnotab-2.patch

messages: + msg258513
2016年01月18日 09:16:38vstinnersettitle: code.co_lnotab: use signed line number delta to support moving instructions in an optimizer -> PEP 511: code.co_lnotab: use signed line number delta to support moving instructions in an optimizer
2016年01月14日 15:20:30vstinnersetnosy: + Mark.Shannon
2016年01月14日 12:32:42serhiy.storchakasetnosy: + gvanrossum
messages: + msg258197
2016年01月14日 09:20:10vstinnersetmessages: + msg258191
2016年01月14日 09:16:09vstinnersettype: enhancement
components: + Interpreter Core
2016年01月14日 09:15:43python-devsetnosy: + python-dev
messages: + msg258190
2016年01月14日 09:12:43vstinnercreate

AltStyle によって変換されたページ (->オリジナル) /