homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: col_offset is -1 and lineno is wrong for multiline string expressions
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Aivar.Annamaa, Anthony Sottile, asottile, benjamin.peterson, carsten.klein@axn-software.de, karamanolev, methane, pablogsal, terry.reedy, vstinner
Priority: normal Keywords: patch

Created on 2012年12月29日 00:08 by carsten.klein@axn-software.de, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue16806.diff carsten.klein@axn-software.de, 2012年12月29日 10:59 Test case and patch resolving the issue review
python2.7.3.diff carsten.klein@axn-software.de, 2012年12月30日 21:25 Patch for Python 2.7.3
Pull Requests
URL Status Linked Edit
PR 1800 lukasz.langa, 2017年05月25日 02:29
PR 10021 merged Anthony Sottile, 2018年10月21日 01:08
PR 17582 lys.nikolaou, 2020年01月08日 21:04
Messages (16)
msg178444 - (view) Author: Carsten Klein (carsten.klein@axn-software.de) Date: 2012年12月29日 00:08
Given an input module such as
class klass(object):
 """multi line comment
 continued on this line
 """
 """single line comment"""
 """
 Another multi
 line
 comment"""
and implementing a custom ast.NodeVisitor such as
import as
class CustomVisitor(ast.NodeVisitor):
 def visit_ClassDef(self, node):
 for childNode in node.body:
 self.visit(childNode)
 def visit_Expr(self, node):
 print(node.col_offset)
 print(node.value.col_offset)
and feeding it the compiled ast from the module above
f = open('./module.py')
source = f.read()
node = ast.parse(source, mode = 'exec')
visitor = CustomVisitor()
visitor.visit(node)
should yield -1/-1 for the docstring that is the first
child node expression of the classdef body.
it will, however, yield the correct col_offset of 4/4 for
the single line docstring following the first one.
the multi line docstring following that will again
yield a -1/-1 col_offset.
It believe that this behaviour is not correct and instead
the col_offset should be 4 for both the expression node
and its str value.
msg178445 - (view) Author: Carsten Klein (carsten.klein@axn-software.de) Date: 2012年12月29日 00:10
Please note that, regardless of the indent level, the col_offset for multi line str expressions will always be -1.
msg178452 - (view) Author: Carsten Klein (carsten.klein@axn-software.de) Date: 2012年12月29日 01:08
In addition, the reported lineno will be set to the last line of the multi line string instead of the first line where parsing the parse began parsing the string.
msg178483 - (view) Author: Carsten Klein (carsten.klein@axn-software.de) Date: 2012年12月29日 10:50
Please see the attached patch that will resolve the issue. It also includes a test case in test_ast.py.
What the patch does is as follows:
- tok_state is extended by two fields, namely first_lineno
 and multi_line_start
- first_lineno will be set by tok_get as soon as the beginning
 of a STRING is detected and it will be set to the current line
 tok->lineno.
- multi_line_start is the beginning of the first line of a string
- in parsetok we now distinguish between STRING nodes and other
 nodes. in case of STRING nodes, we will use the values of the
 above fields for determining the actual lineno and the col_offset,
 otherwise tok->col_offset and tok->lineno will be used when
 creating the token.
The included test case ensures that the col_offset and lineno of
multi line strings is calculated correctly.
msg178614 - (view) Author: Carsten Klein (carsten.klein@axn-software.de) Date: 2012年12月30日 21:25
I have created a patch for Python 2.7.3 that fixes the issue for that release, too.
msg179096 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013年01月05日 01:24
If this is really an 'enhancement', it will only go in 3.4. If it is a bug/behavior issue, then it should be marked as such and 2.7,3.2,3.3 selected. I have not read the doc and messages well enough to know, so I leave that to you and Benjamin.
The patch includes a test. It needs a patch to Misc/ACKS to add Carsten Klein between Reid Kleckner and Bastian Kleineidam
msg179097 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2013年01月05日 01:27
I left comments on Rietveld a few days ago.
msg235448 - (view) Author: Anthony Sottile (asottile) * Date: 2015年02月05日 19:30
Any updates on this? I'm running into this as well (still a problem in 3.4)
```$ python3.4
Python 3.4.2 (default, Oct 11 2014, 17:59:27) 
[GCC 4.4.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ast
>>> ast.parse("""'''foo\n'''""").body[0].value.col_offset
-1
```
msg294339 - (view) Author: Ivailo Karamanolev (karamanolev) Date: 2017年05月24日 09:41
What's the status on this? Anything preventing it getting fixed? Still the same in 3.6.1:
>>> import ast
>>> ast.parse("""'''foo\n'''""").body[0].value.col_offset
-1
msg298016 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2017年07月10日 01:05
pypy seems to have this right (though I don't know enough about their internals to know if cpython can benefit from their patch)
$ venvpypy/bin/pythonPython 2.7.10 (3260adbeba4a, Apr 19 2016, 17:42:20)
[PyPy 5.1.0 with GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>> import ast, astpretty
>>>> astpretty.pprint(ast.parse('"""\n"""'))
Module(
 body=[
 Expr(
 lineno=1,
 col_offset=0,
 value=Str(lineno=1, col_offset=0, s='\n'),
 ),
 ],
)
msg313126 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2018年03月02日 04:18
Still a problem in 3.7:
$ python3.7
Python 3.7.0b2 (default, Feb 28 2018, 06:59:18) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ast
>>> ast.parse("""x = '''foo\n'''""").body[-1].value
<_ast.Str object at 0x7fcde6898358>
>>> ast.parse("""x = '''foo\n'''""").body[-1].value.col_offset
-1
msg333537 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019年01月13日 04:04
Should we backport this to 3.7?
AST changes including bugfix affects existing software unexpectedly.
msg333538 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019年01月13日 04:05
New changeset 995d9b92979768125ced4da3a56f755bcdf80f6e by INADA Naoki (Anthony Sottile) in branch 'master':
bpo-16806: Fix `lineno` and `col_offset` for multi-line string tokens (GH-10021)
https://github.com/python/cpython/commit/995d9b92979768125ced4da3a56f755bcdf80f6e
msg333544 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2019年01月13日 04:50
I agree -- probably safer to not backport to 3.7 in case someone is relying on this behaviour.
msg348010 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019年07月16日 10:04
commit 995d9b92979768125ced4da3a56f755bcdf80f6e introduced a regression: bpo-37603: parsetok(): Assertion `(intptr_t)(int)(a - line_start) == (a - line_start)' failed, when running get-pip.py.
msg359632 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020年01月08日 21:12
> commit 995d9b92979768125ced4da3a56f755bcdf80f6e introduced a regression: bpo-37603: parsetok(): Assertion `(intptr_t)(int)(a - line_start) == (a - line_start)' failed, when running get-pip.py.
Fixed in https://bugs.python.org/issue39209 
History
Date User Action Args
2022年04月11日 14:57:39adminsetgithub: 61010
2020年01月08日 21:12:49pablogsalsetstatus: open -> closed

nosy: + pablogsal
messages: + msg359632

resolution: fixed
stage: patch review -> resolved
2020年01月08日 21:04:49lys.nikolaousetpull_requests: + pull_request17320
2019年07月16日 10:04:59vstinnersetnosy: + vstinner
messages: + msg348010
2019年01月13日 04:50:03Anthony Sottilesetmessages: + msg333544
2019年01月13日 04:05:22methanesetmessages: + msg333538
2019年01月13日 04:04:52methanesetnosy: + methane
messages: + msg333537
2019年01月13日 04:01:33methanesetversions: - Python 3.6
2018年10月21日 01:08:41Anthony Sottilesetpull_requests: + pull_request9361
2018年03月02日 04:18:50Anthony Sottilesetmessages: + msg313126
versions: + Python 3.6, Python 3.7, Python 3.8, - Python 3.4
2017年07月10日 01:05:41Anthony Sottilesetnosy: + Anthony Sottile
messages: + msg298016
2017年05月25日 02:29:35lukasz.langasetpull_requests: + pull_request1890
2017年05月24日 09:41:10karamanolevsetnosy: + karamanolev
messages: + msg294339
2015年11月16日 15:24:15serhiy.storchakalinkissue24623 superseder
2015年02月05日 19:30:51asottilesetmessages: + msg235448
2015年02月05日 19:27:28asottilesetnosy: + asottile
2014年04月18日 11:02:30Aivar.Annamaasetnosy: + Aivar.Annamaa
2013年07月06日 16:51:43brett.cannonlinkissue18370 superseder
2013年01月05日 01:27:46benjamin.petersonsetmessages: + msg179097
2013年01月05日 01:24:07terry.reedysetnosy: + terry.reedy

messages: + msg179096
stage: patch review
2012年12月31日 21:15:26benjamin.petersonsetnosy: + benjamin.peterson
2012年12月30日 21:25:46carsten.klein@axn-software.desetfiles: + python2.7.3.diff

messages: + msg178614
2012年12月29日 10:59:55carsten.klein@axn-software.desetfiles: + issue16806.diff
2012年12月29日 10:59:17carsten.klein@axn-software.desetfiles: - issue1680.diff
2012年12月29日 10:56:30carsten.klein@axn-software.desettitle: col_offset is -1 for multiline string expressions resembling docstrings -> col_offset is -1 and lineno is wrong for multiline string expressions
2012年12月29日 10:50:56carsten.klein@axn-software.desetfiles: + issue1680.diff
keywords: + patch
messages: + msg178483
2012年12月29日 01:08:42carsten.klein@axn-software.desetmessages: + msg178452
2012年12月29日 00:10:11carsten.klein@axn-software.desetmessages: + msg178445
2012年12月29日 00:08:22carsten.klein@axn-software.decreate

AltStyle によって変換されたページ (->オリジナル) /