homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: shlex not posix compliant when parsing "foo#bar"
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: cadf, eric.araujo, ferringb, jjdmol2, meador.inge, terry.reedy
Priority: normal Keywords: patch

Created on 2009年12月31日 08:42 by jjdmol2, last changed 2022年04月11日 14:56 by admin.

Files
File name Uploaded Description Edit
lexer_test.py jjdmol2, 2009年12月31日 09:10 test to show shlex behaviour
shlex_posix.diff cadf, 2010年01月02日 04:18
Messages (7)
msg97081 - (view) Author: Jan David Mol (jjdmol2) Date: 2009年12月31日 08:42
The shlex parser parses "foo#bar" as "foo", discarding the rest as a
comment. This is actually one of the test cases, even in POSIX mode.
However, POSIX (see below) only allows comments to start at the
beginning of a token, so "foo#bar" has to result in a "foo#bar" token.
To easily see this, do "echo foo#bar" in bash, versus "echo foo #bar".
Fixing this might break some applications that rely on this broken
behaviour, even though they're not strictly POSIX compliant.
POSIX 2008, Rationale C.2.3 (which refers to Shell & Utilities 2.3(10)):
The (10) rule about '#' as the current character is the first in the
sequence in which a new token is being assembled. The '#' starts a
comment only when it is at the beginning of a token. This rule is also
written to indicate that the search for the end-of-comment does not
consider escaped <newline> specially, so that a comment cannot be
continued to the next line.
msg97082 - (view) Author: Jan David Mol (jjdmol2) Date: 2009年12月31日 09:10
Attached a program which shows the relevant behaviour:
import shlex
tests = [ "foo#bar", "foo #bar" ]
for t in tests:
 print "%s -> %s" % (t,[x for x in shlex.shlex(t,posix=True)])
results in
$ python lexer_test.py
foo#bar -> ['foo']
foo #bar -> ['foo']
(expected of course is ['foo#bar'] on the first line).
msg97125 - (view) Author: (cadf) Date: 2010年01月02日 04:18
Here's a patch addressing the behavior described.
msg112740 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010年08月04日 03:06
Given that test_shlex.py tests for the current behavior, it is hard to call this a bug in the tracker sense of the term. I would only change with a new version. 
The manual just says "When operating in POSIX mode, shlex will try to be as close as possible to the POSIX shell parsing rules." but gives no reference to which authority it is following or what the rules are in either case. Manual section 23.2.2. Parsing Rules only discusses the differences between posix and non-posix rules, not the common rules.
I suspect this module was written well over a decade ago, maybe closer to two. Is it possible that earlier versions were different on this issue? Or is the 2008 version only cosmetically different some 1990s version?
msg148270 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011年11月24日 16:11
> The manual just says "When operating in POSIX mode, shlex will try to be as close as
> possible to the POSIX shell parsing rules." but gives no reference to which authority it is
> following or what the rules are in either case.
I think it actually does: The POSIX specification defines the behavior of a compliant /bin/sh shell.
See also #1521950.
msg148292 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011年11月25日 00:22
The doc section has no reference, as in a live web link, to any version of the POSIX specification. This is unlike other doc sections that implement various RFCs (which also get updated). The docs also link to specific references for the Unicode version supported, which has changed from version to version.
The OP quotes (without giving a link) from the 2008 version. POSIX and shlex are much older than that, implying that shlex might conform to an earlier version, just as other modules implement older RFCs that have been superceded.
msg148456 - (view) Author: Meador Inge (meador.inge) * (Python committer) Date: 2011年11月27日 18:27
Here a some of the relevant links from POSIX 2008:
 1. Shell Command Language - http://pubs.opengroup.org/onlinepubs/9699919799/idx/shell.html
 3. Shell Command Language Rationale - http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xcu_chap02.html
Sections 2.3 (http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_03) and 2.10 (http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_10) of [1] are particularly relevant.
History
Date User Action Args
2022年04月11日 14:56:55adminsetgithub: 51860
2020年11月11日 17:08:24iritkatrielsetversions: + Python 3.9, Python 3.10, - Python 3.2
2011年11月27日 18:27:51meador.ingesetnosy: + meador.inge
messages: + msg148456
2011年11月25日 00:22:04terry.reedysetmessages: + msg148292
2011年11月24日 16:11:17eric.araujosetnosy: + eric.araujo
messages: + msg148270
2010年08月04日 03:06:41terry.reedysetversions: - Python 2.6, Python 2.5, Python 3.1, Python 2.7
nosy: + terry.reedy

messages: + msg112740

type: behavior -> enhancement
stage: test needed -> patch review
2010年01月02日 04:18:10cadfsetfiles: + shlex_posix.diff

nosy: + cadf
messages: + msg97125

keywords: + patch
2009年12月31日 09:10:35jjdmol2setfiles: + lexer_test.py

messages: + msg97082
versions: + Python 2.5
2009年12月31日 08:48:25ezio.melottisetpriority: normal
nosy: + ferringb
versions: - Python 2.5

stage: test needed
2009年12月31日 08:42:42jjdmol2create

AltStyle によって変換されたページ (->オリジナル) /