Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Note greediness of PEP 723 reference parser #1960

Open

Description

Issue Description

While preparing a PR for PEP 723 support in pip, I noticed that the reference parser defined by the PEP and listed in the PyPA docs will collate multiple adjacent /// TYPE blocks as a single match, even when separated by a comment line (the spec refers to it as a "content line"). This greedy collation is surprising and makes distinguishing error cases a little complicated, so I think it merits a warning in the docs if it is not possible to update the specification itself.

I believe this quirk is caused by the last + in the reference regex being greedy and matching all the way to the trailing /// instead of to the first available one. In my limited experimentation, replacing this quantifier with +? resolves the issue, producing the expected number of matches.

This shouldn't slip through anybody's code unnoticed, as the collation will produce invalid TOML (the interior /// is invalid syntax), but it is a surprising enough edge case that I thought to report it here.

click for code
import re
script_A = """
# /// script
# data (1)
# ///
#
# /// script
# data (2)
# ///
"""
script_B = """
# /// script
# data (1)
# ///

# /// script
# data (2)
# ///
"""
# These lines adapted from PEP 723's reference parser:
# https://peps.python.org/pep-0723/#reference-implementation
REGEX = r"(?m)^# /// (?P<type>[a-zA-Z0-9-]+)$\s(?P<content>(^#(| .*)$\s)+)^# ///$"
name = "script"
matches_A = list(
 filter(lambda m: m.group("type") == name, re.finditer(REGEX, script_A))
)
matches_B = list(
 filter(lambda m: m.group("type") == name, re.finditer(REGEX, script_B))
)
# output:
# 1
# 2
print(len(matches_A))
print(len(matches_B))

Code of Conduct

  • I am aware that participants in this repository must follow the PSF Code of Conduct.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

      Relationships

      None yet

      Development

      No branches or pull requests

      Issue actions

        AltStyle によって変換されたページ (->オリジナル) /