Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 4535d77

Browse files
chrisjsewellubitux
andauthored
⬆️ Comply with Commonmark 0.31.2 (#362)
This PR ports markdown-it/markdown-it@cd24778, which in turn complies with https://spec.commonmark.org/0.31.2/changes.html: - Unicode: ```diff A [Unicode punctuation character](@) is - an [ASCII punctuation character] or anything in - he general Unicode categories `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`. + a character in the Unicode `P` (puncuation) or `S` (symbol) general categories. ``` - HTML comments: ```diff - An HTML comment consists of `<!--` + *text* + `-->`, - where *text* does not start with `>` or `->`, does not end with `-`, and does not contain `--`. - (See the [HTML5 spec](http://www.w3.org/TR/html5/syntax.html#comments).) + An [HTML comment](@) consists of `<!-->`, `<!--->`, or `<!--`, a string of characters not including the string `-->`, and `-->` + (see the [HTML spec](https://html.spec.whatwg.org/multipage/parsing.html#markup-declaration-open-state)). ``` - HTML blocks: ```diff Start condition: line begins the string < or </ followed by one of the strings (case-insensitive) - `section`, `source`, `summary`, `table`, `tbody`, `td`, + `search`, `section`, `summary`, `table`, `tbody`, `td`, ``` - Setext header: ```diff - If a line containing a single `-` can be interpreted as an - empty [list items], it should be interpreted this way - and not as a [setext heading underline]. ``` Co-Authored-By: Clément Bœsch <34467+ubitux@users.noreply.github.com>
1 parent 8eb20ac commit 4535d77

File tree

9 files changed

+2459
-2436
lines changed

9 files changed

+2459
-2436
lines changed

‎markdown_it/common/html_blocks.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
http://jgm.github.io/CommonMark/spec.html#html-blocks
33
"""
44

5+
# see https://spec.commonmark.org/0.31.2/#html-blocks
56
block_names = [
67
"address",
78
"article",
@@ -52,8 +53,8 @@
5253
"option",
5354
"p",
5455
"param",
56+
"search",
5557
"section",
56-
"source",
5758
"summary",
5859
"table",
5960
"tbody",

‎markdown_it/common/html_re.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,9 @@
1515
open_tag = "<[A-Za-z][A-Za-z0-9\\-]*" + attribute + "*\\s*\\/?>"
1616

1717
close_tag = "<\\/[A-Za-z][A-Za-z0-9\\-]*\\s*>"
18-
comment = "<!---->|<!--(?:-?[^>-])(?:-?[^-])*-->"
18+
comment = "<!---?>|<!--(?:[^-]|-[^-]|--[^>])*-->"
1919
processing = "<[?][\\s\\S]*?[?]>"
20-
declaration = "<![A-Z]+\\s+[^>]*>"
20+
declaration = "<![A-Za-z][^>]*>"
2121
cdata = "<!\\[CDATA\\[[\\s\\S]*?\\]\\]>"
2222

2323
HTML_TAG_RE = re.compile(

‎markdown_it/common/utils.py

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
import re
66
from re import Match
77
from typing import TypeVar
8+
import unicodedata
89

910
from .entities import entities
1011

@@ -192,15 +193,10 @@ def isWhiteSpace(code: int) -> bool:
192193

193194
# //////////////////////////////////////////////////////////////////////////////
194195

195-
UNICODE_PUNCT_RE = re.compile(
196-
r"[!-#%-\*,-\/:;\?@\[-\]_\{\}\xA1\xA7\xAB\xB6\xB7\xBB\xBF\u037E\u0387\u055A-\u055F\u0589\u058A\u05BE\u05C0\u05C3\u05C6\u05F3\u05F4\u0609\u060A\u060C\u060D\u061B\u061E\u061F\u066A-\u066D\u06D4\u0700-\u070D\u07F7-\u07F9\u0830-\u083E\u085E\u0964\u0965\u0970\u09FD\u0A76\u0AF0\u0C84\u0DF4\u0E4F\u0E5A\u0E5B\u0F04-\u0F12\u0F14\u0F3A-\u0F3D\u0F85\u0FD0-\u0FD4\u0FD9\u0FDA\u104A-\u104F\u10FB\u1360-\u1368\u1400\u166D\u166E\u169B\u169C\u16EB-\u16ED\u1735\u1736\u17D4-\u17D6\u17D8-\u17DA\u1800-\u180A\u1944\u1945\u1A1E\u1A1F\u1AA0-\u1AA6\u1AA8-\u1AAD\u1B5A-\u1B60\u1BFC-\u1BFF\u1C3B-\u1C3F\u1C7E\u1C7F\u1CC0-\u1CC7\u1CD3\u2010-\u2027\u2030-\u2043\u2045-\u2051\u2053-\u205E\u207D\u207E\u208D\u208E\u2308-\u230B\u2329\u232A\u2768-\u2775\u27C5\u27C6\u27E6-\u27EF\u2983-\u2998\u29D8-\u29DB\u29FC\u29FD\u2CF9-\u2CFC\u2CFE\u2CFF\u2D70\u2E00-\u2E2E\u2E30-\u2E4E\u3001-\u3003\u3008-\u3011\u3014-\u301F\u3030\u303D\u30A0\u30FB\uA4FE\uA4FF\uA60D-\uA60F\uA673\uA67E\uA6F2-\uA6F7\uA874-\uA877\uA8CE\uA8CF\uA8F8-\uA8FA\uA8FC\uA92E\uA92F\uA95F\uA9C1-\uA9CD\uA9DE\uA9DF\uAA5C-\uAA5F\uAADE\uAADF\uAAF0\uAAF1\uABEB\uFD3E\uFD3F\uFE10-\uFE19\uFE30-\uFE52\uFE54-\uFE61\uFE63\uFE68\uFE6A\uFE6B\uFF01-\uFF03\uFF05-\uFF0A\uFF0C-\uFF0F\uFF1A\uFF1B\uFF1F\uFF20\uFF3B-\uFF3D\uFF3F\uFF5B\uFF5D\uFF5F-\uFF65]|\uD800[\uDD00-\uDD02\uDF9F\uDFD0]|\uD801\uDD6F|\uD802[\uDC57\uDD1F\uDD3F\uDE50-\uDE58\uDE7F\uDEF0-\uDEF6\uDF39-\uDF3F\uDF99-\uDF9C]|\uD803[\uDF55-\uDF59]|\uD804[\uDC47-\uDC4D\uDCBB\uDCBC\uDCBE-\uDCC1\uDD40-\uDD43\uDD74\uDD75\uDDC5-\uDDC8\uDDCD\uDDDB\uDDDD-\uDDDF\uDE38-\uDE3D\uDEA9]|\uD805[\uDC4B-\uDC4F\uDC5B\uDC5D\uDCC6\uDDC1-\uDDD7\uDE41-\uDE43\uDE60-\uDE6C\uDF3C-\uDF3E]|\uD806[\uDC3B\uDE3F-\uDE46\uDE9A-\uDE9C\uDE9E-\uDEA2]|\uD807[\uDC41-\uDC45\uDC70\uDC71\uDEF7\uDEF8]|\uD809[\uDC70-\uDC74]|\uD81A[\uDE6E\uDE6F\uDEF5\uDF37-\uDF3B\uDF44]|\uD81B[\uDE97-\uDE9A]|\uD82F\uDC9F|\uD836[\uDE87-\uDE8B]|\uD83A[\uDD5E\uDD5F]"
197-
)
198-
199196

200-
# Currently without astral characters support.
201197
def isPunctChar(ch: str) -> bool:
202198
"""Check if character is a punctuation character."""
203-
return UNICODE_PUNCT_RE.search(ch)isnotNone
199+
return unicodedata.category(ch).startswith(("P", "S"))
204200

205201

206202
MD_ASCII_PUNCT = {

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /