Issue 30455: Generate all tokens related code and docs from Grammar/Tokens

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/74640

classification

Title:	Generate all tokens related code and docs from Grammar/Tokens
Type:	enhancement	Stage:	resolved
Components:	Interpreter Core, Library (Lib)	Versions:	Python 3.8

process

Dependencies:	Superseder:
Status:	closed	Resolution:	fixed
Assigned To:	serhiy.storchaka	Nosy List:	Albert-Jan Nijburg, benjamin.peterson, matrixise, meador.inge, r.david.murray, serhiy.storchaka, vstinner
Priority:	normal	Keywords:	patch

Created on 2017年05月24日 12:21 by serhiy.storchaka, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Pull Requests
URL	Status	Linked	Edit
PR 1860	closed	serhiy.storchaka, 2017年05月30日 12:52
PR 9343	emilyemorehouse, 2018年09月17日 14:46
PR 10370	merged	serhiy.storchaka, 2018年11月06日 19:14
PR 10497	emilyemorehouse, 2018年11月20日 19:28

Messages (12)
msg294350 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2017年05月24日 12:21
Currently Lib/token.py is generated from Include/token.h. This contradicts common practice when the C code is generated from the Python code (see for example opcode.py and sre_constants.py). In additional the table in Parser/tokenizer.c should be manually supported matching Include/token.h. Generating Include/token.h and Parser/tokenizer.c from Lib/token.py would be simpler and more reliable.
msg294356 - (view)	Author: STINNER Victor (vstinner) * (Python committer)	Date: 2017年05月24日 14:29
I like the idea.
msg294361 - (view)	Author: Stéphane Wirtel (matrixise) * (Python committer)	Date: 2017年05月24日 15:08
I can work on it
msg294363 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2017年05月24日 15:20
I already write a patch.
msg294753 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2017年05月30日 13:02
PR 1860 makes following files be generated from token.py: * Include/token.h * Parser/token.c. New file containing the array of token names _PyParser_TokenNames, and functions PyToken_OneChar(), PyToken_TwoChars(), PyToken_ThreeChars(), moved from Parser/tokenizer.c. * Doc/library/token-list.inc. New file containing the list of token.py constants, it is included in Doc/library/token.rst. New Makefile target regen-token regenerates these files. The dict EXACT_TOKEN_TYPES that maps operator strings to token names now is automatically generated and moved from tokenize.py to token.py. Tokens COMMENT, NL and ENCODING used only in tokenize.py now are added in token.py as in issue25324.
msg294754 - (view)	Author: Albert-Jan Nijburg (Albert-Jan Nijburg) *	Date: 2017年05月30日 13:14
I think this covers all the changes from PR #1608. Looks a lot nicer too, building it every time from the make file. You may want to add to the docs that token.py is now the source of the tokens.
msg294833 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2017年05月31日 11:10
The regular expression tokenize.Funny also can be generated. Information is not enough for distinguish between Operator, Bracket and Special, but seems this isn't needed. Some token names can be generated from Grammar/Grammar. But needed an additional mapping for relations between token strings and names ('+' <-> PLUS, etc).
msg329375 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2018年11月06日 19:18
Alternate PR 10370 generates all files from a single file Grammar/Tokens using a single script Tools/scripts/generate_token.py. In addition, the script doesn't write files when the content is not changed. Thus it can be used with read-only sources.
msg330053 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2018年11月18日 17:25
Could anybody please make a review? There are two alternate PRs: PR 1860 and PR 10370. The difference between them is that the former one uses Lib/token.py as a source, and the latter one uses Grammar/Tokens as a source and generates Lib/token.py too.
msg332195 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2018年12月20日 07:54
If there are no objections I am going to merge PR 10370 in few days.
msg332205 - (view)	Author: STINNER Victor (vstinner) * (Python committer)	Date: 2018年12月20日 09:57
> If there are no objections I am going to merge PR 10370 in few days. LGTM. I guess that PR 9343 should be closed once PR 10370 is merged.
msg332459 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)	Date: 2018年12月24日 14:48
New changeset 8ac658114dec4964479baecfbc439fceb40eaa79 by Serhiy Storchaka in branch 'master': bpo-30455: Generate all token related code and docs from Grammar/Tokens. (GH-10370) https://github.com/python/cpython/commit/8ac658114dec4964479baecfbc439fceb40eaa79

History
Date	User	Action	Args
2022年04月11日 14:58:46	admin	set	github: 74640
2018年12月24日 14:48:06	serhiy.storchaka	set	messages: + msg332459
2018年12月22日 09:26:57	serhiy.storchaka	set	status: open -> closed resolution: fixed stage: patch review -> resolved
2018年12月20日 09:57:27	vstinner	set	messages: + msg332205
2018年12月20日 07:54:30	serhiy.storchaka	set	messages: + msg332195
2018年11月20日 19:28:26	emilyemorehouse	set	pull_requests: + pull_request9865
2018年11月18日 17:25:34	serhiy.storchaka	set	messages: + msg330053
2018年11月06日 19:27:01	serhiy.storchaka	set	title: Generate C code from token.py and not vice versa -> Generate all tokens related code and docs from Grammar/Tokens versions: + Python 3.8, - Python 3.7
2018年11月06日 19:18:30	serhiy.storchaka	set	messages: + msg329375
2018年11月06日 19:14:43	serhiy.storchaka	set	pull_requests: + pull_request9671
2018年09月17日 14:46:12	emilyemorehouse	set	pull_requests: + pull_request8783
2018年02月14日 10:32:23	serhiy.storchaka	set	pull_requests: - pull_request5479
2018年02月14日 10:31:47	zach.ware	set	keywords: + patch pull_requests: + pull_request5479
2017年05月31日 11:10:44	serhiy.storchaka	set	messages: + msg294833
2017年05月30日 13:14:53	Albert-Jan Nijburg	set	messages: + msg294754
2017年05月30日 13:02:08	serhiy.storchaka	set	messages: + msg294753 stage: patch review
2017年05月30日 12:52:07	serhiy.storchaka	set	pull_requests: + pull_request1943
2017年05月24日 15:20:37	serhiy.storchaka	set	assignee: serhiy.storchaka messages: + msg294363
2017年05月24日 15:08:17	matrixise	set	nosy: + matrixise messages: + msg294361
2017年05月24日 14:29:18	vstinner	set	messages: + msg294356
2017年05月24日 12:21:49	serhiy.storchaka	create

homepage