This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2022年04月06日 17:23 by LiarPrincess, last changed 2022年04月11日 14:59 by admin.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 32376 | open | LiarPrincess, 2022年04月06日 17:28 | |
| Messages (2) | |||
|---|---|---|---|
| msg416889 - (view) | Author: LiarPrincess (LiarPrincess) * | Date: 2022年04月06日 17:23 | |
This one is so tiny that I'm not really sure we want to merge it... === Problem === `Objects/unicodetype_db.h` starts in a following way: ```c /* a list of unique character type descriptors */ const _PyUnicode_TypeRecord _PyUnicode_TypeRecords[] = { {0, 0, 0, 0, 0, 0}, {0, 0, 0, 0, 0, 0}, {0, 0, 0, 0, 0, 32}, {0, 0, 0, 0, 0, 48}, ... ``` The 1st record (`{0, 0, 0, 0, 0, 0}`) is duplicated. This is not a problem, since the 1st occurrence is never used, but if we wanted to remove it then this is the ticket about it. === Detailed description === `Objects/unicodetype_db.h` is generated by `Tools/unicode/makeunicodedata.py` (I removed irrelevant lines): ```py def makeunicodetype(unicode, trace): dummy = (0, 0, 0, 0, 0, 0) table = [dummy] # (1) cache = {0: dummy} # (2) for char in unicode.chars: # Things... item = (upper, lower, title, decimal, digit, flags) i = cache.get(item) # (3) if i is None: cache[item] = i = len(table) table.append(item) index[char] = i ``` - (1) - list which contains unique character properties (as `(upper, lower, title, decimal, digit, flags)` tuples) - (2) - mapping from character properties to index in `table` - improperly initialized as a mapping from index to character properties - (3) - we check if the current tuple is in `cache` === Result === The first time we get to a character that has `(0, 0, 0, 0, 0, 0)` properties (which is code point 0 - `NULL`) we check if it is in cache. It it not (there is an entry that goes from index `0` to `(0, 0, 0, 0, 0, 0)` - the other way around), so we add this entry to `table` and `cache`. === Fix === In the line `(2)` we should have: `cache = {dummy: 0}`. Obviously after doing so we have to run `makeunicodedata.py` - this is why this simple change modifies a lot of lines. I will submit PR on github in just a sec... |
|||
| msg416892 - (view) | Author: LiarPrincess (LiarPrincess) * | Date: 2022年04月06日 17:48 | |
CLA is signed, but there is this 'it might take a few days before your tracker profile is updated'. Added version 3.11 (present also in previous versions, bot no point in back-porting it). Github: https://github.com/python/cpython/pull/32376 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:59:58 | admin | set | github: 91399 |
| 2022年04月06日 17:48:43 | LiarPrincess | set | messages:
+ msg416892 versions: + Python 3.11 |
| 2022年04月06日 17:28:51 | LiarPrincess | set | keywords:
+ patch stage: patch review pull_requests: + pull_request30419 |
| 2022年04月06日 17:23:31 | LiarPrincess | create | |