homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Normalise non-ASCII variable names in __all__
Type: behavior Stage:
Components: Unicode Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Nate Soares, ezio.melotti, mbussonn, mrabarnett, steven.daprano, vstinner
Priority: normal Keywords:

Created on 2017年06月26日 18:08 by Nate Soares, last changed 2022年04月11日 14:58 by admin.

Messages (6)
msg296928 - (view) Author: Nate Soares (Nate Soares) Date: 2017年06月26日 18:08
[NOTE: In this comment, I use BB to mean unicode character 0x1D539, b/c the issue tracker won't let me submit a comment with unicode characters in it.]
Directory structure:
repro/
 foo.py
 test_foo.py
Contents of foo.py:
 BB = 1
 __all__ = ['BB']
Contents of test_foo.py:
 from .foo import *
Error message:
 AttributeError: module 'repro.foo' has no attribute 'BB'
If I change foo.py to have `__all__ = ['B']` (note that 'B' is not the same as 'BB'), then everything works "fine", modulo the fact that now foo.B is a thing and foo.BB is not a thing.
[Recall that in the above, BB is a placeholder for U+1D539, which the issuetracker prevents me from writing here.]
msg296934 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2017年06月26日 19:27
I can reproduce the issue:
$ cat foo.py 
BB = 1
__all__ = ['BB']
$ python3 -c 'import foo; print(dir(foo)); from foo import *'
['BB', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__']
Traceback (most recent call last):
 File "<string>", line 1, in <module>
AttributeError: module 'foo' has no attribute 'BB
(Note the ascii 'BB' in the dir(foo))
There's also an easier way to reproduce it:
>>> BB= 3
>>> BB
3
>>> BB
3
>>> globals()['BB']
3
>>> globals()['BB']
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
KeyError: 'BB'
>>> globals()
{'__name__': '__main__', '__spec__': None, '__builtins__': <module 'builtins' (built-in)>, '__loader__': <class '_frozen_importlib.BuiltinImporter'>, '__doc__': None, 'BB': 3, '__package__': None}
>>> class Foo:
... B B= 3
... 
>>> Foo.BB
3
>>> Foo.BB
3
It seems the 'BB' gets normalized to 'BB' when it's an identifier, but not when it's a string. I'm not sure why this happens though.
msg296935 - (view) Author: Matthew Barnett (mrabarnett) * (Python triager) Date: 2017年06月26日 19:49
See PEP 3131 -- Supporting Non-ASCII Identifiers
It says: """All identifiers are converted into the normal form NFKC while parsing; comparison of identifiers is based on NFKC."""
>>> import unicodedata
>>> unicodedata.name(unicodedata.normalize('NFKC', '\N{MATHEMATICAL DOUBLE-STRUCK CAPITAL B}'))
'LATIN CAPITAL LETTER B'
msg297284 - (view) Author: Nate Soares (Nate Soares) Date: 2017年06月29日 17:03
To be clear, the trouble I was trying to point at is that if foo.py didn't
have __all__, then it would still have a BB attribute. But if the module is
given __all__, the BB is normalized away into a B. This seems like pretty
strange/counterintuitive behavior. For instance, I found this bug when I
added __all__ to a mathy library, where other modules had previously been
happily importing BB and using <module>.BB etc. with no trouble.
In other words, I could accept "BB gets normalized to B always", but the
current behavior is "modules are allowed to have a BB attribute but only if
they don't use __all__, because __all__ requires putting the BB through a
process that normalizes it to B, and which otherwise doesn't get run".
If this is "working as intended" then w/e, I'll work around it, but I want
to make sure that we all understand the inconsistency before letting this
bug die in peace :-)
On Wed, Jun 28, 2017 at 10:55 AM Brett Cannon <report@bugs.python.org>
wrote:
>
> Changes by Brett Cannon <brett@python.org>:
>
>
> ----------
> resolution: -> not a bug
> stage: -> resolved
> status: open -> closed
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue30772>
> _______________________________________
>
msg297333 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2017年06月30日 00:20
I think that the names in __all__ should have the same NFKC normalisation applied as the identifiers.
Re-opening for 3.7.
msg297739 - (view) Author: Matthias Bussonnier (mbussonn) * Date: 2017年07月05日 13:36
> I think that the names in __all__ should have the same NFKC normalisation applied as the identifiers.
 
Does it make sens to add to this issue : Ensure that all elements of __all__ are str ? (At least emit a warning ?)
I have encounter a small number of libraries where some member of all are the actual objects. Easy mistake to make if you make a public decorator:
 __all__ = []
 def public(o):
 __all__.append(o)
 return o
 @public
 def bar():
 pass
Happy to open a different issue if deemed necessary. Thanks !
History
Date User Action Args
2022年04月11日 14:58:48adminsetgithub: 74955
2017年07月05日 13:36:35mbussonnsetnosy: + mbussonn
messages: + msg297739
2017年06月30日 00:20:16steven.dapranosetstatus: closed -> open

title: If I make an attribute " -> Normalise non-ASCII variable names in __all__
messages: + msg297333
versions: + Python 3.7, - Python 3.6
type: behavior
resolution: not a bug ->
stage: resolved ->
2017年06月29日 17:03:40Nate Soaressetmessages: + msg297284
title: If I make an attribute "[a unicode version of B]", it gets assigned to "[ascii B]", and so on. -> If I make an attribute "
2017年06月28日 17:55:21brett.cannonsetstatus: open -> closed
resolution: not a bug
stage: resolved
2017年06月27日 14:54:40steven.dapranosetnosy: + steven.daprano
2017年06月26日 19:49:27mrabarnettsetnosy: + mrabarnett
messages: + msg296935
2017年06月26日 19:27:47ezio.melottisetmessages: + msg296934
2017年06月26日 18:08:51Nate Soarescreate

AltStyle によって変換されたページ (->オリジナル) /