64

I want to build a Python function that calculates,

alt text

and would like to name my summation function Σ. In a similar fashion, would like to use Π for product, and so on. I was wondering if there was a way to name a python function in this fashion?

def Σ (..):
 ..
 ..

That is, does Python support unicode identifiers, and if so, could someone provide an example for it?

Thanks!


Original motivation for this was a piece of Clojure code I saw today that looks like,

(defn entropy [X]
 (* -1 (Σ [i X] (* (p i) (log (p i))))))

where Σ is a macro defined as,

(defmacro Σ
 ... )

and I thought that was pretty cool.


BTW, to address a couple of comments about readability - with a lot of stats/ML code for instance, being able to compose operations with symbols would be really helpful. (Especially for really complex integrals et al)

φ(z) = ∫(N(x|0,1,1), -∞, z)

vs

Phi(z) = integral(N(x|0,1,1), -inf, z)

or even just the lambda character for lambda()!

asked Apr 15, 2010 at 22:52
13
  • 7
    Although not as cool, Python's summation function is pretty elegant: sum() Commented Apr 15, 2010 at 23:04
  • 3
    Sounds like a horrible idea for ease of input (presumably $\sum$ wouldn't work, right?) Commented Apr 15, 2010 at 23:34
  • 1
    Maybe you want to have a look at Fortress which allows Unicode and TeX style notation. Commented Apr 16, 2010 at 8:09
  • 7
    "Sounds like a horrible idea for ease of input" — depends what keyboard shortcuts you’ve got, doesn’t it? Curly quotes, like the kind I used at the start of this comment, are a bit of a drag to type by default in Windows (I believe), but have decent shortcuts on the Mac. If you do a lot of mathy programming, you could configure shortcuts to make the typing easy. Commented Apr 16, 2010 at 9:30
  • 1
    φ and φ are variants of the same symbol, so it makes sense to be the same identifier (specially when you're reading code out loud) Commented Sep 26, 2018 at 14:35

5 Answers 5

50

(I think it’s pretty cool too, that might mean we’re geeks.)

You’re fine to do this with the code you have above in Python 3. (It works in my Python 3.1 interpreter at least.) See:

But in Python 2, identifiers can only be ASCII letters, numbers and underscores.

answered Apr 15, 2010 at 22:58
Sign up to request clarification or add additional context in comments.

1 Comment

Is the Python 2 incompatability the reason for the following quote from the Tutorial: "don’t use non-ASCII characters in identifiers if there is only the slightest chance people speaking a different language will read or maintain the code"? or is UTF-8 still unpreferable for international purposes in Python 3?
33

It's worth pointing out that Python 3 does support Unicode identifiers, but only allows letter or number like symbols (see http://docs.python.org/3.3/reference/lexical_analysis.html#identifiers for full details). That's why Σ works (remember that it's a Greek letter, not just a math symbol), but √ doesn't.

For anyone interested, I made a website that lists every Unicode character that is valid in a Python variable https://www.asmeurer.com/python-unicode-variable-names/ (be warned that there are quite a lot of them, over 100000 in fact)

answered Feb 18, 2014 at 23:31

Comments

22

(this answer is meant to be a minor addendum not a complete answer)

The additional gotcha to unicode identifiers (which @mike-desimone mentions and I discovered quickly when I thought this was a cool thread and switched to a terminal to play with it), is the multiple versions of each glyph are not equivalent, with regards to how you get to each glyph on each platform. For example Σ (aka greek capital letter sigma, aka U+03A3, [can't find a direct mac input method]) is fine, but unfortunately ∑ (aka N-ary Summation, aka U+2211, aka opt/alt-w using Mac OS X) is not a valid identifier.

>>> Σ = 20
>>> Σ
20

but

>>> ∑ = 20
File "<input>", line 1
 ∑ = 20
 ^
SyntaxError: invalid character in identifier

Using Σ specifically (and probably unicode chars in general) as an identifier might generate some very hard to diagnose errors if you have multiple developers on multiple platforms contributing to your code, for example, debug this visually:

∑ looks very similar to Σ, depending on the typeface selected

The two glyphs are easier to differentiate on this page, but depending on the font used, this may not be the case.

Even the traceback isn't much clearer unless Σ is printed near the ∑

 File "~/Dev/play_python33/identifiers.py", line 12
 print(∑([2, 2, 2, 2, 2]))
 ^
SyntaxError: invalid character in identifier
answered Apr 24, 2015 at 18:43

2 Comments

Another gotcha is that there are multiple glyphs that are equivalent. Define φ = 5, then φ is φ → True
@endolith this is exactly what I discovered in horror today.
16

According to is it bad, you can use some unicode characters, but not all: You are restricted to characters identified as letters.

>>> α = 3 
>>> Σ = sum 
>>> import math 
>>> √ = math.sqrt 
 File "<stdin>", line 1 
  √ = 3 
   ^ 
SyntaxError: invalid character in identifier

Besides: I think it is very cool to be able to use unicode as identifiers - and I wish, i could use all.

I use the neo keyboard layout, which gives me greek and math symbols on extra layers:

αβχδεφγψιθκλνοπφστ[&ωξυζ
∀⇐CΔ∃ΦΓΨ∫Λ⇔Σ∈QR∂⊂√∩Ξ

answered Aug 23, 2011 at 9:40

2 Comments

Also, there are often distinct versions of characters that are also Greek letters. For example, the Greek capital sigma is U+03A3, while the math sigma is U+1D6BA, U+1D6F4, U+1D72E, U+1D768, or U+1D7A2 depending on styling. Similarly, Greek capital omega is U+03A9, math omegas start at U+1D6C0, and the Ohms symbol is U+2126.
Another nice way to enter most symbols is the compose key, e.g. on Windows via WinCompose
8

Python 2.x does not support unicode identifiers, and consequently does not support Σ as an identifier. Python 3.x does support unicode identifiers, although many people will get cross if they have to edit source files with, for example, identifiers A and Α (latin A and greek capital alpha.) Sigma is often readable enough, but still, not as readable as the word sigma, so why bother?

answered Apr 15, 2010 at 22:59

12 Comments

I think readability of words versus symbols depends on context. When I’m reading something mathy, I find symbols (e.g. x + y) more readable than the wordy equivalents you’d get in, say, AppleScript (e.g. add x to y). Symbols are terser, and generally let you get by on shape recognition alone, which I think is easier on the brain than reading. I don’t do enough mathy stuff to have felt the need to add a sigma sign to my code though.
That doesn't look any more readable with unicode identifiers to me.
"That doesn't look any more readable with unicode identifiers to me." — It does look more similar to the equation posted at the top of the question though. If someone was used to reading equations like that, mightn’t they find the symbol-y Python code more readable too?
@Paul: sure, readability is always subjective. The audience is important. Which is why you need to consider the audience more than your own preferences. It's easy if you're always going to be your own entire audience, of course, but frequently things that start out that way end up in a wider distribution, and with a wider set of contributors.
One place where Unicode identifiers will be nice is in iPython Notebook, because you can have variable names that are named the same as the variables they represent. For example, the variable representing a chip's thermal impedance from junction to ambient is θJA, and constantly writing it as THETA_JA makes it harder for non-programmers to read the code.
|

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.