Working on a statically typed language with type inference and streamlined syntax, and need to make final decision about syntax for variable declaration versus assignment. Specifically I'm trying to choose between:
// Option 1. Create new local variable with :=, assign with =
foo := 1
foo = 2
// Option 2. Create new local variable with =, assign with :=
foo = 1
foo := 2
Creating functions will use =
regardless:
// Indentation delimits blocks
square x =
x * x
And assignment to compound objects will do likewise:
sky.color = blue
a[i] = 0
Which of options 1 or 2 would people find most convenient/least surprising/otherwise best?
3 Answers 3
There are many more aspects one should consider when settling for assignment/declaration syntax, than simple =
vs. :=
bikeshedding.
Type inference or not, you will want a syntax for explicit type annotations. In some type systems, inference may not be possible without occasional explicit annotations. There two possible classes of syntax for this:
- A type-variable statement without further operators implies a declaration, e.g.
int i
in C. Some languages use postfix types likei int
, (Golang to a certain degree). - There is a typing operator, often
:
or::
. Sometimes, this declares the type of a name:let i : int = 42
(e.g. Ocaml). In an interesting spin of this, Julia allows a programmer to use type assertions for arbitrary expressions, along the lines ofsum = (a + b):int
.
You may also want to consider an explicit declaration keyword, like var
, val
or let
. The advantage is not primarily that they make parsing and understanding of the code much easier, but that they unambiguously introduce a variable. Why is this important?
If you have closures, you need to precisely declare which scope a variable belongs to. Imagine a language without a declaration keyword, and implicit declaration through assignment (e.g. PHP or Python). Both of these are syntactically challenged with respect to closures, because they either ascribe a variable to the outermost or innermost possible scope. Consider this Python:
def make_closures(): x = 42; def incr(): x = x + 1 def value(): print(x) return incr, value i, v = make_closures(); v(); # expected: 42, behaviour: error because x is uninitialized in value()
Compare with a language that allows explicit declaration:
var make_closures = function() { var x = 42, incr = function() { x++ }, value = function() { console.log(x) }; return { incr: incr, value: value }; }; var closures = make_closures(); closures.value(); // everything works
Explicit declarations allow variable shadowing. While generally a bad practice, it sometimes makes code much easier to follow – no reason to disallow it.
Explicit declarations offer a form of typo detection, because unbound variables are not implicitly declared. Consider:
var1 = 42 if 0 < 1: varl = 12 print(var1) # 42
versus:
use strict; my $var1 = 42; $varl = 12 if 0 < 1; # Doesn't compile: Global symbol "$varl" requires explicit package name say $var1;
You should also consider whether you would like to (optionally) enforce single-assignment form, e.g through keywords like val
(Scala), let
, or const
or by default. In my experience, such code is easier to reason about.
How would a short declaration e.g. via :=
fare in these points?
- Assuming you have typing via a
:
operator and assigment via=
, theni : int = 42
could declare a variable, the syntaxi : = 42
would invoke inference of the variable, andi := 42
would be a nice contraction, but not an operator in itself. This avoids problems later on. - Another rationale is the mathematical syntax for the declaration of new names
x := expression
orexpression =: x
. However, this has no significant difference to the=
relation, except that the colon draws attention to one name. Simply using the:=
for similarity to maths is silly (considering the=
abuse), as is using it for similarity to Pascal. We can declare some more or less sane characteristics for
:=
, like:- It declares a new variable in the current scope
- which is re-assignable,
- and performs type inference.
- Re-declaring a variable in the same scope is a compilation error.
- Shadowing is permitted.
But in practice, things get murky. What happens when you have multiple assignments (which you should seriously consider), like
x := 1 x, y := 2, 3
Should this throw an error because
x
is already declared in this scope? Or should it just assignx
and declarey
? Go takes the second route, with the result that typo detection is weakened:var1 := 1 varl, var2 := 2, 3 // oops, var1 is still 1, and now varl was declared
Note that the "RHS of typing-operator is optional" idea from above would disambiguate this, as every new variable would have to be followed by a colon:
x: = 1 x:, y: = 2, 3 # error because x is already declared x, y: = 2, 3 # OTOH this would have been fine
Should =
be declaration but :=
be assignment? Hell no. First, no language I know of does this. Second, when you don't use single-assignment form, then assignment is more common than declaration. Huffman-coding of operator requires that the shorter operator is used for the more common operation. But if you don't generally allow reassignment, the =
is somewhat free to use (depending on whether you use =
or ==
as comparison operator, and whether you could disambiguate a =
from context).
Summary
- If assignment and declaration use the same operator, bad things happen: Closures, variable shadowing, and typo detection all get ugly with implicit declarations.
- But if you don't have re-assignments, things clear up again.
- Don't forget that explicit types and variable declarations are somewhat related. Combining their syntax has served many languages well.
- Are you sure you want such little visual distinction between assignment and declaration?
Personal opinion
I am fond of declaration keywords like val
or my
. They stand out, making code easier to grok. Explicit declarations are always a good idea for a serious language.
-
Yeah, I'm using
:
for optional explicit type, so as you say,i := 42
is shorthand fori: int = 42
. I am allowing reassignment by default (so there needs to be some distinction), but it can be disabled with afinal
modifier as in Java. And I'm not a huge fan of declaring multiple variables on one line so I'm okay with losing that.rwallace– rwallace2013年10月28日 21:07:07 +00:00Commented Oct 28, 2013 at 21:07
Both alternatives are bad. The first because it is far from obvious that a :=
operator creates a local variable, and the second because it means you have two different meanings for the =
operator. Learn Dennis Ritchie's lesson, and don't have two operators that appear to be assignments, one of which is not.
-
2Further, if I see
:=
, I assume pascal assignment. That also means I expect=
to test equality, not declare a variable.Telastyn– Telastyn2013年10月27日 21:23:33 +00:00Commented Oct 27, 2013 at 21:23 -
1Trying to read a source code without understanding its notation is bad habit. And your "obvious" point won't work when you know the notation. Otherwise
:=
and=
are visually distinguishable very well.lorus– lorus2013年10月28日 06:48:08 +00:00Commented Oct 28, 2013 at 6:48 -
1@lorus Tell that to every seasoned C programmer who's typed
if (a = b) ...
. It's not just a rookie mistake, everyone does it once in a while. In other words, it's a language design flaw.Ross Patterson– Ross Patterson2013年10月28日 09:49:02 +00:00Commented Oct 28, 2013 at 9:49 -
4This problem with C syntax is that assignment is an expression. If it would be a statement, the problem won't occur.lorus– lorus2013年10月29日 04:35:36 +00:00Commented Oct 29, 2013 at 4:35
New variables should be declared with x := 5
and should be updated/reassigned with x = 5
. Kind of the norm now, eight years later. Mostly thanks to golang I think.
Explore related questions
See similar questions with these tags.
foo = ...
always introduce or assign to a local, with syntax to exempt one name from the "introducing" part, instead making=
alter a global/closed over variable (i.e. , like Python'sglobal
andnonlocal
, possibly unified into one concept)?