I'm trying to wrap my head around Scala, and one thing that keeps throwing me is the ordering of a variable/value declaration when specifying the type.
val a = 0
makes perfect sense. This looks pretty much like any other language.
val a: Int = 0
parses really weird in my head; it just seems nonsensical. Why is the type immediately on the left of the assignment operator? When I cut this in my head, I see "... Int = 0", which obviously doesn't make any sense.
Is there a logical reason behind this that I can refer to? Obviously, as I look at Scala code more, I will adjust to it, but I'm also curious why Martin Odersky would choose to arrange it as such. It can't be just to stand out from other languages, where (as far as I know of), the type identifier, if there is one, precedes the declaration.
2 Answers 2
stand out from other languages
No. As Jörg already commented, this form is actually used in many languages. It is probably the most common form of variable declaration by number of languages that use it. It was used back then with Pascal and related languages and it is now being used by all the new ones like TypeScript, Go, Rust—and Scala.
type identifier, if there is one, precedes the declaration
The
type identifier [
=
value ]
form of declaration in C was in some respects a big mistake. Its serious problem is that it makes the grammar of the language contextual. Type and object identifiers look the same syntactically, but this form of declaration cannot be recognized without knowing that the first identifier identifies a type. So the compiler can't build the syntax tree without referring to the table of already defined types. This causes problems to templates, because the interpretation may depend on the parameter, so the compiler can't know whether it is looking at a type yet.
In C++ this means you have to use typename
keyword in the ambiguous cases. Java and C# dodge this by not having typedefs, so you can't have related types, but that seriously limits usefulness of their templates. And it still complicates the compiler anyway.
On the other hand with declarations in the form
keyword identifier [
:
type ] [=
value ]
the identifier after :
(and some keywords like new
) always means type and identifier in any other place never does and the grammar is context-free and everything is much simpler.
It is also more regular when the type is optional. You just omit it. In the C form, you have to replace it with special keyword.
-
This is a bit of a tangent, but another way in which C's declaration syntax is an unfortunate historical mistake is that C objects are not variables in the mathematical sense. A variable stands for an unknown value and once bound doesn't change. A C object is a reference to a block of memory that just happens to be implicitly dereferenced for you. Both the terminology and the use of the
=
symbol are highly confusing to beginners because it breaks the mental model they've built up in math classes.Doval– Doval2014年12月04日 15:36:00 +00:00Commented Dec 4, 2014 at 15:36 -
@Doval: Well, in procedural programming "variable" always means a box that can contain a value and did so long before C. Only functional and logical programming commonly comes with variables in the mathematical sense. The terminology mismatch is a bit unfortunate, but it's the first thing you have to understand when learning programming independent of language. And the use of
:=
does not really make much difference.<-
looks like a better symbol, but the only language I know that uses it is R (which might get it from S+, but I don't know that).Jan Hudec– Jan Hudec2014年12月04日 16:14:04 +00:00Commented Dec 4, 2014 at 16:14 -
@JanHudec: Smalltalk used
←
initially (and↑
for return). However, these characters only existed in the character sets and on the keyboards of Xerox's own workstations, they didn't exist anywhere else. When transferring Smalltalk source code to an ASCII-based system, those codepoints are interpreted as_
(and I forgot the other one). I believe Squeak still accepts_
for assignment, but the spec was changed to use:=
for assignment and^
for return.Jörg W Mittag– Jörg W Mittag2015年09月02日 12:37:07 +00:00Commented Sep 2, 2015 at 12:37
You're thinking about it the wrong way. The type isn't immediately to the left of the assignment, it's immediately to the right of the declarator. This syntax has the advantage of being unambiguous, whereas for example val a = 0 : Int
is ambiguous: does the type specifier refer to the literal, the declaration, or the entire statement? And if the initializer is more complicated than just a literal, it gets really confusing.
-
Note that
val a: Long = 0: Short
is legal. It is a type annotation fora
and a type ascription for0
. It doesn't make much sense here, but it is legal.Jörg W Mittag– Jörg W Mittag2015年09月02日 12:33:39 +00:00Commented Sep 2, 2015 at 12:33
val a: int = 0
is valid Standard ML/Ocaml/F#. So rather than standing out, it fits right in with other functional languages which have probably influenced Scala (e.g. pattern matching).