In C++ and Java and many other OOP languages, in order to send a message to an object, you have to use the object.function()
syntax, for example:
myCar.start(); // send the start() message to the myCar object
But why was this syntax chosen to send a message? why for example wasn't the function(object)
syntax chosen instead? and are all OOP languages use the object.function()
syntax to send a message?
5 Answers 5
In C++ and Java and many other OOP languages, in order to send a message to an object, you have to use the
object.function()
syntax [...]
First, an important correction: they use the syntax object.message()
, not object.function()
. Messages and functions are fundamentally different.
But why was this syntax chosen to send a message? why for example wasn't the
function(object)
syntax chosen instead?
Frame challenge
I want to challenge the premise of your question because there are object-oriented programming languages that use or at least alternatively allow this syntax.
For example in Lark
list.length
is simply syntactic sugar for
length list
Also, interestingly in Python, methods are defined as functions taking the receiver as an argument:
class Foo:
def bar(baz, qux):
# within the method body, `baz` is the receiver
pass
but they are called with the receiver as a special argument using the dot-message-sending syntax:
foo = Foo()
foo.bar(42)
# within the method body, `baz` is now `foo`.
Conventionally, this first parameter is named self
, but that is only a convention.
Simula
This is the syntax used by Simula (1962), considered to be the first object-oriented programming language. Unfortunately, both designers of Simula, Kristen Nygaard and Ole-Johan Dahl, died in 2002, so we can't ask them why they chose it. However, you might find that they have documented their rationale in one of their papers.
However, there is a very good reason to distinguish the "special" zeroth argument from the other arguments, and that reason is that the "special" argument is ... well special.
A method has privileged access to the internal representation and the private API of the receiver. Therefore, the receiver is different, and it makes sense to distinguish it from the rest of the arguments. If you have
message(object, arg1, arg2)
then there is no indication that object
is treated different from arg1
and arg2
. But they are different: the method can access the internal representation and the private API of object
but only the public API of arg1
and arg2
. Whereas with
object.message(arg1, arg2)
you can clearly see that object
is different from arg1
and arg2
.
Additionally, Simula was designed to be a fairly faithful superset of ALGOL and function(object)
already has a meaning in ALGOL (subroutine call) that is different from message send, so that might also have been a reason not to overload it with two different meanings. It gets especially confusing when you have a subroutine named foo
in scope and object
also has a foo
method, then what does foo(object)
mean?
and are all OOP languages use the
object.function()
syntax to send a message?
No, of course not. There are thousands of languages, it would be a miracle if they all used the same syntax.
Smalltalk, Self, Newspeak, Objective-C, Objective-C++, Objective Modula-2, Fancy, Finch, Nu
The other extremely influential OO language next to Simula, Smalltalk, uses the so-called "Smalltalk keyword selector syntax". The message sending operator is simply the space, and the message is a "keyword selector" with the arguments written between the keywords. A "sentence" is ended with a period.
anArray ← Array new.
anArray append: 23.
anArray append: 42.
anArray at: 2 put: 4711.
"anArray will now contain the elements (23 4711)."
"If you want to send multiple messages to the same receiver, you can use a cascade:"
anArray
append: 23;
append: 42;
at: 2 put: 4711.
Note: The name of the method in the last line is at:put:
.
This syntax is also used by many of Smalltalk's descendants, successors, and derivatives, including Self (one of the influences on ECMAScript) and Newspeak as well as Fancy, Finch, Nu, and countless others. The most well-known language that inherited it from Smalltalk is Objective-C, which was until recently the primary language for macOS, iOS, iPadOS, tvOS, and watchOS development, and still plays an important role there.
Objective-C++ is interesting. Objective-C was created by taking the Smalltalk object model and Smalltalk message sending syntax and adding it to C as an orthogonal language extension. ("Extension" meaning that every legal C program is also a legal Objective-C program with identical semantics, and "orthogonal" meaning that the C part and the Objective part mostly don't interact.) Because Objective-C was designed as an orthogonal extension, it was actually possible to take the "Objective" part and apply it to other languages. (For example, Objective Modula-2.)
So, what do you get when you orthogonally extend C++ with "Objective"? You get a language with two separate object models and two different message sending syntaxes!
PLASMA
Carl Hewitt's PLASMA uses
message ⇒ object
object ⇐ message
interchangeably, depending on which one reads better in a certain context.
Note that in PLASMA, the message is not just a "message name" as it is in Simula or Smalltalk, but an object itself. For example, to add some numbers together, you would send the message [1 2 3 4]
(an array of numbers) to the object +
like this:
[1 2 3 4] ⇒ +
+ ⇐ [1 2 3 4]
On the receiver side, you define a receiver pattern, which uses a triple-shafted arrow:
⇛ pattern
body
Io, Ioke, Seph, Atomo, Scala
In Io, Ioke, Seph and other languages based on similar ideas (e.g. Atomo), whitespace is used as the message sending operator as in Smalltalk, but arguments are passed in a parenthesized argument list like in ALGOL-style subroutine calls:
object message(arg1, arg2)
If there is only one argument, it is allowed to omit the parentheses, so you are allowed to write
2 + 3
instead of having to write
2 +(3)
Scala uses object.message()
by default, but it allows to leave out the .
, and if there is only a single argument, it also allows to leave out the parentheses.
Lisps (e.g. Clojure, Hy)
Lisps that support object-orientation typically keep the Lisp-style syntax. In Clojure and Hy, a method call looks like this:
(.message object arg1 arg2)
Grace
Grace uses a mixture of Simula-style and Smalltalk-style:
"abcdefghi".substringFrom(3)to(6)
"abcdefghi".substringFrom(3) to(6)
"abcdefghi".substringFrom 3 to 6
sends the message substringFrom(_)to(_)
passing arguments 3
and 6
.
Ruby, CoffeeScript, CokeScript, Coco
Ruby, CoffeeScript, Coco, and many others use Simula-style but allow to leave out the parentheses:
object.message arg1, arg2
Lua
Lua does not have OOP as part of its language semantics, but it does have some syntactic features that make it possible to implement OOP as a library while still looking like it is part of the language. This is quite interesting because it makes it possible to have multiple competing object systems in the same language, and choose the best one for the job.
In Lua,
object:message(arg1, arg2)
desugars into
object.message(object, arg1, arg2)
but it guarantees that object
will only be evaluated once.
Erlang
Whether or not you consider Erlang object-oriented is a matter of opinion, but it definitely supports message sending:
object ! message
"Interesting" syntax sugar
Scala has right-associative operators. Any operator that ends with a colon :
is right-associative and has the argument on the left and the receiver on the right:
a + b
// is syntactic sugar for
a.+(b)
// but
a +: b
// is syntactic sugar for
b.+:(a)
Languages in which everything is a message send
In some languages, really absolutely everything is a message send. For example, in Ioke, Seph, and Monte, list or array literals are message sends, even number literals can be.
For example, in Ioke
[1, 2, 3]
is actually the message send
[](1, 2, 3)
and even
42
internal:createNumber("""42"""")
[Note: """42"""
is not actually legal Ioke syntax. These methods take a so-called "strange object" as an argument, which is an object that belongs to the underlying platform and cannot be represented in Ioke. I am using this syntax here to mean "the strange object representing 42
".]
and
"Hello World"
internal:createText("""Hello World""")
so, this is sending the message internal:createText
to what Ioke calls the "current ground", and the argument is a so-called "strange object" which is an object that is not actually an Ioke object but an object of the host platform. (We have to start the bootstrapping process somewhere.) So, in the JVM implementation of Ioke, this would be a java.lang.String
and in the CLI implementation, it would be a System.String
.
By overloading these methods, I can overload the meaning of literals. This is used in Ioke's parser generator library, where "a"
for example does not mean "the string containing the character a
" but "a parser that recognizes the character a
".
Others ...
Of course, until now, we have only looked at single-dispatch classical and prototype-based OO. There is a whole slew of other approaches such as multiple-dispatch OO that by necessity have different syntax.
-
1+1 for the broad knowledge of many languages people rarely get a chance to learn about.J.G.– J.G.06/26/2020 21:16:41Commented Jun 26, 2020 at 21:16
This was likely a design decision by the original authors of some of those older languages. As others have pointed out in comments other syntaxes exist in languages that just didn't catch on. C and C++ caught on, and their challenges inspired a new generation of language authors. The newer generations were comfortable with the syntax, so they copied into their new language. My assumption is the syntax worked good enough and did not warrant a redesign.
That isn't to say object.method()
is the best. It just turned out to be tradition.
As to why those original language authors choose that syntax? You'll need to ask those authors.
I can give one suggestion of why object.function()
is used rather than function(object)
.
If the function took an extra parameter, then we'd be comparing object.function(foo)
with function(object, foo)
. As OOP languages generally provide some form of polymorphism, often with single dynamic dispatch, it's useful to clearly distinguish between the parameter which is used to look up the function to call (object
), and the parameter which is not involved in that process.
This is made clearer by having the parameter used for function lookup syntactically separated from the parameter which is not.
Last question: Not all. In objective-c you write [object method]. And Smalltalk is similar.
But why was this syntax chosen to send a message?
It really comes down to this: those languages (starting from C++ and extending onward) do not consider this syntax to be "sending a message to an object" at all. They consider it to be "calling a function which is a member of the object." The syntax in C for getting a member of a struct is to use object.member_name
. In C, if that member_name
just so happens to be a function pointer, then you get to do object.member_name(parameters)
.
That's the basic idea. These languages are not built in terms of "sending messages". They're built in terms of "calling functions". You can conceptually consider "calling a function" to be a form of "sending a message", but those languages don't really agree.
why for example wasn't the
function(object)
syntax chosen instead?
There are many different reasons why this wasn't done, usually down to the specifics of the different languages.
In C, function(object)
already has a well-understood meaning; there is a global function named function
, which takes an argument of a type equivalent to that of object
. That's what the compiler does with that code, and it gives you an error if none of those conditions are satisfied.
C++ was designed to be backwards-compatible with C (mostly), so it inherited this meaning. Also, C++ wanted to have non-member functions too, so it was always going to inherit this meaning.
Allowing function(object)
to look up function declarations within the scope of object
's type would be... weird. It could be very confusing as to which function would be called in any given scenario, especially since people can write non-member functions anywhere that could conflict with such a call.
The ISO C++ committee tried for many years to attempt to reconcile this weirdness so as to allow object.function
and function(object)
to mean the same thing. Nothing came from these efforts, and they've more-or-less given up on the idea of making one automatically map to the other.
Remember again that we're not talking about "messages"; we're talking about static constructs. object
has some type, and function
must be a function declared within object
's type. This is what allows the compiler to resolve object.function
down to a specific piece of code.
Scripting languages like Python or Lua use similar object.function()
notation as well, for somewhat separate reasons. In languages like those, functions are first-class objects. As such, "member functions" are a convention, not a distinct idea. A "member function" is just a regular function whose value happens to be stored in that object, just like any other value. So object.function()
means "retrieve the function
member of object
and call it with object
as one of its parameters." This is functionally no different from object.value
used to retrieve or set the value
member of object
.
Since these languages also have non-member functions, they also have the issue of having function(object)
already having a well-defined meaning: find the function with the name function
that is accessible from this scope and call it with the object
parameter. Trying to make it mean something else as well could get really confusing.
-
As you speak about Lua,
object:function()
is also a thing there, equivalent toobject.function(object)
but only evaluatingobject
once.Deduplicator– Deduplicator06/26/2020 15:32:56Commented Jun 26, 2020 at 15:32 -
@Deduplicator: Yes, but I didn't want to get into about syntactic differences in a particular language. The broad idea is the difference between object-access-member vs. function_name-call-object.Nicol Bolas– Nicol Bolas06/26/2020 15:42:36Commented Jun 26, 2020 at 15:42
function(object)
calls the free function or callablefunction
with the single argumentobject
.object.function()
on the other hand calls the member-functionfunction
ofobject
, passing a pointer toobject
if it isn'tstatic
, or it calls the callablefunction
belonging to it or its class. Thus, they do different things. Still, there is an effort to unify those two.object message: argument
. Theobject.message()
syntax most likely has its roots in theobject.attribute
notation for accessing record variables or struct members.object.function(...)
to be transformed by the compiler intofunction(object, ...)
and/or vice-versa, depending on a variety of things. This effort stalled out a few years ago and hasn't seriously been taken up again by the committee.