In programming language theory, a type is a set of values. E.g. the type "int" is the set of all integer values.
In OOP languages, a class is a type, is it?
When a class is defined with more than one members, e.g.
class myclass{
int a;
double b;
}
When we talk about a class, do we mean
- "
(a,b)
wherea
is an int andb
is a double", or - "{
(x,y)
|x
is any int,y
is any double}"?
What does an instance of myclass
mean?
- "
(a,b)
wherea
is an int andb
is a double", or - an object which occupies a memory space and which can (not necessarily, i.e. can be empty) store
(x,y)
, wherex
is any int andy
is any double?
5 Answers 5
Neither.
I take it you're asking whether having the same set of field types is enough to classify as being the same class, or whether they have to be named identically as well. The answer is: "Not even having the same types and the same names is sufficient!" Structurally equivalent classes are not necessarily type-compatible.
For instance, if you have a CartesianCoordinates
and a PolarCordinates
class, they might both have two numbers as their fields, and they might even have the same Number
type and the same names, but they would still not be compatible, and an instance of PolarCoordinates
would not be an instance of CartesianCoordinates
. The ability to separate types by their intended purpose and not their current implementation is a very useful part of writing safer, more maintainable code.
-
9It should be noted that in some languages being structurally equivalent is enough to make one type a subtype of the other (and often vice versa). That though is decidedly uncommon/unpopular.Telastyn– Telastyn2015年01月21日 14:36:02 +00:00Commented Jan 21, 2015 at 14:36
-
7@Tim A typedef doesn't create a type, it aliases the name used to refer to an existing type.Doval– Doval2015年01月21日 14:53:07 +00:00Commented Jan 21, 2015 at 14:53
-
1@DevSolar He explicitly mentioned C, and other than C++, I don't know of any other language that uses that keyword.Doval– Doval2015年01月21日 15:59:51 +00:00Commented Jan 21, 2015 at 15:59
-
3@Telastyn - those languages should be killed with fire.Jon Story– Jon Story2015年01月21日 17:24:20 +00:00Commented Jan 21, 2015 at 17:24
-
4@JonStory Structural subtyping is useful at a module level; the lack of it is what forces you to turn everything into an
interface
in Java and C# if you ever want to do unit testing. You end up writing a ton of boilerplate just so you can change the particular class your program will use even though you don't have any intention of changing it at runtime.Doval– Doval2015年01月21日 18:27:07 +00:00Commented Jan 21, 2015 at 18:27
Types are not sets.
You see, set theory has a number of features which simply don't apply to types, and vice-versa. For instance, an object has a single canonical type. It may be an instance of several different types, but only one of those types was used to instantiate it. Set theory has no notion of "canonical" sets.
Set theory allows you to create subsets on the fly, if you have a rule that describes what belongs to the subset. Type theory generally does not allow this. While most languages have a Number
type or something similar, they do not have an EvenNumber
type, nor would it be straightforward to create one. I mean, it's easy enough to define the type itself, but any existing Number
s which happen to be even will not be magically transformed into EvenNumber
s.
Actually, saying that you can "create" subsets is somewhat disingenuous, because sets are a different kind of animal altogether. In set theory, those subsets already exist, in all the infinite ways you can define them. In type theory, we usually expect to be dealing with a finite (if large) number of types at any given time. The only types which are said to exist are those we've actually defined, not every type we could possibly define.
Sets are not allowed to directly or indirectly contain themselves. Some languages, such as Python, provide types with less regular structures (in Python, type
's canonical type is type
, and object
is considered an instance of object
). On the other hand, most languages do not allow user-defined types to engage in this sort of trickery.
Sets are commonly allowed to overlap without being contained in one another. This is uncommon in type theory, though some languages do support it in the form of multiple inheritance. Other languages, such as Java, only allow a restricted form of this or disallow it entirely.
The empty type exists (it's called the bottom type), but most languages do not support it, or do not regard it as a first-class type. The "type that contains all other types" also exists (it's called the top type) and is widely supported, unlike set theory.
NB: As some commenters previously pointed out (before the thread was moved to chat), it is possible to model types with set theory and other standard mathematical constructs. For instance, you could model type membership as a relation rather than modeling types as sets. But in practice, this is much simpler if you use category theory instead of set theory. This is how Haskell models its type theory, for example.
The notion of "subtyping" is really quite different from the notion of "subset." If X
is a subtype of Y
, it means we can substitute instances of Y
for instances of X
and the program will still "work" in some sense. This is behavioral rather than structural, though some languages (e.g. Go, Rust, arguably C) have chosen the latter for reasons of convenience, either to the programmer or the language implementation.
-
Comments are not for extended discussion; this conversation has been moved to chat.user28988– user289882015年01月23日 21:08:01 +00:00Commented Jan 23, 2015 at 21:08
Algebraic data types are the way to discuss this.
There are three fundamental ways you can combine types:
Product. That's basically what you're thinking of:
struct IntXDouble{ int a; double b; }
is a product type; its values are all possible combinations (i.e. tuples) of one
int
and onedouble
. If you consider the number types as sets, then the cardinality of the product type is in fact the product of the fields' cardinalities.Sum. In procedural languages this is a bit awkward to express directly (classically it's done with tagged unions), so for better understanding, here is a sum type in Haskell:
data IntOrDouble = AnInt Int | ADouble Double
the values of this type have either the form
AnInt 345
, orADouble 4.23
, but there's always only one number involved (unlike for the product type, where each value has two numbers). So the cardinality: you first enumerate allInt
values, each must be combined with theAnInt
constructor. Plus, allDouble
values, each combined withADouble
. Hence sum type.Exponentiation1. I'll not discuss that in detail here because it has no clear OO correspondence at all.
So what about classes? I deliberately used the keyword struct
rather than class
for IntXDouble
. The thing is, a class as a type isn't really characterised by its fields, those are merely implementation details. The crucial factor is rather, what distinguishable values can the class have.
What is relevant though is, a class's value can be values of any of it's subclasses! So a class is actually a sum type rather than a product type: if A
and B
would both be derived from myClass
, the myClass
would essentially be the sum of A
and B
. Regardless of actual implementation.
1This is functions (in the mathematical sense!); a function type Int -> Double
is represented by the exponential Double
Int
. To bad if your language doesn't have proper functions...
-
2Sorry, but I think this is a very poor answer. Functions do have a clear OO analogue, namely methods (and single-method interface types). The basic definition of an object is that it has both state (fields/data members) and behavior (methods/member functions); your answer ignores the latter.ruakh– ruakh2015年01月22日 06:35:49 +00:00Commented Jan 22, 2015 at 6:35
-
@ruakh: no. You can of course implement functions in OO, but in general, methods are not functions (because they modify state etc.). Nor are "functions" in procedural languages functions, for that matter. Indeed, single-static-method interfaces come closest to function/exponential-types, but I hoped to avoid the discussion of that because it has no relevance for this question.leftaroundabout– leftaroundabout2015年01月22日 12:34:42 +00:00Commented Jan 22, 2015 at 12:34
-
...more importantly, my anwer does consider behaviour. Indeed behaviour is usually the reason why you use inheritance, and the unification of different possible behaviours precisely captures the sum-type aspect of OO classes.leftaroundabout– leftaroundabout2015年01月22日 12:37:03 +00:00Commented Jan 22, 2015 at 12:37
-
@ruakh A method can't without its object. The closest analog is
static
methods, except they're still not first-class values. The thing about most OO languages is that they take objects as the smallest building block, so if you want anything smaller you have to fake it with objects, and you still end up dragging in a bunch of semantics functions shouldn't have. E.g. it doesn't make sense to compare functions for equality, but you can still compare two faux-function objects.Doval– Doval2015年01月22日 12:46:00 +00:00Commented Jan 22, 2015 at 12:46 -
@Doval 1) you can pass methods around AFAIK, so they are first-class values; 2) it does make sense to compare functions for equality, JS people do it all the time.Den– Den2015年01月27日 10:25:01 +00:00Commented Jan 27, 2015 at 10:25
Sorry but I don't know about the "raw" theory. I can only provide a practical approach. I hope this is acceptable at programmers.SE; I'm not familiar with the etiquette here.
A central theme of OOP is information hiding. What the data members of a class are, exactly, should be of no interest to its clients. A client sends messages to (calls methods / member functions of) an instance, which might or might not modify internal state. The idea is that the internals of a class might change, without the client being affected by it.
A corrolary to this is that the class is responsible for ensuring that its internal representation remains "valid". Let's assume a class that stores a (simplified) phone number in two integers:
int areacode;
int number;
These are the data members of the class. However, the class will probably be much more than just its data members, and it is certainly not definable as "set of all possible values of int x int". You shouldn't have direct access to the data members.
Construction of an instance might deny any negative numbers. Perhaps the construction would also normalize the areacode in some way, or even verify the whole number. You would thus end up much closer to your "(a,b) where a is an int and b is a double"
, because it's certainly not any two int stored in that class.
But that doesn't really matter as far as the class is concerned. It is neither the type of the data members, nor the range of their possible values that defines the class, it's the methods that are defined for it.
As long as those methods remain the same, the implementor could change the data types to floating point, BIGNUMs, strings, whatever, and for all practical purposes, it would still be the same class.
There are design patters to ensure that such changes of internal representation can be made without the client even being aware of it (e.g. the pimpl idiom in C++, which hides any data members behind an opaque pointer).
-
1
It is neither the type of the data members, nor the range of their possible values that defines the class, it's the methods that are defined for it.
The data members don't define a class only when you hide them. That may be the most common case, but it certainly can't be said that it's true of all classes. If even a single field is public it's just as important as its methods.Doval– Doval2015年01月21日 16:01:56 +00:00Commented Jan 21, 2015 at 16:01 -
1Unless you're coding in Java, where you have no choice in the matter and even your dumb no-behavior faux-records must be
class
es. (Marking themfinal
helps get the point across, but still). You still have a problem withprotected
members though, which can be inherited and thus form part of a second API for implementors of subclasses.Doval– Doval2015年01月21日 16:05:42 +00:00Commented Jan 21, 2015 at 16:05 -
1@Doval: I understood this to be a "theory" question, which is why I stayed as clear of actual language issues as possible. (Just like I stay as clear of Java and
protected
as possible in practice. ;-) )DevSolar– DevSolar2015年01月21日 16:16:16 +00:00Commented Jan 21, 2015 at 16:16 -
3The problem is that a
class
is a language-dependent construct. As far as I know there's no such thing as aclass
in type theory.Doval– Doval2015年01月21日 16:26:09 +00:00Commented Jan 21, 2015 at 16:26 -
1@Doval: Wouldn't that mean that type theory per se doesn't apply to classes, as they are a construct out of that theory's scope?DevSolar– DevSolar2015年01月21日 16:46:14 +00:00Commented Jan 21, 2015 at 16:46
A type is a description of a category/range of values, compound structures, or what have you. OOPwise, it is akin to an "interface". (In the language-agnostic sense. The language-specific sense, not so much. In Java, for example,
int
is a type, but has no relation to aninterface
. Public/protected field specifications, as well, are not part of aninterface
, but are part of an "interface" or type.)Main point being, it's much more a semantic definition than a concrete one. Structure only factors in inasmuch as the exposed fields/behaviors and their defined purposes align. If you don't have both, you don't have type compatibility.
A class is the realization of a type. It's a template that actually defines the internal structure, attached behavior, etc.
Explore related questions
See similar questions with these tags.
a
andb
are members of that type, as Killian Forth mentions. Myclass is isomorphic to records with fieldsa
andb
of typeint
anddouble
- you could take a record like that and turn it into an instance ofmyclass
.