class in OOP language and type

Question 1

In programming language theory, a type is a set of values. E.g. the type "int" is the set of all integer values.

In OOP languages, a class is a type, is it?

When a class is defined with more than one members, e.g.

class myclass{
 int a; 
 double b;
}

When we talk about a class, do we mean

"(a,b) where a is an int and b is a double", or
"{(x,y) | x is any int, y is any double}"?

What does an instance of myclass mean?

"(a,b) where a is an int and b is a double", or
an object which occupies a memory space and which can (not necessarily, i.e. can be empty) store (x,y), where x is any int and y is any double?

Question 2

A class is a type. "{(x,y) | x is any int, y is any double}"" would be almost correct, except for two things: 1) you've used a tuple while a class is conceptually a record - you refer to its fields by name, not position; and 2) Not every record with fields a and b are members of that type, as Killian Forth mentions. Myclass is isomorphic to records with fields a and b of type int and double - you could take a record like that and turn it into an instance of myclass.

Question 3

In strongly-typed languages, a class is a type. In weakly-typed languages, it may or may not be a type.

Question 4

In programming language theory, a type is a set of values? I think you need to get yourself an other book or other teacher or both. A 'variable' or a 'constant' has a 'type' and often has a 'value'. There are zero value types, scalars and composit value types where the value of the variable or constant contains sub-variables/sub-constants.

Question 5

@user1703394 A type is a set of values. A 32-bit integer type is a set of 2^32 distinct values. If an expression evaluates to a value of that type, you know that value is in that set. The mathematical operators are just functions over values of that set.

Question 6

I would also be cautious considering types as sets of values. Sets have relations that are not strictly applicable to types. For conceptualizing types it's a good model, but it breaks down once you start looking at things more closely - and even moreso when you introduce subtyping.

Question 7

Neither.

I take it you're asking whether having the same set of field types is enough to classify as being the same class, or whether they have to be named identically as well. The answer is: "Not even having the same types and the same names is sufficient!" Structurally equivalent classes are not necessarily type-compatible.

For instance, if you have a CartesianCoordinates and a PolarCordinates class, they might both have two numbers as their fields, and they might even have the same Number type and the same names, but they would still not be compatible, and an instance of PolarCoordinates would not be an instance of CartesianCoordinates. The ability to separate types by their intended purpose and not their current implementation is a very useful part of writing safer, more maintainable code.

Question 8

It should be noted that in some languages being structurally equivalent is enough to make one type a subtype of the other (and often vice versa). That though is decidedly uncommon/unpopular.

Question 9

@Tim A typedef doesn't create a type, it aliases the name used to refer to an existing type.

Question 10

@DevSolar He explicitly mentioned C, and other than C++, I don't know of any other language that uses that keyword.

Question 11

@Telastyn - those languages should be killed with fire.

Question 12

@JonStory Structural subtyping is useful at a module level; the lack of it is what forces you to turn everything into an interface in Java and C# if you ever want to do unit testing. You end up writing a ton of boilerplate just so you can change the particular class your program will use even though you don't have any intention of changing it at runtime.

Question 13

Types are not sets.

You see, set theory has a number of features which simply don't apply to types, and vice-versa. For instance, an object has a single canonical type. It may be an instance of several different types, but only one of those types was used to instantiate it. Set theory has no notion of "canonical" sets.

Set theory allows you to create subsets on the fly, if you have a rule that describes what belongs to the subset. Type theory generally does not allow this. While most languages have a Number type or something similar, they do not have an EvenNumber type, nor would it be straightforward to create one. I mean, it's easy enough to define the type itself, but any existing Numbers which happen to be even will not be magically transformed into EvenNumbers.

Actually, saying that you can "create" subsets is somewhat disingenuous, because sets are a different kind of animal altogether. In set theory, those subsets already exist, in all the infinite ways you can define them. In type theory, we usually expect to be dealing with a finite (if large) number of types at any given time. The only types which are said to exist are those we've actually defined, not every type we could possibly define.

Sets are not allowed to directly or indirectly contain themselves. Some languages, such as Python, provide types with less regular structures (in Python, type's canonical type is type, and object is considered an instance of object). On the other hand, most languages do not allow user-defined types to engage in this sort of trickery.

Sets are commonly allowed to overlap without being contained in one another. This is uncommon in type theory, though some languages do support it in the form of multiple inheritance. Other languages, such as Java, only allow a restricted form of this or disallow it entirely.

The empty type exists (it's called the bottom type), but most languages do not support it, or do not regard it as a first-class type. The "type that contains all other types" also exists (it's called the top type) and is widely supported, unlike set theory.

NB: As some commenters previously pointed out (before the thread was moved to chat), it is possible to model types with set theory and other standard mathematical constructs. For instance, you could model type membership as a relation rather than modeling types as sets. But in practice, this is much simpler if you use category theory instead of set theory. This is how Haskell models its type theory, for example.

The notion of "subtyping" is really quite different from the notion of "subset." If X is a subtype of Y, it means we can substitute instances of Y for instances of X and the program will still "work" in some sense. This is behavioral rather than structural, though some languages (e.g. Go, Rust, arguably C) have chosen the latter for reasons of convenience, either to the programmer or the language implementation.

Question 14

Comments are not for extended discussion; this conversation has been moved to chat.

Question 15

Algebraic data types are the way to discuss this.

There are three fundamental ways you can combine types:

Product. That's basically what you're thinking of:
```
struct IntXDouble{
 int a; 
 double b;
}
```
is a product type; its values are all possible combinations (i.e. tuples) of one int and one double. If you consider the number types as sets, then the cardinality of the product type is in fact the product of the fields' cardinalities.
Sum. In procedural languages this is a bit awkward to express directly (classically it's done with tagged unions), so for better understanding, here is a sum type in Haskell:
```
data IntOrDouble = AnInt Int
 | ADouble Double
```
the values of this type have either the form AnInt 345, or ADouble 4.23, but there's always only one number involved (unlike for the product type, where each value has two numbers). So the cardinality: you first enumerate all Int values, each must be combined with the AnInt constructor. Plus, all Double values, each combined with ADouble. Hence sum type.
Exponentiation¹. I'll not discuss that in detail here because it has no clear OO correspondence at all.

So what about classes? I deliberately used the keyword struct rather than class for IntXDouble. The thing is, a class as a type isn't really characterised by its fields, those are merely implementation details. The crucial factor is rather, what distinguishable values can the class have.

What is relevant though is, a class's value can be values of any of it's subclasses! So a class is actually a sum type rather than a product type: if A and B would both be derived from myClass, the myClass would essentially be the sum of A and B. Regardless of actual implementation.

¹_{This is functions (in the mathematical sense!); a function type Int -> Double is represented by the exponential Double^Int. To bad if your language doesn't have proper functions...}

Question 16

Sorry, but I think this is a very poor answer. Functions do have a clear OO analogue, namely methods (and single-method interface types). The basic definition of an object is that it has both state (fields/data members) and behavior (methods/member functions); your answer ignores the latter.

Question 17

@ruakh: no. You can of course implement functions in OO, but in general, methods are not functions (because they modify state etc.). Nor are "functions" in procedural languages functions, for that matter. Indeed, single-static-method interfaces come closest to function/exponential-types, but I hoped to avoid the discussion of that because it has no relevance for this question.

Question 18

...more importantly, my anwer does consider behaviour. Indeed behaviour is usually the reason why you use inheritance, and the unification of different possible behaviours precisely captures the sum-type aspect of OO classes.

Question 19

@ruakh A method can't without its object. The closest analog is static methods, except they're still not first-class values. The thing about most OO languages is that they take objects as the smallest building block, so if you want anything smaller you have to fake it with objects, and you still end up dragging in a bunch of semantics functions shouldn't have. E.g. it doesn't make sense to compare functions for equality, but you can still compare two faux-function objects.

Question 20

@Doval 1) you can pass methods around AFAIK, so they are first-class values; 2) it does make sense to compare functions for equality, JS people do it all the time.

Question 21

Sorry but I don't know about the "raw" theory. I can only provide a practical approach. I hope this is acceptable at programmers.SE; I'm not familiar with the etiquette here.

A central theme of OOP is information hiding. What the data members of a class are, exactly, should be of no interest to its clients. A client sends messages to (calls methods / member functions of) an instance, which might or might not modify internal state. The idea is that the internals of a class might change, without the client being affected by it.

A corrolary to this is that the class is responsible for ensuring that its internal representation remains "valid". Let's assume a class that stores a (simplified) phone number in two integers:

 int areacode;
 int number;

These are the data members of the class. However, the class will probably be much more than just its data members, and it is certainly not definable as "set of all possible values of int x int". You shouldn't have direct access to the data members.

Construction of an instance might deny any negative numbers. Perhaps the construction would also normalize the areacode in some way, or even verify the whole number. You would thus end up much closer to your "(a,b) where a is an int and b is a double", because it's certainly not any two int stored in that class.

But that doesn't really matter as far as the class is concerned. It is neither the type of the data members, nor the range of their possible values that defines the class, it's the methods that are defined for it.

As long as those methods remain the same, the implementor could change the data types to floating point, BIGNUMs, strings, whatever, and for all practical purposes, it would still be the same class.

There are design patters to ensure that such changes of internal representation can be made without the client even being aware of it (e.g. the pimpl idiom in C++, which hides any data members behind an opaque pointer).

Question 22

It is neither the type of the data members, nor the range of their possible values that defines the class, it's the methods that are defined for it.

The data members don't define a class only when you hide them. That may be the most common case, but it certainly can't be said that it's true of all classes. If even a single field is public it's just as important as its methods.

Question 23

Unless you're coding in Java, where you have no choice in the matter and even your dumb no-behavior faux-records must be classes. (Marking them final helps get the point across, but still). You still have a problem with protected members though, which can be inherited and thus form part of a second API for implementors of subclasses.

Question 24

@Doval: I understood this to be a "theory" question, which is why I stayed as clear of actual language issues as possible. (Just like I stay as clear of Java and protected as possible in practice. ;-) )

Question 25

The problem is that a class is a language-dependent construct. As far as I know there's no such thing as a class in type theory.

Question 26

@Doval: Wouldn't that mean that type theory per se doesn't apply to classes, as they are a construct out of that theory's scope?

Question 27

A type is a description of a category/range of values, compound structures, or what have you. OOPwise, it is akin to an "interface". (In the language-agnostic sense. The language-specific sense, not so much. In Java, for example, int is a type, but has no relation to an interface. Public/protected field specifications, as well, are not part of an interface, but are part of an "interface" or type.)

Main point being, it's much more a semantic definition than a concrete one. Structure only factors in inasmuch as the exposed fields/behaviors and their defined purposes align. If you don't have both, you don't have type compatibility.
A class is the realization of a type. It's a template that actually defines the internal structure, attached behavior, etc.

Kilian Foth Kilian Foth 111k45 gold badges301 silver badges323 bronze badges · Answer 1 · 2015-01-21 14:11:14Z

Neither.

I take it you're asking whether having the same set of field types is enough to classify as being the same class, or whether they have to be named identically as well. The answer is: "Not even having the same types and the same names is sufficient!" Structurally equivalent classes are not necessarily type-compatible.

For instance, if you have a CartesianCoordinates and a PolarCordinates class, they might both have two numbers as their fields, and they might even have the same Number type and the same names, but they would still not be compatible, and an instance of PolarCoordinates would not be an instance of CartesianCoordinates. The ability to separate types by their intended purpose and not their current implementation is a very useful part of writing safer, more maintainable code.

It should be noted that in some languages being structurally equivalent is enough to make one type a subtype of the other (and often vice versa). That though is decidedly uncommon/unpopular.
@Tim A typedef doesn't create a type, it aliases the name used to refer to an existing type.
@DevSolar He explicitly mentioned C, and other than C++, I don't know of any other language that uses that keyword.
@JonStory Structural subtyping is useful at a module level; the lack of it is what forces you to turn everything into an interface in Java and C# if you ever want to do unit testing. You end up writing a ton of boilerplate just so you can change the particular class your program will use even though you don't have any intention of changing it at runtime.

Kevin Kevin 2,6682 gold badges17 silver badges23 bronze badges · Answer 2 · 2015-01-21 20:24:37Z