How programming languages are capable of handling generic types

Question 1

How programming languages are capable of handling generics ?

Here is some context on why I got the question.

Constructs like Lists, Dictionaries and Sets in Python can handle any type of data without even specifying the type of data being added. Whereas in Java Collections require you to specify the type at the time of creating those collections objects.

Both Python and Java are originally implemented in C\C++. So the capability of Java and Python to handle Generics should had obtained from C\C++. Even Javascript engines are built on C++.

To answer my own questions I started thinking why we need to specify the type when we are creating an object, my understanding for this is because to inform the compiler the amount of memory to be allocated in order for the object to live in Memory (RAM). Thus if we are creating an ArrayList in Java of say a MyRandomClass, it should likely allocate an array of Memory Blocks(MB) where each MB is of the size of MyRandomClass

However in Python a list can hold any type of Class without even specifying the type at the time of creation.

What happens under the hood ?

Question 2

en.wikipedia.org/wiki/Type_erasure

Question 3

You seem to be confusing Programming Languages and Programming Language Implementations. Which of the two are you talking about? Also, there seem to be some misconceptions, e.g. you write "Even Javascript engines are built on C++", which is not true: Rhino, Nashorn, and dyn.js a written in Java, IronJS is written in F#, the reference implementation for ECMAScript 4 is written in Standard ML, BESEN is written in Pascal, and Narcissus is even written in JavaScript.

Question 4

@JörgWMittag I gave examples of where generics are already in practice to put some perspective and how generics are used in Python, Java and Javascript which have implementations in C++. So if it can be understood how they are done in their underlying C/C++ that should give an understanding on how generics can be implemented in general. The idea behind mentioning programming languages in my question is how underlying implementations can handle variations for ex:- List in Python and ArrayList in java.

Question 5

People are answering based on static types and type erasure (@Pseudonym and gnasher's answer). The OP's confusion, though, seems to be about how dynamic and completely generic types are implemented e.g. a list of different objects of different types in Python, like [3, "hello", [4, 5]].

Question 6

How Python is implemented is totally irrelevant. You could implement Python using C easily.

Question 7

You mention Python, which is a dynamically typed language. The answer is that it represents the type explicitly as part of the object. You can imagine the following picture: an element of a list or a dictionary is a structure with the following fields

type: <string or type ID>
data: <pointer to memory>

This is how types are represented: as an explicit byte in memory which says what type something is.

In fact, if you're curious, in Python you can see this directly. Fire up a python shell and do x = 3, then type(x). Python knows that x has type int. How does it know? Well check out

dir(x)

to see that there are a bunch of attributes that x has:

'__abs__', '__add__', '__and__', '__bool__', '__ceil__', '__class__', '__delattr__', '__dir__', '__divmod__', '__doc__', '__eq__', '__float__', '__floor__', '__floordiv__', '__format__', '__ge__', '__getattribute__', '__getnewargs__', '__gt__', '__hash__', '__index__', '__init__', '__init_subclass__', '__int__', '__invert__', '__le__', '__lshift__', '__lt__', '__mod__', '__mul__', '__ne__', '__neg__', '__new__', '__or__', '__pos__', '__pow__', '__radd__', '__rand__', '__rdivmod__', '__reduce__', '__reduce_ex__', '__repr__', '__rfloordiv__', '__rlshift__', '__rmod__', '__rmul__', '__ror__', '__round__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__', '__rtruediv__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__truediv__', '__trunc__', '__xor__', 'as_integer_ratio', 'bit_length', 'conjugate', 'denominator', 'from_bytes', 'imag', 'numerator', 'real', 'to_bytes'

and among these, there's one called __class__. If you do

x.__class__

then Python prints out, <class 'int'>.

This is, to a rough approximation, how basically all dynamically typed languages do types. In statically typed languages like C++ or Rust, things are different. Instead of representing the type as part of the object, the compiler figures out the type at compile-time, and uses it to construct machine code for your program, and makes sure that all the types are as expected in the machine code, a process called type-checking. Once the compiler is certain all the types are correct, it then throws them away so that the machine code and explicit memory don't have any types anywhere.

As to generics, the process is similar: in dynamically typed languages the type is part of the object, in statically typed languages like C++ or Rust, the compiler has to figure out all the generic types that you use at compile-time, generate code for each one, and then get rid of the types in the machine code.

Question 8

As an example: Apple's Swift compiler will compile the required code for every type needed, then compare the various versions, and if two are identical then one of them is picked.

For example, lots of the code for an array of unsigned 32 bit integers and for an array of signed 32 bit integers may very well be identical, while the code for an array of 64 bit integers, signed or unsigned, would be different. The code for an array of any reference type might be different.

It's up to the language designer and to some extent the designer of the standard library to make sure not too much unnecessary code is being produced.

score 1 · Answer 1 · 2021-12-12 19:01:40Z

You mention Python, which is a dynamically typed language. The answer is that it represents the type explicitly as part of the object. You can imagine the following picture: an element of a list or a dictionary is a structure with the following fields

type: <string or type ID>
data: <pointer to memory>

This is how types are represented: as an explicit byte in memory which says what type something is.

In fact, if you're curious, in Python you can see this directly. Fire up a python shell and do x = 3, then type(x). Python knows that x has type int. How does it know? Well check out

dir(x)

to see that there are a bunch of attributes that x has:

'__abs__', '__add__', '__and__', '__bool__', '__ceil__', '__class__', '__delattr__', '__dir__', '__divmod__', '__doc__', '__eq__', '__float__', '__floor__', '__floordiv__', '__format__', '__ge__', '__getattribute__', '__getnewargs__', '__gt__', '__hash__', '__index__', '__init__', '__init_subclass__', '__int__', '__invert__', '__le__', '__lshift__', '__lt__', '__mod__', '__mul__', '__ne__', '__neg__', '__new__', '__or__', '__pos__', '__pow__', '__radd__', '__rand__', '__rdivmod__', '__reduce__', '__reduce_ex__', '__repr__', '__rfloordiv__', '__rlshift__', '__rmod__', '__rmul__', '__ror__', '__round__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__', '__rtruediv__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__truediv__', '__trunc__', '__xor__', 'as_integer_ratio', 'bit_length', 'conjugate', 'denominator', 'from_bytes', 'imag', 'numerator', 'real', 'to_bytes'

and among these, there's one called __class__. If you do

x.__class__

then Python prints out, <class 'int'>.

This is, to a rough approximation, how basically all dynamically typed languages do types. In statically typed languages like C++ or Rust, things are different. Instead of representing the type as part of the object, the compiler figures out the type at compile-time, and uses it to construct machine code for your program, and makes sure that all the types are as expected in the machine code, a process called type-checking. Once the compiler is certain all the types are correct, it then throws them away so that the machine code and explicit memory don't have any types anywhere.

As to generics, the process is similar: in dynamically typed languages the type is part of the object, in statically typed languages like C++ or Rust, the compiler has to figure out all the generic types that you use at compile-time, generate code for each one, and then get rid of the types in the machine code.

gnasher729 gnasher729gnasher729 32.5k36 silver badges56 bronze badges · Answer 2 · 2021-07-15 06:40:36Z

As an example: Apple's Swift compiler will compile the required code for every type needed, then compare the various versions, and if two are identical then one of them is picked.

For example, lots of the code for an array of unsigned 32 bit integers and for an array of signed 32 bit integers may very well be identical, while the code for an array of 64 bit integers, signed or unsigned, would be different. The code for an array of any reference type might be different.

It's up to the language designer and to some extent the designer of the standard library to make sure not too much unnecessary code is being produced.

Stack Exchange Network

How programming languages are capable of handling generic types

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

How programming languages are capable of handling generic types

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions