Why is int in C in practice at least a 32 bit type today, despite it being developed on/for the PDP-11, a 16 bit machine?

Question 1

For background, the question is to prepare some training material, which should also should explain a bit why the things are the way they are.

I tried to get some idea of how C began based on this tutorial.

However, this does not give the answer and already has more in the language than I expected.

I learned at university that the first C implementations only had int, so there was no need to give a data type and that this is the reason why leaving it out still leads to an integer.

Interestingly there it is explained that int can be e.g. 16, 32 or 36 bits, depending on the machine.

First, can one confirm that there was really only int at the beginning and no float/double (I would not be surprised because it was written to have a compiler for UNIX to my understanding).

Second, can one give me a reference how and when it came to be 32 bits for int?

Question 2

"I learned at university that the first C implementations only had int" - are you possibly talking about the default type of function arguments, together with the fact that int-returning functions didn't need a forward declaration? Perhaps interacting with the separate fact that some rudimentary hardware only had the word type natively in their instruction sets?

Question 3

IIRC, K&R C had float and double. You may be thinking of the predecessor language B.

Question 4

@dan04: K&R had floating-point, but the first C may not have: according to dmr's HOPL2 paper adding char to typeless B was only called 'new B', and the first language called C added struct and explicit pointer types; while it was already known additional types were needed for FP, since on PDP-11 they weren't (always) the same size, he doesn't say if those types were implemented yet. The bwk link shows they were implemented by 1974.

Question 5

The premise of the question is incorrect. C is not at least a 32-bit type, neither historically nor now.

Question 6

@U.Windl 16 bit integers are fine in MS-DOS. It was the 16-bit pointers that were a problem, needing all the "near" and "far" model issues. It's also a hassle if the int is not large enough for pointers, which is why intptr_t and ptrdiff_t were (later!) added.

Question 7

The C standard doesn't specify 32 bits for int, it specifies "at least" 16. https://stackoverflow.com/questions/589575/what-does-the-c-standard-state-the-size-of-int-long-type-to-be

In practice, most implementations since the 32-bit machine era picked "int" to be 32 bits, but C implementations for 16 bit machines used 16 bit ints.

"float" appears to also have been in present in K&R C, before the modern standardised ANSI C, so it's been around since the 80s. I can't confirm that the first compiler did not have float.

(An even worse situation is the size of pointers; this has always been implementation-defined, but in the MS-DOS era you could have "near" (16 bit) and "far" (32 bit) pointers and get different answers depending on compiler options)

Question 8

I used a compiler that gave you a choice whether int should be 16 or 32 bit. That was around the time where compilers changed from 16 to 32 bit ints, so with this compiler you had the choice. More precisely, the "compiler program" was two different implementations and you can choose which one. I think there are still compilers that let you choose whether plain "char" should behave like signed or unsigned char. (It is always a distinct type).

Question 9

@gnasher729 A lot of modern 8 bit compilers still allow you to choose between 16 and 32 bit ints.

Question 10

Yeah, the 16-bit PC C compilers (e.g., Borland Turbo C and Microsoft QuickC) used 16-bit int. When 32-bit processors (Intel 386) and 32-operating systems (Linux, Windows NT) got popular, C compilers switched to 32-bit int. However, int did not change during the 32- to 64-bit transition, mostly because there wouldn't have been a name for the 32-bit integer type.

Question 11

@supercat If it is such a problem, then don't use C compilers with such a feature, or turn off that feature on the C compilers. You can always set -fwrapv on GCC; in fact, just using gcc without any options will usually just work. It's only when the user invokes gcc with -O/-O2/-O3 and without -fwrapv that there's a problem, and it's a bit bizarre to hear Unix C programmers complain that giving a command the wrong arguments leads to it giving unwanted results.

Question 12

@supercat your "Creative Language-Abusing Nonsense Generator" reminds me on the "DeathStation 9000" (short DS9K). I think, every compiler should have such an option, to allow developers to harden their programs against implicit assumptions (or teach them a lesson about the actual possibility to do so).

Question 13

My first edition of K&R C defines an 'int' as the natural word length of the machine. Which made sense back then because there were a lot of different word lengths on the machines of the time. Everything from 12 to 36 bits were available.

Question 14

Burroughs Large Systems (e.g. model 5900) had 48-bit ints. Right enough, they never had C.

Question 15

This stopped being true with 64-bit machines; most 64-bit ABIs use 32-bit int. (Cray being a notable exception, with 64-bit int.) For machines at least 32 bits wide, long is often the width of an integer register. (But not in Windows x64 where only long long and pointers are 64-bit. This is why we now have types like uintptr_t and uint64_t for use when you don't want to depend on random ABI designers choosing the primitive type widths in a way that's useful for your function or data structure.)

Question 16

@PeterCordes: I think of int as being the largest type that's almost as efficient as anything smaller. The definition of "almost" is a bit fuzzy, but on most machines there was a type that was almost twice as fast as anything bigger. On the 68000, operations on 32-bit values were about 50% slower than on 16-bit values, but on successor platforms they were about the same speed, so 68000 compilers often allowed int to be configured as 16 or 32 bits. On 64-bit machines, CPU-dominated tasks are about the same speed for 32 vs 64, but for memory-dominated tasks, 32-bit types are way faster.

Question 17

@PeterCordes That didn't stop being true with 64 bit machines, it was false from the beginning. There are many systems where the natural word length is 8 bit and most instructions can only be done on 8 bit / byte registers.

Question 18

@dan04: Or you just make int32_t and friends native compiler types.

Question 19

I'm an old guy and I used 6th Edition Unix around 1974. I believe that's the edition that Dennis Ritchie shared as a result of a conversation in response to Dennis' CACM article of 1972. (This was before the AT&T lawyers got hold of Unix; hilarity ensued when they finally did, but that's a different story.)

My recollection is on the PDP-11/45 at that time, int was 16 bits and long was 32 bits. Float was 32 bits and double was 64 bits. They used the native PDP-11 floating point formats from the optional floating point processor (FP11). Ints were automatically widened to 32 bits and floats to 64 bits when passed as arguments to functions.

As they learned to port Unix to other ISAs, the rules on data type sizes had to become more flexible, hence the "int must be at least 16 bits" rule, etc. As a result, <types.h> in standard C has a set of implementation-dependent #defines for things like int32 and uint16 so a developer can portably choose a data type size. If you really care, you use those definitions.

Question 20

The 1974/1975 C Reference Manual documents the existence of machines where int was larger than 32 bits, but does not document long, suggesting that type hadn't yet been invented.

Question 21

Best I can tell, there's no <types.h> in the C standard. There is a <sys/types.h> in POSIX though. An easy mistake to make, but I believe important in this context.

Question 22

Starting in C99, fixed-width integer types are defined in <stdint.h>, including things like int32_t.

Question 23

C was designed to be maximally portable, and while 8-bit bytes / 16-bit words / 32-bit longwords are common, they aren't universal. There are real-world architectures that use 9-bit bytes and 36-bit words, there are architectures with guard or parity bits that count against the word size but aren't used for representing values, etc.

The language definition only specifies the minimum range of values the int type must be able to represent; [-32767..32767]. This means an int must be at least 16 bits wide, but may be wider. short and long are similarly specified. The only type whose size is specified in bits is char, which must be at least eight bits wide.

int is commonly the same size as the native word size, so once 32-bit machines became common, compilers started using 32 bits to represent int.

Question 24

Now with native 64 bit registers there is the problem that with int = 64 bit you cannot have 8, 16 and 32 bit types in C, because you only have two type names char and short.

Question 25

Int wasn’t 16 bit to support PDP/11 but to support a machine popular at the time. Today, we support machines that are popular today. You can get ARM-Cortex 64 bit processors for very little money. So today you want a processor that works well for 64 bit processors.

Int = 16 bit practically forces char=8, short=16, long=32 and long long = 64 bit. On a 64 bit processor you would prefer long = 64 bit.

pjc50 pjc50 15.3k1 gold badge37 silver badges40 bronze badges · Answer 1 · 2023-10-23 08:16:37Z

50

The C standard doesn't specify 32 bits for int, it specifies "at least" 16. https://stackoverflow.com/questions/589575/what-does-the-c-standard-state-the-size-of-int-long-type-to-be

In practice, most implementations since the 32-bit machine era picked "int" to be 32 bits, but C implementations for 16 bit machines used 16 bit ints.

"float" appears to also have been in present in K&R C, before the modern standardised ANSI C, so it's been around since the 80s. I can't confirm that the first compiler did not have float.

(An even worse situation is the size of pointers; this has always been implementation-defined, but in the MS-DOS era you could have "near" (16 bit) and "far" (32 bit) pointers and get different answers depending on compiler options)

Share

Improve this answer

answered Oct 23, 2023 at 8:16

pjc50's user avatar

pjc50 pjc50

15.3k1 gold badge37 silver badges40 bronze badges

60

11

I used a compiler that gave you a choice whether int should be 16 or 32 bit. That was around the time where compilers changed from 16 to 32 bit ints, so with this compiler you had the choice. More precisely, the "compiler program" was two different implementations and you can choose which one. I think there are still compilers that let you choose whether plain "char" should behave like signed or unsigned char. (It is always a distinct type).

gnasher729
– gnasher729

2023年10月23日 11:30:01 +00:00
Commented Oct 23, 2023 at 11:30
5

@gnasher729 A lot of modern 8 bit compilers still allow you to choose between 16 and 32 bit ints.

slebetman
– slebetman

2023年10月23日 16:23:41 +00:00
Commented Oct 23, 2023 at 16:23
4

Yeah, the 16-bit PC C compilers (e.g., Borland Turbo C and Microsoft QuickC) used 16-bit int. When 32-bit processors (Intel 386) and 32-operating systems (Linux, Windows NT) got popular, C compilers switched to 32-bit int. However, int did not change during the 32- to 64-bit transition, mostly because there wouldn't have been a name for the 32-bit integer type.

dan04
– dan04

2023年10月23日 18:23:16 +00:00
Commented Oct 23, 2023 at 18:23
4

@supercat If it is such a problem, then don't use C compilers with such a feature, or turn off that feature on the C compilers. You can always set -fwrapv on GCC; in fact, just using gcc without any options will usually just work. It's only when the user invokes gcc with -O/-O2/-O3 and without -fwrapv that there's a problem, and it's a bit bizarre to hear Unix C programmers complain that giving a command the wrong arguments leads to it giving unwanted results.

prosfilaes
– prosfilaes

2023年10月24日 22:33:11 +00:00
Commented Oct 24, 2023 at 22:33
4

@supercat your "Creative Language-Abusing Nonsense Generator" reminds me on the "DeathStation 9000" (short DS9K). I think, every compiler should have such an option, to allow developers to harden their programs against implicit assumptions (or teach them a lesson about the actual possibility to do so).

Holger
– Holger

2023年10月25日 08:16:39 +00:00
Commented Oct 25, 2023 at 8:16

| Show 55 more comments

John Kocurek John Kocurek 891 bronze badge · Answer 2 · 2023-10-23 21:44:00Z

8

My first edition of K&R C defines an 'int' as the natural word length of the machine. Which made sense back then because there were a lot of different word lengths on the machines of the time. Everything from 12 to 36 bits were available.

Share

Improve this answer

answered Oct 23, 2023 at 21:44

John Kocurek's user avatar

John Kocurek John Kocurek

891 bronze badge

8

1

Burroughs Large Systems (e.g. model 5900) had 48-bit ints. Right enough, they never had C.

Paul_Pedant
– Paul_Pedant

2023年10月23日 22:54:14 +00:00
Commented Oct 23, 2023 at 22:54
3

This stopped being true with 64-bit machines; most 64-bit ABIs use 32-bit int. (Cray being a notable exception, with 64-bit int.) For machines at least 32 bits wide, long is often the width of an integer register. (But not in Windows x64 where only long long and pointers are 64-bit. This is why we now have types like uintptr_t and uint64_t for use when you don't want to depend on random ABI designers choosing the primitive type widths in a way that's useful for your function or data structure.)

Peter Cordes
– Peter Cordes

2023年10月24日 00:33:00 +00:00
Commented Oct 24, 2023 at 0:33
2

@PeterCordes: I think of int as being the largest type that's almost as efficient as anything smaller. The definition of "almost" is a bit fuzzy, but on most machines there was a type that was almost twice as fast as anything bigger. On the 68000, operations on 32-bit values were about 50% slower than on 16-bit values, but on successor platforms they were about the same speed, so 68000 compilers often allowed int to be configured as 16 or 32 bits. On 64-bit machines, CPU-dominated tasks are about the same speed for 32 vs 64, but for memory-dominated tasks, 32-bit types are way faster.

supercat
– supercat

2023年10月24日 16:29:59 +00:00
Commented Oct 24, 2023 at 16:29
@PeterCordes That didn't stop being true with 64 bit machines, it was false from the beginning. There are many systems where the natural word length is 8 bit and most instructions can only be done on 8 bit / byte registers.

12431234123412341234123
– 12431234123412341234123

2023年10月24日 17:18:17 +00:00
Commented Oct 24, 2023 at 17:18
1

@dan04: Or you just make int32_t and friends native compiler types.

Joshua
– Joshua

2023年10月25日 01:10:31 +00:00
Commented Oct 25, 2023 at 1:10

| Show 3 more comments

Dan F Dan F 491 bronze badge · Answer 3 · 2023-10-24 02:00:24Z

I'm an old guy and I used 6th Edition Unix around 1974. I believe that's the edition that Dennis Ritchie shared as a result of a conversation in response to Dennis' CACM article of 1972. (This was before the AT&T lawyers got hold of Unix; hilarity ensued when they finally did, but that's a different story.)

My recollection is on the PDP-11/45 at that time, int was 16 bits and long was 32 bits. Float was 32 bits and double was 64 bits. They used the native PDP-11 floating point formats from the optional floating point processor (FP11). Ints were automatically widened to 32 bits and floats to 64 bits when passed as arguments to functions.

As they learned to port Unix to other ISAs, the rules on data type sizes had to become more flexible, hence the "int must be at least 16 bits" rule, etc. As a result, <types.h> in standard C has a set of implementation-dependent #defines for things like int32 and uint16 so a developer can portably choose a data type size. If you really care, you use those definitions.

The 1974/1975 C Reference Manual documents the existence of machines where int was larger than 32 bits, but does not document long, suggesting that type hadn't yet been invented.
Best I can tell, there's no <types.h> in the C standard. There is a <sys/types.h> in POSIX though. An easy mistake to make, but I believe important in this context.
Starting in C99, fixed-width integer types are defined in <stdint.h>, including things like int32_t.

John Bode John Bode 11k1 gold badge33 silver badges44 bronze badges · Answer 4 · 2023-11-25 14:25:22Z

C was designed to be maximally portable, and while 8-bit bytes / 16-bit words / 32-bit longwords are common, they aren't universal. There are real-world architectures that use 9-bit bytes and 36-bit words, there are architectures with guard or parity bits that count against the word size but aren't used for representing values, etc.

The language definition only specifies the minimum range of values the int type must be able to represent; [-32767..32767]. This means an int must be at least 16 bits wide, but may be wider. short and long are similarly specified. The only type whose size is specified in bits is char, which must be at least eight bits wide.

int is commonly the same size as the native word size, so once 32-bit machines became common, compilers started using 32 bits to represent int.

Now with native 64 bit registers there is the problem that with int = 64 bit you cannot have 8, 16 and 32 bit types in C, because you only have two type names char and short.

gnasher729 gnasher729 49.2k4 gold badges71 silver badges137 bronze badges · Answer 5 · 2023-11-26 19:10:17Z

Int wasn’t 16 bit to support PDP/11 but to support a machine popular at the time. Today, we support machines that are popular today. You can get ARM-Cortex 64 bit processors for very little money. So today you want a processor that works well for 64 bit processors.

Int = 16 bit practically forces char=8, short=16, long=32 and long long = 64 bit. On a 64 bit processor you would prefer long = 64 bit.

Stack Exchange Network

Why is int in C in practice at least a 32 bit type today, despite it being developed on/for the PDP-11, a 16 bit machine?

5 Answers 5

Hot Network Questions

Why is int in C in practice at least a 32 bit type today, despite it being developed on/for the PDP-11, a 16 bit machine?

5 Answers 5

Related

Hot Network Questions