Python is written in C, which is written in a C with an older compiler, which is written in a C with an even older compiler, which is written in B, which is written in BCPL. Presumably there is an original language BCPL is written in.
Every programming language is written in an older programming language.
What programming language came first?
What was that coded in?
-
5Does this answer your question? How do I create my own programming language and a compiler for itwhatsisname– whatsisname10/15/2020 01:07:03Commented Oct 15, 2020 at 1:07
-
3It's turtles all the way down.davidbak– davidbak10/15/2020 17:19:42Commented Oct 15, 2020 at 17:19
-
Wikipedia article: History of compiler constructionrwong– rwong11/22/2020 06:02:45Commented Nov 22, 2020 at 6:02
-
Question should be: What languages are compilers and interpreters written in?gnasher729– gnasher72912/29/2023 12:50:19Commented Dec 29, 2023 at 12:50
-
In my experience (which isn't that significant) they are mostly written in BNFJimmyJames– JimmyJames12/29/2023 22:31:09Commented Dec 29, 2023 at 22:31
6 Answers 6
What are programming languages written in?
Programming language compilers and runtimes are written in programming languages — not necessarily languages that are older or are different than the one they take as input. Some of the runtime code will drop into assembly to access certain hardware instructions or code sequences not easily obtained through the compiler.
Once bootstrapped, programming languages can self-host, so they are often written in the same language they compile. For example, C compilers are written in C or C++ and C#'s Roselyn compiler is written in C#.
When the Roselyn compiler adds a new language feature, they won't use it in the source code for the compiler until it is debugged and working (e.g. released). This akin to the bootstrapping exercise (limited to a new feature rather than the whole language).
But to be clear, there is the potential (and often realized) for the programming language to be written in the latest version of its input language.
So what came first, and what was that coded in?
Machine code came first, and the first assemblers were themselves very very simple (early assembly languages were very easy to parse and generate machine code for), they were written in machine code, until bootstrapped and self-hosted. It helped that the early machines had simple instruction sets, for example, fixed word-sized instructions. Let's also note that as soon as one machine had an assembler, an assembler for a new machine could be written in assembler on the older machine, facilitating bootstrapping of the new machine.
Think of it this way. Python is written in C,
No, it is not.
You seem to be confusing a Programming Language like Python or C with a Programming Language Implementation (e.g. a Compiler or Interpreter) like PyPy or Clang.
A Programming Language is a set of semantic and syntactic rules and restrictions. It is just an idea. A piece of paper. It isn't "written in" anything (in the sense that e.g. Linux is "written in" C). At most, we can say it is written in English, or more precisely, in a specific jargon of English, a semi-formal subset of English extended with logic notation.
Different specifications are written in different styles, here is an example of some specifications:
- The Java Language Specification
- The Scala Language Specification
- The Haskell 2010 Language Report
- The Revised7 Report on the Algorithmic Language Scheme
- The ECMA-262 ECMAScript® Language Specification
- Python does not really have a single Language Specification like many other languages do, the information is kind of splintered between the Python Language Reference, the Python Enhancement Proposals, as well as a lot of implicit institutional knowledge that only exists in the collective heads of the Python community
There are multiple Python implementations in common use today, and only one of them is written in C:
- Brython is written in ECMAScript
- IronPython is written in C#
- Jython is written in Java
- GraalPython is written in Java, using the Truffle Language Implementation Framework
- PyPy is written in the RPython Programming Language (a statically typed language roughly at the abstraction level of Java, roughly with the performance of C, with syntax and runtime semantics that are a proper subset of Python) using the RPython Language Implementation Framework
- CPython is written in C
In other words, every programming language is written in an older programming language. So what came first, and what was that coded in?
Again, you are confusing Programming Languages and Programming Language Implementations.
Programming Languages are written in English. Programming Language Implementations are written in Programming Languages. They can be written in any Programming Language. For example, Jython is a Python implementation written in Java – and Java is younger than Python. GHC is a Haskell implementation written in Haskell. GCC is a C compiler written in C. tsc
is a TypeScript compiler written in TypeScript. rustc
is a Rust compiler written in Rust. NSC is a Scala compiler written in Scala. javac
is a Java compiler written in Java. Roslyn is a C# compiler written in C#.
And so on and so forth, there really is no restriction on the language used to implement a compiler or interpreter. (There is a theoretical limitation in that an interpreter for a Turing-complete language must also be written in a Turing-complete language.)
-
5Programming Languages can also be written in Human Languages other than English.Stack Exchange Broke The Law– Stack Exchange Broke The Law10/15/2020 10:51:46Commented Oct 15, 2020 at 10:51
-
1Now that the difference between language and language implementation is clear, the question remains the same (or almost) _What are programming languages implementations written in? Not complaining about this answer (which I like!) but it does not fully respond to the "implicit" question. Otherwise would be the perfect answerLaiv– Laiv10/15/2020 11:41:32Commented Oct 15, 2020 at 11:41
-
1As @Laiv says, although this highlights the difference between the languages and their implementations, the question still stands. All implementations are written in other languages, on and on and on...fartgeek– fartgeek10/15/2020 16:19:09Commented Oct 15, 2020 at 16:19
-
@Laiv: I am answering that question in the last two paragraphs: language implementations are written in languages. Now, if you want to know specifics, then I cannot answer that. Programming language choice is often a subjective personal choice, so they are written in whatever language the author chooses to.Jörg W Mittag– Jörg W Mittag10/15/2020 16:27:16Commented Oct 15, 2020 at 16:27
-
@fartgeek: Programming language choice is often a subjective personal choice, so a language implementation will typically be written in whatever language the author likes. I personally like Scala and Ruby, so if I had to implement a language, I would likely do it in Scala or Ruby, but that is a completely personal choice, and someone else may choose a different language.Jörg W Mittag– Jörg W Mittag10/15/2020 16:28:51Commented Oct 15, 2020 at 16:28
Each machine has an instruction set it natively executes.
That instruction set is the first language.
The first higher level language was assembly, literally allowing the programmer to write a long expression like mov ax bx
instead of the corresponding binary word.
The first compiler was written in machine language, though more accurately it would have been called an assembler but today's standards. It would have taken the assembly language and translated it to the binary encoding.
This has happened many times over for many different machines until the first cross-compilers were developed that could rewrite a program into another machine language.
Even now though there are still languages who are first implemented in terms of a machine language.
-
1All true, but worth pointing out that even before we had the actual machines we had the instruction sets. This all used to just be mathematics that ran in our heads. We've come a long way.candied_orange– candied_orange10/15/2020 01:05:24Commented Oct 15, 2020 at 1:05
-
@candied_orange True, Lambda the UltimateKain0_0– Kain0_010/15/2020 02:42:55Commented Oct 15, 2020 at 2:42
The oldest programmable computers (like the ENIAC) were programmed by physically setting mechanical switches and plugging wires to certain configurations which caused the CPU to perform the desired operations. So I guess these mechanical configurations would be the "original language".
If we don't count physical configurations as a programming language, then the first programming language is binary-encoded instructions, also known as "machine code". An instruction is a set of bits in memory that trigger the CPU to perform a specific operation. The bits were entered directly into memory by a binary keyboard.
Programs were developed on paper using symbols, but they were manually converted to binary codes before being entered into the computer. Here is the first such program:
Note the symbolic code on the left and how this is manually converted into binary which can be entered into the computer.
At some point, programs were developed which could take simple textual instructions (like ADD, JUMP etc.) and automatically convert them into machine code. The first such program would have been written in machine code, but from this point on, higher and higher level languages could be developed.
-
I have seen photos of i think a pop-8 where you had to enter a tiny program manually that could start the tape reader (5 bit punched holes).gnasher729– gnasher72912/29/2023 19:59:26Commented Dec 29, 2023 at 19:59
There's actually a large class of machines called "computers" with different operational principles, but in this context about "programming languages" you're probably asking specifically about general-purpose, stored-program, digital computers.
The development of these first leapt during WW2, where there was a newly pressing need to execute a variety of complicated computations particularly relating to cryptanalysis and physical simulations.
The labour requirements of executing computations by hand (as had been the norm) soared as they increased in complexity, hence the desire for a mechanised solution that would be faster than an army of clerks, and more economical with labour.
The earliest digital computers did not store their program in memory, but instead were configured by physical alteration of their hardware - by physically altering wiring or adjusting electro-mechanical elements.
However this physical adjustment was itself found to be slow, labour-intensive, and error-prone, and the process of doing so took the machinery out of action in the meantime.
Very quickly it was found more effective to control the computer by implementing a set of fixed computational operations in hardware, and then controlling the application of these fixed operations through the use of "machine code".
This machine code was designed to be stored in the same fashion as the data which the computer operated with.
Arguably that is the first "programming language", if what we mean is some kind of symbolic representation that specifically controls this kind of computing machine.
All current computers still operate using machine code.
The idea of controlling machines with codes was not completely new at the timr though - the Strowger switch for example, which controlled telecoms routing by entering Morse-code like pulses, was invented a generation earlier in the late 19th century. The Jacquard loom in the textile industry, was also controlled by punch-cards, again invented in the 19th century.
And obviously, by WW2, we were already using mathematical syntaxes and natural language instructions to control computation executed by humans, and machine programming languages still rely latently on these human languages for their own design and meaning.
That is, computers must be understood as a mechanisation of computation, not as originating all the concepts of computation.
It's therefore impossible to be rigorous about exactly when "programming languages" first emerged and what each is "written in", unless we are arbitrary about exactly which machines we're talking about, because everything eventually reduces to the underlying natural languages, and many other kinds of machines have had codes and "languages" which control their operation.
Anyway, back to WW2-era computers. Once hardware was to be treated as fixed, analysis ensued on what kind of operations had to be furnished in hardware to provide a full capability for performing all kinds of computation. This is where you get ideas like the "Turing machine" and similar.
Expressing instructions in raw machine codes are not very ergonomic for human use, so assembly languages were devised which allows a programmer to specify the instructions in a mnemonic form, which are translated by a "compiler" program into raw machine code. This translation is simple enough that an assembly-language compiler can be written by hand (by a skilled programmer) in machine code.
The process from there is basically recursive, with a compiler for a new language each time being written first in some more primitive language. By the late 50s, there was already COBOL.
Later the concept of "cross-compilers" and "trans-pilers" also emerged, where you can write a compiler for completely new hardware (i.e. hardware whose machine code is different from any previous scheme), using relatively advanced languages already available for execution on existing hardware.
So we don't really need to start at the bottom anymore, implementing a programming language compiler first in machine code, so long as we continue to have more advanced facilities that have already been built from the bottom up.
Most commonly now, a compiler would be written in a high-level language (like C), and the machine code for the target architecture would only make an appearance as the output of the compiler, not as the origin language in which the compiler itself was written.
Compilers are often not written using a compiler for an older version.
Say you have a compiler for language X, and X is quite suitable for writing compilers. You define an improved language X’. You create a compiler for X’ by starting with the compiler for X and making the minimal changes.
Now you have a compiler for X’, so you can do two things: Change your compiler to take full advantage of all X’ features, so now you have compilers written in X’ for the X and X’ languages. Note that your X compiler actually uses a compiler written in a newer language than X! And you change the X’ compiler, so that compiled X’ code can use all the new X’ features, fully optimised.
So at this point we have two compilers for two different languages, both written in the newer language.
But further, we might have a compiler for a language Y which is better suited to writing compilers than X. So we might change our compilers to be written in the Y language.
Many languages have variants, say C89 vs C2020, but only one program compiling all of them. Which language it compiles depends on a command line option. So the compilers for all variants are written in the same language.
Explore related questions
See similar questions with these tags.