A lexemes is the raw piece of text your code contains, For Example:-
- "int"
- "x"
- "="
- "69"
- ";"
A token is the label or category your language gives to that lexeme.
- "int" → KEYWORD
- "x" → IDENTIFIER
- "=" → ASSIGN_OP
- "69" → NUMBER_LITERAL
- ";" → SEMICOLON
A powerful technique for extracting meaning from the user's source code without having to execute it.
Breaking a multi-argument function into a chain of single argument functions.
A closure is a function that persists the environment in which it was created, allowing it to access surrounding variables later.
Also known as lexing (lexical analysis), A scanner/lexer takes in the linear stream of characters (lexemes) and chunks them together into a series of word (token).
A parser takes the flat sequence of tokens and build a tree structure often refered to as - parse tree or abstract syntax treee (AST). Actually Parser's the friend/foe which reports syntax errors. AST Image
If the language is statically typed, this is when we type check. Let's suppose we have an expression like a + b so we would check where a and b are declared, then we also figure out their types. Then if those types don’t support being added to each other, we report a type error.
IR is the in-between form between AST and machine code, The compiler translates AST to IR so it can optimize.
Self Explanatory...And Most Language Hackers Spend Their Entire Careers Here, squeezing every drop of performance.
After implementing all the optimization you can think of we generate code in primitive assembly-like instruction which the CPU can run.
If the compiler produces bytecode then you have to compile that bytecode again cause there's no chip that runs fuckin bytecode.
While some compiler reads the source code several time inorder to create the AST, IR or Optimizations single pass compiler reads source code once, from top to bottom, and generates output as it goes, without ever rewinding or re-parsing.
Pascal & C were designed around this limitation because memory at that time was so precious that a compiler might not even be able to hold the entire source file.
Some programming languages begin executing code right after parsing it to an AST, with maybe a bit of static analysis applied.
It's not widely used for general-purpose languages since it tends to be slow.
Transpiler are known as source-to-source compiler is a tool that takes code of one language and converts it into another language so that developer don't have write a full fledged language.
JIT compiles the code at runtime on the fly... Alien Tech.