BASICO - Programming Language, Self-compiling compiler
Author: Andre Adrian
Version: 22.apr.2008
Abstract
BASICO
- is a small imperative programming language that is just powerful
enough to compile
itself (compiler bootstrapping).
- has no GOTO, but has while-break-wend and multiple return
- has C-like string handling.
- is implemented in less then 1000 source code lines for the
compiler.
- produces real binary programs for x86 processors, not P-code or
Byte-Code.
- uses the C compiler toolchain (assembler, linker)
- uses C library functions like printf(), getchar(), strcpy(),
isdigit(), rand() for run-time support.
Please check the section
BASICO source
code reading to get a detailed discussion of the source code!
Contents
Introduction
In January 2006 Dr. Dobb's journal celebrated the
30th anniversary of
Tiny BASIC. Tiny BASIC was specified by Dennis Allison and was
implemented by Dick Whipple and John Arnold. Tiny BASIC needed 3
Kilobytes of memory and was sold for 5 US-dollars. The Microsoft BASIC
of this time needed 8 Kilobytes and 150 US-dollars.
The original Tiny BASIC interpreter is written in an intermediate
language (IL) to save memory. The CPU runs a virtual machine (VM) to
emulate the IL processor. The IL processor interprets the Tiny BASIC
program. Li-Chen Wang created a Tiny BASIC version without IL for the
Z80 8-bit CPU.
One drawback of this approach is that the IL can handle strings,
because it has to parse the BASIC keywords like PRINT, GOTO, but the
Tiny BASIC itself has no string handling commands. IL has small memory
requirement but also slow execution speed.
Dennis Ritchie used
compiler bootstrapping for the C compiler, like Niklaus Wirth did for
the computer languages PASCAL, Modula and Oberon. The
compiler itself is written in the
language the compiler can translate. After an first iteration that
needs external help, the compiler of iteration N can compile the source
code that creates compiler of iteration N+1.
The project is pure fun. It is just a "what if" you ask 30 years later.
In 1976 the knowledge for BASICO was already present. The "
goto
considered harmful" paper of
Edsger W. Dijkstra was written in 1968. The first PASCAL
compiler was written in 1971. The UNIX developers used compiler
bootstrapping in 1973.
The compiler tools
lex
(Lesk, 1975) and
yacc (Johnson, 1974)
were available.
The word BASICO can be read as "
Beginner's
All-purpose Symbolic Instruction Code
zero" or as Basico, just something with an italian sound like lambrusco.
Note: The project started with the name "Tiny BASIC no GOTO (TB-NG)",
but BASICO sounds so much better. Basico is the name of a
small italian
town (745 inhabitants).
BASICO syntax ideas
Today the programming language C is the standard.
Other languages are comments to C. PASCAL
is C with nested procedures but without stdio library. C++ is C
overkill. Modula-2 is C without printf but with import and export. Ada
is PASCAL++ for the DoD.
JAVA is C++ with garbage collection but byte-code. Tcl/Tk is no C at
all, but a language with minimum syntax. Last but not
least BASIC is still not dead, but has included a lot of syntax from
other languages.
GOTO considered harmful
Today everybody agrees on the "one coding block shall have one entry
and one exit" paradigma " to shorten the conceptual gap between
the static program and the dynamic process, to make the correspondence
between the program (spread out in text space) and
the process (spread out in time) as trivial as possible". But in the
same paper Dijkstra also wrote "I remember having read the explicit
recommendation
to restrict the use of the
go to statement to alarm
exits".
Just deleting the goto statement and not introduce some other syntax
for "alarm exits" does make programs less structured. Some alarm exits
are
- conditional break in a loop
- multiple return in a function
- conditional return stack un-rolling (try/throw/catch or
setjmp/longjmp)
The traditional structured control flow for the "alarm exit" problem
needs additional state variables or special function return-values
that remember the alarm-status. Specially for return stack un-rolling
the dynamic control flow is hard to see out of the static program
listing.
Statement separator
The original BASIC of 1964 used linefeed as statement terminator.
PASCAL
uses ; as statement separator, C uses ; as statement terminator.
Beginners often forget this silent
syntax element. BASIC later adopted the : as an additional statement
separator. BASICO avoids silent syntax elements, therefore both
linefeed
and
; are used as statement terminators. With linefeed as statement
terminator a long statement can not be split in two source code
lines. Such a statement needs a continuation symbol to undo the
statement terminator effect of
the linefeed. In Tcl/Tk the \ as last character before linefeed works
as continuation symbol. BASICO follows Tcl/Tk syntax.
Statement list
PASCAL encloses statements between the keywords BEGIN and END. C uses {
and } for the same purpose. These statement list keywords give
trouble to the beginner. First, they are silent syntax elements, second
the beginner has problems to see the difference between statement and
statement list.
The statement list keywords are not needed if the control flow keywords
include them. One example is the WHILE condition DO
statement-list WEND syntax. Another is the IF condition THEN
statement-list ENDIF construct. The first statement-list is enclosed by
DO and WEND, the second statement-list with THEN and ENDIF.
Dangling else
Dangling else is solved with the IF condition THEN statement-list ELSE
statement-list ENDIF construct. With ENDIF there is no more
dangling else shift/reduce conflict.
Assignment and Equal
BASIC uses = both for assigment and for test on equivalence, as it is
mathematical tradition. Pascal uses := and =, C uses = and ==. As long
as multiple assignment like a = b = c is not needed, the both meanings
of = do not produce a conflict. The BASICO meaning of a = b = c
is a assign (b equal c), that is a becomes 1 if b is equal to c and 0
else.
The author's all time favorite in simple C bugs, the nasty if (a = b)
instead of if (a == b)
is gone with BASICO.
Evaluation of
conditional expressions
The symbols '<', '>=', .. are defined as conditional operators.
The
expression a >= b can result in 1 for true or 0 for false. There is
no short circuit evaluation for conditional expressions. This is fine
for today's CPUs that do not like jumps very much because of the
execution queue flushing.
Variable type
declaration
Pascal and C have strict variable types. Without declaration no
variable can be used. BASIC needs a variable declaration only for array
variables. Scalar variables can be used directly. BASICO follows Pascal
and C. The variable types are integer, character, one-dimensional array
of integer and one dimensional array of character.
Automatic type
conversion
All calculations are done with integer. Therefore the char variables
are promoted to integer. The BASICO char is an unsigned char. There are
no negative char values.
Function calls
The C library functions like printf() or strcpy() can be called out of
a BASICO program. The C call convention for the GNU C compiler is used.
The function arguments are pushed from right to left on the stack. All
arguments have the same size (32bit for GNU C on Intel x86). The return
value in placed into a CPU register (eax for GNU C on Intel x86).
Nested functions
BASICO does not support nested functions. I even think that nested
functions are a bad method for information hiding.
Call-by-value and
Call-by-reference
Call-by-value is the method for int and char variables as function
arguments: A copy of the variable is forwarded to the function. With
call-by-value only input parameters are possible. Array variables are
forwarded with call-by-reference: The memory location of the start of
the array is given. This method allows arrays as input and output
parameters.
The return value of the function is one output parameter. To get more
output parameters, the trick in BASICO is to use arrays as function
arguments to hold these output parameters.
Named constants
Pascal has named constants with the const keyword. C uses the #define
preprocessor command for constants. BASICO has no named constants (for
now).
Comment
BASIC uses REM to start a one-line comment. PASCAL and C have
multi-line comments with { } or /* */. C++ re-introduced the one-line
comment with //. BASICO uses the C++ one-line comment.
Compiler error
handling
The compiler stops at the first error. The source code line is shown up
to the point where the scanner, parser or code-generator detected the
bug and an error message is given. For the batch processing age
compiler gurus
out there: one error is all you get from this compiler.
BASICO syntax
The following LL(1) context-free grammar in Extended Backus-Naur-Form
(EBNF) for BASICO
is final.
It is based on the
PASCAL
syntax. The syntax is free of
shift/reduce conflicts. All variable declarations are before variable
use. A simple LL(1) predictive parser needs one pass to
translate the source. Syntax terminals are enclosed in ' ' or
" " like '<' and "if". Capital letter non-terminals come from the
scanner (CONST, IDENT, ..). The empty set is commented.
The scanner translates some linefeeds into ';'. This trick makes the
syntax
semicolon-free.
The EBNF was checked by
Coco/R a
scanner and parser generator for LL(1) grammars from Hanspeter
Mössenböck. In EBNF, [ ] is 0 to 1 repetition, { } is 0 to n
repetion, ( ) is grouping alternatives.
basico =
[ "var" { globalDecl ';' } ]
{ "func" IDENT '(' paramList ')' ':' returnType ';'
blockOrForward ';' }
.
globalDecl =
IDENT ':' (
"int"
| "char"
| "array" CONST (
"int"
| "char"
)
) .
paramList =
[ paramDecl { ',' paramDecl } ]
.
paramDecl =
IDENT ':' (
"int"
| "char"
| "array" (
"int"
| "char"
)
) .
returnType =
"int"
|
"char"
|
"void"
.
blockOrForward =
pblock3
| "forward"
.
pblock3 =
[ "var" { localDecl ';' } ]
"begin" stmtList "end"
.
localDecl =
IDENT ':' (
"int"
| "char"
| "array" CONST (
"int"
| "char"
)
) .
stmtList =
{ [ statement ] ';' }
.
statement =
IDENT (
[ '[' expr ']' ] '=' expr
| exprList
)
| "if" expr "then" stmtList [ "else" stmtList ] "endif"
| "while" expr "do" stmtList "wend"
| "return" [ expr
]
| "break"
.
exprList =
'(' [ expr { ',' expr } ] ')'
.
expr =
addexpr { '=' addexpr
| '#' addexpr
| '<' [ '=' ]
addexpr
| '>' [ '=' ]
addexpr
} .
addexpr =
term { '+' term
| '-' term
| '|' term
| '^' term
} .
term =
unaryfact { '*' unaryfact
|
'/' unaryfact
|
'%' unaryfact
|
'&' unaryfact
} .
unaryfact =
'-'
fact
| '~' fact
| fact
.
fact =
IDENT [ '[' expr ']' | exprList ]
| CONST
| CHRCONST
| STRCONST
| '(' expr ')'
.
BASICO example program
This is the old "Guess a number" game in BASICO. The example program
uses the C functions getchar(), printf(), time(), srand() and
rand(). The
function
main() is the entry point like in C. This is the first
BASICO
game ever written. Please note the structured use of alarm exits with
"return decimal" in getDecimal() and "break" in main().
// guess.bas
// Guess a number game
// Compile with BASICO version 0.9
func getDecimal():int
var decimal: int
ch: char
begin
decimal = 0;
while 1 do
ch = getchar()
if (ch>='0')&(ch<='9') then
decimal = decimal * 10 + ch - '0'
else
return decimal
endif
wend
end
func main():void
var myNumber: int
yourNumber: int
guesses: int
ch: char
begin
printf("Guess a number game\n")
printf("Numbers are between 1 and 50\n")
srand(time(0)) // get time in seconds since 1970, init
Random Generator
while 1 do
myNumber = rand() // get a random
number
myNumber = myNumber % 50 + 1
guesses = 0
while 1 do
printf("Your guess: ")
yourNumber = getDecimal()
guesses = guesses + 1
if yourNumber = myNumber then
printf("You guessed it in %d
guesses!\n", guesses)
break
else
if yourNumber > myNumber
then
printf("Your
guess is to high\n")
else
printf("Your
guess is to low\n")
endif
endif
wend
printf("Another game (y or n): ")
ch = getchar();
if (ch='n')|(ch='N') then
break
endif
getchar() // eat \n
wend
printf("Goodbye.\n")
end
Compile program:
./basico<guess.bas
>guess.s
cc -o guess guess.s
Run program:
./guess
Guess a number game
Numbers are between 1 and 50
Your guess: 25
Your guess is to high
Your guess: 13
Your guess is to low
Your guess: 19
Your guess is to high
Your guess: 16
You guessed it in 4 guesses!
Another game (y or n): n
Goodbye.
BASICO development
strategy
Compiler bootstrapping is done in several steps. We start with the
development environment for one language, in our case C, and we end
with the development environment for another language, BASICO. To
minimize our effort, we only replace the C compiler with the BASICO
compiler but keep the assembler, linker and library of the C tool-chain.
- Define the BASICO source language. For compiler bootstrap you
need character, character arrays (strings), integer and integer arrays.
A recursive descent parser needs recursive function calls. Function
calls need parameters (formal arguments), local variables, return
values and call-by-reference in/out variables. See
basico05_bnf.txt.
- Define the BASICO target language. The output of a recursive
descent parser (compiler) are op-codes for a stack-machine. These stack
op-codes are further translated into op-codes (assembler listing) for a
real register machine like the Intel x86 or the Renesas R8C. See
x86.txt.
- Implement a throw-away compiler that translates BASICO source
into target op-codes. This initial compiler is done with lex and yacc.
- Establish the tool-chain with BASICO throw-away compiler,
assembler, linker and library.
- Convert the BASICO grammer to LL(1) with the help of the CoCo/R
scanner and parser generator.
- Write the first version of the BASICO compiler in BASICO source
code.
- Translate the first BASICO compiler with the initial lex/yacc
compiler into a binary program.
- Now the development environment is complete. We have the compiler
source, a throw-away compiler that can translate this source into a
program. This program can compile it's own source.
Predictive Parser for
Expression
The following parser is an enhanced version of the infix to postfix
translator in chapter 2.5 of "COMPILERS Principles, Techniques and
Tools" by Aho, Sethi and Ullman. The program can only handle single
digit numbers and can't skip white space, but does understand all
single character BASICO operators like =, #, <, >, +, -, |, ^, *,
/, %, &, ~, ( and ).
var lookahead: int
func main(): void
begin
lookahead = getchar()
expr()
putchar('\n')
end
func expr(): void
begin
addexpr()
while 1 do
if lookahead =
'=' then
match('='); addexpr(); putchar('=')
else if
lookahead = '#' then
match('#'); addexpr(); putchar('#')
else if
lookahead = '<' then
match('<'); addexpr(); putchar('<')
else if
lookahead = '>' then
match('>'); addexpr(); putchar('>')
else; break;
endif; endif; endif; endif
wend
end
func addexpr(): void
begin
term()
while 1 do
if lookahead =
'+' then
match('+'); term(); putchar('+')
else if
lookahead = '-' then
match('-'); term(); putchar('-')
else if
lookahead = '|' then
match('|'); term(); putchar('|')
else if
lookahead = '^' then
match('^'); term(); putchar('^')
else; break;
endif; endif; endif; endif
wend
end
func term(): void
begin
unaryfact();
while 1 do
if lookahead =
'*' then
match('*'); unaryfact(); putchar('*')
else if
lookahead = '/' then
match('/'); unaryfact(); putchar('/')
else if
lookahead = '%' then
match('%'); unaryfact(); putchar('%')
else if
lookahead = '&' then
match('&'); unaryfact(); putchar('&')
else; break;
endif; endif; endif; endif
wend
end
func unaryfact(): void
begin
if lookahead = '-' then
match('-');
fact(); putchar('-')
else if lookahead = '~'
then
match('~');
fact(); putchar('~')
else; fact(); endif; endif
end
func fact(): void
begin
if lookahead = '(' then
match('(');
expr(); match(')')
else if isdigit(lookahead)
then
putchar(lookahead); match(lookahead)
else; error(); endif; endif
end
func match(t: int): void
begin
if lookahead = t then
lookahead =
getchar()
else; error(); endif
end
func error(): void
begin
printf("syntax error\n")
exit(1)
end
Compile program:
./basico04<rpn3.bas
>rpn3.s
cc -o rpn3 rpn3.s
Run program:
./rpn3
7=1+2*3
7123*+=
The infix expression 7 test for equal with 1 plus 2 multiply by 3 is
translated into the postfix expression push 7, 1, 2, 3, then multiply 2
by 3, add this with 1 and test this for equal with 7.
Examples for Compiler
bootstrapping
- Niklaus Wirth published the source code of many small languages
like PL/0 in 1976, PASCAL-S
in 1975 and Oberon-0
(sources)
in 1996. These
languages can not bootstrap themselves. BASICO follows the PASCAL
syntax.
- P4-Pascal
and UCSD-Pascal
can bootstrap themselves.
- Dennis Ritchie used compiler bootstrapping for the UNIX C
compiler.
Today the sources of the "last1120c"
compiler are on his home page.
This compiler has no struct and creates PDP11/20 object code. Together
with the UNIX version 1 binaries and the PDP11 emulator you can go back
to 1973 and program like the old heros. Even the UNIX version 1 man
pages are available again. BASICO follows the C semantics.
- Jack W. Crenshaw published between 1988 and 1989 the TINY
language. He used characters instead of constants for keywords in the
parser. This idea is used in BASICO, too. TINY can not bootstrap
itself. BASICO follows TINY's code generation ideas.
- Euphoria is
a BASIC like programming language. Programs can run interpreted or
compiled with help of an Euphoria to C translator. There is a free
Euphoria interpreter
in Euphoria.
- Edmund Grimley Evans wrote in 2001 "Bootstrapping a simple
compiler from nothing". He started on the pure binary machine code
level. The language is Forth like.
- Fabrice Bellard wrote in 2002 the Obfuscated Tiny C Compiler
(OTCC). Macro extended with gcc -E and pretty printed with indent the
source is 463 lines and can compile itself on Linux into 80386 machine
code. See this masterpiece yourself (readable otcc.c).
Symbol Table
Niklaus Wirth used in PASCAL-S a two-dimensional symbol table. BASICO
provides only one-dimensional arrays. The compiler program itself
has to do the abstraction from 1-dim to 2-dim.
The identifiers in the program can be keywords (if, then, else), global
variable names, function names, function parameter names or local
variable names. The search in the symbol table is keywords first, then
local and parameter variables and last global variables and
function names. These three namespaces (keywords, local, global) can be
implemented with three symbol tables. Or in one symbal table that has
three
sections. The first approach is implemented.
The variable name is the unique identifier in the symbol table to get
and set the attributes of the
variable. One attribute is the variable type like int, char,
array-of-int and array-of char. Another is the storage type like global
storage or local storage. One more attribute for parameter and local
variables is the frame-pointer
offset (frame-pointer relative memory location). See symtab.c for
details.
Code Generator
The recursive decent parser creates op-codes for a stack-machine.
Normal microprocessors are register machines that can handle 1 to 3
addresses per op-code. The Intel x86 chips or the Renesas R8C chip can
handle 2 addresses. To bridge
the gap between stack-machine and 2-address machine two or more
registers of
the CPU are used as stack-cache, that is TOS (top of stack), NOS
(next of stack), third of stack and fourth of stack are in registers.
Instead of moving
the contents of the CPU registers to perform the push and pop stack
operations, the stack-machine cache labels (TOS, NOS, THIRD, FOURTH)
are
re-mapped. Therefore CPU register %eax can be NOS at one time and TOS
at
another time. See
codegen.c for details. Some
examples of code generation for the x86. The BASICO statement follows
as comment the assembler code:
movl a, %eax
movl 1,ドル %ebx
subl %ebx, %eax
movl %eax, x
# x=a-1
movl a, %eax
movl b, %ebx
subl %ebx, %eax
movl c, %ebx
movl d, %ecx
subl %ecx, %ebx
movl %edx, %esi
cltd
idivl %ebx
movl %esi, %edx
movl %eax, x
# x=(a-b)/(c-d)
movl a, %eax
negl %eax
movl b, %ebx
movl c, %ecx
movl d, %edx
imul %edx, %ecx
subl %ecx, %ebx
subl %ebx, %eax
movl %eax, x
# x=-a-(b-c*d)
movl i, %eax
movl 1,ドル %ebx
addl %ebx, %eax
movl n, %ebx
movl i, %ecx
movl bb(,%ecx,4), %ecx
imul %ecx, %ebx
movl %ebx, bb(,%eax,4)
# bb[i+1] = n*bb[i]
movl a, %eax
movl b, %ebx
notl %ebx
andl %ebx, %eax
movl a, %ebx
notl %ebx
movl b, %ecx
andl %ecx, %ebx
orl %ebx, %eax
movl %eax, x
# x=a&~b|~a&b
Peephole optimization
Wiktionary
definition: An optimization that works by eliminating redundant
instructions from a small area of code.
Some examples of possible peephole optimization for the Basico compiler:
original code
optimized code
comment
movl 0,ドル %eax
movl %eax, -12(%ebp)
movl
0,ドル -12(%ebp)
The x86 CPU can move a constant
to a memory location.
movl 48,ドル %ebx
cmpl %ebx, %eax
cmpl 48,ドル %eax
The x86 CPU can compare a
constant with a register.
movl 48,ドル %ebx
subl %ebx, %eax
subl 48,ドル %eax
The x86 CPU can subtract a
constant from a register.
movl 0,ドル %eax
pushl %eax
pushl 0ドル
The x86 CPU can push a constant.
cmpl
%ebx, %eax
sete %al
movzbl %al, %eax
andl %eax, %eax
jz .L8
cmpl
%ebx, %eax
jnz .L8
The compare op code sets the
flags.
cmpl %ebx, %eax
setg %al
movzbl %al, %eax
andl %eax, %eax
jz .L10
cmpl %ebx, %eax
jng
.L10
The compare op code sets the
flags.
A peephole optimizer uses pattern matching to identify a code segment.
BASICO sources
The source code of BASICO version 0.8 is discussed in detail. The
findings of this code review are implemented in BASICO version 0.9.
BASICO source code reading
Version 0.9
Here are the bugfixes as mentioned in the source code reading.
BASICO version 0.9
Version 0.8
The statements of a function that is only called from one location is
copied to this location (inline the function). This reduced the source
lines to 975. The stripped throw-away compiler binary is 16432
bytes long and needs 46ms to
process the basico.bas input file, the BASICO version is 18124 bytes
long and needs 82ms. The GNU C compiler was using -O1 optimization. The
simple idea of using CPU registers as stack machine cache pays of very
well.
BASICO version 0.8
Version 0.7
The parser calls now the code generation functions. The BASICO compiler
is written in BASICO. Compiler bootstrapping is possible. For the input
file basico.bas the BASICO version 0.5 compiler produces the same
assembler code as the BASICO 0.7 compiler. The compiler is 1279 source
lines long.
BASICO version 0.7
Version 0.6
Now scanner, symbol table and parser are in BASICO. The parser can
successful parse itself. The BASICO parser was transcribed from the
CoCo/R Parser.cpp output file of the basico06.atg grammar input file.
BASICO version 0.6
Version 0.5
The compiler components scanner and symbol table are now available in
BASICO. The BASICO version scanner.bas was created by transcribing
scanner.c. The input file scanner.bas produces the same output file
with the BASICO version and the C version of the scanner - like it
should be. The compiled C version is 5844 bytes long and needs 12ms to
process the scanner.bas input file, the BASICO version is 9928 bytes
long and needs 22ms.
Version 0.51 combines the infix to postfix translator in BASICO with
the scanner and symbol table. This is an intermediate step to the full
BASICO parser in BASICO.
BASICO version 0.51
BASICO version 0.5
Version 0.4
The scanner is now written in C. For easy re-writing of the scanner in
BASICO only a subset of C was used. The BASICO source is copied as
comment into the output assembler listing.
Version 0.41 has the infix to postfix translator examples.
BASICO version 0.41
BASICO version 0.4
Version 0.3
This version uses symbol tables. Now different variable types and
different storage types are working. Even call-by-reference for arrays
as function parameters is implemented. The first version of our
throw-away compiler is ready. Now work moves from back-end (code
generation) to front-end (scanner, parser).
In version 0.31 the lex scanner translates the keywords into
characters. The bison parser uses these characters as tokens. These
changes prepare the parser for the new scanner. Another detail is the
new handling of <= and >=. The boolean operator ^ for xor is new.
The not operator is now ~ as in C.
BASICO version 0.31
BASICO version 0.3
Version 0.2
This version can compile expressions with global integer and
global array-of-integer variables. Function call with return value is
possible. Function call to C library
functions with 3 parameters maximum is working, too. The central
missing element is the symbol table. The symbol table tells the type of
the variable like char variable and/or local variable. Op-code
generation for char and local variables is missing, too. But we have
reached some level of "Tiny BASIC Compiler".
Version 0.21 allows comments after a statement.
BASICO version 0.21
BASICO version 0.2
Version 0.1
This version uses lex and yacc to implement the parser of the
throw-away compiler. The parser does not only check that the syntax is
okay,
but does emit some semantic information. Some very first ideas on
symbol table and code generation are included too.
BASICO version 0.1
Books
Compiler bauen mit UNIX
Introduction to compiler construction with UNIX
Axel-Tobias Schreiner with H. G. Friedman, Jr.,
german 1985, Hanser, München, ISBN 3-446-14359-9;
englisch 1985, Prentice-Hall Software Series, New York, ISBN
0-13-474396-2;
The book is no longer in print. But you can download the
example programs
here.
The UNIX Programming Environment
Der UNIX - Werkzeugkasten. Programmieren mit UNIX
Brian W. Kernighan, Rob Pike
german 2002, Hanser, München, ISBN 978-3446142732;
english 1984, Prentice Hall, Inc., ISBN 0-13-937681-X
The english version of the book is still in print. The
example programs are
here.