I bought my first TTL data book in 1979. I was learning 6502 machine code at the time and dreamt of building a simple TTL CPU. I sketched some circuit ideas; but that was as far as it went. Now, 25 years later, I've finally done it! Working evenings and weekends, the Mark 1 took a month to design, 4 months to build and a month to program. Here's the result:
Myself is a standard way to implement recursion in FORTH. Even if you're not familiar with FORTH, if I tell you it's a stack-based language, and uses reverse polish notation (RPN), you might be able to figure out how this works.
ALU
8
Arithmetic and logic unit
The ALU data path is a bottleneck. It takes four clock cycles to load the inputs, set the ALU function, and read the result. This is the least satisfactory aspect of the whole design.OP
8
Operand register
OP is loaded into the uppermost 8 bits of オPC. The lower 4 bits are reset to zero.オPC
12
Microcode program counter
W
16
FORTH Working register
The 16-bit index registers, IP and W, support increment, decrement, and can address memory.IP
16
FORTH Instruction Pointer
PSP
8
FORTH Parameter stack pointer
The stack pointers, RSP and PSP, are 8-bit up/down counters feeding the A1-A9 address inputs of the stack RAMs. The least significant address input (A0) selects the upper or lower byte. Logically, the stacks are 16-bits wide by 256 words deep. The FORTH word length is 16 bits.RSP
8
FORTH Return stack pointer
Stack RAM
16
Dedicated stack RAM
0
8
Force 00H on data bus
The Mark 1 is a micro-programmed machine with a highly encoded "vertical" microcode. The microinstruction (オ) is only 8-bits wide. One normally thinks in terms of "horizontal" microcodes, which are wider and less encoded. Some are very wide indeed. The Mark 1 is more like a RISC processor.
The 8-bit オ-instruction (オ) is encoded as follows:
The source and destination fields of the move instructions are coded as follows:
The Mark 1 executes 1 オ-instruction per clock cycle (1MHz).
Decoding is done centrally using 74HC138 1-of-8 decoders. Decoded control signals are distributed via the back plane. Simple gating is then required at card level to complete the decoding.
The conditional is a skip not a branch. It inhibits loading of another オ-instruction for a specified number of cycles if the test is true. The skip distance is decremented to zero at which point normal execution resumes.
The "オPC←OP*16" instruction (1011xxxx) a.k.a. XOP loads the uppermost 8-bits of the program counter (オPC) from the operand register. It's a form of indirect jump.
The other jump instruction (1001xxxx) has a 4-bit operand and can only reach the first 16 bytes of the オ-ROM.
The 74181-based ALU requires 8 control signals. These are decoded from the 4-bit ALU function field in the オ-instruction using a 7x16 diode matrix ROM. The shaded squares indicate positions where diodes are fitted:
OP
S0
S1
S2
S3
M
FLAG
D6
D7
The control signals are transmitted from the diode matrix to the ALU via the data bus.
M is hard-wired to オ3.
D6 and D7 control the carry input as follows:
FLAG selects sign or overflow testing. Sign testing routes the most significant bit of the ALU result to the conditional test multiplexer. Overflow testing required the addition of a quad-XOR gate. The 74181 does not generate an overflow signal (the later 382 variant does). This was not used in the end because FORTH implements signed comparison by testing the sign of the result after subtraction.
The Mark 1 is housed in a 3U 19" IEC297 sub-rack with a 64-way DIN 41612 backplane. The bus layout is shown below. The "A" row resembles a standard 8-bit microprocessor bus. The "C" row carries the オ-instruction and various decoded control signals. The fully-bussed pins (1, 2, 31, & 32) carry power supply and clocks.
1
+5V
2
CLK1
3
D0
3
オ0
4
D1
4
オ1
5
D2
5
オ2
6
D3
6
オ3
7
D4
7
オ4
8
D5
8
オ5
9
D6
9
オ6
10
D7
10
オ7
11
A0
11
オ210=101
Dest=OP
12
A1
12
オ210=110
Dest=ALU A
13
A2
13
オ210=111
Dest=ALU B
14
A3
14
オ210=000
W
15
A4
15
オ210=001
IP
16
A5
16
オ210=010
Dest=TOS
PSP
17
A6
17
オ210=011
Dest=R
RSP
18
A7
18
SRC=111
ALU
19
A8
19
SRC=000
W
20
A9
20
SRC=001
IP
21
A10
21
SRC=010
TOS
22
A11
22
SRC=011
R
23
A12
23
オ=1000xxxx
INC / DEC
24
A13
24
オ=1001xxxx
JUMP #
25
A14
25
オ=1010xxxx
ALU Function
26
A15
26
オ=1011xxxx
JUMP OP
27
MR
Memory Read
27
M@IP
Address = IP
28
MW
Memory Write
28
M@W
Address = W
29
RESET
29
LO
LO-byte
30
IRQ
30
HI
HI-byte
31
CLK2
32
0V
The clocks are in quadrature.
CLK1 (削除) rises (削除ここまで)falls at the beginning of the machine cycle.
CLK2 is used to generate write-enable signals for the RAM and I/O.
All control signals are active low.
The sequencer has a 12-bit micro-program counter (オPC). The uppermost 8-bits can be loaded from the OP Latch, effecting a jump to one of 256 microcode routines. Each routine starts on a 16-byte オ-page boundary.
Opcodes, the machine language of the macro-machine, are loaded into the OP latch from the data bus under micro-program control. They can be fetched from memory using one of the index registers as a program counter. A simple micro-interpreter consists of the following 3 オ-instructions:
OP←Memory[Index] Load OP latch from memory Index←Index+1 Increment "program counter" オPC←OP*16 Execute microcode routine
How do these 3 オ-instructions get executed? One possibility is to append them to the end of every オ-routine. A slower but more space-efficient option is to append a jump to them. Mark 1 microcode has a jump specifically for this.
It's possible to customise the the instruction set and thereby create a "virtual machine".
Opcodes can have zero, one, or more operands. The オ-routines consume operands by incrementing the program counter. During development, I used this two-operand POKE instruction to test the UART. This expects a 16-bit address followed by a data byte:
Poke: Index.Lo ← Memory [PC] ; Address LO PC ← PC+1 Index.Hi ← Memory [PC] ; Address HI PC ← PC+1 Temp ← Memory [PC] ; Data byte PC ← PC+1 Memory [Index] ← Temp ; Do the POKE Jump Next
The Mark 1 was designed to support the FORTH virtual machine. The following FORTH primitives are micro-programmed:
EXIT LIT EXECUTE BRANCH 0BRANCH (LOOP) (DO) U* U/ AND OR XOR LEAVE R> >R R 0= 0< + D+ MINUS DMINUS OVER DROP SWAP DUP @ C@ ! C! (DOES)My original plan was to build a subroutine-threaded FORTH. High-level definitions were to be called explicitly, primitives were to be compiled inline:
I abandoned this idea because most FORTHs use indirect threading and I wanted a full-featured standard FORTH with all the usual compiler facilities. Many compiling words are tightly coupled to the indirectly threaded model.
Indirectly threaded code is a list of execution tokens. An execution token is a code field address. The code field is a pointer to machine code. This presented a problem on the mark 1 because it has separate macro and micro address spaces. What should go in the code field? My solution was to shorten it to 1 byte and store the opcode:
Subroutine-threaded
Indirectly-threaded
Mark 1
Foo: DB OP_DUP DB OP_SWAP DB OP_DROP DB OP_EXIT Bar: DB OP_OVER DB OP_CALL DW Foo DB OP_ROT DB OP_EXIT
cfa_Foo: DW Enter pfa_Foo: DW cfa_DUP DW cfa_SWAP DW cfa_DROP DW cfa_Exit cfa_Bar: DW Enter pfa_Bar: DW cfa_OVER DW cfa_Foo DW cfa_ROT DW cfa_Exit cfa_Exit: DW pfa_Exit pfa_Exit: .. code .. cfa_DUP: DW pfa_DUP pfa_DUP: .. code ..
cfa_Foo: DB OP_ENTER pfa_Foo: DW cfa_DUP DW cfa_SWAP DW cfa_DROP DW cfa_Exit cfa_Bar: DB OP_ENTER pfa_Bar: DW cfa_OVER DW cfa_Foo DW cfa_ROT DW cfa_Exit cfa_Exit: DB OP_EXIT cfa_DUP: DB OP_DUP cfa_SWAP: DB OP_SWAP
In Mark 1 assembly language, the FORTH inner interpreter (NEXT) looks like this:
NEXT: mov w.l, [ip] ; W ← XT, IP ← IP+2 inc ip mov w.h, [ip] inc ip mov op, [w] ; OP ← [CFA] inc w ; W ← PFA xop ; オPC ← OP*16
It follows the convention of invoking primitives with the PFA in W as required by ENTER:
ENTER: dec rsp ; Push IP mov rs, ip mov ip, w ; IP ← PFA jmp NEXT EXIT: mov ip, rs ; Pop IP inc rsp jmp NEXT
Note: My microcode assembler expands 16-bit moves into a pair of 8-bit オ-instructions.
Notice how the last 3 オ-instructions in NEXT resemble the simple micro-interpreter described earlier. This was used to advantage in the implementation of multiplication and division.
The math primitives U* and U/ were split into 3 opcodes to optimise performance:
cfa_UMUL: DB OP_MUL_BEGIN, 16 DUP(OP_MUL_BIT), OP_MUL_END cfa_UDIV: DB OP_DIV_BEGIN, 16 DUP(OP_DIV_BIT), OP_DIV_END
I use the Microsoft Assembler (MASM) to create ROM images for the macro memory space. The syntax "16 DUP()" tells MASM to repeat the enclosed byte 16 times. It's equivalent to:
cfa_Uxxx: DB OP_XXX_BEGIN pfa_Uxxx: DB OP_XXX_BIT, OP_XXX_BIT, OP_XXX_BIT, OP_XXX_BIT DB OP_XXX_BIT, OP_XXX_BIT, OP_XXX_BIT, OP_XXX_BIT DB OP_XXX_BIT, OP_XXX_BIT, OP_XXX_BIT, OP_XXX_BIT DB OP_XXX_BIT, OP_XXX_BIT, OP_XXX_BIT, OP_XXX_BIT DB OP_XXX_END
First, NEXT calls _BEGIN with the address of the PFA (i.e. the first _BIT) in W. _BEGIN and _BIT end by jumping to the 3rd from last instruction in NEXT. This executes _BIT 16 times incrementing W as it goes. W acts as the loop counter or temporary program counter. Finally, _END jumps to the high-level NEXT.
Reset forces オPC to 000H. The first 8 bytes of the オ-ROM contain the following:
RESET: mov w, 0 ; W ← 0002h inc w inc w dis ; Disable IRQ mov ip.l, [w] ; IP ← Cold start vector inc w mov ip.h, [w] Next: ...
This initialises the high-level instruction pointer (IP) from a cold start vector at location 0002h in main memory. It then drops through into NEXT.
The high-level ROM assembly begins like this:
.Model Tiny .Code Include OPS.INC ORG 0 DW 0FFFFh ; Reserved for IRQ vector DW Reset ; Cold-start vector Reset DW UART_Init ...
Offset 0000h is reserved for the interrupt vector.
Burning EPROMs soon became tedious and I wrote a ROM-resident monitor to accept Intel Hex downloads via the serial port. This is how FORTH was originally loaded; but the latest version is ROM-resident. I now have a PC-based simulator for debugging, FORTH is fairly stable, and there's less need for the monitor.
My original FORTH, posted here in 2003, reversed the stack order of quotients and remainders left by division words. This has been corrected. Here's the latest code:
This implementation of fig-FORTH is based on the original May 1979 Installation Manual for the 6502 by Bill Ragsdale. It deviates from the standard in the following ways:
You'll find more homemade computers on my links page.
Please visit the other sites on the web ring (below) and don't forget to have a look at my Mark 2 FORTH Computer.