[Next] [Up] [Previous Section] [Home] [Other]

Sixteen Registers

A very complex imaginary architecture was defined here, primarily motivated by what I perceived as a flaw in the architecture of the IBM System/360: that the displacement field in an instruction was only 12 bits long, instead of 16 bits long, as on many microprocessors or the PDP-11.

On the previous page, the possibility of creating instruction formats very closely resembling that of the IBM System/360 which would address this issue was examined.

On this page, an alternative approach, resulting in something closer to my imaginary architecture, able to retain most of its capabilities, is illustrated. Important similarities to the 360 are retained: registers are in sets of 16 rather than 8, and the instructions are limited to a small number of lengths, with length decoding kept very simple.

Here are the instruction formats:

Because sets of 16 registers are used, a full indexed memory-reference instruction is forced to be 48 bits long. So a non-indexed format is provided, and also a short indexed format. In the short indexed format, the destination register is one of general registers 0 to 7, the index register is one of general registers 1 through 7, and the base register is one of base registers 8 to 15; as in my architecture, unlike the System/360, a separate set of registers is used for base registers rather than having general registers be consumed by being allocated as base registers, even though there are sixteen of each instead of eight.

It's all very well to draw pretty pictures, but is there actually enough opcode space to fit all the instructions in that I would want using these formats?

There's only one way to find out: allocate them.

The fixed-point register-to-register instructions take up half of the opcode space allocated to 16-bit instructions:

00 SWBR Swap Byte Register
01 CBR Compare Byte Register
02 LBR Load Byte Register
04 ABR Add Byte Register
05 SBR Subtract Byte Register
08 IBR Insert Byte Register
09 UCBR Unsigned Compare Byte Register
0A ULBR Unsigned Load Byte Register
0B XBR XOR Byte Register
0C NBR AND Byte Register
0D OBR OR Byte Register
10 SWHR Swap Halfword Register
11 CHR Compare Halfword Register
12 LHR Load Halfword Register
14 AHR Add Halfword Register
15 SHR Subtract Halfword Register
16 MHR Multiply Halfword Register
17 DHR Divide Halfword Register
18 IHR Insert Halfword Register
19 UCHR Unsigned Compare Halfword Register
1A ULHR Unsigned Load Halfword Register
1B XHR XOR Halfword Register
1C NHR AND Halfword Register
1D OHR OR Halfword Register
1E MYHR Multiply Extensibly Halfword Register
1F DYHR Divide Extensibly Halfword Register
20 SWR Swap Register
21 CR Compare Register
22 LR Load Register
24 AR Add Register
25 SR Subtract Register
26 MR Multiply Register
27 DR Divide Register
28 IR * Insert Register
29 UCR Unsigned Compare Register
2A ULR * Unsigned Load Register
2B XR XOR Register
2C NR AND Register
2D OR OR Register
2E MYR Multiply Extensibly Register
2F DYR Divide Extensibly Register
30 SWLR Swap Long Register
31 CLR Compare Long Register
32 LLR Load Long Register
34 ALR Add Long Register
35 SLR Subtract Long Register
36 MLR Multiply Long Register
37 DLR Divide Long Register
39 UCLR Unsigned Compare Long Register
3B XLR XOR Long Register
3C NLR AND Long Register
3D OLR OR Long Register
3E MYLR Multiply Extensibly Long Register
3F DYLR Divide Extensibly Long Register

The instructions marked with an asterisk are valid only in 64-bit mode. Load sign-extends into the unused high portion of the register, Unsigned Load clears it, Insert leaves it undisturbed. Multiply and Divide take two inputs and return one output of the same length, just like Add and Subtract. Multiply Extensibly returns a double-length product, with the additional part in the high portion of the destination register if possible; if not, the high part of the product is in the destination register, and the low part is in the following register. Divide Extensibly has a double-length dividend, a single-length divisor, a double-length quotient and a single-length remainder. The quotient is therefore placed where the dividend was found, and the remainder in the next available register.

Because IEEE 754 floating-point will be used, there will be no need for unnormalized floating-point instructions, and so but 1/4 of the opcode space for 16-bit instructions is used by the floating-point instructions:

40 SWMR Swap Medium Register
41 CMR Compare Medium Register
42 LMR Load Medium Register
44 AMR Add Medium Register
45 SMR Subtract Medium Register
46 MMR Multiply Medium Register
47 DMR Divide Medium Register
48 SWFR Swap Floating Register
49 CFR Compare Floating Register
4A LFR Load Floating Register
4C AFR Add Floating Register
4D SFR Subtract Floating Register
4E MFR Multiply Floating Register
4F DFR Divide Floating Register
50 SWDR Swap Double Register
51 CDR Compare Double Register
52 LDR Load Double Register
54 ADR Add Double Register
55 SDR Subtract Double Register
56 MDR Multiply Double Register
57 DDR Divide Double Register
58 SWER Swap Extended Register
59 CER Compare Extended Register
5A LER Load Extended Register
5C AER Add Extended Register
5D SER Subtract Extended Register
5E MER Multiply Extended Register
5F DER Divide Extended Register

Only half of what is left is needed for the shift instructions:

60 110 SHLB Shift Left Byte
60 10 SHLH Shift Left Halfword
60 0 SHL Shift Left
61 110 SHRB Shift Right Byte
61 10 SHRH Shift Right Halfword
61 0 SHR Shift Right
63 110 ASRB Arithmetic Shift Right Byte
63 10 ASRH Arithmetic Shift Right Halfword
63 0 ASR Arithmetic Shift Right
64 110 ROLB Rotate Left Byte
64 10 ROLH Rotate Left Halfword
64 0 ROL Rotate Left
65 110 RORB Rotate Right Byte
65 10 RORH Rotate Right Halfword
65 0 ROR Rotate Right
66 110 RLCB Rotate Left through Carry Byte
66 10 RLCH Rotate Left through Carry Halfword
66 0 RLC Rotate Left through Carry
67 110 RRCB Rotate Right through Carry Byte
67 10 RRCH Rotate Right through Carry Halfword
67 0 RRC Rotate Right through Carry
68 SHLL Shift Left Long
69 SHRL Shift Right Long
6B ASRL Arithmetic Shift Right Long
6C ROLL Rotate Left Long
6D RORL Rotate Right Long
6E RLCL Rotate Left through Carry Long
6F RRCL Rotate Right through Carry Long

leaving the other half for the relative branch instructions:

71 BL Branch if Low
72 BE Branch if Equal
73 BLE Branch if Low or Equal
74 BH Branch if High
75 BNE Branch if Not Equal
76 BHE Branch if High or Equal
77 BNV Branch if No Overflow
78 BV Branch if Overflow
7A BC Branch if Carry
7B BNC Branch if No Carry
7F BRA Branch

This should have been the hard part, as restricting the 32-bit memory-reference instructions to aligned operands, and making use of that to distinguish between operations on data of different lengths by means of the least-significant bits of the address field (an idea pioneered by the SEL 32 computer) produces an extreme saving of opcode space, but even with that saving, in order to have as much space available as I deemed required, it was necessary to complicate the formatting of the 48-bit and 64-bit long instructions:

80 0 SWHX Swap Halfword Indexed
80 01 SWX Swap Indexed
80 011 SWLX Swap Long Indexed
82 0 CHX Compare Halfword Indexed
82 01 CX Compare Indexed
82 011 CLX Compare Long Indexed
84 0 LHX Load Halfword Indexed
84 01 LX Load Indexed
84 011 LLX Load Long Indexed
86 0 STHX Store Halfword Indexed
86 01 STX Store Indexed
86 011 STLX Store Long Indexed
88 0 AHX Add Halfword Indexed
88 01 AX Add Indexed
88 011 ALX Add Long Indexed
8A 0 SHX Subtract Halfword Indexed
8A 01 SX Subtract Indexed
8A 011 SLX Subtract Long Indexed
8C 0 MHX Multiply Halfword Indexed
8C 01 MX Multiply Indexed
8C 011 MLX Multiply Long Indexed
8E 0 DHX Divide Halfword Indexed
8E 01 DX Divide Indexed
8E 011 DLX Divide Long Indexed
90 0 IHX Insert Halfword Indexed
90 01 * IX Insert Indexed
92 0 UCHX Unsigned Compare Halfword Indexed
92 01 UCX Unsigned Compare Indexed
92 011 UCLX Unsigned Compare Long Indexed
94 0 ULHX Unsigned Load Halfword Indexed
94 01 ULX Unsigned Load Indexed
94 011 ULLX Unsigned Load Long Indexed
96 0 XHX XOR Halfword Indexed
96 01 XX XOR Indexed
96 011 XLX XOR Long Indexed
98 0 NHX AND Halfword Indexed
98 01 NX AND Indexed
98 011 NLX AND Long Indexed
9A 0 OLX OR Halfword Indexed
9A 01 OX OR Indexed
9A 011 OLX OR Long Indexed
9C 0 MYHX Multiply Extensibly Halfword Indexed
9C 01 MYX Multiply Extensibly Indexed
9C 011 MYLX Multiply Extensibly Long Indexed
9E 0 DYHX Divide Extensibly Halfword Indexed
9E 01 DYX Divide Extensibly Indexed
9E 011 DYLX Divide Extensibly Long Indexed
A0 0 SWMX Swap Medium Indexed
A0 01 SWFX Swap Floating Indexed
A0 011 SWDX Swap Double Indexed
A0 0111 SWEX Swap Extended Indexed
A2 0 CMX Compare Medium Indexed
A2 01 CFX Compare Floating Indexed
A2 011 CDX Compare Double Indexed
A2 0111 CEX Compare Extended Indexed
A4 0 LMX Load Medium Indexed
A4 01 LFX Load Floating Indexed
A4 011 LDX Load Double Indexed
A4 0111 LEX Load Extended Indexed
A6 0 STMX Store Medium Indexed
A6 01 STFX Store Floating Indexed
A6 011 STDX Store Double Indexed
A6 0111 STEX Store Extended Indexed
A8 0 AMX Add Medium Indexed
A8 01 AFX Add Floating Indexed
A8 011 ADX Add Double Indexed
A8 0111 AEX Add Extended Indexed
AA 0 SMX Subtract Medium Indexed
AA 01 SFX Subtract Floating Indexed
AA 011 SDX Subtract Double Indexed
AA 0111 SEX Subtract Extended Indexed
AC 0 MMX Multiply Medium Indexed
AC 01 MFX Multiply Floating Indexed
AC 011 MDX Multiply Double Indexed
AC 0111 MEX Multiply Extended Indexed
AE 0 DMX Divide Medium Indexed
AE 01 DFX Divide Floating Indexed
AE 011 DDX Divide Double Indexed
AE 0111 DEX Divide Extended Indexed
C0 0 SWHA Swap Halfword Aligned
C0 01 SWA Swap Aligned
C0 011 SWLA Swap Long Aligned
C2 0 CHA Compare Halfword Aligned
C2 01 CA Compare Aligned
C2 011 CLA Compare Long Aligned
C4 0 LHA Load Halfword Aligned
C4 01 LA Load Aligned
C4 011 LLA Load Long Aligned
C6 0 STHA Store Halfword Aligned
C6 01 STA Store Aligned
C6 011 STLA Store Long Aligned
C8 0 AHA Add Halfword Aligned
C8 01 AA Add Aligned
C8 011 ALA Add Long Aligned
CA 0 SHA Subtract Halfword Aligned
CA 01 SA Subtract Aligned
CA 011 SLA Subtract Long Aligned
CC 0 MHA Multiply Halfword Aligned
CC 01 MA Multiply Aligned
CC 011 MLA Multiply Long Aligned
CE 0 DHA Divide Halfword Aligned
CE 01 DA Divide Aligned
CE 011 DLA Divide Long Aligned
DO 0 IHA Insert Halfword Aligned
DO 01 * IA Insert Aligned
D2 0 UCHA Unsigned Compare Halfword Aligned
D2 01 UCA Unsigned Compare Aligned
D2 011 UCLA Unsigned Compare Long Aligned
D4 0 ULHA Unsigned Load Halfword Aligned
D4 01 ULA Unsigned Load Aligned
D4 011 ULLA Unsigned Load Long Aligned
D6 0 XHA XOR Halfword Aligned
D6 01 XA XOR Aligned
D6 011 XLA XOR Long Aligned
D8 0 NHA AND Halfword Aligned
D8 01 NA AND Aligned
D8 011 NLA AND Long Aligned
DA 0 OLA OR Halfword Aligned
DA 01 OA OR Aligned
DA 011 OLA OR Long Aligned
DC 0 MEHA Multiply Extensibly Halfword Aligned
DC 01 MEXA Multiply Extensibly Aligned
DC 011 MELA Multiply Extensibly Long Aligned
DF 0 DYHA Divide Extensibly Halfword Aligned
DF 01 DYA Divide Extensibly Aligned
DF 011 DYLA Divide Extensibly Long Aligned
E0 0 SWMA Swap Medium Aligned
E0 01 SWFA Swap Floating Aligned
E0 011 SWDA Swap Double Aligned
E0 0111 SWEA Swap Extended Aligned
E2 0 CMA Compare Medium Aligned
E2 01 CFA Compare Floating Aligned
E2 011 CDA Compare Double Aligned
E2 0111 CEA Compare Extended Aligned
E4 0 LMA Load Medium Aligned
E4 01 LFA Load Floating Aligned
E4 011 LDA Load Double Aligned
E4 0111 LEA Load Extended Aligned
E6 0 STMA Store Medium Aligned
E6 01 STFA Store Floating Aligned
E6 011 STDA Store Double Aligned
E6 0111 STEA Store Extended Aligned
E8 0 AMA Add Medium Aligned
E8 01 AFA Add Floating Aligned
E8 011 ADA Add Double Aligned
E8 0111 AEA Add Extended Aligned
EA 0 SMA Subtract Medium Aligned
EA 01 SFA Subtract Floating Aligned
EA 011 SDA Subtract Double Aligned
EA 0111 SEA Subtract Extended Aligned
EC 0 MMA Multiply Medium Aligned
EC 01 MFA Multiply Floating Aligned
EC 011 MDA Multiply Double Aligned
EC 0111 MEA Multiply Extended Aligned
EE 0 DMA Divide Medium Aligned
EE 01 DFA Divide Floating Aligned
EE 011 DDA Divide Double Aligned
EE 0111 DEA Divide Extended Aligned

In order to double the space available to indicate registers in the Short Indexed instructions, there are no 48-bit indexed memory reference instructions for the byte data type, which would use up fully half of the opcode space available as they require an undiminished displacement field.

Only a tiny bit of space is left to include 32-bit versions of the conditional jump instructions and subroutine call instructions, but it can be made to serve.

The destination register field, while it serves to indicate where to store the return address for the jump to subroutine instruction, is unused for a jump instruction, and is therefore available to indicate the condition applicable to a conditional jump, as was done on the IBM System/360. This is complicated somewhat for the Short Indexed format, in which the destination register field has been reduced to three bits in length, but sufficient opcode space is still available.

Given that instructions are aligned on halfword boundaries, and there is no alternate kind of instruction that is aligned on 32-bit boundaries to jump to, both possible values of the one spare bit in the displacement may be used given that 16-bit alignment is the terminal alignment for this instruction type. This is needed because while there is extra space among the Aligned format instructions, none is available among the Short Indexed format instructions.

In order to fit the Short Indexed jump instructions in the limited space available, their format is modified. Only index registers 0 to 3 are available for them, unlike registers 0 to 7, made available for the other Short Indexed memory reference instructions. Also, since the destination register field is three bits long, instead of four, two opcodes rather than one are allocated to the conditional jump instructions.

E8 0 JSRA Jump to Subroutine Register Aligned
E9 0 JSBA Jump to Subroutine Base Aligned
EA1 0 JLA Jump if Low Aligned
EA2 0 JEA Jump if Equal Aligned
EA3 0 JLEA Jump if Low or Equal Aligned
EA4 0 JHA Jump if High Aligned
EA5 0 JNEA Jump if Not Equal Aligned
EA6 0 JHEA Jump if High or Equal Aligned
EA7 0 JNVA Jump if No Overflow Aligned
EA8 0 JVA Jump if Overflow Aligned
EAA 0 JCA Jump if Carry Aligned
EAB 0 JNCA Jump if No Carry Aligned
EAF 0 JA Jump Aligned
E8 1 JSRX Jump to Subroutine Register Indexed
E9 1 JSBX Jump to Subroutine Base Indexed
EB0 1 JLX Jump if Low Indexed
EA2 1 JEX Jump if Equal Indexed
EB2 1 JLEX Jump if Low or Equal Indexed
EA4 1 JHAX Jump if High Indexed
EB4 1 JNEX Jump if Not Equal Indexed
EA6 1 JHEX Jump if High or Equal Indexed
EB6 1 JNVX Jump if No Overflow Indexed
EA8 1 JVX Jump if Overflow Indexed
EAA 1 JCX Jump if Carry Indexed
EBA 1 JNCX Jump if No Carry Indexed
EBE 1 JX Jump Indexed

Two versions of the Jump to Subroutine instruction are required; one saves the return address to a specified general register, the other to a specified base register. In the case of the Jump to Subroutine Base instruction, the possible targets are base registers 8 through 15, even though the instruction can only use base registers 11 through 15 with its destination address.

The Aligned instruction format clearly poses no issues, as it handles all non-indexed memory references well, with full four-bit fields available for both the destination register and the base register.

The Indexed instruction is hoped to be able to handle most indexed memory references, as the destination register can be anything from 0 to 3, and the index register anything from 1 to 7, and the base register field is used to indicate a base register from 8 to 15.

Adequate opcode space is still available for the 48-bit full memory reference instructions, multiple register instructions, and vector register instructions, despite their having been allocated only a very small portion of total opcode space.

Thus, for the full memory-reference instructions, we have:

FE 0nn

as the opcode, where nn is the opcode of the corresponding register-to-register instruction; i.e., we have

FE 024 A Add

for the instruction that performs 32-bit addition.

Also, there are the jump instructions:

FE 060 JSR Jump to Subroutine Register
FE 061 JSB Jump to Subroutine Base
FE1 062 JL Jump if Low
FE2 062 JE Jump if Equal
FE3 062 JLE Jump if Low or Equal
FE4 062 JHA Jump if High
FE5 062 JNE Jump if Not Equal
FE6 062 JHE Jump if High or Equal
FE7 062 JNV Jump if No Overflow
FE8 062 JV Jump if Overflow
FEA 062 JC Jump if Carry
FEB 062 JNC Jump if No Carry
FEF 062 J Jump
FE 064 JXLE Jump if Index Low or Equal
FE 065 JXH Jump if Index High

The Jump if Index Low or Equal instruction increments the general register indicated in the index register field, and jumps if its contents are less than or equal to those of the general register indicated in the destination register field. Jump if High decrements, and jumps if greater, instead.

This provides space for further expansion to handle additional data types.

To permit register-to-register instructions for additional data types, the opcode CF, which would indicate a Divide Extensibly Byte instruction if one were useful, can be used to indicate a 32-bit instruction that contains a 16-bit instruction in its second half and which provides additional opcode bits.

Thus, we have:

CF02 nnds FEdx 2nnb aaaa
 0 1 2 3 4 5 6 7 8 9 A B C D E F
0 SFSWH SFSW SFSWL
1 SFCH SFC SFCL 
2 SFLH SFL SFLL 
3 SFSTH SFST SFSTL
4 SFAH SFA SFAL 
5 SFSH SFS SFSL 
6 SFMH SFM SFML 
7 SFDH SFD SFDL 
8 SFMEUH SFMEU SFMEUL
9 SFDEUH SFDEU SFDEUL
A SFLUH SFLU SFLUL 
B SFSTUH SFSTU SFSTUL
C SFAUH SFAU SFAUL 
D SFSUH SFSU SFSUL 
E SFMUH SFMU SFMUL 
F SFDUH SFDU SFDUL 

for the Simple Floating instructions,

CF03 nnds FEdx 3nnb aaaa
 0 1 2 3 4 5 6 7 8 9 A B C D E F
0
1 RPC RCDC 
2 RPME RCDME 
3 RPDE RCDDE 
4 RPA RCDA 
5 RPS RCDS 
6 RPM RCDM 
7 RPD RCDD 
8 
9 RPCL RCDCL 
A RPMEL RCDMEL
B RPDEL RCDDEL
C RPAL RCDAL 
D RPSL RCDSL 
E RPML RCDML 
F RPDL RCDDL 

for the Register Packed Decimal and the Register Compressed Decimal instructions, and

CF04 nnds FEdx 4nnb aaaa
 0 1 2 3 4 5 6 7 8 9 A B C D E F
0 SWFRC SWDRC SWERC SWNFRC SWNDRC SWNERC
1 CFRC CDRC CERC CNFRC CNDRC CNERC 
2 LFRC LDRC LERC LNFRC LNDRC LNERC 
3 STFRC STDRC STERC STNFRC STNDRC STNERC
4 AFRC AFRCH ADRC AFDCH AERC AFDCH ANFRC ANDRC ANERC 
5 SFRC SFRCH SDRC SFDCH SERC SFDCH SNFRC SNDRC SNERC 
6 MFRC MFRCH MDRC MFDCH MERC MFDCH MNFRC MNDRC MNERC 
7 DFRC DFRCH DDRC DFDCH DERC DFDCH DNFRC DNDRC DNERC 
8
9
A LUFRC LUDRC LUERC 
B STUFRC STUDRC STUERC
C AUFRC AUDRC AUERC 
D SUFRC SUDRC SUERC 
E MUFRC MUDRC MUERC 
F DUFRC DUDRC DUERC

for the Floating Register Compressed Decimal instructions.

Finally, rather than switching into a special mode for the Subdivided Floating and Subdivided Medium data types, they as well may be given their own instructions, although this will mean that the instructions are longer:

CF05 nnds FEdx 5nnb aaaa
 0 1 2 3 4 5 6 7 8 9 A B C D E F
0 SWSM
1 CSM
2 LSM
3 STSM STRSM STRM STRD
4 ASM
5 SSM
6 MSM
7 DSM
8 SWSF
9 CSF
A LSF
B STSF STRSF STRF
C ASF
D SSF
E MSF
F DSF

Store Rounded instructions are shown for both these formats and the regular floating-point formats other than Extended: they will be discussed below.

Thus, the floating-point formats available on this machine are the following:

where Floating is 32 bits long, Subdivided Floating is 36 bits long, Medium is 48 bits long, Subdivided Medium is 51 bits long, Double is 64 bits long, and Extended is 128 bits long.

The formats are based on those specified by the IEEE 754 standard for 32 and 64 bit floating-point numbers. The sizes of the exponent fields for other sizes of floating-point numbers were chosen on the basis of the following rationales:

In the case of 36-bit floats, since the hidden first bit grants an extra bit of precision, the floating-point precision of the IBM 7090 can be attained while adding a bit to the exponent to approximate the exponent range of the IBM 360. This allows a broader range of older FORTRAN programs to run without modification if this type of number is used for single precision (provided, of course, that the unusual storage layout is not an issue).

In the case of 48-bit and 51-bit floats, the size of the exponent was chosen because these formats provide a precision of 11 or 12 decimal digits, thus approximating that provided by a typical scientific pocket calculator; therefore, it was deemed desirable that the exponent range exceed that of such a calculator as well (10^-99 to 10^99).


Medium floating point numbers, as can be seen from the instruction formats, are aligned on 16 bit boundaries. Floating, Double, and Extended numbers are aligned when placed on multiples of their lengths, as they have power-of-two lengths.

It is assumed that memory is connected to a microprocessor using this ISA by a 256-bit data bus. A 256 bit memory word can be divided into seven 36-bit Subdivided Floating numbers with three bits left over, and into five 51-bit Subdivided Medium floating numbers with one bit left over.

In order to avoid the slow operation of dividing by either five or seven when addressing numbers of these types, however, they are stored with some additional wastage.

32 Subdivided Floating numbers are stored in five 256-bit memory words, leaving the 33rd, 34th, and 35th slots vacant.

64 Subdivided Medium numbers are stored in thirteen 256-bit memory words, leaving the 65th slot vacant.

In this way, only multiplication plus a small table lookup is required to locate the Nth element of an array of numbers of these types for any value of N. Numbers of these types are addressed as if they are 32 bits long, with only the first seven or the first five positions in a 256-bit memory word being valid. When an instruction is indexed, the index is treated as a displacement in floating point numbers rather than one in bytes. The last five bits, in the case of Subdvided Floating, or the last six bits, in the case of Subdivided Medium, are used to indicate the position within a block, and the higher portion of the index is multiplied by the length of a block in 256-bit memory words.

The portion of the basic address, formed by the sum of the base register contents and the displacement field in the instruction, that indicates a 32-bit word within a 256-bit memory cell, when shifted as required to make it in units of 32 bits, is added to the index register contents before processing. In the indexed case, values of 7 or 5 or higher respectively will work properly, whereas only values of 0 to 6 or 0 to 4 are valid for Subdivided Floating and Subdivided Medium respectively without indexing.


Because Medium floating-point numbers do not work well with indexing if the inefficiency of occasional double memory accesses is not to be tolerated, while Subdivided Medium floating-point numbers do not have this issue, but do not allow overlapping offset arrays, it might be useful to mix both types in the same program.

Thus, some points about type interoperability need to be noted.

Internally, all conventional floating-point numbers will be stored with exponents as in the Extended floating point format, followed by a mantissa with the appropriate number of bits, which will be one more than for the external form of any size other than Extended, as there will be no hidden first bit. (Note that these notes apply only to this architecture, not the parent architecture to which reference is made for the definitions of some exotic data types; that architecture allows considerably more flexibility in floating-point formats, and so it needs to do type conversions explicitly and store all numbers internally in their memory format.)

The floating-point registers, however, are still essentially 128-bit wide fast memory locations without special circuitry; the internal format simply allows numbers to be quickly fed to the floating-point ALU. This permits flexibility in register renaming. (This particular item is a characteristic of the parent architecture as well.)

Floating-point operations clear all the unused less significant mantissa bits. But storing a floating-point number at a lesser precision involves truncation, not rounding, to avoid imposing an overhead on the great majority of operations which involve numbers being stored at their own precision. This conflicts with the aim of the IEEE 754 standard to retain the maximum possible accuracy in all calculations, and, thus, the Store Rounded instructions, placed in the same region of opcode space as the Subdivided floating-point instructions are provided.

The Floating Register Compressed Decimal numbers are also stored in the floating-point registers. Because they are actually 128 bits long, the internal and external formats of 128-bit Floating Register Compressed Decimal numbers are the same. For maximum efficiency in handling decimal floating-point numbers of other precisions, they are stored internally as a sign, a 15-bit binary exponent, and seven or sixteen BCD digits. So Chen-Ho (or, rather, Densely Packed Decimal) encoding and decoding is combined with memory operations for the shorter formats, but is done during register operations for the extended precision.


The short vector instructions also require an additional field to indicate the mask register being used, and another to indicate if masking is present, so they're 16 bits longer. Fortunately, even among 48-bit instructions, there is space for extra codes, using F8 as the opcode:

 FE00 1nn0 mMds F801 nn0d mMxb aaaa
 0 1 2 3 4 5 6
0 SWBSV SWHSV SWSV SWLSV SWFSV SWDSV SWESV
1
2 LBSV LHSV LSV LLSV LFSV LDSV LESV
3 STBSV STHSV STSV STLSV STFSV STDSV STESV
4 ABSV AHSV ASV ALSV AFSV ADSV AESV
5 SBSV SHSV SSV SLSV SFSV SDSV SESV
6 MHSV MSV MLSV MFSV MDSV MESV
7 DHSV DSV DLSV DFSV DDSV DESV
8 SMBPB SMBPH SMBP SMBPL SMBPF SMBPD SMBPE
9 SMBZB SMBZH SMBZ SMBZL SMBZF SMBZD SMBZE
A SMBNB SMBNH SMBN SMBNL SMBNF SMBND SMBNE
B XBSV XHSV XSV XLSV
C NBSV NHSV NSV NLSV
D OBSV OHSV OSV OLSV
E
F
Key to the instruction formats:
nn: opcode as indicated in the table
aaaa: address displacement field
m: 0000 if no mask, 0001 if masked
M: mask register
d: destination register
s: source register
x: index register
b: base register

and, of course, mnemonics are suffixed R for register to register instructions.

In the case of the multiple register and the vector register instructions, the secondary opcode field is eight bits long rather than twelve, but this is still sufficient:

FE 0 82 LBVR Load Byte Vector Register
 (and so forth)
FE 4 82 LMVR Load Medium Vector Register
 (and so forth)
FE 4 D0 SWSMVR Swap Subdivided Medium Vector
 (and so forth)
FE F2 LM Load Multiple
FE F3 STM Store Multiple
FE F4 LML Load Multiple Long
FE F5 STML Store Multiple Long
FE F6 LME Load Multiple Extended
FE F7 STME Store Multiple Extended

Because the first digit of the seconary opcode field of these instructions is 8 or greater, to distinguish them from the full memory reference instructions, the primary opcode field contains the first digit of the two-digit opcode of the analogous instruction, with the first digit of the secondary opcode field containing 8 plus the supplementary opcode digit, and the second digit of the secondary opcode field containing the second digit of the analogous instruction.

Thus, a full memory reference instruction with opcode 123, analogous to a register to register instruction with opcode 23, would correspond to a vector instruction with the split opcode 2 93.


The vector memory-reference instructions are 80 bits long, rather than 64 bits long, partly because of a serious shortage of opcode space for the 64-bit long packed decimal and string instructions, on the other hand.

For the 80-bit instructions, we have:

FF 0 02 LBV Load Byte Vector
 (and so forth)
FF 4 02 LMV Load Medium Vector
 (and so forth)
FF 4 50 SWSMV Swap Subdivided Medium Vector
 (and so forth)
FF A TR Translate
FF C FMT Format
FF E SC Scan

Because the first bit of the secondary opcode must be zero, to distinguish it from the translate instructions, once again the prefix digit becomes the first digit of the secondary opcode, and the two digits of the corresponding original opcode become the primary opcode and the second digit of the secondary opcode respectively. Thus, the opcode 123 becomes 2 13.

While indexing works properly with Subdivided Medium and Subdivided Floating vector operations, remember that the address pointed to by the base and the displacement always defines the beginning of a block of floats with unused space at the end (even though it does not have to point to the very first element of that block), and so it is not possible, in general, to have overlapping vectors work the way one would expect from datatypes which fit more neatly into storage.

The rule to remember is that the counter locating array elements for a vector operation is treated like an index register for purposes of address formation with Subdivided floating-point numbers; even stride works properly with them without issues.

Also, not only are vector operations on 51-bit Subdivided Medium numbers are allowed, vector operations on 48-bit Medium numbers are also supported: while elements of such vectors would cross memory word boundaries, since handling a vector from memory involves fetching each memory word once, not repeatedly for each element containing it, no actual efficiency issue results during vector operations, provided no nonunit stride is present, although one would exist when operating on those individual elements that cross such boundaries.

The FMT instruction operates as follows: The first operand is a translation table with 256 one-byte entries in which entries 0, 1, and 255 have a special significance. The length field of the instruction determines the length of the source operand. Successive characters from the source operand are moved to the destination operand as follows:

This instruction, except that it does not convert from packed decimal to unpacked, performs a similar function to the edit and edit with mark instructions on the IBM System/360 computer. Thus, a translation table can contain a fill character in position 0, the digit zero in position 1, and a floating currency symbol (or another fill character) in position 255 to convert a raw zoned decimal string (produced by an unpack instruction) to the format used in printing. Note that the decimal point would be placed in the destination operand, with zero bytes in the positions to be filled with digits.


The SC instruction begins by ignoring any characters in the source operand that translate to bytes containing zero in the translation table; then, bytes not translating to zero are copied with translation until an entry in the translate table containing a zero is encountered. The number of characters translated is placed in accumulator/index register 2, giving the length of the result in the destination operand, and the number of characters translating to zero that were initially ignored, plus the number of characters translated, is placed in accumulator/index register 1, giving the portion of the source operand that was scanned until the first character translating to zero following a character not translating to zero was found, and the remainder of the destination operand is filled with the character found in position 0 of the translate table.

The source operand may not contain a byte having the value 0. If it does, the instruction stops, and an overflow condition is set.

If the instruction completes within the provided length, then it is treated as having a zero result for a subsequent conditional branch instruction; if it did not complete, but characters with nonzero translations were encountered, it has a positive result; if only characters translating to zero were encountered, it has a negative result.

This instruction can be used for some of the same purposes as the translate and test instruction of the IBM System/360, although it works differently. It can be used to scan for keywords and translate them to upper case, for example.


and as for the 64-bit instructions, they do still squeeze in:

F1 CP Compare Packed
F2 MVP Move Packed
F4 AP Add Packed
F5 SP Subtract Packed
F6 MP Multiply Packed
F7 DP Divide Packed
FA MVB Move Byte
FC P Pack
FD UP Unpack

Note that the Pack and Unpack instructions take a packed argument that is half the length of the argument that would be indicated if it were in character form: these instructions have a single length field, like the MVB instruction. Unpack adds hexadecimal 30 to each packed decimal digit, converting it to an ASCII digit.

The length fields of these instructions contain one minus the operand length.


[Next] [Up] [Previous Section] [Home] [Other]

AltStyle によって変換されたページ (->オリジナル) /