Archives
- April 2025
- March 2025
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- October 2019
- September 2019
- August 2019
- July 2019
- June 2019
- May 2019
- April 2019
- March 2019
- February 2019
- January 2019
- December 2018
- November 2018
- October 2018
- August 2018
- July 2018
- June 2018
- May 2018
- April 2018
- March 2018
- February 2018
- January 2018
- December 2017
- November 2017
- October 2017
- August 2017
- July 2017
- June 2017
- May 2017
- April 2017
- March 2017
- February 2017
- January 2017
- December 2016
- November 2016
- October 2016
- September 2016
- August 2016
- July 2016
- June 2016
- May 2016
- April 2016
- March 2016
- February 2016
- January 2016
- December 2015
- November 2015
- October 2015
- September 2015
- August 2015
- July 2015
- June 2015
- May 2015
- April 2015
- March 2015
- February 2015
- January 2015
- December 2014
- November 2014
- October 2014
- September 2014
- August 2014
- July 2014
- June 2014
- May 2014
- April 2014
- March 2014
- February 2014
- January 2014
- December 2013
- November 2013
- October 2013
- September 2013
- August 2013
- July 2013
- June 2013
- May 2013
- April 2013
- March 2013
- February 2013
- January 2013
- December 2012
- November 2012
- October 2012
- September 2012
- August 2012
- July 2012
- June 2012
- May 2012
- April 2012
- March 2012
- February 2012
- January 2012
- December 2011
- November 2011
- October 2011
- September 2011
- August 2011
- July 2011
- June 2011
- May 2011
- April 2011
- March 2011
- January 2011
- November 2010
- October 2010
- August 2010
- July 2010
SCO UNIX 3.2v4.0 vs. IA-32 Semantics Changes
People noticed a long time ago that SCO UNIX 3.2v4.0 won’t boot on anything resembling modern hardware, and it won’t boot in a VM either. In a VM the results may be inconsistent across implementations, but on a physical machine the system reboots right after the kernel is started (i.e. right after loading the N1 floppy). Most, if not all, older and newer 32-bit SCO XENIX/UNIX releases have no such problem.
It is not unreasonable to assume that the OS worked on 386s and 486s back when it was released (1992), but why won’t it work anymore? Intel processors are supposed to be backwards compatible…
Why it won’t work now
Analyzing why SCO UNIX 3.2v4.0 causes a triple fault (and a subsequent reboot) is not particularly difficult. When the kernel initializes, it first switches into 32-bit protected mode without paging, initializes the paging structures, and only then enables paging. This is slightly unusual as the recommended sequence is to enable protected mode and paging at the same time, but there’s nothing fundamentally wrong with this approach. The crashing code in question looks as follows:
mov eax, 000002000h
mov cr3, eax ; load PDBR
mov eax, cr0
or eax, 080000000h ; set PG bit
mov cr0, eax ; enable paging
jmp ecx ; jump to the following instruction
mov eax, 0f0000000h
This code is loaded somewhere near the top of physical memory and its address will depend on the system’s memory size. In an 8MB system, the above sequence might start at 0075f00bh. Once paging is enabled, the code will be mapped near the top of the 4GB address space, at f001000bh (this address does not change with memory size).
The code is obviously not identity-mapped, i.e. its physical address does not equal its linear address once paging is enabled. What’s worse, the physical address is not mapped by the page tables at all. As soon as the CR0.PG bit is set, the immediately following instruction (jmp ecx) suddenly doesn’t exist anymore. The attempt to fetch the instruction causes a page fault, but the IDT isn’t mapped either. That makes exception delivery rather difficult, and the CPU does not have any recourse but to cause a triple fault which triggers a shutdown bus cycle, which typically hard reboots the system.
Why it worked before
The semantics of instruction prefetching and TLB (Translation Lookaside Buffer) management slightly changed between the 386 and later CPUs, and not just once.
Indeed reading Intel’s own 386 programming manual it seems that SCO just followed the CPU vendor’s recommendations. Here’s an excerpt from From the Intel 80386 Programmer’s Reference Manual (1986), section 10.4.4, Page Tables:
The initialization procedure should adopt one of the following strategies to ensure consistent addressing before and after paging is enabled:
- The page that is currently being executed should map to the same physical addresses both before and after PG is set.
- A JMP instruction should immediately follow the setting of PG.
SCO simply chose the latter option, a JMP immediately following the enabling of paging.
On a 386, reloading CR3 causes a TLB flush, which means that the TLB must be reloaded after CR3 is loaded. At that point, the processor is still executing with paging disabled, hence reloading the TLB causes no problem. On the other hand, setting the PG bit in CR0 does not cause a TLB flush. The TLB is still valid and the JMP instruction can be fetched without referring to the paging structures. The JMP will point to an address that is not currently in the TLB, but it can be resolved by looking it up in the page tables. Changing CR0 also doesn’t flush the prefetch queue, hence there is a high likelihood that the JMP instruction is already fetched and decoded anyway.
What changed
The behavior of current IA-32 CPUs is subtly different. The SCO page switching code in UNIX 3.2v4.0 clearly violates the current (2012) Intel IA-32 programming manual which states: Processors need not implement any TLBs. Processors that do implement TLBs may invalidate any TLB entry at any time. Software should not rely on the existence of TLBs or on the retention of TLB entries. SCO’s code implicitly relies on the existence of TLBs and on the CR0 update not flushing said TLBs. Alternatively a prefetch queue with certain semantics would also produce the behavior assumed by SCO.
The current Intel manual (Volume 3, section 4.10.2.1) further states that MOV to CR0 invalidates TLBs if it changes the value of CR0.PG from 1 to 0. That’s not what SCO is doing, but the Intel manual also says that the processor is always free to invalidate additional entries in the TLBs and paging-structure caches, and specifically MOV to CR0 may invalidate TLB entries even if CR0.PG is not changing. CR0.PG is in fact changing here.
Note that this may be sloppy editing on Intel’s part as the case where CR0.PG changes from 0 to 1 is not explicitly documented. AMD’s documentation clearly states that modifying the CR0.PG bit flushes TLBs. The upshot is that on current processors, enabling paging in CR0 pulls the rug from under the SCO UNIX 3.2v4.0 kernel as the following instruction is no longer mapped. The CPU is in a state where it can’t do anything sensible about it either.
When did it change?
The obvious follow-up question is when the behavior actually changed. Since SCO UNIX 3.2v4.0 was released in 1992, it must have supported 386s and 486s, but not necessarily Pentium CPUs (released in 1993, after unexpected delays).
Interestingly, the Intel 486 documentation had already changed in 1990, even though the CPU behavior apparently had not. Intel’s 486 manual stated (section 10.5.2, on paging initialization) : As with the PE bit, setting the PG bit must be followed immediately with a JMP instruction. Also, the code which sets the PG bit must come from a page which has the same physical address after paging is enabled.
Instead of the 386’s “either use identity paging or jump right after enabling paging”, both identity mapping and using a jump instruction is now required. SCO’s code then already violated the 1990 reference manual for the 486, although it is hard to blame the programmers who clearly followed the earlier edition and did not run into any trouble on actual 486 hardware.
The Pentium Processor Family Developer’s Manual from 1995 offers an explanation in Volume 3, section 16.5.3 (paging initialization):
The following guidelines for setting the PG bit (as with the PE bit) should be adhered to
maintain both upwards and downwards compatibility:
- The instruction setting the PG bit should be followed immediately with a JMP instruction. A JMP instruction immediately after the MOV CR0 instruction changes the flow of execution, so it has the effect of emptying the Intel386 and Intel486 processor of instructions which have been fetched or decoded. The Pentium processor, however, uses a branch target buffer (BTB) for branch prediction, eliminating the need for branch instructions to flush the prefetch queue. For more information on the BTB, see the Pentium® Processor Family Developer’s Manual, Volume 1, order number 241428.
- The code from the instruction which sets the PG bit through the JMP instruction must come from a page which is identity mapped (i.e., the linear address before the jump is the same as the physical address after paging is enabled).
The 32-bit Intel architectures have different requirements for enabling paging and switching to protected mode. The Intel386 processor requires following steps 1 or 2 above. The Intel486 processor requires following both steps 1 and 2 above. The Pentium processor requires only step 2 but for upwards and downwards code compatibility with the Intel386 and Intel486 processors, it is recommended both steps 1 and 2 be taken.
This is a little tricky to reconcile with the reasonable assumption that SCO UNIX 3.2v4.0 must have worked on 486s in 1992, unless it only broke on the post-1992 486 models. Then again, there’s the following in the 1997 Intel Architecture Software Developer’s Manual, Volume 3, section 9.7 (Self-Modifying Code): For Intel486 processors, a write to an instruction in the cache will modify it in both the cache and memory, but if the instruction was prefetched before the write, the old version of the instruction could be the one executed. To prevent the old instruction from being executed, flush the instruction prefetch unit by coding a jump instruction immediately after any write that modifies an instruction. That fairly strongly hints that the 486 prefetch behavior was much the same as on the 386; once CR0 was modified, the jump instruction had already been prefetched and ready to execute with paging enabled.
There was another change in the Pentium architecture with regard to prefetch buffers. Among other things, a MOV to CR0 is a serializing instruction in the Pentium, meaning that it causes the prefetch queue to be flushed. That wasn’t the case on the 386, and presumably also not on the 486.
But there is a notable exception, according to section 3.2.2 (Serializing Operations) of Intel’s Pentium Processor Family Developer’s Manual, Volume 1: Whenever an instruction is executed to enable/disable paging (that is, change the PG bit of CR0), this instruction must be followed with a jump. The instruction at the target of the branch is fetched with the new value of PG (i.e., paging enabled/disabled), however, the jump instruction itself is fetched with the previous value of PG. Intel386, Intel486 and Pentium processors have slightly different requirements to enable and disable paging. In all other respects, an MOV to CR0 that changes PG is serializing. Any MOV to CR0 that does not change PG is completely serializing.
In other words, the Pentium architecture implemented a small hack designed to work with page switching code written for previous x86 generations. If and only if a MOV to CR0 changes paging, the jump instruction expected to follow it is fetched with the previous setting of the CR0.PG bit. That effectively emulates the 386 behavior. The Pentium manual claims that page enabling/disabling code must be identity mapped, but the special prefetch semantics for the jump immediately following a CR0 change obviate that.
The 1995 manual for the Pentium Pro, Volume 3, section 7.3 (Serializing Instructions) changed the wording somewhat: When an instruction is executed that enables or disables paging (that is, changes the PG flag in control register CR0), the instruction should be followed by a jump instruction. The target instruction of the jump instruction is fetched with the new setting of the PG flag (that is, paging is enabled or disabled), but the jump instruction itself is fetched with the previous setting. The Pentium Pro processor does not require the jump operation following the move to register CR0 (because any use of the MOV instruction in the Pentium Pro processor to write to CR0 is completely serializing). However, to maintain backwards and forward compatibility with code written to run on other Intel architecture processors, it is recommended that the jump operation be performed.
Reality
The behavior of actual CPUs is somewhat unclear. The OS/2 Museum confirmed that the SCO UNIX 3.2v4.0 boot disk causes a triple fault on Pentium II and Pentium M class processors. It is inconceivable that the disk would not boot on 386 and 486 machines available in 1992. The question then is how Pentium and Pentium Pro processors behave.
It would also appear that the current Intel processor documentation is incorrect when it implies that the old semantics of CR0 change followed by a jump are retained. If that were the case, SCO UNIX 3.2v4.0 should still boot on current systems.
Interested readers with access to old hardware may wish to perform their own experiment. The SCO UNIX 3.2v4.0 N1 boot floppy is still available from SCO’s FTP. All it takes is to ignore the prompt to insert the filesystem floppy and hit Enter (naturally after first hitting enter on the ‘boot:’ prompt).
Conclusion
From time to time, Intel changes the semantics of IA-32 compatible processors in ways that break existing commercially available software (we’re talking about off-the-shelf operating systems, not special-case code crafted to exploit differences between processor generations). Following the recommended programming practice is usually a good way to avoid such pitfalls, but not always, as sometimes even the recommended practice changes.
Intel clearly tries to strike a balance between evolving the IA-32 architecture and retaining backwards compatibility. Sometimes existing software falls on the “incompatible” side when a new CPU generation arrives.
In addition, trawling through old and new Intel reference manuals can be a fascinating exercise, especially when attempting to resolve vague and sometimes contradictory statements.
6 Responses to SCO UNIX 3.2v4.0 vs. IA-32 Semantics Changes
Hello, I downloaded and wrote the floppy image you linked and booted it up in a 233Mhz Pentium MMX and got the following screen:
http://img.photobucket.com/albums/v462/wingzeroismine/IMG_20130126_145039_129_zps3888898c.jpg
Which should indicate that it would have otherwise been fine if the second disk were provided right?
I also tried it on a dual 200Mhz Pentium Pro but it seemed to just hang when pressing enter the second time, nothing new appeared on the screen and the cursor just kept blinking, keyboard went unresponsive. This is an IBM PC 365 and I’m not really sure if there would be any hardware in there that the OS just didn’t like.
Yes, the photo clearly shows that the kernel is working on the Pentium MMX. Thanks for the report – this means that the Pentium documentation is indeed correct and the special paging-enable workaround is implemented in the CPU. I was 95% sure it’d work as I couldn’t find any mention of this UNIX version not working on Pentiums, but it’s good to know it works even on the MMX variant (which was technically a 3rd generation Pentium).
The Pentium Pro behavior is a bit strange. If the system triple faults, it should reboot itself. OTOH if something went wrong with the VGA support (which tends to be very picky in SCO products), the screen should be completely blank or showing garbage. I’ll have to call the result “indeterminate”.
I think both actually have identical nvidia TNT2 cards in them. Would there be a point in which a machine would have too much memory for such an old operating system? The MMX had 256MB while the Pro has 512MB.
Loading from the floppy was significantly slower on the Pro as well, I should probably check to make sure the drive itself isn’t failing.
I don’t know if SCO UNIX 3.2v4.0 would blow up with “too much” RAM. Since it won’t work on any modern CPU at all, that’s a bit hard to test 🙂
If the graphics card is the same then that shouldn’t be a problem… If I understood you correctly, the system was hung and completely unresponsive, but did not actually reboot itself? If you want to test the triple fault response on that system, just run the DOS DEBUG.COM and assemble/run the following instruction: MOV SP,1/PUSH AX. You’ll see if the system reboots itself or not. This should be done without a memory manager (EMM386 etc.) installed BTW.
I was wrong on the video card, it still has the matrox card in it. I actually got the same results through debug, here’s what I typed in case I got it wrong.
debug (enter)
a (enter)
mov sp,1 (enter)
push ax (enter)
(enter)
g(enter)
and locked up with same blinking cursor / no keyboard
You did that right… it should reboot the system. That is to say, it should triple fault the CPU which then initiates a shutdown bus cycle to which the chipset typically responds by a reset, but doesn’t have to. I just tried this on two systems (a circa 2005 ThinkPad and a circa 1998 Pentium II) and both rebooted themselves. By the way this is so far the only method I’m aware of which will triple fault a 286+ CPU in real mode but does not require anything beyond the 8086 instruction set.
Anyway, if your system locks up, that’s probably a good indication SCO UNIX 3.2v4.0 does crash on the Pentium Pro. Not terribly surprising as the Pentium II isn’t that different, but it implies that Intel changed the CPU behavior in the P6 microarchitecture but forgot to document the change clearly.
This site uses Akismet to reduce spam. Learn how your comment data is processed.