SheepShaver

Download SheepShaver from the official form post. It will be a zip file named SheepShaver.zip if you download it before the next update. This is a testing build, however, I have found it to work better than any of the ones from 4+ years prior. Web site created using create-react-app. I am trying to install Classilla, an open-source browser for Mac OS9, on SheepShaver for Windows. In SheepShaverGui.exe, I have the Enable 'My Computer' icon checked. I open up SheepShaver and go. Download SheepShaver from the official form post. It will be a zip file named SheepShaver.zip if you download it before the next update. This is a testing build, however, I have found it to work better than any of the ones from 4+ years prior. Extract this zip file and copy SheepShaver.exe, you will need it after the next step.

Sheepshaver Catalina
Sheepshaver Emulator

When I first got my 133MHz BeBox (not new, sadly), it had 'only' 32MB of memory and it had four more SIMM slots to fill. While Be only officially supported 256MB of RAM, I was blissfully ignorant of that, bought

an additional 256MB of memory in four equally sized 72-pin SIMMs and installed it for 288MB of RAM. (It can actually take up to 1GB, I later learned.) Nice, I said! And then SheepShaver never worked again.

SheepShaver is a desperate pun and an unusual emulator: much like Classic on PowerPC Mac OS X, on big-endian PowerPC most of the MacOS and its applications run natively on the processor, in a form analogous to KVM-PR. In fact, SheepShaver on Leopard is pretty much the best way to run Classic applications on Power Macs that must run Leopard, though it also runs on Tiger and presents certain advantages there as well. It existed first on BeOS as a paid product before becoming open source, though multiple later forks fix various problems on modern platforms.

My original theory was that I had somehow broken something in the update or some other installation, and so I never did much with it (especially since I have plenty of real Power Macs around here). But while I was doing other work on the machine, after a game of BeOS Doom I accidentally double clicked on its icon on the desktop and ... it started up! What could have restored it, I feverishly wondered? Did something monkey around with the memory map? (Foreshadowing music plays here.) It only ran the one time, however, and I spent hours trying to retrace my steps to see if I could make it work again and I never could.

But this at least told me that the install was fine and the problem lay elsewhere. I had never closely looked at it in a debugger. Perhaps it was time.

The BeOS debugger isn't gdb, but you get the idea. The offending instruction was an stbu (store byte with update), but the effective address was ... really weird. It looks like it's wrapped around the entire addressing space back to 0! How did this program even work?

In the source code, for all supported platforms, SheepShaver (and Basilisk II, a 68K emulator it shares substantial code with) has a SIGSEGV handler for trapping segmentation faults; here is BeOS's. My initial thought was that somehow the handler wasn't being installed, but a couple debug printfs in the handler showed that not only was the handler being triggered, it was actually passing the segfault along to the system handler apparently on purpose.

A partial explanation appears in the Darwin (Mac OS X) port:

Under Mach there is very little assumed about the memory map of objectfiles. It is the job of the loader to create the initial memory map of anexecutable. In a Mach-O executable there will be numerous loader commandsthat the loader must process. Some of these will create the initial memorymap used by the executable. Under Darwin the static object file linker,ld, automatically adds the __PAGEZERO segment to all executables. Thedefault size of this segment is the page size of the target system andthe initial and maximum permissions are set to allow no access. This is sothat all programs fault on a NULL pointer dereference. Arguably this isincorrect and the maximum permissions shoould be rwx so that programs canchange this default behavior. Then programs could be written that assumea null string at the null address, which was the convention on somesystems. In our case we need to have 8K mapped at zero for the low memoryglobals and this program modifies the segment load command in thebasiliskII [sic] executable so that it can be used for data.

So, the handler expects to have actual memory mapped indeed at an effective address of zero for the MacOS's low memory globals, a holdover from the 68K days (and if I'd read the Basilisk technical notes, I would have realized that sooner). Since such a fault should never have gotten to the handler in the first place, it just passes it along and crashes. That kind of significant address space remapping clearly could not come from a user-level executable on BeOS; there had to be some sort of system component doing that remapping.

Turns out SheepShaver did in fact install a couple system extensions:The last one is used for tunneling emulated networking through the host machine; the sheep driver is the one we want (the two sheep drivers are actually the same file; the dev/ one is a symlink to the actual file in bin/). After a little digging in the source tree, I found the C source for it. It became rapidly obvious after a cursory readthrough that it manipulates the PowerPC page tables.

On PowerPC (prior to POWER9 which introduces a higher-performance radix MMU), the mapping between virtual addresses and physical addresses is maintained by a set of hashed page tables, divided into page table entry groups, or PTEGs. (There is an alternate pathway using block address translation 'BAT' registers but I'm going to ignore that for the purposes of this discussion.) The low memory globals region is 8K in size, so (with 32-bit PowerPC) we need two 4K memory pages to map to 0x0000 and 0x1000, which needn't be contiguous in real memory since we'll set up mappings for each page individually. The driver allocates three pages with malloc() and takes a page-aligned slice of two pages within it, then tries to find where in physical memory those pages got mapped to using get_memory_map(). Now we want to make those pages' effective address mapping in SheepShaver point to 0x0000 and 0x1000 instead.

To find a real address in 32-bit PowerPC, the top four bits of the effective address select one of 16 segment registers mapping each 256MB effective address block. The segment register's low 24 bits (the Virtual Segment ID) is combined with the 16-bit effective address' page number and 12-bit byte number within that page to generate a 52-bit virtual address. The VSID and the page number then get hashed and combined with the storage description register SDR1 to yield the address of the PTEG, the correct PTE is found within it, and the real page number within it then becomes the upper 20 bits of the resulting real page address. We're going to work this in a similar fashion to find the PTEG that would contain the mapping for these lowest page addresses.

Sheepshaver Catalina

Traditionally the number of PTEGs is optimally half the number of real pages to be accessed, and since the next highest power of two in a 288MB BeBox is 512MB, that means 2²⁹ addressable bytes in (divided by 4K, or 2¹²) 2¹⁷ pages. Halving that yields 2¹⁶, or 65536, 64-byte PTEGs to equal a total size of 4MB. BeOS has a specific memory area for this, appropriately named pte_table, that we can look up with find_area() (thus giving us the effective address of the page table pointed to by SDR1). We find the relevant PTEG for each page by doing the same hashing steps the processor would do to resolve the address. In that PTEG, each PTE's highest bit is whether it's valid, followed by the 24-bit VSID, one bit for the hash type flag, five bits of the effective address called the Abbreviated Page Index, the 20-bit Real Page Number, and protection and access control fields.

We won't know the VSID without looking at the segment registers, but we can just walk the entire page table instead since we only have to set this mapping up once. When we find a valid PTE that matches the API, then we know this is a candidate PTEG and derive the VSID from that. We can then either directly modify an existing PTE within it or take advantage of the fact that each PTEG essentially offers up to eight hash collision resolution slots to add a PTE of our own. If we do this to the first place the CPU will look, we will take over that memory mapping for the life of the process.

The memory mapper conveniently has debug logging support for a simple tool called PortLogger that I patched up for BeOS R5. I compiled it with debugging on, restarted, ran PortLogger, started SheepShaver (it crashed, of course) and looked at the output:The driver seemed to properly reserve memory and find the real address (and thus real page number) for its mapping, and was able to resolve and walk the page table. But one problem jumped out immediately: we only have two pages (here 0 and 1d). Why is it that it found four? Notice that the 'fraternal twin' pages have matching RPNs, but the VSIDs are different and we don't know which VSID is right. Did our algorithm effectively cause its own hash collision?

Continuing on, when we look at the existing PTE we found, the RPN is the first through fifth hex digits in the second word and both effective addresses match their real ones (80069b80 00000010 and 80069b80 00001010). That seems hinky.

Sheepshaver Emulator

My first thought was maybe we had a stale TLB and our PTE change didn't stick, because on the PowerPC 603 and 603e the code doesn't do a tlbsync to synchronize the translation lookaside buffer (which caches all this work) and this BeBox has two 603e CPUs. However, despite the code and Metrowerks saying it's 604-only, tlbsync is listed as a valid instruction in my copy of the 603e User's Manual Appendix A. I forced it to do a tlbsync by commenting out the check, compiled it again, restarted, ran PortLogger and started SheepShaver. Unfortunately, while it didn't do anything worse, it didn't work either.

My next guess was to see if maybe we were working on the wrong 'twin.' Assuming we really did have two sets of colliding hashes, what if we used the other one? A line of code to stop the search at the first page pair rather than the second was added and I tried again:Success! Now we actually have a free PTE, instead of modifying a questionable one, and we alter that. The mapping now takes precedence over anything else for that effective address and SheepShaver starts and runs normally. It also fixed Basilisk II, which would not run for the same reason, though SheepShaver seems to run 68K applications rather better than Basilisk II does.

Why was this never noticed? Well, like I say, Be never advertised support for more than 256MB in the BeBox, and in 1997 that would have been a significant amount of memory (my Power Mac 7300 in 2000 had 192MB and I thought that was a lot). Most PowerPC systems running BeOS probably had substantially less. Like many other bugs due to clock speed and RAM, no one ever dreamed future users would have such a surfeit of them.

The patched binary and source code are on Be-Power.