Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 23 May 1996 21:17:42 -0400 (EDT)
From:      Bill Paul <wpaul@skynet.ctr.columbia.edu>
To:        freebsd-hackers@freebsd.org
Subject:   three stage boot again
Message-ID:  <199605240117.VAA03425@skynet.ctr.columbia.edu>

next in thread | raw e-mail | index | archive | help
Yes, I'm still here.

After much hair pulling, code scrutinizing, book reading, and Elvis
only knows how much trial and error, I finally managed to cobble together
an assembly language startup routine that lets me load the existing
second stage bootstrap into a standalone program that can be loaded
with itself. Basically, here's the magic I've managed to unravel:

- The program is an OMAGIC binary link edited for address 0. This
  is necessary because of the real mode/protected mode switching
  business. When in real mode, we execute at physical memory location
  0x10000, but with a code segment descriptor that basically maps
  0x10000 to 0x0. (So the program thinks it's executing at 0x0 but
  really isn't.) In real mode, we're excuting at 0x1000:0, which again
  makes the code think it's executing at 0x0. (And this is why we
  can't make it larger than 64K, since that would cause the program
  to extend into 0x2000:0, and all the offsets and addresses calculated
  by the linker would no longer work.)

- The program untimately runs at 0x10000, which is the same location
  as the existing bootstrap (this was so that I could steal the
  existing global descriptor table values until I understood them
  well enough to change them). The program is actually loaded into
  memory at a different location and copies itself to 0x10000.
  (It could actually go somewhere else, like 0x20000. I'm saving
  that for later.)

- Even though the binary is link edited for address 0x0, its a.out
  header is massaged by a small fixup program that changes its entry
  address to 0x100000. This is to fool the existing boot block into
  loading it correctly: the second stage boot loads files into
  memory based on their entry points, however we can't link the
  program for its entry point since then it won't work when we
  relocate it.

- The existing bootstrap needs to be modified slightly to allow loading
  of OMAGIC binaries. Currently, it expects to load ZMAGIC binaries,
  which I think have their sections page aligned. To account for this,
  the bootstrap skips a chunk of memory between loading the text and
  data segments; this makes the bootstrap blow up when it tries to load
  an OMAGIC binary. The code needs to be changed to check the magic
  value of the binary and only skip the space for ZMAGIC binaries
  instead of doing it unconditionally.

- Once the standalone program is loaded, it copies itself down to 0x10000.
  This clobbers the global descriptor table left behind by the second stage
  boot, so we have to build a new one. The standalone image has its own
  table and it resets the GDT register to use it. It then performs an
  intrasegment jump to reload the code segment selector and to start
  executing in the new segment (with the 0x10000 offset). Then it sets
  the DS, SS and ES segment selectors to match the new code segment 
  selector, resets the stack pointer and jumps to boot().

Sounds simple, right? Hah. All this took me a couple of weeks to
figure out. I started off trying to figure out how the mach_kboot
program worked, but that only frustrated me since it seems to have
been written for a different assembler. Fortunately, I found a couple
of reasonably helpful books on i386 architecture and programming in
the Columbia engineering library. (Being a Columbia University employee 
has its perks: you get to check out books free of charge. :) It was 
with these that I finally learned what a global descriptor table was
and what segment selector registers did (and how they were different from 
segment registers in real mode). Of course, progress was slow even with 
these books since they don't use gas for their examples. (And will 
somebody please tell me what the hell 'data32/addr32' mean?)

Anyway.

Now that that's done, there's still one more obstacle to overcome.
When I went to link the new startup routine with the boot code for
the first time, I ended up with an unresolved symbol called '_disklabel'.
It turns out that this symbol is defined in start.S and looks like
it's meant to just provide a pointer to a particular area of memory.
I assume that this is supposed to contain the BSD disklabel for the
disk from which the bootstrap was loaded, but I can't tell how it's
supposed to know where it is. I also don't quite understand the
following gas syntax

ENTRY(disklabel)
	. = EXT(boot1) + 400

I realize that the ENTRY() macro is being (ab)used to turn disklabel
into a global symbol, but I don't quite understand what the next line
does. In a fuzzy sort of way, I think what's happening is that the
disklabel ends up slapped into the boot block somehow and gets loaded
somewhere along the way by the first stage. My problem is that the
third stage will need this disklabel information. I'm not sure if I
should somehow arrage to save this disklabel info and pass it to the
third stage or if I should make the third stage read it over again.
(It should be able to do it by itself, I suppose.)

I might be able to figure this part out on my own, but my brain
still itches from the last part. Sage advice from those who know
how this stuff works would be most welcome.

-Bill

PS: Yes, I'm having tremendous fun, dammit.

-- 
=============================================================================
-Bill Paul            (212) 854-6020 | System Manager
Work:         wpaul@ctr.columbia.edu | Center for Telecommunications Research
Home:  wpaul@skynet.ctr.columbia.edu | Columbia University, New York City
=============================================================================
License error: The license for this .sig file has expired. You must obtain
a new license key before any more witty phrases will appear in this space.
=============================================================================



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199605240117.VAA03425>