Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 2 Jun 1999 01:08:36 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        SBenjamin@quest.com (Scott Benjamin)
Cc:        freebsd-chat@FreeBSD.ORG
Subject:   Re: Binary Compatibility
Message-ID:  <199906020108.SAA07001@usr09.primenet.com>
In-Reply-To: <1D7D0A00F0E8D111A26600104B873E4C017835FA@exchange.quests.com> from "Scott Benjamin" at Jun 1, 99 03:16:02 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> Does anyone know off hand, where I might find some information on Binary
> Compatibility.  I would like to understand, in more detail, how FreeBSD can
> run Linux Binaries.

FreeBSD has an abstraction called an "execution class loader".  This
is a wedge into the execve(2) system call.

What happens is that FreeBSD has a list of loaders, instead of a
single loader with a failback to the "#!" loader for running any
shell interpreters or shell scripts.

Historically, the only loader on the UNIX platform examined the magic
number (generally the first 4 or 8 bytsed of the file) to see if
it was a binary known to the system, and if so, invoked the binary
loader.

If it was not the binary type for the system, the execve returned
a failure, and the shell attempted to start executing it as shell
commands.

The assumption was a default of "whatever the current shell is".

Later, a hack was made for /bin/sh to examine the first two characters,
and if they were ":\n", then it invoked the csh shell instead (I
believe SCO first made this hack, but am willing to be corrected).


What FreeBSD does now is go through a list of loaders, with a
generic "#!" loader that knows about interpreters as the characters
which follow to the next whitespace next to last, followed by a
fallback to /bin/sh.


For the Linux binary emulation, FreeBSD sees the magic number as
an ELF binary (it makes no distinction between FreeBSD, Solaris,
Linux, or any other OS which has an ELF image tpye, at this point).

The ELF loader looks for a specialized "brand", which is a comment
section in the ELF image, and which is not present on SVR4/Solaris
ELF binaries.

For Linux binaries to function, they must be "branded" as type
"Linux"; form the "brandelf(1)" man page:

	% brandelf -t Linux file

When this is done, the ELF loader will see the "Linux" brand on
the file.

When the ELF loader sees the "Linux" brand, the loader replaces a
pointer in the proc structure.  All system calls are indexed through
this pointer (in a traditional UNIX system, this would be the sysent[]
structure array, containing the system calls).  In addition, the
process is flageed for special handling of the trap vector for the
signal trampoline code, and sever other (minor) fixups that are
handled by the Linux kernel module.

The Linux system call vector contains, among other things, a list
of sysent[] entries whose addresses reside in the kernel module.

When a system call is called by the Linux binary, the trap code
dereferences the system call function pointer off the proc structure,
and gets the Linux, not the FreeBSD, system call entry points.

In addition, the Linux emulation dynamically "reroots" lookups;
this is, in effect, what the "union" option to FS mounts (Note:
_not_ the unionfs!) does.  First, an attempt is made to lookup
the file in the "/compat/linux/<original path>" directory, *then*
only if that fails, the lookup is done in the "/<original path>"
directory.  This makes sure that binaries that require other
binaries can run (e.g., the Linux toolchain can all run under
emulation).  It also means that the Linux binaries can load and
exec FreeBSD binaries, if there are no corresponding Linux
binaries present, and that you could place a "uname" command in
the "/compat/linux" directory tree to ensure that the Linux
binaries couldn't tell they weren't running on Linux.

In effect, there is a Linux kernel in the FreeBSD kernel; the various
underlying functions that implement all of the services provided by
the kernel are identical to both the FreeBSD system call table entries,
and the Linux system call table entries: file system operations,
virtual memory operations, signal delivery, System V IPC, etc..  The
only difference is that FreeBSD binaries get the FreeBSD "glue"
functions, and Linux binaries get the Linux "glue" functions (most
older OS's only had their own "glue" functions: addresses of functions
in a static global sysent[] structure array, instead of addresses of
functions dereferenced off a dynamically initialized pointer in the
proc structure of the process making the call).

Which one is the native FreeBSD ABI?  It doesn't matter.  Basically
the only difference is that (currently; this could easily be changed
in a future release, and probably will be after this) the FreeBSD
"glue" functions are statically linked into the kernel, and the Linux
glue functions can be statically linked, or they can be accessed via
a kernel module.


Yeah, but is this really emulation?  No.  It's an ABI implementation,
not an emulation.  There is no emulator (or simulator, to cut off the
next uestion) involved.

So why is it called "Linux emulation"?  To make it hard to sell
FreeBSD!  8-).  Really, it's because the historical implementation
was done at a time when there was really no word other than that to
describe what was going on; saying that FreeBSD ran Linux binaries
wasn't true, if you didn't compile the code in or load a module,
and there needed to be a word to describe what was being loaded --
hence "the Linux emulator".


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-chat" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199906020108.SAA07001>