Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 30 Jul 2008 12:11:54 +0200
From:      Heiko Wundram <modelnine@modelnine.org>
To:        freebsd-stable@freebsd.org
Subject:   Re: Upcoming ABI Breakage in RELENG_7
Message-ID:  <200807301211.54974.modelnine@modelnine.org>
In-Reply-To: <200807300247.34948.david@vizion2000.net>
References:  <1217346345.12322.31.camel@bauer.cse.buffalo.edu> <200807300247.34948.david@vizion2000.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Am Mittwoch, 30. Juli 2008 11:47:34 schrieb David Southwell:
> For those of us who are not as well informed and experienced  as others
> could someone please explain what is meant by an  ABI breakage, its
> implications and how to deal with them.

ABI (Application Binary Interface) is a term used to describe the 
characteristics of binary (i.e. object-) code when accessed from other binary 
code. Generally, this entails everything from method signatures over 
structure sizes to global data (exported symbols) sizes.

One example of ABI-breakage, if you're somewhat proficient in C, could be the 
following:

-- Old code (library header)

/* Some structure which contains a long, and so has total size four bytes on 
i386. */
struct x {
	long y;
} __attribute__((packed))__;

/* Get access to two x objects. This function is implemented in a shared 
library. It returns a pointer to eight bytes of memory (2*struct x). */
extern const struct x* getSomeData();

-- Old code (user)

long test() {
	const struct x* ref = getSomeData();
	return ref[0].y + ref[1].y;
}

--

Now, when compiling the user code, it will pick up the specification of the 
structure x through the library supplied header file, seeing that ref[...].y 
is a long value (four bytes on i386), and so that will be translated to some 
machine code which adds four bytes from offset zero (ref[0].y) of the pointer 
that getSomeData() returns to four bytes from offset four (ref[1].y), and 
returns the result as a long value.

What happens if the following change to the library is made?

-- New code (library header)

/* Some structure which now contains a short. This reduces the size of the 
structure to two bytes on i386. */
struct x {
	short y;
} __attribute__((packed))__;

/* Get access to two x objects. This function is implemented in a shared 
library. Due to the change in struct x, this now returns a pointer to (only!) 
four bytes of storage (2*struct x). */
extern const struct x* getSomeData();

--

The size of the structure member y of structure x has now been reduced to two 
bytes (because the type was changed from long to short), but the user code 
doesn't know this, because it was compiled with the old headers. When 
installing the changed shared library, the old user code will load (!) as 
before, because the symbol getSomeData() is still exported from the shared 
library, but will now erraneously load in total eight bytes from a location 
(because that's hardcoded in the machine code produced for the user binary) 
where it should only load four bytes from after the incompatible change to 
the layout of the structure.

If you were to recompile the user code after installing the updated shared 
library, everything would go back to functioning normally, because now the 
user code picks up the redefined structure x specification which now says 
that it should load only two bytes from offset zero (ref[0].y) and two bytes 
from offset two (ref[1].y) of the returned pointer.

The API has not changed, because the user code would still compile properly, 
but the ABI has, and as such object code compiled before the change in the 
shared library breaks (most often in very mysterious ways).

I don't know exactly what will be changed in the kernel to warrant the heads 
up for this specific case, but as was said, the vnode structure is changed 
(because data was added to it), and as such the structure specification has 
changed. If you now have a KLM which expects the old structure layout 
(because it was compiled before the change), you're bound for trouble, 
because it does not yet know that the offsets for members and the total size 
of the vnode structure has changed, and as such all offsets it uses in the 
vnode structure are broken.

So, if you have a binary only kernel module which requires access/uses the 
vnode structure, you'll have a problem. If not, you don't.

Does this help?

-- 
Heiko Wundram



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200807301211.54974.modelnine>