Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 10 Oct 2010 22:51:20 +0200
From:      Erik Cederstrand <erik@cederstrand.dk>
To:        FreeBSD Hackers <hackers@FreeBSD.org>
Subject:   Deterministic builds?
Message-ID:  <718D8E86-EA2E-4D07-BAFF-5D8D093FD296@cederstrand.dk>

next in thread | raw e-mail | index | archive | help

--Apple-Mail-2086--800964699
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii

Hi hackers

As a followup to the "Timestamps in static libraries" thread which =
resulted in a '-D' option to ar(1), I'd like to discuss if it is a =
worthy goal of the Project to create deterministic builds. By that I =
mean for two make build+install world+kernel+distribution runs, every =
contained file is bitwise identical between the two runs.

Deterministic builds would be useful for me, since I'm creating binary =
diffs against lots of FreeBSD builds, and smaller diffs are good. Also, =
I'd like to detect which files have changed between two commits. I =
imagine it would also be useful for things like IDS and freebsd-update.

Currently, this does not hold for static libraries (*.a), kernel modules =
(*.ko / *.ko.symbols) and the following:

bthidd
cc1
cc1obj
cc1plus
clang
clang++
ctfconvert
freebsd.cf
freebsd.submit.cf
kernel
kernel.symbols
libcrypto.so.6
libufs.so.5
loader
pxeboot
sendmail.cf
submit.cf
tblgen
zfsloader

Most of the libraries can be brought to be identical by using ar -D. =
Some record the absolute OBJDIR path to header files, though (libc.a for =
example).

I tried adding 'D' to ARFLAGS in share/mk/sys.mk, but that's only part =
of the solution. ARFLAGS are overridden hundreds of places in the source =
code, and in some places ARFLAGS isn't even used (or AR for that =
matter). Is it worthwhile to go through the whole tree, fixing up these =
calls to ar? A lot of this is in contrib/ code.

Another option is to add a WITH_DETERMINISTIC_AR knob to the build to =
compile ar with D as default behaviour. This would make the above =
changes unnecessary, but is more intrusive.

A third option is that this is not a priority for the community, or =
directly unwanted, and that I just post-process my builds myself.

I don't know what causes the checksum difference in .ko files - there is =
no size difference, and no difference according to strings(1). A bsdiff =
on the two is typically around 160B.

.ko.symbols have some unique identifiers or addresses internally.

kernel, loader, zfsloader and pxeboot have a build date recorded, kernel =
also has absolute path to GENERIC. OK for the kernel, I think, although =
it would be easier for me if this was just stored in a separate file =
since binary diffs on large files are expensive.

clang, clang++ and tblgen store some absolute paths to .cpp files in the =
src repo internally, plus unique identifiers.

freebsd.cf, freebsd.submit.cf, sendmail.cf and submit.cf record the =
absolute OBJDIR path to sendmail

What do you think?


Thanks,
Erik=

--Apple-Mail-2086--800964699--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?718D8E86-EA2E-4D07-BAFF-5D8D093FD296>