From owner-freebsd-hackers@FreeBSD.ORG Sun Oct 10 20:51:23 2010 Return-Path: Delivered-To: hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2DBAE10656A9 for ; Sun, 10 Oct 2010 20:51:23 +0000 (UTC) (envelope-from erik@cederstrand.dk) Received: from csmtp1.one.com (csmtp1.one.com [195.47.247.21]) by mx1.freebsd.org (Postfix) with ESMTP id 9409B8FC12 for ; Sun, 10 Oct 2010 20:51:22 +0000 (UTC) Received: from [192.168.10.202] (0x573b9942.cpe.ge-1-2-0-1101.ronqu1.customer.tele.dk [87.59.153.66]) by csmtp1.one.com (Postfix) with ESMTP id C9FA21BC00AF3 for ; Sun, 10 Oct 2010 20:51:20 +0000 (UTC) From: Erik Cederstrand Content-Type: multipart/signed; boundary=Apple-Mail-2086--800964699; protocol="application/pkcs7-signature"; micalg=sha1 Date: Sun, 10 Oct 2010 22:51:20 +0200 Message-Id: <718D8E86-EA2E-4D07-BAFF-5D8D093FD296@cederstrand.dk> To: FreeBSD Hackers Mime-Version: 1.0 (Apple Message framework v1081) X-Mailer: Apple Mail (2.1081) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: Deterministic builds? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Oct 2010 20:51:23 -0000 --Apple-Mail-2086--800964699 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Hi hackers As a followup to the "Timestamps in static libraries" thread which = resulted in a '-D' option to ar(1), I'd like to discuss if it is a = worthy goal of the Project to create deterministic builds. By that I = mean for two make build+install world+kernel+distribution runs, every = contained file is bitwise identical between the two runs. Deterministic builds would be useful for me, since I'm creating binary = diffs against lots of FreeBSD builds, and smaller diffs are good. Also, = I'd like to detect which files have changed between two commits. I = imagine it would also be useful for things like IDS and freebsd-update. Currently, this does not hold for static libraries (*.a), kernel modules = (*.ko / *.ko.symbols) and the following: bthidd cc1 cc1obj cc1plus clang clang++ ctfconvert freebsd.cf freebsd.submit.cf kernel kernel.symbols libcrypto.so.6 libufs.so.5 loader pxeboot sendmail.cf submit.cf tblgen zfsloader Most of the libraries can be brought to be identical by using ar -D. = Some record the absolute OBJDIR path to header files, though (libc.a for = example). I tried adding 'D' to ARFLAGS in share/mk/sys.mk, but that's only part = of the solution. ARFLAGS are overridden hundreds of places in the source = code, and in some places ARFLAGS isn't even used (or AR for that = matter). Is it worthwhile to go through the whole tree, fixing up these = calls to ar? A lot of this is in contrib/ code. Another option is to add a WITH_DETERMINISTIC_AR knob to the build to = compile ar with D as default behaviour. This would make the above = changes unnecessary, but is more intrusive. A third option is that this is not a priority for the community, or = directly unwanted, and that I just post-process my builds myself. I don't know what causes the checksum difference in .ko files - there is = no size difference, and no difference according to strings(1). A bsdiff = on the two is typically around 160B. .ko.symbols have some unique identifiers or addresses internally. kernel, loader, zfsloader and pxeboot have a build date recorded, kernel = also has absolute path to GENERIC. OK for the kernel, I think, although = it would be easier for me if this was just stored in a separate file = since binary diffs on large files are expensive. clang, clang++ and tblgen store some absolute paths to .cpp files in the = src repo internally, plus unique identifiers. freebsd.cf, freebsd.submit.cf, sendmail.cf and submit.cf record the = absolute OBJDIR path to sendmail What do you think? Thanks, Erik= --Apple-Mail-2086--800964699--