From owner-freebsd-current@FreeBSD.ORG Thu Jul 14 23:21:29 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0D71D106566C; Thu, 14 Jul 2011 23:21:29 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214]) by mx1.freebsd.org (Postfix) with ESMTP id 79B7F8FC15; Thu, 14 Jul 2011 23:21:28 +0000 (UTC) Received: from alchemy.franken.de (localhost [127.0.0.1]) by alchemy.franken.de (8.14.4/8.14.4/ALCHEMY.FRANKEN.DE) with ESMTP id p6ENLQcZ042056; Fri, 15 Jul 2011 01:21:27 +0200 (CEST) (envelope-from marius@alchemy.franken.de) Received: (from marius@localhost) by alchemy.franken.de (8.14.4/8.14.4/Submit) id p6ENLQqd042055; Fri, 15 Jul 2011 01:21:26 +0200 (CEST) (envelope-from marius) Date: Fri, 15 Jul 2011 01:21:26 +0200 From: Marius Strobl To: Doug Barton Message-ID: <20110714232126.GK95673@alchemy.franken.de> References: <20110707100446.GJ14797@alchemy.franken.de> <20110707154958.GK14797@alchemy.franken.de> <20110708181102.GA95673@alchemy.franken.de> <20110708193236.GB95673@alchemy.franken.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: KOT MATPOCKuH , FreeBSD Current Subject: Re: named crashes on assertion in rbtdb.c on sparc64/SMP X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jul 2011 23:21:29 -0000 On Thu, Jul 14, 2011 at 09:53:42AM +0400, KOT MATPOCKuH wrote: > 2011/7/11 KOT MATPOCKuH : > >> Oops, sorry, I forgot to revert the previous patch when test-compiling. > >> Please re-fetch sparc64_isc_atomic.h.diff2 and try again. > > I started named from ports (dns/bind96) at Sat Jul ?9 10:08:41 MSD, > > and it worked properly till Sun Jul 10 22:25:41 MSD. > > At 22:25:41 I restarted bind from base system with your > > sparc64_isc_atomic.h.diff2. > > From this moment till today, 15:57:05 he crashed 3 times: > > Jul 10 23:19:19 sunrise kernel: pid 45352 (named), uid 53: exited on signal 6 > > Jul 11 14:52:20 sunrise kernel: pid 52032 (named), uid 53: exited on signal 6 > > Jul 11 15:14:15 sunrise kernel: pid 71300 (named), uid 53: exited on signal 6 > > > > To make to ensure proper operation of bind from ports, I ran it again > > at 15:57:05, and, I think, we need to wait several days. > And from that time till now bind from ports never died and works properly... > Okay. Doug, could you please disable the use of atomic operations for sparc64 in the in-tree BIND via the following patch in order to match what the vendor source does? http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff I've no idea why they don't work properly (apart from the fact that there additionally should be memory barriers at least when used for reference counting just like the alpha version of the ISC atomic operations uses), I just can say they match what we use in the kernel without problems pretty closely and that they work as described in the respective comments when testing them stand-alone. So my best guess is that the BIND source additionaly depends on some x86-specific behavior of the atomic operations there or in general, but from a glance the source it's not obvious for me what that could be. Given that the vendor source doesn't even use atomic operations on Solaris/SPARC I suspect this is a non-trivial problem. It probably would be a good idea to also disable the use of atomic operations for arm again just like the vendor source does as they don't work there either but nobody seems to care (see PR 154306). Marius