From owner-freebsd-questions@FreeBSD.ORG Wed Jan 2 23:17:08 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E626716A41A for ; Wed, 2 Jan 2008 23:17:08 +0000 (UTC) (envelope-from tom@tomjudge.com) Received: from smtp810.mail.ird.yahoo.com (smtp810.mail.ird.yahoo.com [217.146.188.70]) by mx1.freebsd.org (Postfix) with SMTP id 51D1913C458 for ; Wed, 2 Jan 2008 23:17:07 +0000 (UTC) (envelope-from tom@tomjudge.com) Received: (qmail 36733 invoked from network); 2 Jan 2008 22:50:27 -0000 Received: from unknown (HELO ?192.168.1.2?) (thomasjudge@btinternet.com@86.151.234.106 with plain) by smtp810.mail.ird.yahoo.com with SMTP; 2 Jan 2008 22:50:27 -0000 X-YMail-OSG: mVUDSewVM1lbNg.ImTKbQevdd8nvagpKpUocL35fI75h1Gn_wk4Z.DzB1yuzOLsEW78- Message-ID: <477C1629.1030604@tomjudge.com> Date: Wed, 02 Jan 2008 22:54:33 +0000 From: Tom Judge User-Agent: Thunderbird 1.5.0.13 (X11/20070824) MIME-Version: 1.0 To: Jarrod Sayers References: <59DD6CCE263ECD75A7283A7B@ganymede.hub.org> <477A72B8.8010307@protected-networks.net> <477BAD2B.8070603@tomjudge.com> <1DB78354-EBA2-43D0-A2D6-EFDA4950135B@netleader.com.au> In-Reply-To: <1DB78354-EBA2-43D0-A2D6-EFDA4950135B@netleader.com.au> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "Marc G. Fournier" , freebsd-stable@freebsd.org, freebsd-questions@freebsd.org Subject: Re: Nagios + 6.3-RELEASE == Hung Process X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Jan 2008 23:17:09 -0000 Jarrod Sayers wrote: > On 03/01/2008, at 1:56 AM, Tom Judge wrote: >> I have also seen this issue, but have always put it down to the way that >> we manage our nagios deployments with cfengine. I will try to deploy >> this change and monitor for the problem to see if it persists. > > I hope I can confirm your frustrations. There is a threading issue with > Nagios when it's binaries are linked against libpthread(3) threading > library, the default on recent FreeBSD 5.x releases and all 6.x > releases. The issue is random and extremely difficult to track down with > the symptoms being a second Nagios process sitting on the system hanging > a CPU. Be rest assured that I have been working on it, and have seen it > on one system of mine. > Not sure if this is related at all but out of the 3 nagios deployments we have here I have only ever seen it on one (It currently has 2 nagios threads spinning CPU time atm). The differences on that server are: * It is amd64 compared to i386 * It also runs ndo2db from ndoutils 1.4b7 All the systems run 6.2-RELEASE-p5 and nagios-2.9_1, they are also all patched with gnu libltdl patch below. Don't know if that info is of any use to you. > Changes have been submitted for net-mgmt/nagios-devel (aka Nagios > 3.0.r1)) to force the build process to link against libthr(3) where > available, removing the need to map libpthread() out with > /etc/libmap.conf. If this goes well, as stated in the PR, i'll > back-port it to net-mgmt/nagios (aka Nagios 2.10) in the next few days. > > If anyone out there is running net-mgmt/nagios-devel and feels like > trying it for me, see ports/119246 and drop me an email with a before > and after "ldd /usr/local/bin/nagios". > >> On a side note if you want to use broker modules with nagios from port >> you need to change the following in the port Makefile in order to make >> them load properly: >> >> From: >> USE_AUTOTOOLS= autoconf:259 >> To: >> SE_AUTOTOOLS= autoconf:259 libltdl:15 >> >> I sent an email to the maintainer but got no response and my email did >> not seem to have affected the last commit to upgrade to 2.10 > > I did receive that email and the changes went in with the last commit of > net-mgmt/nagios-devel to test. No issues have arisen so i'll be > back-porting it to net-mgmt/nagios soon for you. There also has been a > rather large ports freeze which delayed the upgrade to Nagios 2.10, that > PR was submitted on the 1st of November and committed on the 13th of > December. Unfortunately your email fell somewhere in the middle, > apologies for not letting you know. > Thanks for this, I currently maintain the patch on our build servers.