From owner-freebsd-current@FreeBSD.ORG Thu Aug 23 19:23:24 2007 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C7C5A16A419 for ; Thu, 23 Aug 2007 19:23:24 +0000 (UTC) (envelope-from peterjeremy@optushome.com.au) Received: from turion.vk2pj.dyndns.org (c220-239-20-82.belrs4.nsw.optusnet.com.au [220.239.20.82]) by mx1.freebsd.org (Postfix) with ESMTP id 3776C13C428 for ; Thu, 23 Aug 2007 19:23:23 +0000 (UTC) (envelope-from peterjeremy@optushome.com.au) Received: from turion.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by turion.vk2pj.dyndns.org (8.14.1/8.14.1) with ESMTP id l7NJNMvX004115; Fri, 24 Aug 2007 05:23:22 +1000 (EST) (envelope-from peter@turion.vk2pj.dyndns.org) Received: (from peter@localhost) by turion.vk2pj.dyndns.org (8.14.1/8.14.1/Submit) id l7NJNMHA004114; Fri, 24 Aug 2007 05:23:22 +1000 (EST) (envelope-from peter) Date: Fri, 24 Aug 2007 05:23:22 +1000 From: Peter Jeremy To: Robert Watson Message-ID: <20070823192322.GQ1161@turion.vk2pj.dyndns.org> References: <20070823101737.GA1161@turion.vk2pj.dyndns.org> <20070823162802.T2662@fledge.watson.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="yVhtmJPUSI46BTXb" Content-Disposition: inline In-Reply-To: <20070823162802.T2662@fledge.watson.org> X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.16 (2007-06-09) Cc: freebsd-current@FreeBSD.org Subject: Re: idprio(1) broken in recent -current X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Aug 2007 19:23:24 -0000 --yVhtmJPUSI46BTXb Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2007-Aug-23 16:29:26 +0100, Robert Watson wrote: >> all the ports and a bit of hair removal, I tracked the problem to the=20 >> 'idprio' command in /usr/local/etc/rc.d/boinc - without that, all works= =20 >> fine. If I include that, boinc_client stops sending heartbeat messages. >When I run "idprio 20 echo hi", it seems to execute per normal as root, an= d=20 >not at all as an unprivileged user, which I think is the desired symptom o= f=20 >using it. The offending command in the rc.d is (as root): idprio 31 su - boinc -c '/usr/local/bin/boinc_client ...&' boinc_client forks the actual computation process (setiathome etc). The setiathome process is basically CPU bound but calls usleep() occasionally (so on an otherwise idle system, top shows it using around 95% CPU and in nanslp). boinc_client is supposed to write a watchdog flag in a shared SysV SHM block every second. The setiathome process regularly polls the SHM and if it doesn't see the watchdog for 31 seconds, it will abort. boinc_client basically sits in a loop and uses select() timeouts. I wrote a program to monitor the SHM and it shows that SHM is not being updated. It looks like the kernel isn't cleanly handling the situation where there are multiple idprio processes. I will try some more experimenting this evening. >If you're running things with idprio, is it definitely the case that your= =20 >system is sometimes idle allowing the program to run once in a while? It used to work fine even whilst doing a buildworld, now it won't work on an otherwise idle system... --=20 Peter Jeremy --yVhtmJPUSI46BTXb Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) iD8DBQFGzd6q/opHv/APuIcRAs6xAJ99fo/O8LOs3Uk3kDPy4+MGAyF/OgCdE5/T qouhs82wKvGz0gBt7vnFXhk= =ZKWd -----END PGP SIGNATURE----- --yVhtmJPUSI46BTXb--