Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 24 Aug 2007 05:23:22 +1000
From:      Peter Jeremy <peterjeremy@optushome.com.au>
To:        Robert Watson <rwatson@FreeBSD.org>
Cc:        freebsd-current@FreeBSD.org
Subject:   Re: idprio(1) broken in recent -current
Message-ID:  <20070823192322.GQ1161@turion.vk2pj.dyndns.org>
In-Reply-To: <20070823162802.T2662@fledge.watson.org>
References:  <20070823101737.GA1161@turion.vk2pj.dyndns.org> <20070823162802.T2662@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--yVhtmJPUSI46BTXb
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2007-Aug-23 16:29:26 +0100, Robert Watson <rwatson@FreeBSD.org> wrote:
>> all the ports and a bit of hair removal, I tracked the problem to the=20
>> 'idprio' command in /usr/local/etc/rc.d/boinc - without that, all works=
=20
>> fine.  If I include that, boinc_client stops sending heartbeat messages.

>When I run "idprio 20 echo hi", it seems to execute per normal as root, an=
d=20
>not at all as an unprivileged user, which I think is the desired symptom o=
f=20
>using it.

The offending command in the rc.d is (as root):
idprio 31 su - boinc -c '/usr/local/bin/boinc_client ...&'

boinc_client forks the actual computation process (setiathome etc).
The setiathome process is basically CPU bound but calls usleep()
occasionally (so on an otherwise idle system, top shows it using
around 95% CPU and in nanslp).

boinc_client is supposed to write a watchdog flag in a shared SysV SHM
block every second.  The setiathome process regularly polls the SHM
and if it doesn't see the watchdog for 31 seconds, it will abort.
boinc_client basically sits in a loop and uses select() timeouts.
I wrote a program to monitor the SHM and it shows that SHM is not
being updated.

It looks like the kernel isn't cleanly handling the situation where
there are multiple idprio processes.  I will try some more experimenting
this evening.

>If you're running things with idprio, is it definitely the case that your=
=20
>system is sometimes idle allowing the program to run once in a while?

It used to work fine even whilst doing a buildworld, now it won't
work on an otherwise idle system...

--=20
Peter Jeremy

--yVhtmJPUSI46BTXb
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (FreeBSD)

iD8DBQFGzd6q/opHv/APuIcRAs6xAJ99fo/O8LOs3Uk3kDPy4+MGAyF/OgCdE5/T
qouhs82wKvGz0gBt7vnFXhk=
=ZKWd
-----END PGP SIGNATURE-----

--yVhtmJPUSI46BTXb--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070823192322.GQ1161>