Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 24 Feb 2010 18:53:59 +1100
From:      Peter Jeremy <peterjeremy@acm.org>
To:        freebsd-stable@freebsd.org
Cc:        gshapiro@freebsd.org
Subject:   Re: sleep(3) sometimes too sleepy on FreeBSD 8.0?
Message-ID:  <20100224075359.GA61876@server.vk2pj.dyndns.org>
In-Reply-To: <20100223013522.GE2303@rwpc12.mby.riverwillow.net.au>
References:  <20100223013522.GE2303@rwpc12.mby.riverwillow.net.au>

next in thread | previous in thread | raw e-mail | index | archive | help

--J/dobhs11T7y2rNN
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Updates following some off-line discussions and debugging with John on
IRC.  I've cc'd gshapiro@ because the problem appears to be sendmail,
rather than the FreeBSD kernel.

On 2010-Feb-23 12:35:22 +1100, John Marshall <john.marshall@riverwillow.com=
=2Eau> wrote:
>Environment: sendmail 8.14.4 on FreeBSD 8.0-RELEASE-p2

Note that this is stock ISC sendmail, not the sendmail in either the
base system or the port.

>I posted about this in comp.mail.sendmail and was told...
>
>> sleep() should be one of these calls:
>>=20
>>         if (njobs =3D=3D 0 && WorkGrp[wgrp].wg_lowqintvl < MIN_SLEEP_TIM=
E)
>>                 sleep(MIN_SLEEP_TIME);
>>         else if (WorkGrp[wgrp].wg_lowqintvl <=3D 0)
>>                 sleep(QueueIntvl > 0 ? QueueIntvl : MIN_SLEEP_TIME);
>>         else
>>                 sleep(WorkGrp[wgrp].wg_lowqintvl);

Whilst it's true that the code calls sleep(), it's not calling
sleep(3) in the FreeBSD libc.  Instead it's calling a sleep() defined
in libsm/clock.c - which is a horrible maze of #ifdefs.

John has pre-processed that code and the result it at:
http://www.riverwillow.net.au/~john/sm/clock.preprocessed

At a quick look, the code is broken: sm_seteventm() generates a
one-off timer using setitimer(2), which will send SIGALRM when it
expires.  sm_releasesignal() then unblocks SIGALRM.  In theory, the
SIGALRM could be delivered anywhere after the (!SmSleepDone) test and
before pause() is called - in which case, the signal is lost and
pause() will sleep forever.

On 2010-Feb-24 08:13:06 +1100, John Marshall <john.marshall@riverwillow.com=
=2Eau> wrote:
>My ktrace file was created with 'ktrace -g 48501'.  I have the result of
>'kdump -R -p 48504' available at:
>
> <http://www.riverwillow.net.au/~john/8_0/rwsrv04_201002240725.kdump.gz>;

The syscall pattern near the end of this file is significantly different
=66rom that elsewhere in the file - with gettimeofday(), sigprocmask() and
sigsuspend() looping fairly rapidly.  Interestingly, sigsuspend() is
returning EINTR but no signal is reported.  I'm not sure what could
cause this.

This syscall pattern looks like the while() loop in sendmail's sleep(),
though it does appear that the loop is exited on that occasion but not
on the following occasion (though the reason for this behaviour is
unclear).

Overall, it appears that there is a race condition in sendmail and
something in the 8.0 signal handling appears to make this race easier
to lose.

Going back to the original clock.c source code, the other thing that
is obvious is the HAVE_NANOSLEEP block - if this was active, sleep()
would call nanosleep(2) and the whole signal mess would be avoided.
It's not clear when that code was added but clock.c has not been
touched for many years.  In the sendmail in FreeBSD-8.0, there is no
other reference to HAVE_NANOSLEEP within sendmail.  sendmail 8.14.4
(in 8-STABLE) has HAVE_NANOSLEEP enabled on Solaris 11 only.

Is there any reason why HAVE_NANOSLEEP is not defined for FreeBSD?
Looking back through the commit logs, nanosleep(2) was implemented in
sys/kern/kern_time.c v1.23 on Thu May 8 14:16:25 1997 UTC - that's
just before RELENG_2_2.

--=20
Peter Jeremy

--J/dobhs11T7y2rNN
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (FreeBSD)

iEYEARECAAYFAkuE2xcACgkQ/opHv/APuIc1oACgnsnkJg0LUt/QFHWuKMQGKFl5
cpkAn1emZlO8CO0dn21kIEM3qi61kuid
=zPnc
-----END PGP SIGNATURE-----

--J/dobhs11T7y2rNN--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100224075359.GA61876>