Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Nov 2014 19:07:14 +0300
From:      Eygene Ryabinkin <rea@freebsd.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        davidxu@FreeBSD.org, freebsd-threads@FreeBSD.org
Subject:   Re: [CFR][PATCH] Call proper signal handler from libthr for non-SA_SIGINFO case
Message-ID:  <ngq2VsHPChtgNNL/z%2BlRaAXEeTI@1d%2BaJAniZP50FCDdGj54nd51%2Bks>
In-Reply-To: <20141120194925.GK17068@kib.kiev.ua>
References:  <KBP9w7ztz0PP7uy58G1ODYb/voI@1d%2BaJAniZP50FCDdGj54nd51%2Bks> <20141120194925.GK17068@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help

--LXqH7sWsIplhrbjF
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Konstantin, good day.

Thu, Nov 20, 2014 at 09:49:25PM +0200, Konstantin Belousov wrote:
> On Thu, Nov 20, 2014 at 09:32:19PM +0300, Eygene Ryabinkin wrote:
> > Was hacking on Squid that used to dump core when signal 15 was
> > delivered to it via 'squid -k shutdown' and found out that the reason
> > for core dumps is that thr_sighandler() is _sometimes_ passed 0x10001
> > as the value of "info" argument despite that _sigaction() always arms
> > flags for __sys_sigaction() with SA_SIGINFO.
> >=20
> > But looking at handle_signal() I had discovered that if libthr client
> > wants no SA_SIGINFO, then actp->sa_sigaction() will be called, though
> > specs,
> >   http://pubs.opengroup.org/onlinepubs/009695399/functions/sigaction.ht=
ml
> > say that in this case sa_handler should be used.  And this is consistent
> > with the non-threaded case.
>
> The sa_sigaction and sa_handler live in the same bits, look at the
> union in the definition of the struct sigaction in sys/signal.h. It is
> the same pointer, cast to different functions signature depending on
> SA_SIGSINFO.

Sorry, should have been studied sigaction(2) more carefully: everything
turned out to be written there.  So, libthr is fine here.

> > The origin of 0x10001 as info isn't yet clear to me, though I have
> > a feeling that it is SI_USER that is slipping somewhere to the wrong
> > place.  Will dig further.
>
> It is not clear to me what your problem is, at all.

The problem is that libthr's _sigaction() proxies sa_flags, so if
SA_RESETHAND was specified by the caller, it will also be passed to
the kernel when thr_sighandler() is installed.  And since sa_flags are
equipped with SA_SIGINFO inside _sigaction(), thr_sighandler() and
handle_signal() always expect to be called with POSIX SA_SIGINFO
semantics.

This isn't the case for SA_RESETHAND, because sigdflt() from
kern/kern_sig.c will be called before mach-dependent sv_sendsig
vector, so
{{{
SIGDELSET(ps->ps_siginfo, sig)
}}}
will be issued.  Thus, any code inside mach-dependent handler won't
see ps_siginfo, so will always use traditional semantics.  This
explains why I see 0x10001 as "siginfo_t *info": it is just "int code"
for the traditional case and the signal in question is really produced
by userland, so it is SI_USER.

So, it looks like a kernel bug (since if we request SA_SIGINFO, we
should get the proper handler to be called even for the SA_RESETHAND
case).  I see two possibilities:

 - invoke SA_RESETHAND processing inside mach-dependent code; that's
   a kind of ugly and makes mach-specific code to deal with the
   generic signal handling logics;

 - pass information about SA_SIGINFO "out-of-band" (not in
   ps->ps_siginfo).

We can't postpone sigdflt() to be called after signal being delivered,
since spec requires it to be done before calling user-space handler.

Had created a patch that adds 4th argument to sv_sendsig and fixes
this problem:
  http://codelabs.ru/fbsd/patches/libthr/11-CURRENT-fix-SIGINFO-processing-=
with-RESETHAND.diff
This changes internal KABI, but hopefully sv_sendsig is an internal
kernel interface that isn't used by anything else outside kern_sig.c.

It works fine with virgin libthr and solves Squid restart problem.
Will try to install it to more test machines and see if it will work
as expected.  Seems like a kernel regression test like the attached
one will also be handy.
--=20
Eygene Ryabinkin                                        ,,,^..^,,,
[ Life's unfair - but root password helps!           | codelabs.ru ]
[ 82FE 06BC D497 C0DE 49EC  4FF0 16AF 9EAE 8152 ECFB | freebsd.org ]

--LXqH7sWsIplhrbjF
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iL4EABEKAGYFAlRvYzJfFIAAAAAALgAoaXNzdWVyLWZwckBub3RhdGlvbnMub3Bl
bnBncC5maWZ0aGhvcnNlbWFuLm5ldDgyRkUwNkJDRDQ5N0MwREU0OUVDNEZGMDE2
QUY5RUFFODE1MkVDRkIACgkQFq+eroFS7PuBvAD/XrhsUYpvMRgUJXW6W22OGN9R
EKZYqQhKw8IyFYl1NLkA/0FOEjikQJWMA1SH6BB3h+3osS7e0AQkS2cwOAulpUM7
=RnhA
-----END PGP SIGNATURE-----

--LXqH7sWsIplhrbjF--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ngq2VsHPChtgNNL/z%2BlRaAXEeTI>