Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 08 Oct 2002 10:14:29 -0400 (EDT)
From:      John Baldwin <jhb@FreeBSD.org>
To:        Don Lewis <dl-freebsd@catspoiler.org>
Cc:        arch@FreeBSD.org, jmallett@FreeBSD.org
Subject:   Re: [jmallett@FreeBSD.org: [PATCH] Reliable signal queues, etc.,
Message-ID:  <XFMail.20021008101429.jhb@FreeBSD.org>
In-Reply-To: <200210080322.g983MIvU034090@gw.catspoiler.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On 08-Oct-2002 Don Lewis wrote:
> On  7 Oct, John Baldwin wrote:
>> 
>> On 07-Oct-2002 Don Lewis wrote:
> 
>>> Probably, but the list is also modified in the exit code.  All those
>>> processes that we are sending SIGKILL to are removing themselves from
>>> the list.
>> 
>> Processes dieing from SIGKILL that we send them aren't a problem since
>> we have already read their p_peers member before we kill them.  That's
>> the point of 'nq'.  The problem is that 'nq' could exit and could be
>> an invalid pointer.  If a process later in the list after 'nq' died
>> that is not a problem either.  Well, how about this:
> 
> I missed your use of nq, even though this is a fairly common way of
> handling similar problems if there is only a single thread.
> 
>> http://www.FreeBSD.org/~jhb/patches/ppeers.patch
> 
> That's pretty much what I had envisioned.  I have a little bit of a
> concern that funnelling a single mutex could be a bottleneck in some
> cases, but it is simple, safe, and otherwise low overhead.

Well, the mutex is only used in the RFTHREAD case most of the time.  The
only time it is uncondtionally acquired it is almost immediately released
in the !RFTHREAD case.

> It looks like we've got a potential lock order reversal problem, though.
> In fork1() we grab ppeers_lock while holding a couple of PROC_LOCKs,
> while in the first part of exit1() we grab ppeers_lock before PROC_LOCK.
> My caffeine level is insufficient to judge whether P_WEXIT checking
> would save us in practice.

Bah, fixed the reversal, thanks.  We still need the P_WEXIT check in
fork1() since otherwise a new peer or child could be added after we
have finished going through the entire list.  Hmm, adding this is ugly
though b/c we really need to check after we acquire the ppeers_lock and
do the actual hookup.  Hmm, we can move the RFTHREAD stuff a lot earlier
and then this isn't such a big deal.  Ok, I've updated the patch again.
One note: I've got a question about how to handle the error condition
in that case in fork1().  I'm really starting to think that instead of
returning an error, the peer process should just go ahead and call
exit1() in this case since it is about to be killed anyways.

-- 

John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.20021008101429.jhb>