Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 04 Oct 2000 15:04:33 -0700
From:      Peter Wemm <peter@netplex.com.au>
To:        Matt Dillon <dillon@earth.backplane.com>
Cc:        arch@FreeBSD.ORG
Subject:   Re: Mutexes and semaphores 
Message-ID:  <200010042204.e94M4XH22892@netplex.com.au>
In-Reply-To: <200010041639.e94GdVR24384@earth.backplane.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
Matt Dillon wrote:
>     This is one of the few things I really hate in the linux source base.
> 
>     Besides, it isn't guarenteed to work.  What happens if the process
>     double-reparent's and the space pointed to by the original pptr is
>     reallocated and reused in the second reparenting?  That is, this
>     cpu stalls after getting 'parent', the process is reparented
>     (p->p_pptr changes), the process reparents AGAIN to a new just-allocated
>     process which happens to use the original p->p_pptr's process's zalloc
>     slot.  In this case the contents of parent->p_pid can be garbage and
>     the algorithm will not detect it.

If it's double-reparented?  I don't see that as a big issue. Consider:

- Suppose you are a child of pid 10.
- While you are doing a getppid, pid 10 exits and you loose the race.
- You were reparented to pid 1 (init) as part of 10's exit/cleanup.
- If pid 1 exits, where are you going to be reparented to?  init dying is
  a kernel panic offence.

Using ptrace() reparenting is different, it doesn't involve processes going
away.  You are simply switched from one parent to another, and both keep
going.  There is no race here, even if your original parent immediately dies.
If the ptrace() parent dies, we are screwed anyway because we are abandoned
in SSTOP state and never get woken up.  I dont think ptrace() is even nearly
smp safe yet anyway, especially if the debugger and child are on different
cpus and both running at PT_ATTACH time.

.. Unless I am missing something, I dont see any possibility of a race.  And
even if there is one, the double-reparenting can be detected by an additional
pid test or generation numbers, or whatever.

Anyway, before:
getppid:
        pushl %ebp
        movl %esp,%ebp
        movl 8(%ebp),%eax
        movl 68(%eax),%edx
        movl 48(%edx),%edx
        movl %edx,248(%eax)
        xorl %eax,%eax
        leave
        ret

After:
getppid:
        pushl %ebp
        movl %esp,%ebp
        movl 8(%ebp),%edx
.L187:
        movl 68(%edx),%eax
        movl 48(%eax),%ecx
        cmpl 68(%edx),%eax		/* new */
        jne .L187   			/* new */
        movl %ecx,248(%edx)
        xorl %eax,%eax
        leave
        ret

gcc does a pretty good job optimizing this.  Also, we dont have a
preemptable kernel (and if we did, then this is the least of our problems).
In the SMP case, the cpu is going to have to be running an interrupt
for a really long time in order for another cpu to complete the double
exit processing.  A simple interrupt disable would remove this window for
the local cpu as well.  (mutexes disable interrupts internally, so this
is no loss compared to using a mutex).

>     Now, granted, the chance of the above occuring is probably close to nil.
>     From an algorithmic point of view, though, the below code (and the
>     linux code) cannot actually guarentee correct operation without also
>     make assumptions on side effects (for example, in regards to who is
>     or is not actually allowed to become the new parent and whether the
>     situation can occur with those restrictions in place).
> 
>     I would prefer not to see this sort of code in FreeBSD, even if it means
>     slowing things down a little.

I understand.  However, this is a cheap speedup for both getpid() and
getppid() that we can get *now*.  We dont have an infrastructure for
locking the proc hierarchy yet.  As soon as we do, I'd be quite happy to
rip this out and replace it with the correct mutexes or whatever.  getpid()
and/or getppid() are quite heavily hammered in certain benchmarks.


> 					-Matt
> 
> :In the freebsd case, this is the case.  Zones are never cleaned up, and
> :certainly not unmapped.  zfree() will however cause the first few bytes
> :to be clobbered as they are reused for the freelist.
> :
> :int
> :getppid(p, uap)
> :        struct proc *p;
> :        struct getppid_args *uap;
> :{
> : 
> :        p->p_retval[0] = p->p_pptr->p_pid;
> :        return (0);
> :}
> :
> :could safely become:
> :
> :int
> :getppid(p, uap)
> :	struct proc *p;
> :	struct getppid_args *uap;
> :{
> :	struct proc *parent;
> :	pid_t pid;
> :
> :	parent = p->p_pptr;
> :	pid = parent->p_pid;
> :#ifdef SMP
> :	for (;;) {
> :		__asm __volatile (": : : memory");	/* mb(); on x86 */
> :		if (parent == p->p_pptr)
> :			break;
> :		/* lost a race, our parent died and we reparented - redo */
> :		parent = p->p_pptr;
> :		pid = parent->p_pid;
> :	}		
> :#endif
> :	p->p_retval[0] = (register_t)pid;
> :	return 0;
> :}
> :
> :This isn't quite the same as the linux version, but I'm pretty sure the
> :important races are covered just like theirs.  The mb(); replacement could
> :be a real per-cpu mb instruction on arches that require it.  Even if it is
> :just the gcc flush, it would be sufficient for 99.99999% of cases.
> :
> :Cheers,
> :-Peter
> 
> 

Cheers,
-Peter
--
Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au
"All of this is for nothing if we don't go to the stars" - JMS/B5



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200010042204.e94M4XH22892>