Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 17 Aug 2005 09:33:32 -0700
From:      Luigi Rizzo <rizzo@icir.org>
To:        John Baldwin <jhb@FreeBSD.org>
Cc:        gnn@FreeBSD.org, freebsd-arch@FreeBSD.org
Subject:   Re: Special schedulers, one CPU only kernel, one only userland
Message-ID:  <20050817093332.A85431@xorpc.icir.org>
In-Reply-To: <200508170836.05861.jhb@FreeBSD.org>; from jhb@FreeBSD.org on Wed, Aug 17, 2005 at 08:36:04AM -0400
References:  <42F9ECF2.8080809@freebsd.org> <200508161716.44597.jhb@FreeBSD.org> <m2pssdjoob.wl%gnn@neville-neil.com> <200508170836.05861.jhb@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Aug 17, 2005 at 08:36:04AM -0400, John Baldwin wrote:
> On Tuesday 16 August 2005 10:13 pm, gnn@freebsd.org wrote:
> > At Tue, 16 Aug 2005 16:22:41 -0700,
> >
> > luigi wrote:
> > > ok just for the records, the issue i had in mind is the release/acquire
> > > the mutex around some code that might cause a deadlock, not for the
> > > mutex ops per se, but for the need to make sure that the data
> > > structure is consistent before releasing the lock, and rechecking
...
> > Correct me if I'm wrong but I suspect you're thinking of cases such
> > as:
> >
> > 		RT_UNLOCK(rt);		/* XXX workaround LOR */
> > 		gwrt = rtalloc1(gate, 1, 0);
> > 		RT_LOCK(rt);
> >
> > in the routing/networking code.  A quick, and by no means exhaustive,
> > check of CURRENT with cscope and Emacs looking for these turned up 3.

actually most network device drivers have a similar thing in
the receive interrupt processing. The  code below is from if_fxp.c
but many other drivers have that as well:

                        /*
                         * Drop locks before calling if_input() since it
                         * may re-enter fxp_start() in the netisr case.  
                         * This would result in a lock reversal.  Better  
                         * performance might be obtained by chaining all
                         * packets received, dropping the lock, and then 
                         * calling if_input() on each one.
                         */
                        FXP_UNLOCK(sc);
                        (*ifp->if_input)(ifp, m);
                        FXP_LOCK(sc);
			...

here (and everywhere else)  there is no check that the device is still
in a consistent state when reacquiring the lock.

How safe is this in the face of, say, device removal, i have no idea,
because i cannot find a place that comments what the sc->sc_mtx lock is
supposed to protect.

But in fact, reading fxp_detach() is not very encouraging....
and the leftover spl*() calls are not making it any easier to understand
the code.

Actually, if someone can point me to something that documents how
network locking (device driver and above) is designed, i'd be
very grateful.

	cheers
	luigi

> > It may still be something to look at, but perhaps mroe from a point of
> > view of cleaning up our APIs to not do this kind of jiggery pokery,
> > rather than inventing a new locking paradigm, which is fraught with
> > peril.
> >
> > This isn't exhaustive though, if others can show this kind of thing
> > being done a lot, or becoming a idiom in our code, then there is more
> > cause for concern.
> >
> > Just my 2 yen :-)
> 
> Agreed.  I'd rather that we fix our APIs and try to organize the workflow to 
> minimize the number of times that a lock has to be dropped and then 
> reacquired.
> 
> -- 
> John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
> "Power Users Use the Power to Serve"  =  http://www.FreeBSD.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050817093332.A85431>