Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 26 Dec 2004 23:56:51 +0100
From:      Peter Holm <peter@holm.cc>
To:        Bosko Milekic <bmilekic@technokratis.com>
Cc:        current@freebsd.org
Subject:   Re: panic: uma_zone_slab is looping
Message-ID:  <20041226225651.GA87178@peter.osted.lan>
In-Reply-To: <20041226181738.GA21533@technokratis.com>
References:  <20041209144233.GA46928@peter.osted.lan> <20041220234103.GA59225@technokratis.com> <20041222210553.GA28108@peter.osted.lan> <20041222221540.GA70052@technokratis.com> <20041226161153.GA74592@peter.osted.lan> <20041226181738.GA21533@technokratis.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Dec 26, 2004 at 01:17:38PM -0500, Bosko Milekic wrote:
> 
> On Sun, Dec 26, 2004 at 05:11:53PM +0100, Peter Holm wrote:
> > 
> > Yes, I think that I have verified your exelent analysis of the
> > problem: http://www.holm.cc/stress/log/freeze04.html
> > 
> > So, do have any fix suggenstons? :-)
> 
>   Not yet, because the problem is non-obvious from the trace.
> 
>   I need to know exactly when the UMA RCntSlabs zone recurses _first_,
>   and I need to confirm that it is an actual recursion.  I've looked at
>   the VM code and I don't see how/why recursion on the RCntSlabs zone
>   would happen.
> 
>   Please modify the printf code to look exactly like this:
> 
>    if (keg->uk_flags & UMA_ZFLAG_INTERNAL && keg->uk_recurse != 0) {
> 	if ((zone == slabzone) || (zone == slabrefzone))
> 		panic("Zone %s forced to fail due to recurse non-null: %d\n",
> 		    zone->uz_name, keg->uk_recurse);
>    	return (NULL);
>    }
> 
>   (You don't need to check any global counter -- the counter is imperfect
>   anyway -- because even a single recursion on slabzone or slabrefzone
>   should be illegal).
> 
>   I'd like to see the trace from the above panic, if possible.

Here it is: http://www.holm.cc/stress/log/freeze05.html

> 
>   Also, from your current crash dump, see if you can print the value of
>   keg->uk_recurse (from frame 11, pid 74804).
> 
>   It appears that the other KASSERT being triggered from
>   propagate_priority() is due to some weird side-effect of process
>   74804 looping with the UMA RCntSlabs zone lock held (without it
>   ever being dropped).  We'll have to see.
> 
>   The point is: the trace is useless unless it shows where/when the
>   recursion on slabrefzone _begins_ to happen (not that it has already
>   happened, that part is obvious now). 
> 
> Happy holidays,
> -- 
> Bosko Milekic
> bmilekic@technokratis.com
> bmilekic@FreeBSD.org

-- 
Peter Holm



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041226225651.GA87178>