Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 2 Jul 2005 00:13:08 +0400
From:      Gleb Smirnoff <glebius@FreeBSD.org>
To:        Gary Mu1der <gmulder@infotechfl.com>
Cc:        freebsd-stable@FreeBSD.org
Subject:   Re: panic in RELENG_5 UMA - two new stack traces
Message-ID:  <20050701201308.GD59610@cell.sick.ru>
In-Reply-To: <42C58373.60008@infotechfl.com>
References:  <20050621090701.GB34406@cell.sick.ru> <20050621105154.GA36538@cell.sick.ru> <42B961B9.7A5856B3@freebsd.org> <20050623104230.GB61389@cell.sick.ru> <20050623141514.GD738@obiwan.tataz.chchile.org> <42BC5EE2.2020003@infotechfl.com> <20050627082958.GB97832@cell.sick.ru> <42C16BBF.4060107@infotechfl.com> <20050701085808.GD52023@cell.sick.ru> <42C58373.60008@infotechfl.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jul 01, 2005 at 01:54:59PM -0400, Gary Mu1der wrote:
G> >On Tue, Jun 28, 2005 at 11:24:47AM -0400, Gary Mu1der wrote:
G> >G> I spent the day yesterday trying to reproduce the crash that I posted 
G> >G> last week and you kindly replied to. This is due to the fact that I 
G> >G> stupidly managed to overwrite the kernel.debug that I used to generate 
G> >G> the stack trace. Sadly I could not cause the system to crash again with 
G> >G> the same sb* errors.
G> >G> 
G> >G> I did however remove both the Berkley Packet Filter and IPFilter from 
G> >my G> custom kernel to try and isolate the problem. This has caused the 
G> >crash G> to occur in a different and more reproducible form. I have both 
G> >G> INVARIANTS and WITNESS enabled, as you can see from my kernel conf. 
G> >G> which is included at the end of this e-mail.
G> >G> 
G> >G> Below are the latest stack traces (using bge and then fxp NICs), kernel 
G> >G> conf. and dmesg. Any help would be appreciated. This time I have a copy 
G> >G> of both the core files and corresponding kernel.debug so I can 
G> >hopefully G> provide you with any info you need.
G> >
G> >How often does it crash? Does debug.mpsafenet=0 increases stability?
G> 
G> I can reproduce the crash within 60 seconds of firing off 30+ ping/arp 
G> -d scripts, all running in parallel.
G> 
G> debug.mpsafenet=0 seems to have solved the problem. I'm running 100+ 
G> instances of the above script and the system has been stable for over an 
G> hour.

Thanks! We definitely see that the bug is a race, not a broken logic. I am
almost sure, that you are experiencing the same bug as I described in
the beginning of the thread.

Although there is no yet fix available for race between 'arp -d' and
outgoing packet, there is one for race between incoming ARP reply and
outgoing packet. We will probably commit it soon, after more review.

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050701201308.GD59610>