Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 1 Feb 2008 14:55:25 -0500
From:      John Baldwin <jhb@freebsd.org>
To:        gnn@freebsd.org
Cc:        freebsd-amd64@freebsd.org
Subject:   Re: Recent problems with 6-STABLE...
Message-ID:  <200802011455.25551.jhb@freebsd.org>
In-Reply-To: <m27ihp5n7r.wl%gnn@neville-neil.com>
References:  <m2fxwgx167.wl%gnn@neville-neil.com> <200801310617.16333.jhb@freebsd.org> <m27ihp5n7r.wl%gnn@neville-neil.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 31 January 2008 11:12:40 pm gnn@freebsd.org wrote:
> At Thu, 31 Jan 2008 06:17:16 -0500,
> John Baldwin wrote:
> > 
> > On Thursday 31 January 2008 04:37:13 am gnn@freebsd.org wrote:
> > > At Tue, 29 Jan 2008 11:57:39 -0500,
> > > John Baldwin wrote:
> > > > 
> > > > On Tuesday 29 January 2008 07:32:16 am gnn@freebsd.org wrote:
> > > > > Hi,
> > > > > 
> > > > > I have two boxes running 6-STABLE, post 6.3 release, which have both
> > > > > spontaneously rebooted, one under load and one not under load.  I 
have
> > > > > attached dmesg and some traceback information, from the one trace 
that
> > > > > looked interesting.  Any thoughts or hints would be apprecated.
> > > > > 
> > > > > To save you scanning all the dmesg first these are dual processor 
XEON
> > > > > boxes, each processor has 4 cores.
> > > > 
> > > > Can you do 'x/i 0xffffffff80296642' to show which instruction faulted?
> > > 
> > > (kgdb) x/i 0xffffffff80296642
> > > 0xffffffff80296642 <pfs_exit+114>:      cmp    %ecx,0x8(%rdx)
> > 
> > Hmm, and rdx from your last post was:
> > 
> > > printf "%x\n" 32491047111385957
> > 736e6f69746365
> > 
> > > echo "0x73 0x6e 0x6f 0x69 0x74 0x63 0x65" | dh
> > snoitce
> > 
> > so it appears you have a data corruption issue.  You could check the
> > hardware (RAM, etc.) but if that is ok you might want to see if you
> > can isolate it to a specific driver if a driver has a bug (or
> > hardware has an errata we don't work around yet).  Do you have any
> > custom drivers for hardware that does DMA?  If not, which storage
> > driver (including pciconf output if ATA) and NIC(s) does this box
> > have?  Also, how much RAM?
> 
> Custom drivers?  Not that I know of.  This box uses Intel Pro/1000
> network drivers and Adaptec AIC7902 SCSI for talking to the disks.
> 
> The box has 8G of RAM in 2G chunks (which has now been subjected to 40
> memtests and passed).

Try hw.physmem=4g at the loader to see if it fixes it.  If so, it's a bug with 
bounce buffering.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200802011455.25551.jhb>