Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 18 Jul 2001 07:00:02 -0700 (PDT)
From:      Bill Moran <wmoran@iowna.com>
To:        freebsd-bugs@FreeBSD.org
Subject:   Re: i386/29045: Heavy disk usage causes panic in ffs_blkfree
Message-ID:  <200107181400.f6IE02067016@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR i386/29045; it has been noted by GNATS.

From: Bill Moran <wmoran@iowna.com>
To: Ian Dowse <iedowse@maths.tcd.ie>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: i386/29045: Heavy disk usage causes panic in ffs_blkfree
Date: Wed, 18 Jul 2001 09:50:13 -0400

 Ian Dowse wrote:
 > 
 > In message <3B54C17D.F31029C2@iowna.com>, Bill Moran writes:
 > >Errr ... I tried, but frame 2 considers those symbols undefined.
 > >Did I misunderstand?
 > 
 > Whoops, no I did. It was frame 3 that should contain these symbols,
 > but now that you have an easier way to get corruption, that vmcore
 > is of less interest.
 
 Oddly enough, the md5ing I did last night did not cause a panic. In fact,
 the rsync process ran at 3:00 AM successfully.
 
 I was just talking to the guy we got the hardware from, and my gut instinct
 is to suspect the ata100 - controller or driver. We've got another system
 here that needs to go into production in a few weeks, with the same mobo,
 but a different HDD. We're going to set it up and run some tests to see if
 we can get it to crash.
 One thing that I thought about (feel free to support or refute this based
 on your own experience) is that we're hitting this hardware hard enough
 that we may be exposing driver bugs that others haven't seen. I guess 
 we'll find out (hopefully).
 
 > >I ran two tests during the "make buildworld" (one right after the other)
 > >I ran a diff on the two resultant files and Lo and Behold! there are a
 > >slew of differences in the hashes.
 > 
 > Ok, that's progress anyway, even if it's not progress in the most
 > desirable direction... I think you can pretty much rule out any
 > filesystem bugs here; either the hardware is bad (disk, ATA
 > controllers, RAM etc) or possibly there is something the ATA driver
 > isn't doing right, such as missing a workaround for a known hardware
 > bug.
 > 
 > One thing that would be very useful is if you can collect a number
 > of samples of "good" and "corrupted" versions of the same file.
 > That may be tricky to do, because right now we don't know anything
 > about where the data is being corrupted. Maybe try to make 2 copies
 > of lots of files to another system, and then md5 each and look for
 > differences.
 
 I'll set my alarm for 4:00 AM tomorrow, get up and run an md5 on both
 the problem server (which is a backup server) and the fileserver where
 the files were synced with. At 4:00 AM (right after the sync) there
 shouldn't be any differences yet.
 
 > It would also be well worth trying swapping hardware components
 > to see if you can isolate the cause.
 
 If, come the weekend, we haven't isolated this yet, I'm going to move
 the drive from the promise controller to the via controller and see if
 the problem disappears. My gut instinct is pointing toward the promise
 controller, so I'll try that first. That doesn't rule out everything,
 however, since FreeBSD seems to only recognize the via controller in
 ata66 mode, so it could still be some overall problem with ata100.
 
 -Bill

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200107181400.f6IE02067016>