Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 22 Dec 2001 09:14:33 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        "Kristian K. Nielsen" <jkkn@jkkn.dk>
Cc:        <bradym@mail.hydrologue.com>, <peter.jeremy@alcatel.com.au>, <davidc@acns.ab.ca>, <rnyberg@it.su.se>, <david@catwhisker.org>, <freebsd-stable@FreeBSD.ORG>
Subject:   Re: RE: 4.4-STABLE crashes - suspects new ata-driver over wd-drivers
Message-ID:  <200112221714.fBMHEX195032@apollo.backplane.com>
References:   <F44A657B39A9D41194450008C70DD94F020AA769@oberon.tv2.dk>

next in thread | previous in thread | raw e-mail | index | archive | help
:I did in the bug report at:
:http://www.freebsd.org/cgi/query-pr.cgi?pr=31233
:
:Is that enough information?
:
:/Kristian

    Yah.   I've been analyzing the thread and focusing on Brady's
    kernel cores to try to track this down, but it's tough going.
    This is what I see so far:

	* Cpu does not appear to matter (P3, P4, and AMD have been indicated)
	* All machines are doing heavy IDE I/O
	* All machines are running the new ATA drivers
	* DMA vs PIO does not seem to matter
	* All or most machines appear to be running softupdates  ??
	* None of the disk activity is over SCSI
	* VM system is not necessarily being stressed, just the disks.

    I would like you to turn off softupdates if you have it on and
    see if that makes a difference.

    I do not know whether the problem is the ATA driver or whether it
    is simply a side effect - for example, heavy disk I/O creating
    backlogs and delays that are causing another bug to rear its ugly
    head. 

    So far I haven't been able to track down the cause - the problem 
    appears to be random corruption.  Each of Brady's three kernel cores
    are corrupted in different places - Most typically different
    portions of the vm_page_array[] which is the absolute *last* thing I 
    would every expect to get corrupted considering their static, fixed
    ram.

    The only bug I know about that could have an effect is the 
    BUF_TIMELOCK bug which I fixed two days go in -current (not yet
    in stable), but a number of people have already reported
    -current working prior to the fix.  Of course, the timing is
    very different in -current so that might not matter.  Longer
    IDE delays could cause this bug to rear its ugly head but it's
    a long shot.  I am pursuing this with Brady since he seems to be
    able to reliably reproduce the crash.

					-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200112221714.fBMHEX195032>