From owner-freebsd-current Tue Feb 18 15:19:32 2003 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D376237B401; Tue, 18 Feb 2003 15:19:29 -0800 (PST) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 76B1E43F93; Tue, 18 Feb 2003 15:19:25 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id KAA05798; Wed, 19 Feb 2003 10:19:22 +1100 Date: Wed, 19 Feb 2003 10:20:12 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Ruslan Ermilov Cc: Alfred Perlstein , Thomas Moestl , Soren Schmidt , Subject: Re: cvs commit: src/sys/kern kern_intr.c src/sys/dev/ata ata-all.c In-Reply-To: <20030218102408.GA48010@sunbay.com> Message-ID: <20030219095525.R11144-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, 18 Feb 2003, Ruslan Ermilov wrote: > On Fri, Feb 14, 2003 at 05:10:40AM -0800, Alfred Perlstein wrote: > > alfred 2003/02/14 05:10:40 PST > > > > Modified files: > > sys/kern kern_intr.c > > sys/dev/ata ata-all.c > > Log: > > Fix crash dumps on ata and scsi. > > > [...] > > To fix ata, use what appears to be a polling method if we're dumping, > > I stole this from tmm but added code to ensure that this change is > > only in effect while dumping. > > > > Tested by: des > > > FWIW, if I propagate this change to the !dumping case, it also > fixes the ``resume stucks in "ata1: resetting devices .."'' bug > I was having with my ThinkPad 600X: > > %%% > Index: ata-all.c > =================================================================== > RCS file: /home/ncvs/src/sys/dev/ata/ata-all.c,v > retrieving revision 1.165 > diff -u -p -r1.165 ata-all.c > --- ata-all.c 14 Feb 2003 13:10:40 -0000 1.165 > +++ ata-all.c 18 Feb 2003 10:08:22 -0000 > @@ -486,8 +486,7 @@ ata_getparam(struct ata_device *atadev, > > /* apparently some devices needs this repeated */ > do { > - if (ata_command(atadev, command, 0, 0, 0, > - dumping ? ATA_WAIT_READY : ATA_WAIT_INTR)) { > + if (ata_command(atadev, command, 0, 0, 0, ATA_WAIT_READY)) { > ata_prtdev(atadev, "%s identify failed\n", > command == ATA_C_ATAPI_IDENTIFY ? "ATAPI" : "ATA"); > free(ata_parm, M_ATA); > %%% There is, or was, something near here that made the whole system go unresponsive (as seen by nfs clients) for several seconds. I guess the main problem was just using polled mode in all cases here. In RELENG_4, polling is done at splbio() so normally only disk devices are blocked, but under -current almost everything is blocked by Giant. > The resume session (with apm(4)) now looks like this: > > : cbb0: PCI Memory allocated: 50103000 > : cbb1: PCI Memory allocated: 50102000 > : pcm0: detached > : csa: card is Thinkpad 600X/A20/T20 > : pcm0: on csa0 > : pcm0: > : wakeup from sleeping state (slept 00:00:10) > : ata0: resetting devices .. > : done > : ata1: resetting devices .. > : ata1-slave: timeout waiting for cmd=ec s=01 e=24 > : ata1-slave: ATA identify failed > : done Apparently the timeout is too short or the interrupt got lost. The timeout seems to be too short. It is 10 seconds, but IIRC the spec is says 30 seconds for reset of the master and a bit more for the slave. Since things work with polling, we know that the device state changed properly. We could test for this state change instead of always aborting after the timeout, and do finer grained and more sleeps to determine the precise timeout required. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message