Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Feb 2003 10:00:41 +0200
From:      Ruslan Ermilov <ru@freebsd.org>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        Alfred Perlstein <alfred@freebsd.org>, Thomas Moestl <tmm@freebsd.org>, Soren Schmidt <sos@freebsd.org>, current@freebsd.org
Subject:   Re: cvs commit: src/sys/kern kern_intr.c src/sys/dev/ata ata-all.c
Message-ID:  <20030219080041.GA83452@sunbay.com>
In-Reply-To: <20030219095525.R11144-100000@gamplex.bde.org>
References:  <20030218102408.GA48010@sunbay.com> <20030219095525.R11144-100000@gamplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--rwEMma7ioTxnRzrJ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Feb 19, 2003 at 10:20:12AM +1100, Bruce Evans wrote:
> On Tue, 18 Feb 2003, Ruslan Ermilov wrote:
>=20
> > On Fri, Feb 14, 2003 at 05:10:40AM -0800, Alfred Perlstein wrote:
> > > alfred      2003/02/14 05:10:40 PST
> > >
> > >   Modified files:
> > >     sys/kern             kern_intr.c
> > >     sys/dev/ata          ata-all.c
> > >   Log:
> > >   Fix crash dumps on ata and scsi.
> > >
> > [...]
> > >   To fix ata, use what appears to be a polling method if we're dumpin=
g,
> > >   I stole this from tmm but added code to ensure that this change is
> > >   only in effect while dumping.
> > >
> > >   Tested by: des
> > >
> > FWIW, if I propagate this change to the !dumping case, it also
> > fixes the ``resume stucks in "ata1: resetting devices .."'' bug
> > I was having with my ThinkPad 600X:
> >
> > %%%
> > Index: ata-all.c
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> > RCS file: /home/ncvs/src/sys/dev/ata/ata-all.c,v
> > retrieving revision 1.165
> > diff -u -p -r1.165 ata-all.c
> > --- ata-all.c	14 Feb 2003 13:10:40 -0000	1.165
> > +++ ata-all.c	18 Feb 2003 10:08:22 -0000
> > @@ -486,8 +486,7 @@ ata_getparam(struct ata_device *atadev,
> >
> >      /* apparently some devices needs this repeated */
> >      do {
> > -	if (ata_command(atadev, command, 0, 0, 0,
> > -		dumping ? ATA_WAIT_READY : ATA_WAIT_INTR)) {
> > +	if (ata_command(atadev, command, 0, 0, 0, ATA_WAIT_READY)) {
> >  	    ata_prtdev(atadev, "%s identify failed\n",
> >  		       command =3D=3D ATA_C_ATAPI_IDENTIFY ? "ATAPI" : "ATA");
> >  	    free(ata_parm, M_ATA);
> > %%%
>=20
> There is, or was, something near here that made the whole system go
> unresponsive (as seen by nfs clients) for several seconds.  I guess
> the main problem was just using polled mode in all cases here.  In
> RELENG_4, polling is done at splbio() so normally only disk devices
> are blocked, but under -current almost everything is blocked by Giant.
>=20
The symptoms were as following.  The console is blocked, and if I type
something, I don't see it unless I enter into the DDB -- then what I
have typed is displayed.

> > The resume session (with apm(4)) now looks like this:
> >
> > : cbb0: PCI Memory allocated: 50103000
> > : cbb1: PCI Memory allocated: 50102000
> > : pcm0: detached
> > : csa: card is Thinkpad 600X/A20/T20
> > : pcm0: <CS461x PCM Audio> on csa0
> > : pcm0: <Cirrus Logic CS4297A ac97 codec>
> > : wakeup from sleeping state (slept 00:00:10)
> > : ata0: resetting devices ..
> > : done
> > : ata1: resetting devices ..
> > : ata1-slave: timeout waiting for cmd=3Dec s=3D01 e=3D24
> > : ata1-slave: ATA identify failed
> > : done
>=20
> Apparently the timeout is too short or the interrupt got lost.  The
> timeout seems to be too short.  It is 10 seconds, but IIRC the spec
> is says 30 seconds for reset of the master and a bit more for the
> slave.  Since things work with polling, we know that the device state
> changed properly.  We could test for this state change instead of
> always aborting after the timeout, and do finer grained and more sleeps
> to determine the precise timeout required.
>=20
I recall seeing the ``stray irq 15'' too, so yes, that may likely
be the case here.  I will try bumping up the ATA_WAIT_INTR timeout
later today and let you know the results.


Cheers,
--=20
Ruslan Ermilov		Sysadmin and DBA,
ru@sunbay.com		Sunbay Software AG,
ru@FreeBSD.org		FreeBSD committer,
+380.652.512.251	Simferopol, Ukraine

http://www.FreeBSD.org	The Power To Serve
http://www.oracle.com	Enabling The Information Age

--rwEMma7ioTxnRzrJ
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (FreeBSD)

iD8DBQE+UzmoUkv4P6juNwoRAs9ZAJ4oUu2KWivYyYsIFFBpzqYFUdewIACfcC8O
erX9QjXuNKhltRAmqQi2luA=
=NbGx
-----END PGP SIGNATURE-----

--rwEMma7ioTxnRzrJ--

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030219080041.GA83452>