Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 2 Jan 2003 17:38:12 +0100
From:      Francesco Casadei <fcasadei@inwind.it>
To:        Bruce Campbell <bruce@engmail.uwaterloo.ca>
Cc:        freebsd-hardware@freebsd.org, freebsd-questions@freebsd.org
Subject:   Re: ata "fallback to PIO mode" on dual processor AMD systems
Message-ID:  <20030102163812.GA2350@goku.kasby>
In-Reply-To: <1041368236.3e1204ac45da5@www.nexusmail.uwaterloo.ca>
References:  <1041368236.3e1204ac45da5@www.nexusmail.uwaterloo.ca>

next in thread | previous in thread | raw e-mail | index | archive | help

--wRRV7LY7NUeQGEoC
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Dec 31, 2002 at 03:57:16PM -0500, Bruce Campbell wrote:
>=20
> I am seeing a problem with ata disks on 4 new systems, which
> I believe is either a bug in the ata driver, or a problem with
> the onboard IDE controller, or something else.  Systems are as follows:
>=20
> Motherboard: ASUS A7M266-D
> CPUs       : 2 x 2000+ AMD MP
> Memory     : 2 x 512MB Crucial part: CT6472Y265
>=20
> Disks (all UDMA100):
>=20
>             Master                   Slave
> System 1:  WDC WD400BB             WDC WD1000BB
> System 2:  WDC WD400BB             WDC WD1000BB
> System 3:  WDC WD400BB             WDC WD800BB
> System 4:  WDC WD400BB             Maxtor 98196H8
>=20
> Kernel : 4.7-RELEASE, custom kernel (compared to GENERIC):
>=20
> commented out:
>=20
>  cpu           I386_CPU
>  cpu           I486_CPU
>=20
> enabled=20
>=20
>  options       SMP                     # Symmetric MultiProcessor Kernel
>  options       APIC_IO                 # Symmetric (APIC) I/O
>=20
>=20
> I am running a test with "dbench" (/usr/ports/benchmarks/dbench)
> with a script which runs:
>=20
>   dbench 1
>   sleep for 5 minutes
>   dbench 2
>   sleep for 5 minutes
>   dbench 3
>   ...
>=20
> to simulate 1,2,3... clients.
>=20
> The following has happened on systems 2,3 and 4, after about 15 hours
> of running the test:
>=20
> Dec 30 23:26:59 ecserv13 /kernel: ad0: WRITE command timeout tag=3D0 serv=
=3D0 -
> resetting
> Dec 30 23:26:59 ecserv13 /kernel: ata0: resetting devices .. done
> Dec 30 23:26:59 ecserv13 /kernel: ad0: WRITE command timeout tag=3D0 serv=
=3D0=20
> resetting
> Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done
> Dec 30 23:27:00 ecserv13 /kernel: ad0: WRITE command timeout tag=3D0 serv=
=3D0=20
> resetting
> Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done
> Dec 30 23:27:00 ecserv13 /kernel: ad0: WRITE command timeout tag=3D0 serv=
=3D0=20
> resetting
> Dec 30 23:27:00 ecserv13 /kernel: ad0: timeout waiting for cmd=3Def s=3Dd=
0 e=3D00
> Dec 30 23:27:00 ecserv13 /kernel: ad0: trying fallback to PIO mode
> Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done
>=20
> The test continues to run with the ata controller in PIO mode, with
> slower performance, and higher load average.
>=20
> Once the master drops to PIO, attempts to access the slave then cause
> it to drop to PIO.
>=20
> If I run:
>=20
>   atacontrol mode 0 UDMA100 UDMA100
>=20
> attempts to access either drive result in a delay until the controller
> drops to PIO, and then operations resume.  A soft reboot and things
> work in UDMA mode again.  Also tried UDMA33 and UDMA66 with no change.
> I also tried "atacontrol reinit 0" with no help.
>=20
> Theories when I search the web for "fallback to PIO mode" include:
>=20
>  - bad disks
>  - something to do with thermal recalibration
>=20
> I don't believe the problems are bad disks, as the slave drops to PIO
> after the master does, and I can't get in back to UDMA, other than by
> soft reboot.  Plus I see the problem on 6 of 8 disks.
>=20
> The problem is very repeatable.
>=20
> Can anyone offer any ideas, or suggest investigative steps ?  I have a sy=
stem
> in PIO mode right now.
>=20
> Thanks,
>=20
> --=20
> Bruce Campbell
> Engineering Computing
> CPH-2374B
> University of Waterloo
> (519)888-4567 ext 5889
>=20
> ----------------------------------------
> This mail sent through www.mywaterloo.ca
>=20
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-questions" in the body of the message
>=20
> end of the original message

Same problem here, but slightly different configuration:

# atacontrol list
ATA channel 0:
    Master:  ad0 <IC35L040AVER07-0/ER4OA44A> ATA/ATAPI rev 5
    Slave:       no device present
ATA channel 1:
    Master: acd0 <LG CD-ROM CRD-8521B/1.03> ATA/ATAPI rev 0
    Slave:       no device present
ATA channel 2:
    Master:  ad4 <IC35L040AVER07-0/ER4OA44A> ATA/ATAPI rev 5
    Slave:       no device present
ATA channel 3:
    Master:  ad6 <IC35L040AVER07-0/ER4OA44A> ATA/ATAPI rev 5
    Slave:       no device present

ad4 and ad6 are attached to a Promise FastTrak 100 TX2 ATA RAID controller.

# atacontrol mode 0
Master =3D UDMA100=20
Slave  =3D ???

# atacontrol mode 1
Master =3D PIO4=20
Slave  =3D ???

# atacontrol mode 2
Master =3D UDMA100=20
Slave  =3D ???

# atacontrol mode 3
Master =3D PIO4=20
Slave  =3D ???

ad6 falls back to PIO mode on heavy I/O activity, i.e. when the system does=
 a
level 0 file systems dump from the RAID 1 array (ad4,ad6) to the backup disk
ad0.
Rebooting and rebuilding the array with the Promise BIOS utility temporarily
solve the problem. The system may be up and running for 1-4 weeks doing a
level 0 dump every morning at 5:30am and then one day the drive ad6 falls b=
ack
to PIO mode again (little before the completion of fs dump).

Do the hard drives you are using support the ATA tagged queuing? And if so,=
 do
you have TQ enbled?

	Francesco Casadei

--=20
You can download my public key from http://digilander.libero.it/fcasadei/
or retrieve it from a keyserver (pgpkeys.mit.edu, wwwkeys.pgp.net, ...)

Key fingerprint is: 1671 9A23 ACB4 520A E7EE  00B0 7EC3 375F 164E B17B


--wRRV7LY7NUeQGEoC
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (FreeBSD)
Comment: For info see http://www.gnupg.org

iD8DBQE+FGr0fsM3XxZOsXsRAlInAKDb4DiO9vSpMBJnmfRnS3v+qtTs+ACg0EZG
BvkLn2Sdg7cpD6KSWoxsYRA=
=sE+F
-----END PGP SIGNATURE-----

--wRRV7LY7NUeQGEoC--

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030102163812.GA2350>