Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 14 Nov 2004 10:27:32 +0100
From:      Frode Nordahl <frode@nordahl.net>
To:        =?ISO-8859-1?Q?S=F8ren_Schmidt?= <sos@DeepCore.dk>
Cc:        Garance A Drosihn <drosih@rpi.edu>
Subject:   Re: 5.3-RELEASE: WARNING - WRITE_DMA interrupt timout
Message-ID:  <69854275-361F-11D9-B78A-000A95A9A574@nordahl.net>
In-Reply-To: <4195E1FF.5090906@DeepCore.dk>
References:  <25983.1100341229@critter.freebsd.dk> <4195E1FF.5090906@DeepCore.dk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Nov 13, 2004, at 11:29, S=F8ren Schmidt wrote:

> Poul-Henning Kamp wrote:
>> In message <4195DB3E.2040807@DeepCore.dk>,=20
>> =3D?ISO-8859-1?Q?S=3DF8ren_Schmidt?=3D wri
>> tes:
>>>> It is not really the task of the ata driver to fail requests at =
that
>>>> time.   How long is the timeout anyway ?
>>>
>>> Oh, ATA doesn't fail them, it just yells that the request hasn't=20
>>> been finished yet by the upper layers, it doesn't do anything to the=20=

>>> request.
>>>
>>> Timeout is 5 secs, which is a pretty long time in this context =
IMHO..
>> Five seconds counted from when ?
>
> Now thats the nasty part :)
> ATA starts the timeout when the request is issued to the device, so=20
> theoretically the disk could take 4.9999 secs to complete the request=20=

> and then the timeout fires before the taskqueue gets its chance at it,=20=

> but IMHO thats pretty unlikely...
>
> Anyhow, I can just remove the warning from ATA if that makes anyone=20
> happy, since its just a warning and ATA doesn't do anything with it at=20=

> all.
> However, IMNHO this points at a problem somewhere that we should=20
> better understand and fix instead.

Please don't remove the warning until we can find and fix this problem!=20=

Even if it may not be ATA related, your warning is by now the only way=20=

to tell me when the problem occurs :-)

I have two brand new systems which I have installed 5.3-R on who show=20
this problem. They are two way 3.06GHz Xeons, and when run in SMP mode,=20=

the system will often panic shortly after these warnings occur, most=20
often in UFS code.

Since we don't know what or where the problem is, the traces might be=20
completely bogus, but I include them anyway:=20
http://home.powertech.no/frode/freebsd/

I have just installed CURRENT on one of them, haven't gotten around to=20=

make it crash yet, but I get this:

ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=3D119959919
ad4: FAILURE - WRITE_DMA timed out
g_vfs_done():ad4s1f[WRITE(offset=3D43971141632, length=3D2048)]error =3D =
5
initiate_write_filepage: already started
ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=3D120428639
ad4: FAILURE - WRITE_DMA timed out
g_vfs_done():ad4s1f[WRITE(offset=3D44211126272, length=3D16384)]error =3D =
5


 =46rom ffs_softdep.c:
         if (pagedep->pd_state & IOSTARTED) {
                 /*
                  * This can only happen if there is a driver that does=20=

not
                  * understand chaining. Here biodone will reissue the=20=

call
                  * to strategy for the incomplete buffers.
                  */
                 printf("initiate_write_filepage: already started\n");
                 return;
         }

Mvh,
Frode Nordahl

> --=20
>
> -S=F8ren
>
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to=20
> "freebsd-current-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?69854275-361F-11D9-B78A-000A95A9A574>