From owner-freebsd-current@FreeBSD.ORG Sat Nov 13 10:46:37 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C118816A4CE; Sat, 13 Nov 2004 10:46:37 +0000 (GMT) Received: from spider.deepcore.dk (cpe.atm2-0-53484.0x50a6c9a6.abnxx9.customer.tele.dk [80.166.201.166]) by mx1.FreeBSD.org (Postfix) with ESMTP id EFE9143D53; Sat, 13 Nov 2004 10:46:36 +0000 (GMT) (envelope-from sos@DeepCore.dk) Received: from [194.192.25.143] (laptop.deepcore.dk [194.192.25.143]) by spider.deepcore.dk (8.12.11/8.12.10) with ESMTP id iADAkWap090295; Sat, 13 Nov 2004 11:46:34 +0100 (CET) (envelope-from sos@DeepCore.dk) Message-ID: <4195E5DB.2070302@DeepCore.dk> Date: Sat, 13 Nov 2004 11:45:47 +0100 From: =?ISO-8859-1?Q?S=F8ren_Schmidt?= User-Agent: Mozilla Thunderbird 0.7.2 (X11/20040802) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Poul-Henning Kamp References: <26249.1100342074@critter.freebsd.dk> In-Reply-To: <26249.1100342074@critter.freebsd.dk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable X-mail-scanned: by DeepCore Virus & Spam killer v1.4 cc: Garance A Drosihn cc: Zoltan Frombach cc: freebsd-current@freebsd.org cc: Robert Watson Subject: Re: 5.3-RELEASE: WARNING - WRITE_DMA interrupt timout X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Nov 2004 10:46:37 -0000 Poul-Henning Kamp wrote: > In message <4195E1FF.5090906@DeepCore.dk>, =3D?ISO-8859-1?Q?S=3DF8ren_S= chmidt?=3D wri > tes: >=20 >=20 >>>>Timeout is 5 secs, which is a pretty long time in this context IMHO..= >>> >>>Five seconds counted from when ? >> >>Now thats the nasty part :) >>ATA starts the timeout when the request is issued to the device, so=20 >>theoretically the disk could take 4.9999 secs to complete the request=20 >>and then the timeout fires before the taskqueue gets its chance at it, = >>but IMHO thats pretty unlikely... >=20 > I find that far more likely than kernel threads being stalled for that > long. ATA disks doing bad-block stuff takes several seconds on some > of the disks I've had my hands on. >=20 >>Anyhow, I can just remove the warning from ATA if that makes anyone=20 >>happy, since its just a warning and ATA doesn't do anything with it at = all. >>However, IMNHO this points at a problem somewhere that we should better= =20 >>understand and fix instead. >=20 > I would prefer you reset the timer to five seconds in your interrupt > routine so we can see exactly on which side of that the time is spent. It would be even better to time how long both ops take and be able to=20 get that via a sysctl or something (I have that on my TODO list but its=20 loooong :) ). Anyhow resetting it is easy (patch against 5.3R): Index: ata-queue.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/sys/dev/ata/ata-queue.c,v retrieving revision 1.32.2.5 diff -u -r1.32.2.5 ata-queue.c --- ata-queue.c 24 Oct 2004 09:27:37 -0000 1.32.2.5 +++ ata-queue.c 13 Nov 2004 10:44:40 -0000 @@ -216,6 +216,9 @@ ata_completed(request, 0); } else { + if (!dumping) + callout_reset(&request->callout, request->timeout * hz, + (timeout_t*)ata_timeout, request); if (request->bio && !(request->flags & ATA_R_TIMEOUT)) { ATA_DEBUG_RQ(request, "finish bio_taskqueue"); bio_taskqueue(request->bio, (bio_task_t *)ata_completed,=20 request); --=20 -S=F8ren