From owner-freebsd-current@FreeBSD.ORG Thu Aug 5 17:11:22 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0AB6316A4CE for ; Thu, 5 Aug 2004 17:11:22 +0000 (GMT) Received: from av9-2-sn1.fre.skanova.net (av9-2-sn1.fre.skanova.net [81.228.11.116]) by mx1.FreeBSD.org (Postfix) with ESMTP id 990CB43D54 for ; Thu, 5 Aug 2004 17:11:21 +0000 (GMT) (envelope-from daniel_k_eriksson@telia.com) Received: by av9-2-sn1.fre.skanova.net (Postfix, from userid 502) id F11E837E81; Thu, 5 Aug 2004 19:11:20 +0200 (CEST) Received: from smtp3-1-sn1.fre.skanova.net (smtp3-1-sn1.fre.skanova.net [81.228.11.163]) by av9-2-sn1.fre.skanova.net (Postfix) with ESMTP id DF2D137E5E; Thu, 5 Aug 2004 19:11:20 +0200 (CEST) Received: from gadget (h130n1fls11o822.telia.com [213.64.66.130]) by smtp3-1-sn1.fre.skanova.net (Postfix) with ESMTP id BD8D237E47; Thu, 5 Aug 2004 19:11:20 +0200 (CEST) From: "Daniel Eriksson" To: =?iso-8859-1?Q?'S=F8ren_Schmidt'?= Date: Thu, 5 Aug 2004 19:11:22 +0200 Organization: Home Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.6626 Importance: Normal In-Reply-To: <411127F0.6080407@DeepCore.dk> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 cc: freebsd-current@freebsd.org cc: 'Ville-Pertti Keinonen' Subject: RE: ATA driver races with interrupts X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Aug 2004 17:11:22 -0000 S=F8ren Schmidt wrote: > > I just applied your patch to clean sources dated=20 > 2004.08.04.13.00.00 and ran > > some tests. Everything seems to be working as it should=20 > (just like after the > > serialization patch from Ville-Pertti that I tried=20 > earlier). I will continue > > running with this patch applied to see if it stays stable. >=20 > Good! please keep me posted! Unfortunately the machine disconnected one of the SATA discs earlier = today. It did so out-of-the-blue, because there was no activity at all on = either of the two discs other than the SMART monitor. Aug 5 11:45:47 fortify kernel: ad20: WARNING - removed from = configuration Aug 5 11:45:47 fortify kernel: ata10-master: FAILURE - unknown CMD = (0xb0) timed out Aug 5 11:45:47 fortify smartd[882]: Device: /dev/ad20, not capable of = SMART self-check No other interesting messages in the log. The channel was, as usual, completely locked after this and it took an extended power-off (2 min) = to unlock it (I really don't know what is up with that). Once the channel was unlocked it booted up but page-faulted in the = middle of detecting the attached discs (another reboot took care of that problem, = not sure if the page fault info is interesting at all, but here it is): [...] ad16: 114473MB [232581/16/63] at ata8-master = UDMA100 ad18: 26059MB [52946/16/63] at ata9-master UDMA66 ad20: 239372MB [486344/16/63] at ata10-master SATA150 kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode fault virtual address =3D 0x24 fault code =3D supervisor read, page not present instruction pointer =3D 0x8:0xc0580904 stack pointer =3D 0x10:0xdd6e5c1c frame pointer =3D 0x10:0xdd6e5c44 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, def32 1, gran 1 processor eflags =3D resume, IOPL =3D 0 current process =3D 35 (swi5: clock sio) [thread 100036] Stopped at propagate_priority+0x84: movl 0x24(%eax),%eax db> trace propagate_priority(c2734420,c078a9a0,c056f8a9,c0790780,c26e47d0) at propagate_priority+0x84 turnstile_wait(c2735bc0,c078e960,c078a9a0,0,c27440ac) at turnstile_wait+0x31c _mtx_lock_sleep(c078e960,c2734420,0,0,0) at _mtx_lock_sleep+0xe8 softclock(0,0,ffffffff,ffffbfff,ffffffff) at softclock+0x248 ithread_loop(c26d0080,dd6e5d48,ffffffff,ffffffff,ffffffff) at ithread_loop+0x1a8 fork_exit(c05439c0,c26d0080,dd6e5d48) at fork_exit+0x80 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip =3D 0, esp =3D 0xdd6e5d7c, ebp =3D 0 --- It should have looked something like this: [...] ad16: 114473MB [232581/16/63] at ata8-master = UDMA100 ad18: 26059MB [52946/16/63] at ata9-master UDMA66 ad20: 239372MB [486344/16/63] at ata10-master SATA150 ad22: 238475MB [484521/16/63] at ata11-master = SATA150 ar0: 476950MB [60802/255/63] status: READY subdisks: disk0 READY on ad4 at ata2-master disk1 READY on ad5 at ata2-slave ar1: 478744MB [61031/255/63] status: READY subdisks: disk0 READY on ad6 at ata3-master disk1 READY on ad7 at ata3-slave ar2: 388962MB [49585/255/63] status: READY subdisks: disk0 READY on ad9 at ata4-slave disk1 READY on ad8 at ata4-master ar3: 228946MB [29186/255/63] status: READY subdisks: disk0 READY on ad15 at ata7-slave disk1 READY on ad16 at ata8-master Waiting 5 seconds for SCSI devices to settle [...] I have switched back to the patch from Ville-Pertti that serializes the controller for now, to see if that is more stable. /Daniel Eriksson