Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 5 Aug 2004 19:11:22 +0200
From:      "Daniel Eriksson" <daniel_k_eriksson@telia.com>
To:        =?iso-8859-1?Q?'S=F8ren_Schmidt'?= <sos@DeepCore.dk>
Cc:        'Ville-Pertti Keinonen' <will+freebsd-current@will.iki.fi>
Subject:   RE: ATA driver races with interrupts
Message-ID:  <!~!UENERkVCMDkAAQACAAAAAAAAAAAAAAAAABgAAAAAAAAA0VcX9IoJqUaXPS8MjT1PdsKAAAAQAAAAMYTGmVxmR0Oc6P/8t/P6dgEAAAAA@telia.com>
In-Reply-To: <411127F0.6080407@DeepCore.dk>

next in thread | previous in thread | raw e-mail | index | archive | help
S=F8ren Schmidt wrote:

> > I just applied your patch to clean sources dated=20
> 2004.08.04.13.00.00 and ran
> > some tests. Everything seems to be working as it should=20
> (just like after the
> > serialization patch from Ville-Pertti that I tried=20
> earlier). I will continue
> > running with this patch applied to see if it stays stable.
>=20
> Good! please keep me posted!

Unfortunately the machine disconnected one of the SATA discs earlier =
today.
It did so out-of-the-blue, because there was no activity at all on =
either of
the two discs other than the SMART monitor.

Aug  5 11:45:47 fortify kernel: ad20: WARNING - removed from =
configuration
Aug  5 11:45:47 fortify kernel: ata10-master: FAILURE - unknown CMD =
(0xb0)
timed out
Aug  5 11:45:47 fortify smartd[882]: Device: /dev/ad20, not capable of =
SMART
self-check

No other interesting messages in the log. The channel was, as usual,
completely locked after this and it took an extended power-off (2 min) =
to
unlock it (I really don't know what is up with that).

Once the channel was unlocked it booted up but page-faulted in the =
middle of
detecting the attached discs (another reboot took care of that problem, =
not
sure if the page fault info is interesting at all, but here it is):

[...]
ad16: 114473MB <WDC WD1200JB-00DUA3> [232581/16/63] at ata8-master =
UDMA100
ad18: 26059MB <Maxtor 92732U8> [52946/16/63] at ata9-master UDMA66
ad20: 239372MB <Maxtor 7Y250M0> [486344/16/63] at ata10-master SATA150
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
fault virtual address   =3D 0x24
fault code              =3D supervisor read, page not present
instruction pointer     =3D 0x8:0xc0580904
stack pointer           =3D 0x10:0xdd6e5c1c
frame pointer           =3D 0x10:0xdd6e5c44
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, def32 1, gran 1
processor eflags        =3D resume, IOPL =3D 0
current process         =3D 35 (swi5: clock sio)
[thread 100036]
Stopped at      propagate_priority+0x84:        movl    0x24(%eax),%eax
db> trace
propagate_priority(c2734420,c078a9a0,c056f8a9,c0790780,c26e47d0) at
propagate_priority+0x84
turnstile_wait(c2735bc0,c078e960,c078a9a0,0,c27440ac) at
turnstile_wait+0x31c
_mtx_lock_sleep(c078e960,c2734420,0,0,0) at _mtx_lock_sleep+0xe8
softclock(0,0,ffffffff,ffffbfff,ffffffff) at softclock+0x248
ithread_loop(c26d0080,dd6e5d48,ffffffff,ffffffff,ffffffff) at
ithread_loop+0x1a8
fork_exit(c05439c0,c26d0080,dd6e5d48) at fork_exit+0x80
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip =3D 0, esp =3D 0xdd6e5d7c, ebp =3D 0 ---


It should have looked something like this:
[...]
ad16: 114473MB <WDC WD1200JB-00DUA3> [232581/16/63] at ata8-master =
UDMA100
ad18: 26059MB <Maxtor 92732U8> [52946/16/63] at ata9-master UDMA66
ad20: 239372MB <Maxtor 7Y250M0> [486344/16/63] at ata10-master SATA150
ad22: 238475MB <WDC WD2500JD-00FYB0> [484521/16/63] at ata11-master =
SATA150
ar0: 476950MB <ATA RAID0 array> [60802/255/63] status: READY subdisks:
 disk0 READY on ad4 at ata2-master
 disk1 READY on ad5 at ata2-slave
ar1: 478744MB <ATA RAID0 array> [61031/255/63] status: READY subdisks:
 disk0 READY on ad6 at ata3-master
 disk1 READY on ad7 at ata3-slave
ar2: 388962MB <ATA RAID0 array> [49585/255/63] status: READY subdisks:
 disk0 READY on ad9 at ata4-slave
 disk1 READY on ad8 at ata4-master
ar3: 228946MB <ATA RAID0 array> [29186/255/63] status: READY subdisks:
 disk0 READY on ad15 at ata7-slave
 disk1 READY on ad16 at ata8-master
Waiting 5 seconds for SCSI devices to settle
[...]


I have switched back to the patch from Ville-Pertti that serializes the
controller for now, to see if that is more stable.

/Daniel Eriksson




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?!~!UENERkVCMDkAAQACAAAAAAAAAAAAAAAAABgAAAAAAAAA0VcX9IoJqUaXPS8MjT1PdsKAAAAQAAAAMYTGmVxmR0Oc6P/8t/P6dgEAAAAA>