Date: Mon, 1 Dec 2003 13:09:23 +0100 From: Thomas Moestl <t.moestl@tu-bs.de> To: Robert Watson <rwatson@freebsd.org> Cc: sparc@freebsd.org Subject: Re: panic: trap: memory address not aligned in ata_prtdev() with Nov 18 GENERIC Message-ID: <20031201120923.GA3276@timesink.dyndns.org> In-Reply-To: <Pine.NEB.3.96L.1031130202153.66375k-100000@fledge.watson.org> References: <Pine.NEB.3.96L.1031130202153.66375k-100000@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--zYM0uCDKw75PZbzx Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sun, 2003/11/30 at 20:29:09 -0500, Robert Watson wrote: > Unfortunately, I didn't have dumps set up on this box. On the other hand, > given that the panic was in the ata code, perhaps I wouldn't have got a > dump anyway. This was with a November 18th GENERIC kernel on a blade100. > dmesg also below. This appears to be highly reproduceable, and might be a > property of the bgfsck running on the system. > > [...] > > db> show msgbuf > msgbufp = 0xfffff80000407fe0 > magic = 63062, size = 32736, r= 4790, w = 4860, ptr = 0xfffff80000400000, > cksum= > 377365 > panic: trap: memory address not aligned > cpuid = 0; > Debugger("panic") > ... > db> trace > panic() at panic+0x174 > trap() at trap+0x3b4 > -- memory address not aligned sfar=0xdedeadc0ee sfsr=0x40029 > %o7=0xc007eda8 -- > ata_prtdev() at ata_prtdev+0x14 > ata_timeout() at ata_timeout+0x130 > softclock() at softclock+0x1a0 > ithread_loop() at ithread_loop+0x1b8 > fork_exit() at fork_exit+0x84 > fork_trampoline() at fork_trampoline+0x8 This can happen when an ATA operation times out, and is caused by an access to a freed structure. I have attached a workaround; IIRC sos is developing a more complete fix for this. ISTR the timeouts were caused by the fact that Blade 100s come with ATA66-capable disks and controllers, but a non-ATA66 (40 pin) cable, and that for some reason the driver check to catch this situation did not work. I am not seeing this on my machine because I replaced the cable long ago when I added another disk. Can you confirm that your box does only have a 40 pin cable? - Thomas -- Thomas Moestl <t.moestl@tu-bs.de> http://www.tu-bs.de/~y0015675/ <tmm@FreeBSD.org> http://people.FreeBSD.org/~tmm/ PGP fingerprint: 1C97 A604 2BD0 E492 51D0 9C0F 1FE6 4F1D 419C 776C --zYM0uCDKw75PZbzx Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="ata-timo.diff" Index: ata-queue.c =================================================================== RCS file: /vol/ncvs/src/sys/dev/ata/ata-queue.c,v retrieving revision 1.11 diff -u -r1.11 ata-queue.c --- ata-queue.c 20 Oct 2003 14:28:37 -0000 1.11 +++ ata-queue.c 20 Nov 2003 00:56:48 -0000 @@ -316,6 +316,8 @@ ata_timeout(struct ata_request *request) { struct ata_channel *ch = request->device->channel; + struct ata_device *reqdev = request->device; + char *reqstr = ata_cmd2str(request); int quiet = request->flags & ATA_R_QUIET; /* clear timeout etc */ @@ -324,10 +326,11 @@ /* call hw.interrupt to try finish up the command */ ch->hw.interrupt(request->device->channel); if (ch->running != request) { + /* request might already be freed - use copies. */ if (!quiet) - ata_prtdev(request->device, + ata_prtdev(reqdev, "WARNING - %s recovered from missing interrupt\n", - ata_cmd2str(request)); + reqstr); return; } --zYM0uCDKw75PZbzx--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20031201120923.GA3276>