From owner-freebsd-sparc64@FreeBSD.ORG Fri Jan 4 23:53:37 2013 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B3A38B8A for ; Fri, 4 Jan 2013 23:53:37 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214]) by mx1.freebsd.org (Postfix) with ESMTP id 4627AF04 for ; Fri, 4 Jan 2013 23:53:36 +0000 (UTC) Received: from alchemy.franken.de (localhost [127.0.0.1]) by alchemy.franken.de (8.14.5/8.14.5/ALCHEMY.FRANKEN.DE) with ESMTP id r04NraDg038528; Sat, 5 Jan 2013 00:53:36 +0100 (CET) (envelope-from marius@alchemy.franken.de) Received: (from marius@localhost) by alchemy.franken.de (8.14.5/8.14.5/Submit) id r04NraXt038527; Sat, 5 Jan 2013 00:53:36 +0100 (CET) (envelope-from marius) Date: Sat, 5 Jan 2013 00:53:36 +0100 From: Marius Strobl To: Kurt Lidl Subject: Re: smartmontools panics 9.1-RELEASE on sunfire 240 Message-ID: <20130104235336.GB37999@alchemy.franken.de> References: <20130104051914.GA22613@pix.net> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="ZPt4rx8FFjLCG7dd" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20130104051914.GA22613@pix.net> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-sparc64@freebsd.org X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Jan 2013 23:53:37 -0000 --ZPt4rx8FFjLCG7dd Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit On Fri, Jan 04, 2013 at 12:19:15AM -0500, Kurt Lidl wrote: > Greetings all -- > > I recently endeavored to install the same suite of ports on > my sparc64 machines as I have installed on my amd64 hosts. > > I installed the smartmontools from /usr/ports (it installed > smartmontools-6.0), configured it thusly: > > echo 'DEVICESCAN -a -m somealias@example.com' > \ > /usr/local/etc/smartd.conf > > And then when I started the daemon: > > /usr/local/etc/rc.d/smartd start > Starting smartd. > (pass0:ata2:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 01 00 > (pass0:ata2:0:0:0): CAM status: ATA Status Error > (pass0:ata2:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 04 (ABRT ) > (pass0:ata2:0:0:0): RES: 51 04 00 00 00 00 00 00 00 01 00 > > And the kernel panic'd: > > panic: trap: memory address not aligned (kernel) > cpuid = 0 > KDB: stack backtrace: > #0 0xc086c934 at trap+0x554 > Uptime: 2h0m44s > Automatic reboot in 15 seconds - press a key on the console to abort > Rebooting... > > It's running the GENERIC kernel from 9.1-RELEASE. > > lidl@host-2: uname -a > FreeBSD [redacted] 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243836: Tue Dec 4 15:49:34 UTC 2012 root@heller.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC sparc64 > > It's completely repeatable. The second time it panic'd, the > machine successfully wrote a crashdump file, and savecore recovered it. > Here's what kgdb had to say: > > root@spork-1: kgdb /boot/kernel/kernel /var/crash/vmcore.0 > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "sparc64-marcel-freebsd"... > > Unread portion of the kernel message buffer: > panic: trap: memory address not aligned (kernel) > cpuid = 0 > KDB: stack backtrace: > #0 0xc086c934 at trap+0x554 > Uptime: 23m57s > Dumping 8192 MB (4 chunks) > chunk at 0: 2147483648 bytes | > > Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /boot/kernel/zfs.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/zfs.ko > Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/opensolaris.ko > Reading symbols from /boot/kernel/geom_mirror.ko...Reading symbols from /boot/kernel/geom_mirror.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/geom_mirror.ko > Reading symbols from /boot/kernel/aio.ko...Reading symbols from /boot/kernel/aio.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/aio.ko > Reading symbols from /boot/kernel/accf_data.ko...Reading symbols from /boot/kernel/accf_data.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/accf_data.ko > Reading symbols from /boot/kernel/accf_http.ko...Reading symbols from /boot/kernel/accf_http.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/accf_http.ko > Reading symbols from /boot/kernel/pflog.ko...Reading symbols from /boot/kernel/pflog.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/pflog.ko > Reading symbols from /boot/kernel/pf.ko...Reading symbols from /boot/kernel/pf.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/pf.ko > #0 0x00000000c053199c in doadump (textdump=Variable "textdump" is not available. > ) > at /usr/src/sys/kern/kern_shutdown.c:259 > 259 savectx(&dumppcb); > (kgdb) where > #0 0x00000000c053199c in doadump (textdump=Variable "textdump" is not available. > ) > at /usr/src/sys/kern/kern_shutdown.c:259 > #1 0x00000000c05323a0 in kern_reboot (howto=260) > at /usr/src/sys/kern/kern_shutdown.c:448 > #2 0x00000000c053284c in panic (fmt=0xc0ab6a40 "trap: %s (kernel)") > at /usr/src/sys/kern/kern_shutdown.c:636 > #3 0x00000000c086c93c in trap (tf=0xea36b0b0) > at /usr/src/sys/sparc64/sparc64/trap.c:411 > #4 0x00000000c0099060 in tl1_trap () > #5 0x00000000c012aebc in ata_pio_read (request=0xfffff80003642800, length=512) > at bus.h:548 > #6 0x00000000c012ae0c in ata_pio_read (request=0xfffff800053927b0, length=512) > at /usr/src/sys/dev/ata/ata-lowlevel.c:838 > #7 0x00000000c012c004 in ata_end_transaction (request=dwarf2_read_address: Corrupted DWARF expression. > ) > at /usr/src/sys/dev/ata/ata-lowlevel.c:282 > #8 0x00000000c0128750 in ata_interrupt_locked (data=0xfffff80003642800) > at /usr/src/sys/dev/ata/ata-all.c:586 > #9 0x00000000c01287e8 in ata_interrupt (data=0xfffff80003642800) > at /usr/src/sys/dev/ata/ata-all.c:549 > #10 0x00000000c0130100 in ata_generic_intr (data=Variable "data" is not available. > ) > at /usr/src/sys/dev/ata/ata-pci.c:814 > #11 0x00000000c04ff8c0 in intr_event_execute_handlers (p=0xfffff800032a2dc8, > ie=0xfffff800033b3d00) at /usr/src/sys/kern/kern_intr.c:1262 > #12 0x00000000c05014a4 in ithread_loop (arg=0xfffff800036a15c0) > at /usr/src/sys/kern/kern_intr.c:1275 > #13 0x00000000c04fc548 in fork_exit (callout=0xc05013c0 , > arg=0xfffff800036a15c0, frame=0xea36b880) > at /usr/src/sys/kern/kern_fork.c:992 > #14 0x00000000c0099270 in fork_trampoline () > #15 0x00000000c0099270 in fork_trampoline () > Previous frame identical to this frame (corrupt stack?) > (kgdb) up 5 > #5 0x00000000c012aebc in ata_pio_read (request=0xfffff80003642800, length=512) > at bus.h:548 > 548 bus.h: No such file or directory. > in bus.h > (kgdb) up 1 > #6 0x00000000c012ae0c in ata_pio_read (request=0xfffff800053927b0, length=512) > at /usr/src/sys/dev/ata/ata-lowlevel.c:838 > 838 struct ata_channel *ch = device_get_softc(request->parent); > (kgdb) list > 833 } > 834 > 835 static void > 836 ata_pio_read(struct ata_request *request, int length) > 837 { > 838 struct ata_channel *ch = device_get_softc(request->parent); > 839 uint8_t *addr; > 840 int size = min(request->transfersize, length); > 841 int resid; > 842 uint8_t buf[2]; > (kgdb) p *request > $1 = {dev = 0x0, parent = 0xfffff80003563400, unit = 0, u = {ata = { > command = 161 '¡', feature = 0, count = 1, lba = 0}, atapi = { > ccb = "¡\000\000\000\000\001\000\000\000\000\000\000\000\000\000", > sense = {error = 0 '\0', segment = 0 '\0', key = 0 '\0', cmd_info = 0, > sense_length = 0 '\0', cmd_specific_info = 0, asc = 0 '\0', > ascq = 0 '\0', replaceable_unit_code = 0 '\0', specific = 0 '\0', > specific1 = 0 '\0', specific2 = 0 '\0'}, saved_cmd = 0 '\0'}}, > bytecount = 512, transfersize = 512, data = 0xdd551577 "", tag = 0, > flags = 2, dma = 0x0, status = 88 'X', error = 0 '\0', donecount = 0, > result = 0, callback = 0, done = {sema_mtx = {lock_object = {lo_name = 0x0, > lo_flags = 0, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}, > sema_cv = {cv_description = 0x0, cv_waiters = 0}, sema_waiters = 0, > sema_value = 0}, retries = 0, timeout = 20, callout = {c_links = {sle = { > sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0xc1950940}}, > c_time = 1456010, c_arg = 0xfffff800053927b0, > c_func = 0xc012f040 , c_lock = 0xfffff80003642ac0, > c_flags = 22, c_cpu = 0}, task = {ta_link = {stqe_next = 0x0}, > ta_pending = 0, ta_priority = 0, ta_func = 0, ta_context = 0x0}, > bio = 0x0, this = 0, composite = 0x0, driver = 0x0, chain = {tqe_next = 0x0, > tqe_prev = 0x0}, ccb = 0xfffff80005375000} Uhm, probably an userland buffer which isn't even 16-bit aligned. If that's the cause, the attached patch hopefully should at least prevent the panic. If it does, smartmontools still need to be fixed though. Marius --ZPt4rx8FFjLCG7dd Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="cam_periph.c.diff" Index: cam_periph.c =================================================================== --- cam_periph.c (revision 245046) +++ cam_periph.c (working copy) @@ -744,6 +744,9 @@ cam_periph_mapmem(union ccb *ccb, struct cam_perip if ((ccb->ccb_h.flags & CAM_DIR_MASK) == CAM_DIR_NONE) return(0); + if ((uintptr_t)ccb->ataio.data_ptr % sizeof(uint16_t) != 0) + return (EINVAL); + data_ptrs[0] = &ccb->ataio.data_ptr; lengths[0] = ccb->ataio.dxfer_len; dirs[0] = ccb->ccb_h.flags & CAM_DIR_MASK; --ZPt4rx8FFjLCG7dd--