From owner-freebsd-drivers@FreeBSD.ORG Fri Jun 17 09:39:24 2011 Return-Path: Delivered-To: freebsd-drivers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E7239106566B for ; Fri, 17 Jun 2011 09:39:23 +0000 (UTC) (envelope-from orca@tdlsoftware.org) Received: from server2.hostultra.com (server2.hostultra.com [178.63.65.195]) by mx1.freebsd.org (Postfix) with ESMTP id 5C5F58FC18 for ; Fri, 17 Jun 2011 09:39:23 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=tdlsoftware.org) by server2.hostultra.com with esmtpa (Exim 4.72 (FreeBSD)) (envelope-from ) id 1QXV7b-0009Gg-0t for freebsd-drivers@freebsd.org; Fri, 17 Jun 2011 09:13:47 +0000 Received: from 91.195.58.188 ([91.195.58.188]) (SquirrelMail authenticated user orca@tdlsoftware.org) by tdlsoftware.org with HTTP; Fri, 17 Jun 2011 11:13:47 +0200 Message-ID: <7042daa3e72e642db80c0622cc9acf67.squirrel@tdlsoftware.org> Date: Fri, 17 Jun 2011 11:13:47 +0200 From: "Peter Laursen" To: freebsd-drivers@freebsd.org User-Agent: SquirrelMail/1.4.20 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server2.hostultra.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [26 6] / [26 6] X-AntiAbuse: Sender Address Domain - tdlsoftware.org X-Source: X-Source-Args: X-Source-Dir: Subject: Confused about USB HID devices X-BeenThere: freebsd-drivers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Writing device drivers for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Jun 2011 09:39:24 -0000 Hi everyone, I am trying to write a device driver for a USB HID device. The device in question is a Braille display (electronic equipment that transforms letters into braille dots for blind people to read). Potentially, this driver may be the stepping stone for expanding our installer so that blind people may be able to install FreeBSD without sighted assistance. My problem is that, seemingly no matter what I do, I cannot get my braille display to show any output if I go through my device driver. If I send output from a test program written with the aid of libusb, the display shows everything correctly, but when I send the same data from my test device driver, I get an error code of 22. When I send the following data packet from my libusb program, I see the word "Hello": (Every value is in decimal. The first three bytes is HID output report byte for this device, next byte is offset on the display and the third is the length of the data to be shown on the display) "2 0 40 83 17 7 7 21 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0" When I try to send the same data packet from within my device driver, I get an error code with the value 22. I insert my USB callback function below: static void alva_write_callback(struct usb_xfer* xfer, usb_error_t err) { struct alva_softc* sc = usbd_xfer_softc(xfer); struct usb_device_request req; req.bmRequestType = UT_WRITE_CLASS_INTERFACE; req.bRequest=UR_SET_REPORT; USETW2(req.wValue, UHID_OUTPUT_REPORT, 2); req.wIndex[0]=1; req.wIndex[1]=0; USETW(req.wLength, 43); mtx_lock(&sc->mtx); switch (USB_GET_STATE(xfer)) { case USB_ST_SETUP: printf("Inside USB write setup state.\n"); usbd_xfer_set_frame_data(xfer,0,&req,8); unsigned char Packet[43] = {0}; Packet[0]=2; Packet[1]=0; Packet[2]=40; Packet[3] = 83; Packet[4] = 17; Packet[5] = 7; Packet[6] = 7; Packet[8] = 21; for (int i = 9; i < 43; i++) Packet[i]=0; usbd_xfer_set_frame_data(xfer,1,Packet,43); usbd_transfer_submit(xfer); break; case USB_ST_TRANSFERRED: printf("Reached the transferred state.\n"); break; default: printf("An error must have occurred. Error code: %d\n", err); usbd_transfer_clear_stall(xfer); break; } mtx_unlock(&sc->mtx); } I am quite out of ideas as to how I might solve the problem. I will gladly provide any additional information, but I'm quite new to writing device drivers, so please bear with me if I have missed anything obvious or missed out important information. I have looked through the USB HID driver,read the HID specification and googled and nothing has given me any clues. I hope someone out here can help guide me in the right direction. All the best, Peter. FreeBSD 8.2-i386 From owner-freebsd-drivers@FreeBSD.ORG Sat Jun 18 14:26:22 2011 Return-Path: Delivered-To: freebsd-drivers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B83B61065670; Sat, 18 Jun 2011 14:26:22 +0000 (UTC) (envelope-from stephane.lapie@darkbsd.org) Received: from quasar.darkbsd.org (shinigami.darkbsd.org [82.227.96.182]) by mx1.freebsd.org (Postfix) with ESMTP id D16338FC08; Sat, 18 Jun 2011 14:26:21 +0000 (UTC) Received: from quasar.darkbsd.org (localhost [127.0.0.1]) by quasar.darkbsd.org (Postfix) with ESMTP id 4F98A6FF5; Sat, 18 Jun 2011 16:07:56 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=darkbsd.org; h=message-id :date:from:mime-version:to:subject:content-type; s=selector1; bh=qxMQ+Z1joU4CsnjA1GngSTtHhqk=; b=mrQ0I/2cbCztGatgJll5G1XS6CSW 7wDz7XQpYdmMD9i6k3FeKNcYNm1XZGoVTTIG+4hQvhBEPaHPm1HfZ5Y2r3t2luph YkXZdwr7CMsMHyT1O/h3pFejCWP/6mWRjYrrlmOkvQpVO6WkyxRnHP8+SmWTj5L3 3s8gkbGjQoerC+Y= DomainKey-Signature: a=rsa-sha1; c=nofws; d=darkbsd.org; h=message-id :date:from:mime-version:to:subject:content-type; q=dns; s= selector1; b=qS0+daBzZu7nx51fQowh3IC//wFYOuxpZTM9UVh87WQiu001Y2W eYDQbUnFP4VbjNrWQPSEs7D6JOfrDT6vPDnT/AwcJJxJigAqBKhTeEljSfKzizDg Bd+m9TnCyaVmKf9wczgERte81q2U/nLQOlB40kwSuHGr1joMxF0Rut8A= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=darkbsd.org; h= content-type:content-type:subject:subject:mime-version :user-agent:from:from:date:date:message-id:received:received; s= selector1; t=1308406071; bh=HRSrx71OBoInPYqWhZzR/iCvG3HZKXjGv8EX 0GijCXc=; b=xX5WbHzQrh/GYCJ5WkYe/kbhdcNqFou5DoE6V1TSOjz3hPjO0k89 ilQemy/FMDtQtdhwVzkDNrJOySUfWClce883yye1EM2xDniXV1cKZpUwbJ+53+AJ Yi4Vx8JxEWWGk3jRJwEOkeZQrBY3oppTc4js+J/F40LePlQSKbj9I7I= Received: from quasar.darkbsd.org ([127.0.0.1]) by quasar.darkbsd.org (quasar.darkbsd.org [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id dRF6YtYnkAJN; Sat, 18 Jun 2011 16:07:51 +0200 (CEST) Received: from [192.168.3.42] (archer.yomi.darkbsd.org [192.168.3.42]) (Authenticated sender: darksoul) by quasar.darkbsd.org (Postfix) with ESMTPSA id 1984C6FEE; Sat, 18 Jun 2011 16:07:47 +0200 (CEST) Message-ID: <4DFCB12A.6030805@darkbsd.org> Date: Sat, 18 Jun 2011 23:07:38 +0900 From: Stephane LAPIE User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110516 Thunderbird/3.1.10 MIME-Version: 1.0 To: freebsd-hardware@freebsd.org, freebsd-drivers@freebsd.org, freebsd-fs@freebsd.org X-Enigmail-Version: 1.1.2 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig8BE633BB83E59ACA09FD0D2A" Cc: Subject: Problem with a LSILogic SAS/SATA adapter on 8.2-STABLE/ZFSv28 X-BeenThere: freebsd-drivers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Writing device drivers for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Jun 2011 14:26:22 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig8BE633BB83E59ACA09FD0D2A Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello list, I have a problem with my 8.2-STABLE/ZFSv28 server. I am currently upgrading my disks from 1.5TB Seagate drives to 2TB Seagate drives, and therefore replacing devices within ZFS. (I have activated deduplication on a few file systems, for the record) I think this is more related to a hardware problem (flaky memory ? flaky controller/driver maybe ?), but I would appreciate any input. I experienced several kernel panics, all of which seem to point at mpt0 mis-handling interrupts : www.darkbsd.org/~darksoul/kernel-panic-mpt1.txt (no target cmd ptrs) www.darkbsd.org/~darksoul/kernel-panic-mpt2.txt (mpt_intr index =3D=3D ..= =2E) www.darkbsd.org/~darksoul/kernel-panic-mpt3.txt (NMI in kernel mode) www.darkbsd.org/~darksoul/kernel-panic-mpt4.txt (LAN CONTEXT REPLY) www.darkbsd.org/~darksoul/kernel-panic-mpt5.txt (LAN CONTEXT REPLY) www.darkbsd.org/~darksoul/kernel-panic-mpt6.txt (LAN CONTEXT REPLY) www.darkbsd.org/~darksoul/kernel-panic-mpt7.txt (LAN CONTEXT REPLY) I would appeciate any pointers to what on earth "LAN CONTEXT REPLY" means for an LSI controller (using driver mpt(4)), as I have no idea, and the source was not really helpful. The error message about an NMI and RAM parity error is what is scaring me the most here, and points me in the direction of flaky memory. This is a personal machine, so I can add debug options and try stuff if it can help figure out what is going on. Also, any critical data is replicated, backed up and accounted for. Thanks in advance for your time. Here is a zpool list and a zpool status : NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT prana 22.7T 17.4T 5.29T 76% 1.18x DEGRADED - pool: prana state: DEGRADED status: One or more devices is currently being resilvered. The pool will= continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Sat Jun 18 12:43:02 2011 13.8T scanned out of 17.3T at 236/s, (scan is slow, no estimated time= ) 899G resilvered, 79.38% done config: NAME STATE READ WRITE CKSUM prana DEGRADED 0 0 0 raidz1-0 DEGRADED 0 0 0 da3 OFFLINE 0 0 0 ad14 ONLINE 0 0 0 ad12 ONLINE 0 0 0 da1 ONLINE 0 0 0 da0 ONLINE 0 0 0 raidz1-1 DEGRADED 0 0 0 ad26 ONLINE 0 0 0 replacing-1 DEGRADED 0 0 0 da6/old OFFLINE 0 0 0 da6 ONLINE 0 0 0 (resilvering) da4 ONLINE 0 0 0 da7 ONLINE 0 0 0 da5 ONLINE 0 0 0 raidz1-2 ONLINE 0 0 0 ad28 ONLINE 0 0 0 ad8 ONLINE 0 0 0 ad6 ONLINE 0 0 0 ad16 ONLINE 0 0 0 ad18 ONLINE 0 0 0 cache gptid/d9c047d5-c1a7-11df-b584-000e0c707d1e ONLINE 0 0 0= gptid/da695e56-c1a7-11df-b584-000e0c707d1e ONLINE 0 0 0= spares da8 AVAIL da9 AVAIL Here is my dmesg trace : Copyright (c) 1992-2011 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.2-STABLE #1: Thu Jun 16 23:22:47 JST 2011 darksoul@eirei-no-za.yomi.darkbsd.org:/usr/storage/tech/eirei-no-za.yomi.= darkbsd.org/usr/obj/usr/storage/tech/eirei-no-za.yomi.darkbsd.org/usr/src= /sys/DARK-2011KERN amd64 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz (2666.68-MHz K8-class CPU) Origin =3D "GenuineIntel" Id =3D 0x1067a Family =3D 6 Model =3D 17 Stepping =3D 10 Features=3D0xbfebfbff Features2=3D0x408e3fd AMD Features=3D0x20100800 AMD Features2=3D0x1 TSC: P-state invariant real memory =3D 8589934592 (8192 MB) avail memory =3D 8254509056 (7872 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 1 package(s) x 4 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-47 on motherboard kbd1 at kbdmux0 ichwd module loaded iscsi: version 2.2.4.2 cryptosoft0: on motherboard acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 cpu0: on acpi0 cpu1: on acpi0 cpu2: on acpi0 cpu3: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: irq 16 at device 6.0 on pci0 pci3: on pcib1 pcib2: at device 0.0 on pci3 pci4: on pcib2 em0: port 0x2000-0x203f mem 0xdf980000-0xdf99ffff,0xdf900000-0xdf93ffff irq 16 at device 4.0 on pci4 em0: [FILTER] em0: Ethernet address: 00:0e:0c:70:7d:1e em1: port 0x2040-0x207f mem 0xdf9a0000-0xdf9bffff,0xdf940000-0xdf97ffff irq 17 at device 4.1 on pci4 em1: [FILTER] em1: Ethernet address: 00:0e:0c:70:7d:1f pcib3: at device 0.2 on pci3 pci5: on pcib3 em2: port 0x1820-0x183f mem 0xdfb00000-0xdfb1ffff,0xdfb20000-0xdfb20fff irq 16 at device 25.0 on pci0= em2: Using an MSI interrupt em2: [FILTER] em2: Ethernet address: 00:30:48:de:84:88 uhci0: port 0x1840-0x185f irq 16 at device 26.0 on pci0 uhci0: [ITHREAD] usbus0: on uhci0 uhci1: port 0x1860-0x187f irq 17 at device 26.1 on pci0 uhci1: [ITHREAD] usbus1: on uhci1 uhci2: port 0x1880-0x189f irq 18 at device 26.2 on pci0 uhci2: [ITHREAD] usbus2: on uhci2 ehci0: mem 0xdfb22800-0xdfb22bff irq 18 at device 26.7 on pci0 ehci0: [ITHREAD] usbus3: EHCI version 1.0 usbus3: on ehci0 pcib4: irq 16 at device 28.0 on pci0 pci6: on pcib4 pcib5: at device 0.0 on pci6 pci7: on pcib5 mpt0: port 0x3000-0x30ff mem 0xdf310000-0xdf313fff,0xdf300000-0xdf30ffff irq 24 at device 1.0 on pci7 mpt0: [ITHREAD] mpt0: MPI Version=3D1.5.12.0 mpt0: Capabilities: ( RAID-0 RAID-1E RAID-1 ) mpt0: 0 Active Volumes (2 Max) mpt0: 0 Hidden Drive Members (10 Max) atapci0: port 0x3400-0x34ff mem 0xdf200000-0xdf2fffff irq 28 at device 7.0 on pci7 atapci0: [ITHREAD] ata2: on atapci0 ata2: [ITHREAD] ata3: on atapci0 ata3: [ITHREAD] ata4: on atapci0 ata4: [ITHREAD] ata5: on atapci0 ata5: [ITHREAD] ata6: on atapci0 ata6: [ITHREAD] ata7: on atapci0 ata7: [ITHREAD] ata8: on atapci0 ata8: [ITHREAD] ata9: on atapci0 ata9: [ITHREAD] uhci3: port 0x18a0-0x18bf irq 23 at device 29.0 on pci0 uhci3: [ITHREAD] usbus4: on uhci3 uhci4: port 0x18c0-0x18df irq 22 at device 29.1 on pci0 uhci4: [ITHREAD] usbus5: on uhci4 uhci5: port 0x18e0-0x18ff irq 18 at device 29.2 on pci0 uhci5: [ITHREAD] usbus6: on uhci5 ehci1: mem 0xdfb22c00-0xdfb22fff irq 23 at device 29.7 on pci0 ehci1: [ITHREAD] usbus7: EHCI version 1.0 usbus7: on ehci1 pcib6: at device 30.0 on pci0 pci17: on pcib6 em3: port 0x4080-0x40bf mem 0xdfa00000-0xdfa1ffff irq 20 at device 0.0 on pci17 em3: [FILTER] em3: Ethernet address: 00:07:e9:0f:a3:80 em4: port 0x40c0-0x40ff mem 0xdfa20000-0xdfa3ffff irq 21 at device 0.1 on pci17 em4: [FILTER] em4: Ethernet address: 00:07:e9:0f:a3:81 vgapci0: port 0x4000-0x407f mem 0xde800000-0xdeffffff,0xdfa40000-0xdfa4ffff at device 1.0 on pci17 fwohci0: mem 0xdfa54000-0xdfa547ff,0xdfa50000-0xdfa53fff irq 22 at device 3.0 on pci17= fwohci0: [ITHREAD] fwohci0: OHCI version 1.10 (ROM=3D1) fwohci0: No. of Isochronous channels is 4. fwohci0: EUI64 00:30:48:00:00:20:42:f6 fwohci0: Phy 1394a available S400, 2 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: on fwohci0 fwe0: on firewire0 if_fwe0: Fake Ethernet address: 02:30:48:20:42:f6 fwe0: Ethernet address: 02:30:48:20:42:f6 fwip0: on firewire0 fwip0: Firewire address: 00:30:48:00:00:20:42:f6 @ 0xfffe00000000, S400, maxrec 2048 dcons_crom0: on firewire0 dcons_crom0: bus_addr 0x80ab60 fwohci0: Initiate bus reset fwohci0: fwohci_intr_core: BUS reset fwohci0: fwohci_intr_core: node_id=3D0x00000000, SelfID Count=3D1, CYCLEMASTER mode atapci1: port 0x4420-0x4427,0x4414-0x4417,0x4418-0x441f,0x4410-0x4413,0x4400-0x440f irq 23 at device 4.0 on pci17 atapci1: [ITHREAD] ata10: on atapci1 ata10: [ITHREAD] isab0: at device 31.0 on pci0 isa0: on isab0 atapci2: port 0x1c70-0x1c77,0x1c64-0x1c67,0x1c68-0x1c6f,0x1c60-0x1c63,0x1c00-0x1c1f mem 0xdfb22000-0xdfb227ff irq 17 at device 31.2 on pci0 atapci2: [ITHREAD] atapci2: AHCI called from vendor specific driver atapci2: AHCI v1.20 controller with 6 3Gbps ports, PM supported ata11: on atapci2 ata11: [ITHREAD] ata12: on atapci2 ata12: [ITHREAD] ata13: on atapci2 ata13: [ITHREAD] ata14: on atapci2 ata14: [ITHREAD] ata15: on atapci2 ata15: [ITHREAD] ata16: on atapci2 ata16: [ITHREAD] ichsmb0: port 0x1100-0x111f mem 0xdfb23000-0xdfb230ff irq 17 at device 31.3 on pci0 ichsmb0: [ITHREAD] smbus0: on ichsmb0 smb0: on smbus0 pci0: at device 31.6 (no driver attached) acpi_button0: on acpi0 atrtc0: port 0x70-0x71 irq 8 on acpi0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: [FILTER] uart0: console (115200,n,8,1) uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0 uart1: [FILTER] ichwd0: on isa0 ichwd0: Intel ICH9R watchdog timer (ICH9 or equivalent) orm0: at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=3D0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0= coretemp0: on cpu0 est0: on cpu0 p4tcc0: on cpu0 coretemp1: on cpu1 est1: on cpu1 p4tcc1: on cpu1 coretemp2: on cpu2 est2: on cpu2 p4tcc2: on cpu2 coretemp3: on cpu3 est3: on cpu3 p4tcc3: on cpu3 ZFS filesystem version 5 ZFS storage pool version 28 Timecounters tick every 1.000 msec firewire0: 1 nodes, maxhop <=3D 0 cable IRM irm(0) (me) firewire0: bus manager 0 usbus0: 12Mbps Full Speed USB v1.0 usbus1: 12Mbps Full Speed USB v1.0 usbus2: 12Mbps Full Speed USB v1.0 usbus3: 480Mbps High Speed USB v2.0 usbus4: 12Mbps Full Speed USB v1.0 usbus5: 12Mbps Full Speed USB v1.0 usbus6: 12Mbps Full Speed USB v1.0 usbus7: 480Mbps High Speed USB v2.0 ugen0.1: at usbus0 uhub0: on usbus0 ugen1.1: at usbus1 uhub1: on usbus1 ugen2.1: at usbus2 uhub2: on usbus2 ugen3.1: at usbus3 uhub3: on usbus3 ugen4.1: at usbus4 uhub4: on usbus4 ugen5.1: at usbus5 uhub5: on usbus5 ugen6.1: at usbus6 uhub6: on usbus6 ugen7.1: at usbus7 uhub7: on usbus7 ad6: 1907729MB at ata3-master UDMA100 SATA 3G= b/s ad8: 1907729MB at ata4-master UDMA100 SATA 3G= b/s ad12: 1430799MB at ata6-master UDMA100 SATA 3Gb/s ad14: 1907729MB at ata7-master UDMA100 SATA 3Gb/s uhub0: 2 ports with 2 removable, self powered uhub1: 2 ports with 2 removable, self powered uhub2: 2 ports with 2 removable, self powered uhub4: 2 ports with 2 removable, self powered uhub5: 2 ports with 2 removable, self powered uhub6: 2 ports with 2 removable, self powered ad16: 1907729MB at ata8-master UDMA100 SATA 3Gb/s ad18: 1907729MB at ata9-master UDMA100 SATA 3Gb/s ata10: DMA limited to UDMA33, controller found non-ATA66 cable ad20: 3823MB at ata10-master UDMA33 ad21: 61136MB at ata10-slave UDMA133 ad26: 1907729MB at ata13-master UDMA100 SATA 3Gb/s ad28: 1907729MB at ata14-master UDMA100 SATA 3Gb/s uhub3: 6 ports with 6 removable, self powered uhub7: 6 ports with 6 removable, self powered ugen7.2: at usbus7 umass0: on usbus7 umass0: SCSI over Bulk-Only; quirks =3D 0x0000 ugen3.2: at usbus3 da0 at mpt0 bus 0 scbus0 target 0 lun 0 da0: Fixed Direct Access SCSI-5 device da0: 300.000MB/s transfers da0: Command Queueing enabled da0: 1430799MB (2930277168 512 byte sectors: 255H 63S/T 182401C) da1 at mpt0 bus 0 scbus0 target 1 lun 0 da1: Fixed Direct Access SCSI-5 device da1: 300.000MB/s transfers da1: Command Queueing enabled da1: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) da2 at mpt0 bus 0 scbus0 target 2 lun 0 da2: Fixed Direct Access SCSI-5 device da2: 300.000MB/s transfers da2: Command Queueing enabled da2: 61136MB (125206528 512 byte sectors: 255H 63S/T 7793C) da3 at mpt0 bus 0 scbus0 target 3 lun 0 da3: Fixed Direct Access SCSI-5 device da3: 300.000MB/s transfers da3: Command Queueing enabled da3: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) da4 at mpt0 bus 0 scbus0 target 4 lun 0 da4: Fixed Direct Access SCSI-5 device da4: 300.000MB/s transfers da4: Command Queueing enabled da4: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) da5 at mpt0 bus 0 scbus0 target 5 lun 0 da5: Fixed Direct Access SCSI-5 device da5: 300.000MB/s transfers da5: Command Queueing enabled da5: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) da6 at mpt0 bus 0 scbus0 target 6 lun 0 da6: Fixed Direct Access SCSI-5 device da6: 300.000MB/s transfers da6: Command Queueing enabled da6: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) da7 at mpt0 bus 0 scbus0 target 7 lun 0 da7: Fixed Direct Access SCSI-5 device da7: 300.000MB/s transfers da7: Command Queueing enabled da7: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) SMP: AP CPU #2 Launched! SMP: AP CPU #1 Launched! SMP: AP CPU #3 Launched! Root mount waiting for: usbus7 umass0:2:0:-1: Attached to scbus2da8 at umass-sim0 bus 0 scbus2 target 0 lun 0 da8: Fixed Direct Access SCSI-4 device da8: 40.000MB/s transfers da8: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) ugen7.3: at usbus7 uhub8: on usbus7 Root mount waiting for: usbus7 uhub8: 4 ports with 4 removable, self powered ugen7.4: at usbus7 umass1: on usbus7 umass1: SCSI over Bulk-Only; quirks =3D 0x0000 Root mount waiting for: usbus7 umass1:3:1:-1: Attached to scbus3da9 at umass-sim1 bus 1 scbus3 target 0 lun 0 da9: Fixed Direct Access SCSI-4 device da9: 40.000MB/s transfers da9: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) ugen7.5: at usbus7 uhub9: on usbus7 Root mount waiting for: usbus7 uhub9: 4 ports with 4 removable, self powered Trying to mount root from zfs:prana --=20 Stephane LAPIE, EPITA SRS, Promo 2005 "Even when they have digital readouts, I can't understand them." --MegaTokyo --------------enig8BE633BB83E59ACA09FD0D2A Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk38sS4ACgkQ24Ql8u6TF2MdTQCfXGnImFL+4qSWHbV2SW6Qk0DT DkcAniV5OC8yVxhigvYA/4Cpb+UP1eNk =6Q2i -----END PGP SIGNATURE----- --------------enig8BE633BB83E59ACA09FD0D2A-- From owner-freebsd-drivers@FreeBSD.ORG Sat Jun 18 14:58:49 2011 Return-Path: Delivered-To: freebsd-drivers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E14DD1065670 for ; Sat, 18 Jun 2011 14:58:48 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta04.emeryville.ca.mail.comcast.net (qmta04.emeryville.ca.mail.comcast.net [76.96.30.40]) by mx1.freebsd.org (Postfix) with ESMTP id CA5178FC15 for ; Sat, 18 Jun 2011 14:58:48 +0000 (UTC) Received: from omta24.emeryville.ca.mail.comcast.net ([76.96.30.92]) by qmta04.emeryville.ca.mail.comcast.net with comcast id xSd51g0051zF43QA4Sldlf; Sat, 18 Jun 2011 14:45:37 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta24.emeryville.ca.mail.comcast.net with comcast id xSl51g00u1t3BNj8kSl6hX; Sat, 18 Jun 2011 14:45:07 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id AFC32102C36; Sat, 18 Jun 2011 07:45:36 -0700 (PDT) Date: Sat, 18 Jun 2011 07:45:36 -0700 From: Jeremy Chadwick To: Stephane LAPIE Message-ID: <20110618144536.GA15627@icarus.home.lan> References: <4DFCB12A.6030805@darkbsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4DFCB12A.6030805@darkbsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org, freebsd-drivers@freebsd.org, freebsd-hardware@freebsd.org Subject: Re: Problem with a LSILogic SAS/SATA adapter on 8.2-STABLE/ZFSv28 X-BeenThere: freebsd-drivers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Writing device drivers for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Jun 2011 14:58:49 -0000 On Sat, Jun 18, 2011 at 11:07:38PM +0900, Stephane LAPIE wrote: > I have a problem with my 8.2-STABLE/ZFSv28 server. I am currently > upgrading my disks from 1.5TB Seagate drives to 2TB Seagate drives, and > therefore replacing devices within ZFS. (I have activated deduplication > on a few file systems, for the record) > > I think this is more related to a hardware problem (flaky memory ? flaky > controller/driver maybe ?), but I would appreciate any input. > > I experienced several kernel panics, all of which seem to point at mpt0 > mis-handling interrupts : > www.darkbsd.org/~darksoul/kernel-panic-mpt1.txt (no target cmd ptrs) > www.darkbsd.org/~darksoul/kernel-panic-mpt2.txt (mpt_intr index == ...) > www.darkbsd.org/~darksoul/kernel-panic-mpt3.txt (NMI in kernel mode) > www.darkbsd.org/~darksoul/kernel-panic-mpt4.txt (LAN CONTEXT REPLY) > www.darkbsd.org/~darksoul/kernel-panic-mpt5.txt (LAN CONTEXT REPLY) > www.darkbsd.org/~darksoul/kernel-panic-mpt6.txt (LAN CONTEXT REPLY) > www.darkbsd.org/~darksoul/kernel-panic-mpt7.txt (LAN CONTEXT REPLY) > > I would appeciate any pointers to what on earth "LAN CONTEXT REPLY" > means for an LSI controller (using driver mpt(4)), as I have no idea, > and the source was not really helpful. > > The error message about an NMI and RAM parity error is what is scaring > me the most here, and points me in the direction of flaky memory. > > This is a personal machine, so I can add debug options and try stuff if > it can help figure out what is going on. Also, any critical data is > replicated, backed up and accounted for. For readers, the NMI and RAM parity error message in question is shown here: http://www.darkbsd.org/~darksoul/kernel-panic-mpt2.txt But is difficult to decode due to the well-established problem with the FreeBSD kernel interspersing text output. (I imagine this gets worse the more cores you have on your system, but that's not relevant to this discussion) Anyway, to expand on the "RAM parity error" and NMI message: this information I'm going to give you isn't specific to the LSI controller; it's a general piece of information. I've talked about this in the past. Please read it and focus on the SERR/PERR and NMI details: http://lists.freebsd.org/pipermail/freebsd-fs/2011-March/010938.html If you want to rule out actual system RAM issues, I would recommend running memtest86 for about 30 minutes, and then memtest86+ for the same amount of time. This might sound crazy ("why can't I just run one?!"), but you need to review the ChangeLog for memtest86 to see why. Their support for detecting corrected ECC errors was removed with 4.0, but in 4.0 they added multi-CPU support (which is good to have in this situation), while memtest86 may still have support for ECC. Neither of these utilities are as excellent as a hardware RAM tester (which does cool things like sending extreme amounts of voltage through each DRAM module, looks for soft and hard errors, etc.), but those are expensive. Usually system memory problems will show up in memtest86/86+ pretty quickly though. All that said: it may be possible that the NMIs you're seeing aren't being induced by system RAM issues at all, but somehow are being generated or caused by the LSI controller. I wasn't under the impression that a PCIe MSI and/or MSI-X generated an NMI, but I could be completely wrong. You may want to try the memtest86/86+ tests with and without the LSI controller plugged into the system to see if there's any difference as well. So that's another hour of testing. Anyway, hope this helps in some regard. P.S. -- In the future, try to avoid cross-posting. :-) -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |