From owner-freebsd-stable@FreeBSD.ORG Fri Aug 3 08:41:07 2007 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E54B316A417 for ; Fri, 3 Aug 2007 08:41:06 +0000 (UTC) (envelope-from nti@w4w.net) Received: from mail.w4w.de (mail.w4w.de [87.225.242.3]) by mx1.freebsd.org (Postfix) with ESMTP id 4689B13C474 for ; Fri, 3 Aug 2007 08:41:06 +0000 (UTC) (envelope-from nti@w4w.net) Received: by mail.w4w.de (Exim 4.67/w4w-26.09.2006) with ESMTPA from [212.42.255.7] (helo=zora) for (envelope-from ) authenticated as nti id 1IGsBD-000LGx-OJ; Fri, 03 Aug 2007 10:06:40 +0200 From: "Nicola Tiling" To: Date: Fri, 3 Aug 2007 10:06:39 +0200 Message-ID: <007001c7d5a5$38c95ff0$c92ca8c0@zora> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138 Thread-Index: AcfVpTEQ6YI+fXl1Q5+A+/JrlhHwmw== Subject: sata problems? / system freezes X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Aug 2007 08:41:07 -0000 Hi I have problems with a combination of Mainboard: Intel Serverboard SE7221BK1-E, ICH 6 Chipset, Bios Version P06 HDD: WD 4000YR, no raid FreeBSD 6.2-STABLE-200702 (boot messages see further down) >From time to time (every 2-4 weeks) the server hangs without any message on the console or in log. It's not a kernel panic, the system freezes and there is no reaction from the server with the exception that the kernel debugger runs. It seems that the ata driver is hanging in an interrupt event and don't know what to do. Is there anybody who can give additional information about that? ---------------------------------------------------------------------------- db> bt Tracing pid 24 tid 100020 td 0xc637a480 kdb_enter(c072a8e2,e4f91bc8,c0,c637a480,c64ac400,...) at kdb_enter+0x30 siointr1(c64ac400,c64b60c0,c636f4c8,e4f91bec,c06d1799,...) at siointr1+0xd1 siointr(c64ac400,c6370000,e4f91bec,0,c637a480,...) at siointr+0x42 intr_execute_handlers(c636f4c8,e4f91c04,e4f91c7c,c06cdb33,37,...) at intr_execute_handlers+0xfa lapic_handle_intr(37) at lapic_handle_intr+0x3b Xapic_isr1() at Xapic_isr1+0x33 --- interrupt, eip = 0xc044c3ab, esp = 0xe4f91c48, ebp = 0xe4f91c7c --- ata_ahci_status(c64a4880,c63ffd38,c63ffc90,e4f91ce0,c0530521,...) at ata_ahci_status+0x57 ata_interrupt(c6479c00,c64a0cc0,4,e4f91ce0,c050ede2,...) at ata_interrupt+0x68 ata_generic_intr(c6489600,c637a480,f18bb,f2539122,c637a480,...) at ata_generic_intr+0x25 ithread_execute_handlers(c63ffc90,c6376300,c63ffc90,c637a480,c63ffc90,...) at ithread_execute_handlers+0x15e ithread_loop(c6462170,e4f91d38,ffffffff,ffffffff,ffffffff,...) at ithread_loop+0x63 fork_exit(c050eec0,c6462170,e4f91d38) at fork_exit+0x7a fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xe4f91d6c, ebp = 0 --- ---------------------------------------------------------------------------- Doing some alltrace, go out of the debugger and reenter: ---------------------------------------------------------------------------- ... Tracing command init pid 1 tid 100007 td 0xc6379480 sched_switch(c6379480,0,1,11dd38b4,8d0595a4,...) at sched_switch+0x158 mi_switch(1,0) at mi_switch+0x1d4 sleepq_switch(c637f000,c6379480,0,e4f73c2c,c052ffcb,...) at sleepq_switch+0x91 sleepq_wait_sig(c637f000,5c,c07179db,100,c649c648,...) at sleepq_wait_sig+0x21 msleep(c637f000,c637f068,15c,c07179db,0,...) at msleep+0x288 kern_wait(c6379480,ffffffff,e4f73c78,0,0,...) at kern_wait+0xb10 wait4(c6379480,e4f73d04,10,1a031,0,...) at wait4+0x3c syscall(3b,3b,bfbf003b,2,bfbfeef8,...) at syscall+0x34a Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (7, FreeBSD ELF32, wait4), eip = 0x8054197, esp = 0xbfbfed6c, ebp = 0xbfbfed88 --- Tracing command swapper pid 0 tid 0 td 0xc076f500 sched_switch(c076f500,0,1,b624e016,aa96b765,...) at sched_switch+0x158 mi_switch(1,0,0,0,0,...) at mi_switch+0x1d4 scheduler(0,c1e000,c1ec00,c1e000,0,...) at scheduler+0x224 mi_startup() at mi_startup+0xa0 begin() at begin+0x2c ---------------------------------------------------------------------------- db> continue ~KDB: enter: Line break on console [thread pid 24 tid 100020 ] Stopped at kdb_enter+0x30: leave ---------------------------------------------------------------------------- db> bt Tracing pid 24 tid 100020 td 0xc637a480 kdb_enter(c072a8e2,0,0,c637a480,c64ac400,...) at kdb_enter+0x30 siointr1(c64ac400,c64b60c0,c636f4c8,e4f91c80,c06d1799,...) at siointr1+0xd1 siointr(c64ac400,e4f91c74,c053cba3,0,c637a480,...) at siointr+0x42 intr_execute_handlers(c636f4c8,e4f91c98,e4f91ce0,c06cdb33,37,...) at intr_execute_handlers+0xfa lapic_handle_intr(37) at lapic_handle_intr+0x3b Xapic_isr1() at Xapic_isr1+0x33 --- interrupt, eip = 0xc06d781f, esp = 0xe4f91cdc, ebp = 0xe4f91ce0 --- spinlock_exit(1,0,c63ffc90,c637a480,c63ffc90,...) at spinlock_exit+0x28 ithread_loop(c6462170,e4f91d38,ffffffff,ffffffff,ffffffff,...) at ithread_loop+0xf4 fork_exit(c050eec0,c6462170,e4f91d38) at fork_exit+0x7a fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xe4f91d6c, ebp = 0 --- ---------------------------------------------------------------------------- Trying do boot, but the hdd hangs in a timeout loop ---------------------------------------------------------------------------- db> call boot Waiting (max 60 seconds) for system process `vnlru' to stop... done Waiting (max 60 seconds) for system process `bufdaemon' to stop... FreeBSD/i386 em1: watchdog timeout -- resetting ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=100029696 ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=100029760 ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly timed out Waiting (max 60 seconds) for system process `syncer' to stop...ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly ad4: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=100029696 ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly ad4: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=100029760 ... ---------------------------------------------------------------------------- REBOOT over IPMI ---------------------------------------------------------------------------- KDB: debugger backends: ddb KDB: current backend: ddb Copyright (c) 1992-2007 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 6.2-STABLE-200702 #11: Sun Jul 15 21:17:16 CEST 2007 ACPI APIC Table: Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Pentium(R) 4 CPU 3.00GHz (2992.52-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf43 Stepping = 3 Features=0xbfebfbff Features2=0x649d> AMD Features=0x20000000 Logical CPUs per core: 2 real memory = 2138984448 (2039 MB) avail memory = 2088189952 (1991 MB) ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-47 on motherboard ioapic2 irqs 48-71 on motherboard kbd1 at kbdmux0 acpi0: on motherboard acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi0: Power Button (fixed) acpi0: reservation of 500, 10 (4) failed acpi0: reservation of 560, 20 (4) failed Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 cpu0: on acpi0 acpi_throttle0: on cpu0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pci0: at device 2.0 (no driver attached) pcib1: irq 16 at device 28.0 on pci0 pci2: on pcib1 pcib2: at device 0.0 on pci2 pci4: on pcib2 em0: port 0xef80-0xefbf mem 0xdffe0000-0xdfffffff irq 27 at device 3.0 on pci4 em0: Ethernet address: 00:0e:0c:4a:a7:fd pcib3: at device 0.2 on pci2 pci3: on pcib3 uhci0: port 0xcc00-0xcc1f irq 23 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: port 0xcc80-0xcc9f irq 19 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] usb1: on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0xcd00-0xcd1f irq 18 at device 29.2 on pci0 uhci2: [GIANT-LOCKED] usb2: on uhci2 usb2: USB revision 1.0 uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered ehci0: mem 0xdfdff800-0xdfdffbff irq 23 at device 29.7 on pci0 ehci0: [GIANT-LOCKED] usb3: EHCI version 1.0 usb3: companion controllers, 2 ports each: usb0 usb1 usb2 usb3: on ehci0 usb3: USB revision 2.0 uhub3: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub3: 6 ports with 6 removable, self powered pcib4: at device 30.0 on pci0 pci1: on pcib4 em1: port 0xdf80-0xdfbf mem 0xdfee0000-0xdfefffff irq 18 at device 3.0 on pci1 em1: Ethernet address: 00:0e:0c:4a:a7:fc isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376 at device 31.1 on pci0 ata0: on atapci0 ata1: on atapci0 atapci1: port 0xcf80-0xcf87,0xcf00-0xcf03,0xce80-0xce87,0xce00-0xce03,0xcd80-0xcd8f mem 0xdfdffc00-0xdfdfffff irq 19 at device 31.2 on pci0 atapci1: AHCI Version 01.00 controller with 4 ports detected ata2: on atapci1 ata3: on atapci1 ata4: on atapci1 ata5: on atapci1 ichsmb0: port 0x400-0x41f irq 19 at device 31.3 on pci0 ichsmb0: [GIANT-LOCKED] smbus0: on ichsmb0 ipmi0: on smbus0 ipmi0: SSIF mode found at address 0x42 on smbus acpi_button0: on acpi0 sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A, console fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] fd0: <1440-KB 3.5" drive> on fdc0 drive 0 pmtimer0 on isa0 orm0: at iomem 0xc9800-0xca7ff,0xca800-0xcb7ff,0xdc000-0xdffff on isa0 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] ppc0: parallel port not found. sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounter "TSC" frequency 2992519478 Hz quality 800 Timecounters tick every 1.000 msec em0: link state changed to UP em1: link state changed to UP acd0: DVDROM at ata0-master UDMA33 ad4: 381554MB at ata2-master SATA150 ad6: 381554MB at ata3-master SATA150 ad8: 381554MB at ata4-master SATA150 ipmi0: IPMI device rev. 1, firmware rev. 2.81, version 1.5 ipmi0: Number of channels 0 ipmi0: Attached watchdog Trying to mount root from ufs:/dev/ad4s2a WARNING: / was not properly dismounted Loading configuration files. kernel dumps on /dev/ad4s2b Entropy harvesting: interrupts ethernet point_to_point kickstart. swapon: adding /dev/ad4s2b as swap device Starting file system checks: /dev/ad4s2a: 3492 files, 72574 used, 940441 free (2689 frags, 117219 blocks, 0.3% fragmentation) /dev/ad6s2a: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad6s2a: clean, 940441 free (497 frags, 117493 blocks, 0.0% fragmentation) /dev/ad8s1d: DEFER FOR BACKGROUND CHECKING /dev/ad4s2d: DEFER FOR BACKGROUND CHECKING /dev/ad4s2e: DEFER FOR BACKGROUND CHECKING /dev/ad4s2f: DEFER FOR BACKGROUND CHECKING /dev/ad4s3p1: DEFER FOR BACKGROUND CHECKING /dev/ad4s3p2: DEFER FOR BACKGROUND CHECKING /dev/ad4s3p3: DEFER FOR BACKGROUND CHECKING /dev/ad4s3p11: DEFER FOR BACKGROUND CHECKING /dev/ad4s3p10: DEFER FOR BACKGROUND CHECKING /dev/ad4s3p4: DEFER FOR BACKGROUND CHECKING /dev/ad4s3p5: DEFER FOR BACKGROUND CHECKING /dev/ad4s3p12: DEFER FOR BACKGROUND CHECKING /dev/ad4s3p13: DEFER FOR BACKGROUND CHECKING /dev/ad6s2d: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad6s2d: clean, 5009816 free (15968 frags, 624231 blocks, 0.2% fragmentation) /dev/ad6s2e: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad6s2e: clean, 1437962 free (8354 frags, 178701 blocks, 0.3% fragmentation) /dev/ad6s2f: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad6s2f: clean, 1011948 free (156 frags, 126474 blocks, 0.0% fragmentation) /dev/ad6s3p1: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad6s3p1: clean, 1384097 free (2433 frags, 172708 blocks, 0.2% fragmentation) /dev/ad6s3p2: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad6s3p2: clean, 11232980 free (14020 frags, 1402370 blocks, 0.1% fragmentation) /dev/ad6s3p3: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad6s3p3: clean, 9642063 free (15775 frags, 1203286 blocks, 0.1% fragmentation) /dev/ad6s3p11: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad6s3p11: clean, 772504 free (6136 frags, 95796 blocks, 0.4% fragmentation) /dev/ad6s3p10: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad6s3p10: clean, 1298386 free (5226 frags, 161645 blocks, 0.3% fragmentation) /dev/ad6s3p4: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad6s3p4: clean, 2140124 free (5428 frags, 266837 blocks, 0.2% fragmentation) /dev/ad6s3p5: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad6s3p5: clean, 776311 free (4455 frags, 96482 blocks, 0.4% fragmentation) /dev/ad6s3p12: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad6s3p12: clean, 14682517 free (5149 frags, 1834671 blocks, 0.0% fragmentation) /dev/ad6s3p13: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ad6s3p13: clean, 7697255 free (6631 frags, 961328 blocks, 0.1% fragmentation) Mounting local file systems:WARNING: /usr was not properly dismounted