Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 May 2007 11:07:10 +0200
From:      Volker <volker@vwsoft.com>
To:        Kris Kennaway <kris@obsecurity.org>
Cc:        rmiranda@digitalrelay.ca, freebsd-stable@FreeBSD.ORG
Subject:   Re: ghosthunting: machine freeze 6.2R
Message-ID:  <4656A73E.9040109@vwsoft.com>
In-Reply-To: <20070523215818.GB64723@xor.obsecurity.org>
References:  <200705230717.l4N7HuPW010071@lurza.secnetix.de>	<465408F9.6080302@vwsoft.com> <4654C0C4.2030405@vwsoft.com> <20070523215818.GB64723@xor.obsecurity.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Kris, Roger & all,

On 05/23/07 23:58, Kris Kennaway wrote:
> On Wed, May 23, 2007 at 11:31:32PM +0100, Volker wrote:
>> talking to myself... ;)
>>
>> On 2007-05-23 10:27, Volker wrote:
>> Unfortunately three hours later, the machine died completely. It has
>> been a hardware failure which came quietly.
>>
>> Sorry for the noise I've put on this list but when experiencing
>> things like that, one has to think in all possible directions (I
>> first thought about a DoS attack).
> 
> Even though it turned out to be a hardware failure, it was helpful to
> publicize this fact.  It is often difficult to convince users to
> accept the possibility that hardware failure may be the cause of weird
> system behaviour, because "it has always been fine".  It is worth
> remembering that if your hardware is going to fail, then there is
> going to be a first time.

well, we replaced the broken machine (totally different hardware), 
took one of the mirrored hard disks into this replacement machine 
and took this replacement into production.

Unfortunately it took less than 16 hours for this replacement 
machine to also freeze. My assumption is, the freeze itself has 
nothing to do with bad hardware, as it's now happening on two 
different machines. This replacement doesn't have em NICs but gives 
the same bad behavior (so I also think, it's not em related). As I 
really do want to know what's going on, I'm now compiling a new 
world + kernel with WITNESS and INVARIANTS support and see if I can 
catch something.

I'm using the following additional kernel options:

makeoptions     DEBUG=-g
options         KDB
options         KDB_UNATTENDED
options         KDB_TRACE
options         DDB
options         WITNESS
options         WITNESS_SKIPSPIN
options         INVARIANTS
options         INVARIANT_SUPPORT
options         DIAGNOSTIC
options         PANIC_REBOOT_WAIT_TIME=60

Suggestions on these options? Anything more to enable with massive 
performace loss?

`uname -v':
FreeBSD 6.2-RELEASE-p1 #0: Sun Feb 11 22:35:18 CET 2007
While it's now in a box with an Athlon XP, it's still i386 binary.

Anything else I can additionally do to debug these freezes?

Thx

Volker


dmesg:

Copyright (c) 1992-2007 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
         The Regents of the University of California. All rights 
reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 6.2-RELEASE-p1 #0: Sun Feb 11 22:35:18 CET 2007
     root@GwMbg.elbekies.net:/usr/obj/usr/src/sys/GwMbg
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm) XP (1198.83-MHz 686-class CPU)
   Origin = "AuthenticAMD"  Id = 0x681  Stepping = 1
 
Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
   AMD Features=0xc0480800<SYSCALL,MP,MMX+,3DNow+,3DNow>
real memory  = 1073676288 (1023 MB)
avail memory = 1041526784 (993 MB)
ACPI APIC Table: <AMIINT VIA_K7  >
ioapic0 <Version 0.3> irqs 0-23 on motherboard
acpi0: <AMIINT VIA_K7> on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
cpu0: <ACPI CPU> on acpi0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
agp0: <VIA 8377 (Apollo KT400/KT400A/KT600) host to PCI bridge> mem 
0xe0000000-0xe3ffffff at device 0.0 on pci0
pcib1: <PCI-PCI bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib1
pci1: <display, VGA> at device 0.0 (no driver attached)
rl0: <RealTek 8139 10/100BaseTX> port 0xd400-0xd4ff mem 
0xdfffbf00-0xdfffbfff irq 16 at device 12.0 on pci0
miibus0: <MII bus> on rl0
rlphy0: <RealTek internal media interface> on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
rl0: Ethernet address: 00:80:48:15:3f:26
atapci0: <VIA 6420 SATA150 controller> port 
0xec00-0xec07,0xe800-0xe803,0xe400-0xe407,0xe000-0xe003,0xdc00-0xdc0f,0xd800-0xd8ff 
irq 20 at device 15.0 on pci0
ata2: <ATA channel 0> on atapci0
ata3: <ATA channel 1> on atapci0
atapci1: <VIA 8237 UDMA133 controller> port 
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f at device 15.1 on pci0
ata0: <ATA channel 0> on atapci1
ata1: <ATA channel 1> on atapci1
uhci0: <VIA 83C572 USB controller> port 0xc400-0xc41f irq 21 at 
device 16.0 on pci0
uhci0: [GIANT-LOCKED]
usb0: <VIA 83C572 USB controller> on uhci0
usb0: USB revision 1.0
uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: <VIA 83C572 USB controller> port 0xc800-0xc81f irq 21 at 
device 16.1 on pci0
uhci1: [GIANT-LOCKED]
usb1: <VIA 83C572 USB controller> on uhci1
usb1: USB revision 1.0
uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: <VIA 83C572 USB controller> port 0xcc00-0xcc1f irq 21 at 
device 16.2 on pci0
uhci2: [GIANT-LOCKED]
usb2: <VIA 83C572 USB controller> on uhci2
usb2: USB revision 1.0
uhub2: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhci3: <VIA 83C572 USB controller> port 0xd000-0xd01f irq 21 at 
device 16.3 on pci0
uhci3: [GIANT-LOCKED]
usb3: <VIA 83C572 USB controller> on uhci3
usb3: USB revision 1.0
uhub3: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
ehci0: <VIA VT6202 USB 2.0 controller> mem 0xdfffbd00-0xdfffbdff irq 
21 at device 16.4 on pci0
ehci0: [GIANT-LOCKED]
usb4: EHCI version 1.0
usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3
usb4: <VIA VT6202 USB 2.0 controller> on ehci0
usb4: USB revision 2.0
uhub4: VIA EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub4: 8 ports with 8 removable, self powered
isab0: <PCI-ISA bridge> at device 17.0 on pci0
isa0: <ISA bus> on isab0
pci0: <multimedia, audio> at device 17.5 (no driver attached)
vr0: <VIA VT6102 Rhine II 10/100BaseTX> port 0xbc00-0xbcff mem 
0xdfffbc00-0xdfffbcff irq 23 at device 18.0 on pci0
miibus1: <MII bus> on vr0
ukphy0: <Generic IEEE 802.3u media interface> on miibus1
ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
vr0: Ethernet address: 00:13:8f:0f:14:d2
acpi_button1: <Sleep Button> on acpi0
fdc0: <floppy drive controller> port 0x3f2-0x3f3,0x3f4-0x3f5,0x3f7 
irq 6 drq 2 on acpi0
fdc0: [FAST]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 
on acpi0
sio0: type 16550A
ppc0: <ECP parallel printer port> port 0x378-0x37f,0x778-0x77b irq 7 
drq 0 on acpi0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/9 bytes threshold
ppbus0: <Parallel port bus> on ppc0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model IntelliMouse, device ID 3
pmtimer0 on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <12 virtual consoles, flags=0x300>
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on 
isa0
Timecounter "TSC" frequency 1198831745 Hz quality 800
Timecounters tick every 1.000 msec
Fast IPsec: Initialized Security Association Processing.
acd0: DVDROM <TOSHIBA DVD-ROM SD-M1402/1008> at ata1-master UDMA33
ad4: 76293MB <Maxtor 6Y080M0 YAR51HW0> at ata2-master SATA150
ar0: WARNING - mirror protection lost. RAID1 array in DEGRADED mode
ar0: 76293MB <FreeBSD PseudoRAID RAID1> status: DEGRADED
ar0: disk0 DOWN no device found for this subdisk
ar0: disk1 READY (mirror) using ad4 at ata2-master
cd0 at ata1 bus 0 target 0 lun 0
cd0: <TOSHIBA DVD-ROM SD-M1402 1008> Removable CD-ROM SCSI-0 device
cd0: 33.000MB/s transfers




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4656A73E.9040109>