Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Nov 2009 09:37:47 -0800
From:      David Wolfskill <david@catwhisker.org>
To:        hardware@freebsd.org
Subject:   7.2-STABLE i386 box crashing -- clues?
Message-ID:  <20091111173747.GA1150@albert.catwhisker.org>

next in thread | raw e-mail | index | archive | help

--ADZbWkCsHQ7r3kzd
Content-Type: multipart/mixed; boundary="Kj7319i9nmIyA2yE"
Content-Disposition: inline


--Kj7319i9nmIyA2yE
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Sometimes, I find hardware Seriously Annoying.

I'm getting pretty close to that in the present case.

I have a pretty normal machine ; it's an Intel D865GBF desktop
system board in a rack-mount 2U chssis; a single 2.6 GHz CPU where I've
(presently) deiabled hyperthreading (in an effort to un-complexify
things).

I've enabled KDB.  I've gone into the BIOS to check the hardware "events
log".  I've set up serial console, and am running tip(1) to it from
within script(1) from another machine.  Nothing.  No clues.

Every once in a while, it just crashes -- hard.  It loses video output
at that point; Ctl+Alt+Esc doesn't appear to change anything; entering
(say) "reset" blindly at that point has no apparent effect.

Either a reset switch or a power cycle bring s the machine back up again
=2E.. for a while.

Someone suggested that memory might be at fault; it had 2 512 MB PC3200
DIMMs.  Local place advertised having compatible Kingston memory of
those specs for $10/DIMM; when I got there, they said that it was out of
stock and they weren't getting any more, but they did have 1GB PC3200
DIMMs.  So I went ahead & got them (though it was rather more than $20),
and the next time it crashed, I swapped the memory.

This morning when I got up, it had crashed again.  I recalled that I had
at one point been hoping to run backups (to tape) from the machine, and
accordingly, had attached a SCSI host adaptor via PCI riser card.  Since
I had nothing actually connected to the card, I pulled it out of the
machine before bringing it back up.  (I also fleft around for
excessively warm spots; nothing.  All fans spin up, as well.)

Well, it just crashed again.

Flaky CPU?  Flaky power supply?  How might I tell?

Attached dmesg.boot is from most recent boot, so it won't show the SCSI
card I pulled.

Please include me on replies, as I'm not subscribed to hardware@.

Please include me on replies, as I'm not subscribed to hardware@.
Reply-To sewt for convenience.
I've set Reply-To for convenience.

Thanks...

Peace,
david
--=20
David H. Wolfskill				david@catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.

--Kj7319i9nmIyA2yE
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="dmesg.boot"

Copyright (c) 1992-2009 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.2-STABLE #173 r198984: Fri Nov  6 13:53:16 PST 2009
    root@freebeast.catwhisker.org:/common/S3/obj/usr/src/sys/ALBERT i386
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Pentium(R) 4 CPU 2.60GHz (2593.68-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf29  Stepping = 9
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x4400<CNXT-ID,xTPR>
  Logical CPUs per core: 2
real memory  = 2145579008 (2046 MB)
avail memory = 2089758720 (1992 MB)
ACPI APIC Table: <INTEL  D865GBF >
ioapic0 <Version 2.0> irqs 0-23 on motherboard
kbd1 at kbdmux0
acpi0: <INTEL D865GBF> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
acpi0: reservation of 0, a0000 (3) failed
acpi0: reservation of 100000, 7fe00000 (3) failed
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
vgapci0: <VGA-compatible display> port 0xec00-0xec07 mem 0xf0000000-0xf7ffffff,0xffa80000-0xffafffff irq 16 at device 2.0 on pci0
agp0: <Intel 82865G (865G GMCH) SVGA controller> on vgapci0
agp0: detected 892k stolen memory
agp0: aperture size is 128M
uhci0: <Intel 82801EB (ICH5) USB controller USB-A> port 0xdc00-0xdc1f irq 16 at device 29.0 on pci0
uhci0: [GIANT-LOCKED]
uhci0: [ITHREAD]
usb0: <Intel 82801EB (ICH5) USB controller USB-A> on uhci0
usb0: USB revision 1.0
uhub0: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0
uhub0: 2 ports with 2 removable, self powered
uhci1: <Intel 82801EB (ICH5) USB controller USB-B> port 0xe000-0xe01f irq 19 at device 29.1 on pci0
uhci1: [GIANT-LOCKED]
uhci1: [ITHREAD]
usb1: <Intel 82801EB (ICH5) USB controller USB-B> on uhci1
usb1: USB revision 1.0
uhub1: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1
uhub1: 2 ports with 2 removable, self powered
uhci2: <Intel 82801EB (ICH5) USB controller USB-C> port 0xe400-0xe41f irq 18 at device 29.2 on pci0
uhci2: [GIANT-LOCKED]
uhci2: [ITHREAD]
usb2: <Intel 82801EB (ICH5) USB controller USB-C> on uhci2
usb2: USB revision 1.0
uhub2: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb2
uhub2: 2 ports with 2 removable, self powered
uhci3: <Intel 82801EB (ICH5) USB controller USB-D> port 0xe800-0xe81f irq 16 at device 29.3 on pci0
uhci3: [GIANT-LOCKED]
uhci3: [ITHREAD]
usb3: <Intel 82801EB (ICH5) USB controller USB-D> on uhci3
usb3: USB revision 1.0
uhub3: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb3
uhub3: 2 ports with 2 removable, self powered
ehci0: <Intel 82801EB/R (ICH5) USB 2.0 controller> mem 0xffa7fc00-0xffa7ffff irq 23 at device 29.7 on pci0
ehci0: [GIANT-LOCKED]
ehci0: [ITHREAD]
usb4: EHCI version 1.0
usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3
usb4: <Intel 82801EB/R (ICH5) USB 2.0 controller> on ehci0
usb4: USB revision 2.0
uhub4: <Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb4
uhub4: 8 ports with 8 removable, self powered
pcib1: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci1: <ACPI PCI bus> on pcib1
fxp0: <Intel 82801BA (D865) Pro/100 VE Ethernet> port 0xcc00-0xcc3f mem 0xff8ff000-0xff8fffff irq 20 at device 8.0 on pci1
miibus0: <MII bus> on fxp0
inphy0: <i82562ET 10/100 media interface> PHY 1 on miibus0
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fxp0: Ethernet address: 00:0c:f1:8f:fd:69
fxp0: [ITHREAD]
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH5 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 31.1 on pci0
ata0: <ATA channel 0> on atapci0
ata0: [ITHREAD]
ata1: <ATA channel 1> on atapci0
ata1: [ITHREAD]
ichsmb0: <Intel 82801EB (ICH5) SMBus controller> port 0xd800-0xd81f irq 17 at device 31.3 on pci0
ichsmb0: [GIANT-LOCKED]
ichsmb0: [ITHREAD]
smbus0: <System Management Bus> on ichsmb0
smb0: <SMBus generic I/O> on smbus0
pcm0: <Intel ICH5 (82801EB)> mem 0xffa7f800-0xffa7f9ff,0xffa7f400-0xffa7f4ff irq 17 at device 31.5 on pci0
pcm0: [ITHREAD]
pcm0: primary codec not ready!
pcm0: <Analog Devices AD1985 AC97 Codec>
acpi_button0: <Sleep Button> on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: [ITHREAD]
psm0: model Generic PS/2 mouse, device ID 0
fdc0: <floppy drive controller> port 0x3f0-0x3f1,0x3f2-0x3f3,0x3f4-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: [FILTER]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
sio0: type 16550A, console
sio0: [FILTER]
cpu0: <ACPI CPU> on acpi0
p4tcc0: <CPU Frequency Thermal Control> on cpu0
pmtimer0 on isa0
orm0: <ISA Option ROM> at iomem 0xc0000-0xc9fff pnpid ORM0000 on isa0
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode
ppbus0: <Parallel port bus> on ppc0
ppbus0: [ITHREAD]
plip0: <PLIP network interface> on ppbus0
plip0: WARNING: using obsoleted IFF_NEEDSGIANT flag
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
ppc0: [GIANT-LOCKED]
ppc0: [ITHREAD]
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounter "TSC" frequency 2593681984 Hz quality 800
Timecounters tick every 1.000 msec
ad0: 152627MB <WDC WD1600AAJB-22WRA0 58.01H58> at ata0-master UDMA100
ad1: 238475MB <Hitachi HDS722525VLAT80 V36OA6MA> at ata0-slave UDMA100
acd0: CDROM <SONY CD-ROM CDU5221/1.01> at ata1-master UDMA33
Trying to mount root from ufs:/dev/ad0s2a
WARNING: / was not properly dismounted
Limiting icmp unreach response from 235 to 200 packets/sec

--Kj7319i9nmIyA2yE--

--ADZbWkCsHQ7r3kzd
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.13 (FreeBSD)

iEYEARECAAYFAkr69moACgkQmprOCmdXAD1LgwCdE5dWURRR67kvcSehS2PxOakc
+yMAn0hlMpEl4jg0t/hiuBoFEEAGHLAo
=fhFm
-----END PGP SIGNATURE-----

--ADZbWkCsHQ7r3kzd--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20091111173747.GA1150>