Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 16 Nov 2009 11:04:10 -0800
From:      David Wolfskill <david@catwhisker.org>
To:        Peter Jeremy <peterjeremy@acm.org>
Cc:        hardware@freebsd.org
Subject:   Re: 7.2-STABLE i386 box crashing -- clues?
Message-ID:  <20091116190410.GA1589@albert.catwhisker.org>
In-Reply-To: <20091116182924.GA30969@server.vk2pj.dyndns.org>
References:  <20091111173747.GA1150@albert.catwhisker.org> <20091112062708.GA16648@server.vk2pj.dyndns.org> <20091112125903.GA1631@albert.catwhisker.org> <20091116182924.GA30969@server.vk2pj.dyndns.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--d6Gm4EdcadzBjdND
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Nov 17, 2009 at 05:29:24AM +1100, Peter Jeremy wrote:
> ...
> >Yes; the machine is configured to start xdm on transition to
> >multi--user, as my spouse used to use it as a desktop.
>=20
> Does the problem still appear if you don't start X?

I haven't tried that yet....

> Is it running anything unusual when it crashes?

Not that I can tell, no.  Though I did just notice that the whines about
icmp unreach responses is actually coming from the machine that's
crashing ("albert") vs. the firewall box (which is configured to log
everything to albert).

> >> At this stage, my suggestion would be to try swapping the PSU.
> >
> >Thanks.  I'll discuss it with the "family CFO."
>=20
> You can't swap it with another of your systems?  Even if it doesn't
> fit neatly into the case, a temporary swap would give you some
> confidence as to whether it was really faulty or not (especially if
> the random reboots move to the other system).

I think it's more a matter of "at all" rather than "neatly." :-}  I tend
to have a variety of hardware, but each machine tends to be from a
different era or have other differences that cause each to be a one-off.

But I'll see what I can find.

In the mean time. tyhe machine crashed this morning after I got in to
work -- but wonder of wonders, it came back up again this time.

And the typescript file that's capturing the serial console activity
showed:

fxp0: link state changed to UP
Limiting icmp unreach response from 234 to 200 packets/sec

FreeBSD/i386 (albert.catwhisker.org) (ttyd0)

login: drm0: <Intel i865G GMCH> on vgapci0
vgapci0: child drm0 requested pci_enable_busmaster
info: [drm] AGP at 0xf0000000 128MB
info: [drm] Initialized i915 1.6.0 20080730
drm0: [ITHREAD]
Limiting icmp unreach response from 201 to 200 packets/sec
Limiting icmp unreach response from 201 to 200 packets/sec
Limiting icmp unreach response from 201 to 200 packets/sec
Limiting icmp unreach response from 201 to 200 packets/sec
Limiting icmp unreach response from 201 to 200 packets/sec
Limiting icmp unreach response from 201 to 200 packets/sec
Limiting icmp unreach response from 201 to 200 packets/sec
Limiting icmp unreach response from 201 to 200 packets/sec
Limiting icmp unreach response from 201 to 200 packets/sec
Limiting icmp unreach response from 201 to 200 packets/sec
panic: vm_fault: fault on nofault entry, addr: c3983000
cpuid =3D 0
KDB: stack backtrace:
db_trace_self_wrapper(c0bf0330,e7d168f8,c082cae9,c0c1237c,0,...) at 0xc049e=
9a6 =3D db_trace_self_wrapper+0x26
kdb_backtrace(c0c1237c,0,c0c07dfe,e7d16904,0,...) at 0xc085a239 =3D kdb_bac=
ktrace+0x29
panic(c0c07dfe,c3983000,2,e7d169fc,e7d169ec,...) at 0xc082cae9 =3D panic+0x=
119
vm_fault(c1471000,c3983000,2,0,e7d16a7c,...) at 0xc0a6ec88 =3D vm_fault+0x1=
78
trap_pfault(c0d4de20,e7d16ac4,c0b2b675,0,c660cb00,...) at 0xc0b3f60e =3D tr=
ap_pfault+0x20e
trap(e7d16b3c) at 0xc0b400b5 =3D trap+0x445
calltrap() at 0xc0b22dbb =3D calltrap+0x6
--- trap 0xc, eip =3D 0xc0b37648, esp =3D 0xe7d16b7c, ebp =3D 0xe7d16b88 ---
pmap_try_insert_pv_entry(c08b14c4,c63c0cf0,c63c0cf0,e7d16bbc,c08b4b17,...) =
at 0xc0b37648 =3D pmap_try_insert_pv_entry+0x48
pmap_copy(c6cd2d74,c69dd350,33f7d000,f4000,33f7d000,...) at 0xc0b3c1e8 =3D =
pmap_copy+0x2e8
vmspace_fork(c69dd2c4,0,2,e7d16c5c,bfbfc824,...) at 0xc0a7698b =3D vmspace_=
fork+0x42b
fork1(c63c3b40,14,0,e7d16c78,0,...) at 0xc08051ee =3D fork1+0x30e
fork(c63c3b40,e7d16cfc,c,8001550d,369e99,...) at 0xc0806b79 =3D fork+0x29
syscall(e7d16d38) at 0xc0b3f9c5 =3D syscall+0x335
Xint0x80_syscall() at 0xc0b22e20 =3D Xint0x80_syscall+0x20
--- syscall (2, FreeBSD ELF32, fork), eip =3D 0x340cde4b, esp =3D 0xbfbfc7c=
c, ebp =3D 0xbfbfc858 ---
Uptime: 3d4h1m43s
Physical memory: 2033 MB
Dumping 179 MB: 164 148 132 116 100 84 68 52 36 20 4
Dump complete
Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...


Taking a quick look at vmcore.1, I see:

albert(7.2-S)[5] kgdb /boot/kernel/kernel vmcore.1
GNU gdb 6.1.1 [FreeBSD]
=2E..
[above stuff...]
=2E..
#0  doadump () at pcpu.h:196
196     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) bt
#0  doadump () at pcpu.h:196
#1  0xc082c817 in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:4=
18
#2  0xc082cb22 in panic (fmt=3DVariable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:574
#3  0xc0a6ec88 in vm_fault (map=3D0xc1471000, vaddr=3D3281530880,=20
    fault_type=3D2 '\002', fault_flags=3D0) at /usr/src/sys/vm/vm_fault.c:2=
77
#4  0xc0b3f60e in trap_pfault (frame=3D0xe7d16b3c, usermode=3D0, eva=3D3281=
531764)
    at /usr/src/sys/i386/i386/trap.c:852
#5  0xc0b400b5 in trap (frame=3D0xe7d16b3c) at /usr/src/sys/i386/i386/trap.=
c:541
#6  0xc0b22dbb in calltrap () at /usr/src/sys/i386/i386/exception.s:166
#7  0xc0b37648 in pmap_try_insert_pv_entry (pmap=3D0xc6cd2d74, va=3D8721408=
00,=20
    m=3D0xc2be5110) at /usr/src/sys/i386/i386/pmap.c:2229
#8  0xc0b3c1e8 in pmap_copy (dst_pmap=3D0xc6cd2d74, src_pmap=3D0xc69dd350,=
=20
    dst_addr=3D871878656, len=3D999424, src_addr=3D871878656)
    at /usr/src/sys/i386/i386/pmap.c:3677
#9  0xc0a7698b in vmspace_fork (vm1=3D0xc69dd2c4)
    at /usr/src/sys/vm/vm_map.c:2552
#10 0xc08051ee in fork1 (td=3D0xc63c3b40, flags=3DVariable "flags" is not a=
vailable.
)
    at /usr/src/sys/kern/kern_fork.c:288
#11 0xc0806b79 in fork (td=3D0xc63c3b40, uap=3D0xe7d16cfc)
    at /usr/src/sys/kern/kern_fork.c:107
#12 0xc0b3f9c5 in syscall (frame=3D0xe7d16d38)
    at /usr/src/sys/i386/i386/trap.c:1101
#13 0xc0b22e20 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s=
:262
#14 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)


If there is an issue with the PSU, I'm not sure there's much to be
gained by spending much time on that dump -- I understand that
there's not much information to trust if the PSU is flaky.

Thanks for your help!

Peace,
david
--=20
David H. Wolfskill				david@catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.

--d6Gm4EdcadzBjdND
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.13 (FreeBSD)

iEYEARECAAYFAksBoikACgkQmprOCmdXAD3h+wCfRMXcxH7UUqleSYnMMiHoRI0A
tg0An2+iZqnyBzgEYu89l96nJUY4sS1T
=TLek
-----END PGP SIGNATURE-----

--d6Gm4EdcadzBjdND--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20091116190410.GA1589>