Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 01 Apr 2017 00:18:35 +0000
From:      Mike Meyer <mwm@mired.org>
To:        questions@freebsd.org
Subject:   Help with crashes on FreeBSD 11.0
Message-ID:  <CAD=7U2BkEydwtN45Nkzg=HDRzOJ7pwDj7M6TH0nbSntBEK7wdA@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
I've been chasing issues on my 11.0 system, and have about reached a dead
end. I'm hoping I can get some help here.

The initial issue was that Chromium would crash at random with an illegal
instruction. Google couldn't find any reports from others of such problems,
and there were no open issues about it, so I figured it was some kind of
corruption on my system. I checked package checksums, reinstalled packages
that weren't right, and otherwise put things back. Still happened. Tried
reinstalling all packages. Still no help. At that point, I tried firefox
and it was also crashing, but with a different error (segmentation
violation or bus error). VirtualBox also started crashing, with whichever
of those two problems Firefox wasn't having. All three crashed pretty
reliably, always with the same error, but each with a different error. I
tried building a debug version of chromium, but kept getting link errors
about "environ" being missing (huh?).

So I figured it was a hardware problem. I spent three days running various
version of memtest86 and found zero errors.  I started running some stress
tests. Looping over "make buildworld" ran for a day or so with no problems.
Changing it to "make -j 12 buildworld" (I see 12 cpu's on my hyperthreaded
6-core system) led to pretty consistent reboots after a few hours. But no
crash dumps.

Ok, my swap is on a disk that has partitions in ZFS pools, so I figured
that might be an issue. Added a spare drive, pointed dumpdev at it, and
verified with debug.kdb.panic that I would get a core dump from a panic.
Reran the "make -j 12 buildworld" loop. Still no core dumps..

Ok, at this point I'm sort of at a loss. I'm pretty sure it's a hardware
problem, but have no idea how to narrow things down. Running a version of
one of the crashing programs under gdb with debug symbols might help, but
might not, and getting one seems problematical.

Any one have suggestions? Maybe a good hardware test tool that test things
other than memory? Any information I can provide that might help?

Thanks,
Mike



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAD=7U2BkEydwtN45Nkzg=HDRzOJ7pwDj7M6TH0nbSntBEK7wdA>