Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 1 Aug 2003 13:54:02 +0100
From:      Bruce Cran <bruce@cran.org.uk>
To:        "Karel J. Bosschaart" <K.J.Bosschaart@tue.nl>
Cc:        current@freebsd.org
Subject:   Re: make buildworld: Signal 11; Illegal instruction
Message-ID:  <20030801125402.GA50343@buffy.brucec.backnet>
In-Reply-To: <20030801124116.GA2688@phys9911.phys.tue.nl>
References:  <20030801110415.GA13918@speedy.unibe.ch> <20030801124116.GA2688@phys9911.phys.tue.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Aug 01, 2003 at 02:41:16PM +0200, Karel J. Bosschaart wrote:
> On Fri, Aug 01, 2003 at 01:04:16PM +0200, Tobias Roth wrote:
> > On Thu, Jul 31, 2003 at 09:52:08PM +0100, Bruce Cran wrote:
> > > On Thu, Jul 31, 2003 at 03:03:01PM -0400, Chris Shenton wrote:
> > > > Chris Shenton <chris@shenton.org> writes:
> > > > 
> > > > >   *** Signal 11
> > > > >... 
> > > > >   Illegal instruction (core dumped)
> > > > >   *** Error code 132
> > > > 
> > > > Also seeing
> > > > 
> > > > *** Signal 4
> > > > 
> > > > if it matters.  This sounds way too flakey to be SW.
> > > 
> > > I'm seeing the same symptoms.   I got a signal 4 when running 'clean'
> > in the 
> > > pam authentication directory, and I've just had a signal 11 running 
> > > 'rm -f libradius.so'.  This is an install from a snapshot I built
> > today - 
> > > during the install I had panics in _mtx_init_ and a backtrace traced
> > through 
> > > vfs and ffs functions, and I only managed to install successfully when
> > I 
> > > had the CPU throttled to 30%.  This is the same computer which ran
> > memtest86
> > > for 8 hours without a single fault last night, so I doubt the
> > hardware's 
> > > faulty, at least not the memory or the CPU.
> > 
> > memtest86 does not always catch memory errors. sig11 and sig4 at varying
> > locations during buildworld are a sure indicator for a hardware problem.
> > most likely a memory or overheating issue, though other hardware related
> > causes are possible.
> > 
> > if you still are not convinced that this is a hardware issue, run build-
> > world on a -stable system.
> > 
> > more and more latest generation laptops from different manufacturers
> > show
> > these symptoms during hot days. my guess is that mobile pentium 4
> > systems
> > are just not as stable as they should. let's hope things get better with
> > the pentium m chips. are the manufaturers deploy better quality control
> > to
> > catch the numerous faulty systems.
> 
> My stock Dell Optiplex GX260, P4 based with 256 MB RAM, running -current,
> would spit signal 4,10 and 11 (and also 6, don't remember) all over the place
> during buildworld when not having these kernel options:
> 
> options         DISABLE_PSE
> options         DISABLE_PG_G
> 
> Search the -current archive, it's due to a processor bug but there is
> no detailed public information about it and hence no 'official' fix.
> 
> You might try and see if it helps for you. memtest86 and other hardware
> testers won't notice anything because it's in the CPU and officially
> unknown.
> 
> But yes, also keep in mind that there might be overheating issues if 
> the wheather is hot; yesterday my -stable machine at home rebooted during
> a port build: turned out to be a flatcable being too close to the CPU fan...
> 
> Karel.

Thanks, I'd come to the conclusion it must have been the P4 bug.   The system
gets hot, sometimes 65 deg C during builds, but it very rarely aborts on a 
signal 11.   I don't quite understand what happened yesterday to break it so
badly, maybe it was because I was newly installing a -CURRENT snapshot I'd
built with pentium2 optimisations, but I don't know.   

--
Bruce Cran



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030801125402.GA50343>