Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 1 Jun 2017 15:41:28 +0200
From:      Raimo Niskanen <raimo+freebsd@erix.ericsson.se>
To:        <freebsd-questions@freebsd.org>
Subject:   Re: Advice on kernel panics
Message-ID:  <20170601134128.GB2256@erix.ericsson.se>
In-Reply-To: <CAOgwaMvse3h7Kn+eZW_mz2EDR8PqG_x5=F0nGZ=JHk=ap7Dz+Q@mail.gmail.com>
References:  <20170529092043.GA89682@erix.ericsson.se> <20170601051030.GA39861@geeks.org> <20170601082749.GA80543@erix.ericsson.se> <CAOgwaMvse3h7Kn+eZW_mz2EDR8PqG_x5=F0nGZ=JHk=ap7Dz+Q@mail.gmail.com>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
On Thu, Jun 01, 2017 at 03:07:36AM -0700, Mehmet Erol Sanliturk wrote:
> On Thu, Jun 1, 2017 at 1:27 AM, Raimo Niskanen <
> raimo+freebsd@erix.ericsson.se> wrote:
> 
> > On Thu, Jun 01, 2017 at 12:10:30AM -0500, Doug McIntyre wrote:
> > > On Mon, May 29, 2017 at 11:20:43AM +0200, Raimo Niskanen wrote:
> > > > I have a server that panics about every 3 days and need some advice on
> > how
> > > > to handle that.
> > >
> > > I'd expect it is some sort of hardware failure, as I would expect
> > > kernel panics more on the order of once a decade with FreeBSD. Ie.
> > > I've seen one or two on my hundred or so servers, but its pretty rare.
> > >
> > > Check and recheck your hardware items.
> >
> > I have removed one of four memory capsules - panicked again.  Will rotate
> > through all of them...
> >
> > >
> > > Runup memtest86+. Check your drive hardware, turn on SMART checking.
> >
> > I have run memtest86+ over night - no errors found.
> >
> > I have installed smartmontools - no errors found, short and long self tests
> > on both disks run fine.  zpool scrub repaired 0 errors and has no known
> > data
> > errors.
> >
> >
> > Any further hints on how to "Check your drive hardware"?
> >
> >
> > Thank you for your advice.
> > --
> >
> > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> > _______________________________________________
> > f <freebsd-questions@freebsd.org>
> >
> 
> 
> 
> Also check cables , because , sometimes , some connector parts are not
> transmitting data properly .

I'll see if I can do that.

> Another possibility may be a faulty executable binary because some bits may
> be changed in place .

> Another possibility may be power level ( Watts ) of power supply  : Adding
> some new hardware part(s) may exceed capacity of existing power supply :
> When executed programs require more power , due to insufficient power level
> , circuits may be corrupted .

This is a standard Dell Power Edge R320 with two disks of four and no
extras except for an extension board with two more Ethernet ports, so that
is rather unlikely, but worth looking into.

> 
> If it is possible , by removing connectors of existing HDDs and installing
> a new OS on a spare disk may show possibility of modified binary existence
> .

I will try "freebsd-update IDS" and see if it finds a checksum error.

> 
> If the new install is not panicking , then existing installed parts may
> have defective parts .
> If the new install is also panicking , then your hardware ( for example ,
> main board , circuits in main board
> ) has some trouble  points .
> 
> 
> Mehmet Erol Sanliturk

Thank you for your advice.


-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Want to link to this message? Use this URL: <http://docs.FreeBSD.org/cgi/mid.cgi?20170601134128.GB2256>