From owner-freebsd-hardware@FreeBSD.ORG Mon Jun 28 18:18:48 2010 Return-Path: Delivered-To: freebsd-hardware@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 585501065672 for ; Mon, 28 Jun 2010 18:18:48 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 2CE9F8FC1F for ; Mon, 28 Jun 2010 18:18:48 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id DAE4046B23; Mon, 28 Jun 2010 14:18:47 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id C64038A050; Mon, 28 Jun 2010 14:18:46 -0400 (EDT) From: John Baldwin To: Ireneusz Pluta Date: Mon, 28 Jun 2010 14:09:23 -0400 User-Agent: KMail/1.12.1 (FreeBSD/7.3-CBSD-20100217; KDE/4.3.1; amd64; ; ) References: <4C2499B5.3030404@wp.pl> <201006281326.08896.jhb@freebsd.org> <4C28E287.5010103@wp.pl> In-Reply-To: <4C28E287.5010103@wp.pl> MIME-Version: 1.0 Content-Type: Text/Plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <201006281409.23546.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Mon, 28 Jun 2010 14:18:46 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: freebsd-hardware@freebsd.org Subject: Re: System hangs during heavy sequential write to mfi device X-BeenThere: freebsd-hardware@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: General discussion of FreeBSD hardware List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Jun 2010 18:18:48 -0000 On Monday 28 June 2010 1:57:27 pm Ireneusz Pluta wrote: > John Baldwin pisze: > > On Monday 28 June 2010 12:00:06 pm Ireneusz Pluta wrote: > > > >> John Baldwin pisze: > >> > >>> On Friday 25 June 2010 4:59:57 pm Ireneusz Pluta wrote: > >>> > >>> > >>>> John Baldwin pisze: > >>>> > >>>> > >>>>> Hmmm. You might have a hardware issue. OTOH, you can try seeing if you > >>>>> > > have > > > >>>>> a BIOS option to disable PCIE error logging. > >>>>> > >>>>> > >>>> is it one of them?: > >>>> > >>>> Assert NMI on SERR > >>>> Assert NMI on PERR > >>>> > >>>> (pdf page 109 of: -> > >>>> > >>>> > > http://download.intel.com/support/motherboards/server/s5520hc/sb/e39529013_s5520hc_s5500hcv_s5520hct_tps_r1_9.pdf) > > > >>>> > >>>> > >>> Well, that will turn off the NMIs. Not sure if it will affect the event > >>> logging, but it is worth a shot. > >>> > >>> > >> Per BIOS setup documentation: > >> > >> On SERR, generate an NMI and log an error. > >> Note: [Enabled] must be selected for the Assert NMI > >> on PERR setup option to be visible. > >> > >> and: > >> > >> On PERR, generate an NMI and log an error. > >> Note: This option is only active if the Assert NMI on > >> SERR option is [Enabled] selected. > >> > >> However, disabling them did not change anything. > >> > > > > Is it still logging errors and sending NMIs with them disabled? > > > with the options I mentioned disabled. They do not have to be the only > sources of NMIs, do they? Well, they should be the sources of the log messages you found in your system event log. There is a good chance that you have some broken hardware somewhere, I'm not sure how easy it is for you to debug that via swapping out components, but the RAID controller is the first thing I would try. -- John Baldwin