From owner-freebsd-proliant@FreeBSD.ORG Mon Sep 25 16:13:31 2006 Return-Path: X-Original-To: freebsd-proliant@freebsd.org Delivered-To: freebsd-proliant@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9198516A407 for ; Mon, 25 Sep 2006 16:13:31 +0000 (UTC) (envelope-from kama@pvp.se) Received: from ms1.as.pvp.se (dns.pvp.se [213.64.187.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 181BF43D46 for ; Mon, 25 Sep 2006 16:13:31 +0000 (GMT) (envelope-from kama@pvp.se) Received: by ms1.as.pvp.se (Postfix, from userid 1001) id 12677A9; Mon, 25 Sep 2006 18:13:29 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by ms1.as.pvp.se (Postfix) with ESMTP id 0E1D3A7; Mon, 25 Sep 2006 18:13:29 +0200 (CEST) Date: Mon, 25 Sep 2006 18:13:29 +0200 (CEST) From: kama X-X-Sender: kama@ns1.as.pvp.se To: Per olof Ljungmark In-Reply-To: <4517DCE3.2090206@intersonic.se> Message-ID: <20060925174144.R39281@ns1.as.pvp.se> References: <4517C9AE.6000107@intersonic.se> <20060925132626.GG26539@voodoo.schug.net> <4517DCE3.2090206@intersonic.se> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: freebsd-proliant@freebsd.org Subject: Re: DL380G2 instability X-BeenThere: freebsd-proliant@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Technical discussion of FreeBSD on HP ProLiant server platforms." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Sep 2006 16:13:31 -0000 On Mon, 25 Sep 2006, Per olof Ljungmark wrote: > Christoph Schug wrote: > > On Mon, Sep 25, 2006, Per olof Ljungmark wrote: > > > >> I have tried now for some time to get FBSD running on a couple of > >> DL380G2 with dual PIII 1.4G but never got it right. The boxes are > >> identical in both hw setup and the way they behave so I figured this may > >> be FreeBSD related, not a hardware fault. > > [...] > >> Anyone out there who has this model running SMP OK? > > > > I've got several DL380 G2 running 6-STABLE and 6.1-RELEASE without any > > hitch, both in UP and SMP configurations. ACPI is disabled, OS set to > > 'Linux'. Other BIOS settings are default, firmware levels are as of > > Proliant Software Maintenance CD 7.40. > > > > I would rather guess you encountered a hardware issue (faulty power > > supply; non-identical stepping of CPUs?). OTOH as you're talking about a > > couple of machines, this is not very likely. Are you sure, your servers > > have got the redundant fan kit in place as this is mandatory for SMP > > operations? > > Hi Christoph, > > Yes, got the fan kit and CPU's are same stepping and it is two identical > boxes. As most people seem to be ok with "Linux" as the BIOS setting > this is what I have used for most of the time. Firmware levels are at > 7.50 but I don't think there are any updates for this box between 7.40 > and .50. Have even switched power supplies, both boxes have redundant psu's. > > I guess I should try 5-STABLE SMP, if that works the probability of hw > fault would be minimal but then again I'm back to square one in terms of > what the problem could be. The reason of checking 5-STABLE, is that I have encounter problems w DL380 and 6-STABLE (mine are all G3's). But I could not get any real info out of it and could not proceed to make a report. Mine just rebooted (power cycled), like if someone pulled the powercables and put them in again. So nothing in the log, and no dump in /var/crash (Yes, I specified it.). The ilo only reports a power cycle. What I do know is that it happens more often with high IO load. Network of 200-300Mbps in and out, a lot of memory and disk activity. One other thing I noticed was that they mostly occured at 00, 15, 30 or 45. (according to logs and ilo) But I could not see anything in cron, logrotate or anything that could cause it to boot at those times. I disabled the checks for server overload in the bios settings and still the power cycled. But it couldnt have been a problem with the hardware, since with 5-STABLE it can run for months without problems and with 6-STABLE it can crash anywhere from 2 hours to a week. Sometimes it can crash several times in one day and then be stable and fine for a week. If you get a solid 5-STABLE and the 6-STABLE is still a problem, it will be one more than me (that I know of) that have encounter problems. As my boxes are in production, I cannot switch to 6-STABLE just to test things. So if you are able to do all the steps that needs to be done and post a problem report, it would be great, from my point of view. And then I can only hope that it is the same issue. /Bjorn