From owner-freebsd-current@FreeBSD.ORG Sun Oct 30 07:39:18 2005 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 65C9616A41F for ; Sun, 30 Oct 2005 07:39:18 +0000 (GMT) (envelope-from chad@shire.net) Received: from hobbiton.shire.net (hobbiton.shire.net [166.70.252.250]) by mx1.FreeBSD.org (Postfix) with ESMTP id AA6A243D49 for ; Sun, 30 Oct 2005 07:39:16 +0000 (GMT) (envelope-from chad@shire.net) Received: from [67.161.222.227] (helo=[192.168.99.68]) by hobbiton.shire.net with esmtpa (Exim 4.51) id 1EW7md-000DbZ-Pf; Sun, 30 Oct 2005 01:39:15 -0600 In-Reply-To: <200510191623.j9JGNSfr007356@magus.nostrum.com> References: <200510191623.j9JGNSfr007356@magus.nostrum.com> Mime-Version: 1.0 (Apple Message framework v734) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: "Chad Leigh -- Shire.Net LLC" Date: Sun, 30 Oct 2005 01:39:14 -0600 To: Philip Kizer X-Mailer: Apple Mail (2.734) X-SA-Exim-Connect-IP: 67.161.222.227 X-SA-Exim-Mail-From: chad@shire.net X-SA-Exim-Scanned: No (on hobbiton.shire.net); SAEximRunCond expanded to false Cc: freebsd-current@freebsd.org Subject: Re: Problem remains with FreeBSD 6.0-RC1 as seen in RELENG_5 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Oct 2005 07:39:18 -0000 On Oct 19, 2005, at 10:23 AM, Philip Kizer wrote: > I have a problem I reported on freebsd-stable several weeks ago in: > > http://www.FreeBSD.org/cgi/getmsg.cgi?fetch=47770+187449+/usr/ > local/www/db/text/2005/freebsd-stable/20051009.freebsd-stable > > > I upgraded a test box to see if all of the reports were true that > threading > and most other major problems were better in the 6.x branch, but I > have had > the same kind of hangs with 6.0-RC1 that I was having with RELENG_5. > > I get notified that some of my services are unavailable and I > verify that > new connection attempts from remote just hang. Attempts at issuing > commands from my existing ssh connections will let me send a > and > see a new prompt generated, but any attempt at execing/etc will > then hang > that process. > > > Moving to the console, I get the same behaviour such that if it is > already > logged in, I can hit return in the shell and get a new shell > prompt, but > any command I try (i.e. 'uptime' or 'uname') then results in the > same lack > of any further response. [Though, obviously from below, sending a > break to > activate DDB still works.] > > If the console is not already logged in and I hit return at > "login:", I get > a new "login:" prompt; but, as soon as I enter a username+^M or > even Ctrl-D > it, I cease to receive any feedback besides terminal echo from my > input. > I won't be much help in debugging this, but one of my main production servers was having this exact set of symptoms (all of them), about once a month. It started with 5.3-REL (happened once with 5.3-REL right as I was getting ready to upgrade to 5.4-REL after several months of running -- during a bunch of "dump" commands to files of md based filesystems). Happened several times, about every 2-6 weeks, until I upgraded to 5.4-STABLE a few weeks ago. Has not happened since but I am not sure it was "fixed." One interesting thing: When it would happen, I would have to do a hard boot by pushing the reset button. The problem would happen 2 or 3 more times after rebooting within 10-20 minutes of finishing the boot (and would have to be hard reset each time), then would no longer happen for 2-6 weeks. The machine is a Tyan 2882 dual Opteron running the i386 version of FreeBSD and has an Adaptec 2200S controller (aac). One time I saw an aac error on the console but usually not. It has lots of md based file systems mounted (disk backed). Has many jails mostly all running out of the md filesystems. Except for ssh, has no services running on the base install -- everything happens inside a jail. I will try and contribute but this is a major production machine and cannot be screwed around with until such time as an upgrade or planned reboot happens or the problem happens and we have to reboot anyway. # uname -a FreeBSD bywater.shire.net 5.4-STABLE FreeBSD 5.4-STABLE #7: Sun Oct 2 13:27:37 MDT 2005 chad@bywater.shire.net:/usr/obj/usr/src/sys/ BYWATER-SMP i386 Thanks Chad --- Chad Leigh -- Shire.Net LLC Your Web App and Email hosting provider chad@shire.net