From owner-freebsd-questions@FreeBSD.ORG Sun Sep 7 11:00:53 2003 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 310F316A4BF for ; Sun, 7 Sep 2003 11:00:53 -0700 (PDT) Received: from out003.verizon.net (out003pub.verizon.net [206.46.170.103]) by mx1.FreeBSD.org (Postfix) with ESMTP id 38CD243FB1 for ; Sun, 7 Sep 2003 11:00:52 -0700 (PDT) (envelope-from cswiger@mac.com) Received: from mac.com ([68.237.14.199]) by out003.verizon.net (InterMail vM.5.01.05.33 201-253-122-126-133-20030313) with ESMTP id <20030907180051.GCOH29617.out003.verizon.net@mac.com>; Sun, 7 Sep 2003 13:00:51 -0500 Message-ID: <3F5B724C.8040606@mac.com> Date: Sun, 07 Sep 2003 14:00:44 -0400 From: Chuck Swiger Organization: The Courts of Chaos User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Jack L. Stone" References: <3.0.5.32.20030907102900.01393408@sage-one.net> In-Reply-To: <3.0.5.32.20030907102900.01393408@sage-one.net> X-Enigmail-Version: 0.76.5.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Authentication-Info: Submitted using SMTP AUTH at out003.verizon.net from [68.237.14.199] at Sun, 7 Sep 2003 13:00:51 -0500 cc: freebsd-questions@freebsd.org Subject: Re: Random crash and/or reboots X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Sep 2003 18:00:53 -0000 Jack L. Stone wrote: > A while back, on a couple of occasions, I posted a query about some bad > behavior on my mail server. For the past several months, it has been either > crashing/reboot or just rebooting. It's ALWAYS triggered by a SSH login, > but at random and ONLY at the "su" to root -- usually the most time before > reboot is about 2+ weeks and then contrasted by 2 in a row right after the > reboot -- actually no pattern. It has never happened directly at the console. [ ... ] > There are no indications of anything in the logs, and no core dumps. It > just stops and reboots, and any random time it pick. Only a couple of times > it has crashed without the remote login. These two paragraphs contradict each other, at least in part. :-) You're seeing frequent crashes, which seem to be strongly correlated with logging in as root, but you've also noticed crashes "without the remote login", too? You should build a debug kernel, and enable dumping the system to swap upon a panic ("man crash"), so that you have more information about the crash. > One tip was that I might have stale NFS mountabs -- cleared them out, but > problem persisted. > > The above tip was suggested when I mentioned that on a couple or more of > the occurrences, I managed to get to the console quickly enough to see (in > bright bold) "lockmgr locking against myself" -- or close to that. My > google of that error does mention stale mounts, but mostly about esoteric > code stuff. No fix found anywhere. Hmm. Are you performing local mail delivery to NFS volumes? Normally (or historically, anyway), NFS locking problems cause rpc.lockd to crash or wedge, thus resulting in NFS locking not working and possibly grim results to file consistency for anything being changed by two or more processes at the same time. However, NFS locking problems generally do not result in a system panic. [ ... ] > http://sageweb/tmp/1-lsof.txt > http://sageweb/tmp/2-lsof.txt These URLs aren't fully-qualified hostnames. Please try again. :-) -- -Chuck