From owner-freebsd-stable@FreeBSD.ORG Mon Mar 31 22:39:40 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1021F499 for ; Mon, 31 Mar 2014 22:39:40 +0000 (UTC) Received: from isis.morrow.me.uk (isis.morrow.me.uk [204.109.63.142]) by mx1.freebsd.org (Postfix) with ESMTP id CE1DFE4A for ; Mon, 31 Mar 2014 22:39:39 +0000 (UTC) Received: from anubis.morrow.me.uk (host86-173-254-150.range86-173.btcentralplus.com [86.173.254.150]) (Authenticated sender: mauzo) by isis.morrow.me.uk (Postfix) with ESMTPSA id 1B2324508D for ; Mon, 31 Mar 2014 22:39:37 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.8.3 isis.morrow.me.uk 1B2324508D DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=morrow.me.uk; s=dkim201101; t=1396305578; bh=QEe81FsyASrGuGXAPjjY9ygyGPMr+j+rDUBE7+g+wGE=; h=Date:From:To:Subject:References:In-Reply-To; b=yTegjjmvEcz3rO6LRV211jaHUN5eDpAN2NLfsK/se3JWaaAXs005UXYCLVX5YMELs vpIshMPVVKVt5c3zd3nsHBMgbixXmrg9lSV+Y2Fb9BNlnb5A44JInBsFLomHT+rh3n B6cxb996HtdwGHbOkjRCo0PoTUuhR24T8hCGvCoI= X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.98.1 at isis.morrow.me.uk Received: by anubis.morrow.me.uk (Postfix, from userid 5001) id 888891272C; Mon, 31 Mar 2014 23:39:34 +0100 (BST) Date: Mon, 31 Mar 2014 23:39:34 +0100 From: Ben Morrow To: freebsd-stable@freebsd.org Subject: Re: Process handlers, and zombies, or preap(1) Message-ID: <20140331223930.GA52538@anubis.morrow.me.uk> Mail-Followup-To: freebsd-stable@freebsd.org References: <20140331211147.GA52184@anubis.morrow.me.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Newsgroups: gmane.os.freebsd.stable User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Mar 2014 22:39:40 -0000 Quoth "Chris H" : > > Quoth "Chris H" : > >> I'm evaluating/experimenting on releng_9. The install, and now > >> custom kernel have noting exotic, or anything out of the ordinary. > >> top(1), and ps(1) indicate a (1) zombie, or process. On > >> my releng_8 systems, when I occasionally encounter one of these, > >> they soon disappear (are reaped) from the process table. While I > >> have not investigated this far enough on both versions to determine > >> whether the parent process reaped the child on the releng_8 systems, > >> and the parent on releng_9 is simply an irresponsible parent, eg; > >> a different parent. > > > > What is the parent? > > Sorry, that /should/ have been clearer. :) > Meaning; the processes (parents) that are reaping the zombies on releng_8 > are different that those I'm seeing on releng_9. > In other words; On releng_8, I see a zombie, then seconds later, it's > gone. On releng_9, I see a zombie, and it never leaves. > Is the "parent" of the dead "child" on releng_9, different than that of > the parent on releng_8. I couldn't possibly expect you to know. But not > having been able to catch the parent process reaping the defunct child > on releng_8, before it has reaped it. Well, *all* processes (except those whose parent is ignoring SIGCHLD) become zombies for some period of time, before their parent waits for them. The question is, why are the particular parent processes you are seeing failing to wait on 9-STABLE? > I cannot know. Which led me to ask; > Is there anything different on releng_9, that might cause zombies > terminally within the process table? > A bit wordy, perhaps. But makes the point. No? :) You were perfectly clear. Obviously I wasn't. What program is the parent process running? Is it a sshd, for instance, or some other piece of base software? This might indicate a bug somewhere in the base system (though, I'm running 9.1, and I don't see any zombies except under unusual circumstances). > >> Before I do, I was wondering if there was any > >> specific difference between the 2 versions that might cause better > >> handling of such situations. While I recognize that resource > >> starvation is HIGHLY unlikely, except by perhaps a rouge parent > > > > A rouge parent? :) > > Yes. An unfit parent, that will not watch after it's child(ren). We > have agencies in the US that seek to end such delinquencies. Maybe > FreeBSD could employ such tactics. :) I think you meant 'a rogue parent'. A rouge parent would be a sort of brownish-red colour. > >> spawning multitudes of zombies. I thought it might be useful for > >> "housekeeping" to 1) provide a process table housekeeper (zombie > >> reaper), > > > > That's called init(8). When the parent exits, init will wait for the > > zombie. > > > >> or 2) create a system utility/command like SunOS/OpenSolaris > >> has; preap(1). > > > > That seems like a bad idea, to me. Generally speaking I would expect it > > to be safer to kill and restart the parent, allowing init to do its job. > > Maybe. Maybe not. I think it depends on the parent process, and what impact > HUPing it, will have on the system. Tho this should not be an excuse for > not fixing the problem parent. But rather, a stop-gap, until a suitable > fix is created/obtained (for the parent). Indeed. However, it's nearly always possible to stop and restart any process safely (after all, power outages happen, and you have to be ready for them), whereas the effects of reaping a process the parent was eventually going to wait for cannot be determined without detailed knowledge of the parent's source. Ben