Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 31 Mar 2014 23:39:34 +0100
From:      Ben Morrow <ben@morrow.me.uk>
To:        freebsd-stable@freebsd.org
Subject:   Re: Process handlers, and zombies, or preap(1)
Message-ID:  <20140331223930.GA52538@anubis.morrow.me.uk>
In-Reply-To: <f5bfca4537aaca03ef53ae45950ef764.authenticated@ultimatedns.net>
References:  <20140331211147.GA52184@anubis.morrow.me.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
Quoth "Chris H" <bsd-lists@bsdforge.com>:
> > Quoth "Chris H" <bsd-lists@bsdforge.com>:
> >>  I'm evaluating/experimenting on releng_9. The install, and now
> >> custom kernel have noting exotic, or anything out of the ordinary.
> >> top(1), and ps(1) indicate a (1) zombie, or <defunct> process. On
> >> my releng_8 systems, when I occasionally encounter one of these,
> >> they soon disappear (are reaped) from the process table. While I
> >> have not investigated this far enough on both versions to determine
> >> whether the parent process reaped the child on the releng_8 systems,
> >> and the parent on releng_9 is simply an irresponsible parent, eg;
> >> a different parent.
> >
> > What is the parent?
> 
> Sorry, that /should/ have been clearer. :)
> Meaning; the processes (parents) that are reaping the zombies on releng_8
> are different that those I'm seeing on releng_9.
> In other words; On releng_8, I see a zombie, then seconds later, it's
> gone. On releng_9, I see a zombie, and it never leaves.
> Is the "parent" of the dead "child" on releng_9, different than that of
> the parent on releng_8. I couldn't possibly expect you to know. But not
> having been able to catch the parent process reaping the defunct child
> on releng_8, before it has reaped it.

Well, *all* processes (except those whose parent is ignoring SIGCHLD)
become zombies for some period of time, before their parent waits for
them. The question is, why are the particular parent processes you are
seeing failing to wait on 9-STABLE?

> I cannot know. Which led me to ask;
> Is there anything different on releng_9, that might cause zombies
> terminally within the process table?
> A bit wordy, perhaps. But makes the point. No? :)

You were perfectly clear. Obviously I wasn't.

What program is the parent process running? Is it a sshd, for instance,
or some other piece of base software? This might indicate a bug
somewhere in the base system (though, I'm running 9.1, and I don't see
any zombies except under unusual circumstances).

> >> Before I do, I was wondering if there was any
> >> specific difference between the 2 versions that might cause better
> >> handling of such situations. While I recognize that resource
> >> starvation is HIGHLY unlikely, except by perhaps a rouge parent
> >
> > A rouge parent? :)
> 
> Yes. An unfit parent, that will not watch after it's child(ren). We
> have agencies in the US that seek to end such delinquencies. Maybe
> FreeBSD could employ such tactics. :)

I think you meant 'a rogue parent'. A rouge parent would be a sort of
brownish-red colour.

> >> spawning multitudes of zombies. I thought it might be useful for
> >> "housekeeping" to 1) provide a process table housekeeper (zombie
> >> reaper),
> >
> > That's called init(8). When the parent exits, init will wait for the
> > zombie.
> >
> >> or 2) create a system utility/command like SunOS/OpenSolaris
> >> has; preap(1).
> >
> > That seems like a bad idea, to me. Generally speaking I would expect it
> > to be safer to kill and restart the parent, allowing init to do its job.
> 
> Maybe. Maybe not. I think it depends on the parent process, and what impact
> HUPing it, will have on the system. Tho this should not be an excuse for
> not fixing the problem parent. But rather, a stop-gap, until a suitable
> fix is created/obtained (for the parent).

Indeed. However, it's nearly always possible to stop and restart any
process safely (after all, power outages happen, and you have to be
ready for them), whereas the effects of reaping a process the parent was
eventually going to wait for cannot be determined without detailed
knowledge of the parent's source.

Ben




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140331223930.GA52538>