Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 19 Apr 2014 00:18:41 -0700 (PDT)
From:      Don Lewis <truckman@FreeBSD.org>
To:        bsd-lists@bsdforge.com
Cc:        freebsd-hackers@FreeBSD.org, hackers@FreeBSD.org, freebsd-stable@FreeBSD.org
Subject:   Re: Process handlers, and zombies, or preap(1)
Message-ID:  <201404190718.s3J7IfAL093043@gw.catspoiler.org>
In-Reply-To: <c28e389b7ebf9a778367e7f59d018222.authenticated@ultimatedns.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On  1 Apr, Chris H wrote:
>> On Monday, March 31, 2014 4:06:43 pm Chris H wrote:
>>> Greetings,
>>>  I'm evaluating/experimenting on releng_9. The install, and now
>>> custom kernel have noting exotic, or anything out of the ordinary.
>>> top(1), and ps(1) indicate a (1) zombie, or <defunct> process. On
>>> my releng_8 systems, when I occasionally encounter one of these,
>>> they soon disappear (are reaped) from the process table. While I
>>> have not investigated this far enough on both versions to determine
>>> whether the parent process reaped the child on the releng_8 systems,
>>> and the parent on releng_9 is simply an irresponsible parent, eg;
>>> a different parent. Before I do, I was wondering if there was any
>>> specific difference between the 2 versions that might cause better
>>> handling of such situations. While I recognize that resource
>>> starvation is HIGHLY unlikely, except by perhaps a rouge parent
>>> spawning multitudes of zombies. I thought it might be useful for
>>> "housekeeping" to 1) provide a process table housekeeper (zombie
>>> reaper), or 2) create a system utility/command like SunOS/OpenSolaris
>>> has; preap(1).
>>>
>>> http://www.freebsd.org/cgi/man.cgi?query=preap&manpath=SunOS+5.10
>>>
>>> Thank you for your time, and consideration.
>>
>> Nothing is different with child processes in 9 vs 8.  It is most
>> likely a misbehaving parent (or the parent is stuck or hung).
> 
> Hello, John, and thank you for the reply.
> Right you are. Julian Elischer was kind enough to remind me that
> ps -alx
> would give me the information I needed to find the seemingly
> "lazy" parent process. But not before I had already (re)created
> a (Free)BSD version of preap(1), and cleared the entry from the
> proc table.
> However, it re-appeared again. So this time I traced it to it's
> parent, and now I can deal with it /properly/. It's an old port
> who's development was taken over by a Windows developer. So he
> doesn't have access to the *NIX-isms. I'll see if I can find
> the time to coordinate some effort(s) to clean it up, or branch
> a NIX version.

A call to
	signal(SIGCHLD, SIG_IGN);
would probably fix the problem with no fuss.  From the signal(3) man
page:

     If a process explicitly specifies SIG_IGN as the action for the signal
     SIGCHLD, the system will not create zombie processes when children of the
     calling process exit.  As a consequence, the system will discard the exit
     status from the child processes.  If the calling process subsequently
     issues a call to wait(2) or equivalent, it will block until all of the
     calling process's children terminate, and then return a value of -1 with
     errno set to ECHILD.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201404190718.s3J7IfAL093043>