From owner-svn-src-stable-8@FreeBSD.ORG Mon Apr 5 03:56:31 2010 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E0B8F1065674 for ; Mon, 5 Apr 2010 03:56:30 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from mail2.fluidhosting.com (mx21.fluidhosting.com [204.14.89.4]) by mx1.freebsd.org (Postfix) with ESMTP id 866198FC1A for ; Mon, 5 Apr 2010 03:56:30 +0000 (UTC) Received: (qmail 22498 invoked by uid 399); 5 Apr 2010 03:56:29 -0000 Received: from localhost (HELO foreign.dougb.net) (dougb@dougbarton.us@127.0.0.1) by localhost with ESMTPAM; 5 Apr 2010 03:56:29 -0000 X-Originating-IP: 127.0.0.1 X-Sender: dougb@dougbarton.us Message-ID: <4BB95F6C.7070202@FreeBSD.org> Date: Sun, 04 Apr 2010 20:56:28 -0700 From: Doug Barton Organization: http://SupersetSolutions.com/ User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.9.1.9) Gecko/20100330 Thunderbird/3.0.4 MIME-Version: 1.0 To: Jilles Tjoelker References: <201003282019.o2SKJfPg033857@svn.freebsd.org> <4BAFBBFA.7020701@FreeBSD.org> <20100328210630.GA2086@stack.nl> <4BAFE1EE.9040908@FreeBSD.org> <20100329173214.GA17249@stack.nl> In-Reply-To: <20100329173214.GA17249@stack.nl> X-Enigmail-Version: 1.0.1 OpenPGP: id=D5B2F0FB Content-Type: multipart/mixed; boundary="------------070707050009090200000503" Cc: svn-src-stable@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, svn-src-stable-8@freebsd.org Subject: Re: svn commit: r205806 - stable/8/etc X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2010 03:56:31 -0000 This is a multi-part message in MIME format. --------------070707050009090200000503 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit On 03/29/10 10:32, Jilles Tjoelker wrote: > On Sun, Mar 28, 2010 at 04:10:38PM -0700, Doug Barton wrote: >> On 03/28/10 14:06, Jilles Tjoelker wrote: >>> On Sun, Mar 28, 2010 at 01:28:42PM -0700, Doug Barton wrote: >>>> Probably my fault for not saying something sooner, but there is a >>>> problem with the code in head that sometimes causes it to loop >>>> repeatedly even though pwait exits successfully. I am trying to track it >>>> down, but since it only happens about once every 10 shutdowns it's been >>>> difficult. > >>> There is a difference between the two methods in what is waited for >>> exactly. pwait(1) will wait for the process to terminate; if it is >>> applied to a zombie it will return immediately (printing the exit status >>> if -v was given). On the other hand, kill(1) will continue to return >>> success until the process has been waited for by its parent. > >> The process that I see this with most often is devd, does that fit the >> model you're describing? > > Possibly. This would mainly happen because init has been busy, I think > (or if the parent isn't init). > >> What are the implications of moving on after a >> successful pwait even though there is still a zombie process? > > For shutdown/stop, nothing. > > For restart, there may be problems if a restarted daemon checks the > validity of the pid in the pidfile using kill(). Ok, in that case I'm not really comfortable with the idea of ignoring the results of kill -0, however I've come up with what I think is a good solution in the attached patch. Based on your description and my ongoing analysis the problem I was seeing with the long string of the same pid repeated over and over seems to be a side effect of pwait returning successfully (thus the || sleep 2 never kicks in) but 'kill -0' still being able to see the pid. So, I've moved the sleep up so that if we're not in the first pass but kill -0 is still seeing the pid that we sleep for 1 second, then proceed. I think that'll handle both the problem I saw, and the odd case where pwait doesn't return successfully. Sound good? Doug -- ... and that's just a little bit of history repeating. -- Propellerheads Improve the effectiveness of your Internet presence with a domain name makeover! http://SupersetSolutions.com/ --------------070707050009090200000503 Content-Type: text/plain; name="wait_for_pids.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="wait_for_pids.diff" Index: rc.subr =================================================================== --- rc.subr (revision 206117) +++ rc.subr (working copy) @@ -359,12 +359,14 @@ if [ -z "$_list" ]; then return fi - _prefix= + + local _prefix= while true; do _nlist=""; for _j in $_list; do if kill -0 $_j 2>/dev/null; then _nlist="${_nlist}${_nlist:+ }$_j" + [ -n "$_prefix" ] && sleep 1 fi done if [ -z "$_nlist" ]; then @@ -373,7 +375,7 @@ _list=$_nlist echo -n ${_prefix:-"Waiting for PIDS: "}$_list _prefix=", " - pwait $_list 2>/dev/null || sleep 2 + pwait $_list 2>/dev/null done if [ -n "$_prefix" ]; then echo "." --------------070707050009090200000503--