From owner-freebsd-stable  Sat Dec 23  9:19:32 2000
From owner-freebsd-stable@FreeBSD.ORG  Sat Dec 23 09:19:27 2000
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from earth.backplane.com (placeholder-dcat-1076843399.broadbandoffice.net [64.47.83.135])
	by hub.freebsd.org (Postfix) with ESMTP id 9126A37B400
	for <stable@FreeBSD.ORG>; Sat, 23 Dec 2000 09:19:27 -0800 (PST)
Received: (from dillon@localhost)
	by earth.backplane.com (8.11.1/8.9.3) id eBNHIQ197462;
	Sat, 23 Dec 2000 09:18:26 -0800 (PST)
	(envelope-from dillon)
Date: Sat, 23 Dec 2000 09:18:26 -0800 (PST)
From: Matt Dillon <dillon@earth.backplane.com>
Message-Id: <200012231718.eBNHIQ197462@earth.backplane.com>
To: Cy Schubert - ITSD Open Systems Group <Cy.Schubert@uumail.gov.bc.ca>
Cc: cjclark@alum.mit.edu, Mikhail Teterin <mi@aldan.algebra.com>,
	Cy Schubert - ITSD Open Systems Group <Cy.Schubert@uumail.gov.bc.ca>,
	stable@FreeBSD.ORG
Subject: Re: an unkillable process (again) 
References:  <200012231434.eBNEYoc09416@cwsys.cwsent.com>
Sender: owner-freebsd-stable@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

    This doesn't sound right at all.  It doesn't matter what the priority
    of the process calling kill() is running at, nor does it matter what
    priority the process doing the sleeping is running at.  If the process
    making the kill() call is able to run, the signal will be delivered.

    So it is *not* a priority issue.  I don't think it was even at the time
    of the original CSRG 4.4.

    Processes that tsleep() in the kernel use the return value of tsleep()
    to determine whether there is a signal pending and act based on that.
    There are many tsleep()'s in the kernel which (appropriately) ignore
    any delivered signals, because they are sleeping while holding the process
    in a sensitive or documented 'uninterruptable' state.  Waiting on swap
    is one of those 'uninterruptable' states.  These signals will be acted
    upon later, after the uninterruptable operation completes.

    Judging from the original article, it looks like communicator is not
    on a swap-writing binge but is instead stuck waiting for a swap I/O
    to complete.  It's hard to tell from the one ps line output.  I suspect
    the issue here is the swap device itself.  Is it a disk?  Is it over NFS?
    something sure is broken with it.

						-Matt


:In message <20001223001223.M96105@149.211.6.64.reflexcom.com>, "Crist 
:J. Clark"
: writes:
:> On Fri, Dec 22, 2000 at 12:57:13PM -0500, Mikhail Teterin wrote:
:> > Cy Schubert - ITSD Open Systems Group once stated:
:> > 
:> > =In message <200012202226.eBKMQf100632@misha.privatelabs.com>, Mikhail 
:> > =Teterin writes:
:> > => Here it is:
:> > => 
:> > =>   425 mi       -18   0 45308K   144K swwrt    4:25  0.10%  0.10% communi
:> cator
:> > => -l
:> > => 
:> > => For some bizarre reasons of  its own, Netscape went into swap-writing
:> > => binge. Why did it make it immune to ``kill -9''?
:> > =
:> > =Then it appears  that swwrt has a higher priority  than kill has, which
:> > =it should have.
:> > 
:> > Rather confusing... kill -9 does not deliver any signals to the process.
:> > It is there to kill. Shouldn't it have the higher priority?
:> 
:> It is not a "priority" issue. The process is in the midst of an
:> operation that cannot be interupted. For some reason, that operation
:> is hanging up. I believe 'swwrt' is writing to swap? I/O calls are the
:> most frequent uninteruptable calls that get hung.
:
:Actually it is a "priority issue".  Read Design and Implementation of 
:the 4.4BSD Operating System pp 83-85, and pp 89:  To prevent a sleeping 
:process, e.g. one waiting for a device to respond, the kernel raises 
:the priority of that sleep to splhigh to prevent interrupts that might 
:cause process-state transitions.  For example, see pp 84, Table 4-2 in 
:the book, if you have a process waiting for swap (PSWP, priority 0) and 
:you issue a kill which would run at the baseline kernel priority, PZERO 
:(priority 22), your kill will have no effect on a process in PSWP state 
:until that process transitions to a lower priority.
:
:> 
:> > Also, anything that  prevents root from killing a process  is not right,
:> > IMHO.
:> 
:> It is usually indicative of a deeper problem.
:
:Agreed.  For example an NFS I/O running at priority PRIBIO (priority 
:16) cannot be killed by a process running at PZERO.  The deeper problem 
:being that a device or in this case an NFS server is not responding.
:
:
:Regards,                         Phone:  (250)387-8437
:Cy Schubert                        Fax:  (250)387-5766
:Team Leader, Sun/Alpha Team   Internet:  Cy.Schubert@osg.gov.bc.ca
:Open Systems Group, ITSD, ISTA
:Province of BC
:
:
:
:
:
:To Unsubscribe: send mail to majordomo@FreeBSD.org
:with "unsubscribe freebsd-stable" in the body of the message
:


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message