From owner-freebsd-stable Sat Dec 23 9:19:32 2000 From owner-freebsd-stable@FreeBSD.ORG Sat Dec 23 09:19:27 2000 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from earth.backplane.com (placeholder-dcat-1076843399.broadbandoffice.net [64.47.83.135]) by hub.freebsd.org (Postfix) with ESMTP id 9126A37B400 for ; Sat, 23 Dec 2000 09:19:27 -0800 (PST) Received: (from dillon@localhost) by earth.backplane.com (8.11.1/8.9.3) id eBNHIQ197462; Sat, 23 Dec 2000 09:18:26 -0800 (PST) (envelope-from dillon) Date: Sat, 23 Dec 2000 09:18:26 -0800 (PST) From: Matt Dillon Message-Id: <200012231718.eBNHIQ197462@earth.backplane.com> To: Cy Schubert - ITSD Open Systems Group Cc: cjclark@alum.mit.edu, Mikhail Teterin , Cy Schubert - ITSD Open Systems Group , stable@FreeBSD.ORG Subject: Re: an unkillable process (again) References: <200012231434.eBNEYoc09416@cwsys.cwsent.com> Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG This doesn't sound right at all. It doesn't matter what the priority of the process calling kill() is running at, nor does it matter what priority the process doing the sleeping is running at. If the process making the kill() call is able to run, the signal will be delivered. So it is *not* a priority issue. I don't think it was even at the time of the original CSRG 4.4. Processes that tsleep() in the kernel use the return value of tsleep() to determine whether there is a signal pending and act based on that. There are many tsleep()'s in the kernel which (appropriately) ignore any delivered signals, because they are sleeping while holding the process in a sensitive or documented 'uninterruptable' state. Waiting on swap is one of those 'uninterruptable' states. These signals will be acted upon later, after the uninterruptable operation completes. Judging from the original article, it looks like communicator is not on a swap-writing binge but is instead stuck waiting for a swap I/O to complete. It's hard to tell from the one ps line output. I suspect the issue here is the swap device itself. Is it a disk? Is it over NFS? something sure is broken with it. -Matt :In message <20001223001223.M96105@149.211.6.64.reflexcom.com>, "Crist :J. Clark" : writes: :> On Fri, Dec 22, 2000 at 12:57:13PM -0500, Mikhail Teterin wrote: :> > Cy Schubert - ITSD Open Systems Group once stated: :> > :> > =In message <200012202226.eBKMQf100632@misha.privatelabs.com>, Mikhail :> > =Teterin writes: :> > => Here it is: :> > => :> > => 425 mi -18 0 45308K 144K swwrt 4:25 0.10% 0.10% communi :> cator :> > => -l :> > => :> > => For some bizarre reasons of its own, Netscape went into swap-writing :> > => binge. Why did it make it immune to ``kill -9''? :> > = :> > =Then it appears that swwrt has a higher priority than kill has, which :> > =it should have. :> > :> > Rather confusing... kill -9 does not deliver any signals to the process. :> > It is there to kill. Shouldn't it have the higher priority? :> :> It is not a "priority" issue. The process is in the midst of an :> operation that cannot be interupted. For some reason, that operation :> is hanging up. I believe 'swwrt' is writing to swap? I/O calls are the :> most frequent uninteruptable calls that get hung. : :Actually it is a "priority issue". Read Design and Implementation of :the 4.4BSD Operating System pp 83-85, and pp 89: To prevent a sleeping :process, e.g. one waiting for a device to respond, the kernel raises :the priority of that sleep to splhigh to prevent interrupts that might :cause process-state transitions. For example, see pp 84, Table 4-2 in :the book, if you have a process waiting for swap (PSWP, priority 0) and :you issue a kill which would run at the baseline kernel priority, PZERO :(priority 22), your kill will have no effect on a process in PSWP state :until that process transitions to a lower priority. : :> :> > Also, anything that prevents root from killing a process is not right, :> > IMHO. :> :> It is usually indicative of a deeper problem. : :Agreed. For example an NFS I/O running at priority PRIBIO (priority :16) cannot be killed by a process running at PZERO. The deeper problem :being that a device or in this case an NFS server is not responding. : : :Regards, Phone: (250)387-8437 :Cy Schubert Fax: (250)387-5766 :Team Leader, Sun/Alpha Team Internet: Cy.Schubert@osg.gov.bc.ca :Open Systems Group, ITSD, ISTA :Province of BC : : : : : :To Unsubscribe: send mail to majordomo@FreeBSD.org :with "unsubscribe freebsd-stable" in the body of the message : To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message