From owner-freebsd-hackers  Wed Dec 16 21:09:05 1998
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id VAA05019
          for freebsd-hackers-outgoing; Wed, 16 Dec 1998 21:09:05 -0800 (PST)
          (envelope-from owner-freebsd-hackers@FreeBSD.ORG)
Received: from Genesis.Denninger.Net (kdhome-2.pr.mcs.net [205.164.6.10])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id VAA05011
          for <hackers@FreeBSD.ORG>; Wed, 16 Dec 1998 21:09:03 -0800 (PST)
          (envelope-from karl@Genesis.Denninger.Net)
Received: (from karl@localhost) by Genesis.Denninger.Net (8.9.1/8.8.2) id XAA27449; Wed, 16 Dec 1998 23:08:56 -0600 (CST)
Message-ID: <19981216230855.A27443@Denninger.Net>
Date: Wed, 16 Dec 1998 23:08:56 -0600
From: Karl Denninger <karl@Denninger.Net>
To: Alfred Perlstein <bright@hotjobs.com>
Cc: hackers@FreeBSD.ORG
Subject: Re: yup, found it (NFS)
References: <19981216211723.A27176@Denninger.Net> <Pine.BSF.4.05.9812162341200.378-100000@bright.fx.genx.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 0.93.2i
In-Reply-To: <Pine.BSF.4.05.9812162341200.378-100000@bright.fx.genx.net>; from Alfred Perlstein on Wed, Dec 16, 1998 at 11:51:39PM -0500
Organization: Karl's Sushi and Packet Smashers
X-Die-Spammers: Spammers will be LARTed and the remains fed to my cat
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On Wed, Dec 16, 1998 at 11:51:39PM -0500, Alfred Perlstein wrote:
> On Wed, 16 Dec 1998, Karl Denninger wrote:
> 
> > Remove the intr for now.  If that fixes it then at least we have
> > hard proof of where it is.
> 
> Already done.  I'm silly, not suicidal about things :)
> 
> > The problem is that vinvlbuf is not the only place you can get screwed.
> > There is also a problem in the vm pager (it can hang in there too, as I've
> > now been able to prove and isolate) due to what I *believe* is the same
> > cause.  This of course assumes you mount executable directories (very
> > common in clusters) across NFS.
> 
> You mean, if i'm running an executable over NFS?  I've seen this but not nearly as often.  In my case pine is local to the machine, but my mailbox isn't.
> 
> Just because of curiousity, it's hanging because the program text
> retrieval from the binary (not swap) has a similar loop?

Yep.  It locks up the process in question.  I suspect, but haven't yet
proven, that if that lockup bites "pagedaemon" you're fucked on a system
level.  I *have* proven that the process in question gets hosed and
deadlocks.

Example:
www      11988  0.0  0.5  6260  612  ??  D     8:12AM   0:00.99 /lbin/httpd.apa
www      11994  0.0  0.5  6288  620  ??  D     8:12AM   0:06.68 /lbin/httpd.apa

Guess what.  Right at 8:12 in the morning the server gets "kicked" to
produce logs (it gets sent a SIGINT).  Hmmm.....

> > Certainly the expected execution path is basically the same, and I can
> > *trigger it* with a SIGINT to a running process which happens to have some
> > of its working set paged out at the time it receives the signal (ouch!)
> 
> That doesn't seem very good at all.  Is this second case for all
> NFS mounts? or only intr mounts?

Don't know yet - still testing.

> Thanks for the attention.  Sorry i took so long to get some proof
> of this bug, it's just that it's a work machine and taking time
> out to do this isn't always possible.
> 
> I'm sure tracking down/fixing the problem is on a totally different
> level, so thanks,
> 
> -Alfred

Yep.  I understand fully.

What I want to know is whether a "ro,soft" mount has the same
vulnerability.  We use them around here for things like mounting
the Usenet spool.

--
-- 
Karl Denninger (karl@denninger.net) http://www.mcs.net/~karl
I ain't even *authorized* to speak for anyone other than myself, so give
up now on trying to associate my words with any particular organization.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message