Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 17 Dec 1998 02:00:53 -0500 (EST)
From:      Alfred Perlstein <bright@hotjobs.com>
To:        David G Andersen <danderse@cs.utah.edu>
Cc:        karl@Denninger.Net, hackers@FreeBSD.ORG
Subject:   Re: yup, found it (NFS)
Message-ID:  <Pine.BSF.4.05.9812170155360.378-100000@bright.fx.genx.net>
In-Reply-To: <199812170644.XAA03482@lal.cs.utah.edu>

next in thread | previous in thread | raw e-mail | index | archive | help

On Wed, 16 Dec 1998, David G Andersen wrote:

> Lo and behold, Alfred Perlstein once said:
> > as an interrupt) is sent to a process waiting for an NFS server,
> > the corresponding I/O system call returns with a transient error.
> > (*1)  Normally, the process is terminated by the signal.(*2)
> 
>   Right.  This is simply having the system call return EINTR, one of the
> options I duscuss.  However, it's not clear to me that this is the best
> option when the action was triggered by a close() on a file descriptor.
> Certainly, if it happens during a write, most processes know how to cope
> with it -- and the kernel does the right thing, returning EINTR.  However,
> when flushing a dirty block during a close, and you get an intr, what do
> you expect to do?
> 
>   Other BSDs don't have this problem because they call sleep() in this
> context, which isn't interruptable.  That leads to the other simple fix I
> noted, changing the tsleep() call to not have the interruptable flag.
> (Thus making us equivalent to blahBSD, etc).  But that means that you
> could potentially wedge a process on an NFS server that hung, despite the
> 'intr' flag.  It also explains why removing the nfsiods works, because the
> buffer flushing occurs as a part of the write() (which is interruptable
> and behaves properly), instead of being delayed until the close.
> > 
> > This isn't exactly good, a normal write should proceed as normal
> > correct?  Maybe it can delay the signal and try an extra 4-5 times
> > and delay the signal untill after the syscall?
> 
>   It's not a write; most programs that are signal aware will notice the
> EINTR and retry the write.  It's the close -- when was the last time you
> saw a program which checked the return value from close()?  (The answer is
> about 20% of the time in usr.bin).
>
>    Solaris isn't affected by this.  It's FreeBSD specific due to the
> sleep() -> tsleep() change.

Solaris has intr mounts, do you mean a close may ignore the intr
and block forever possibly?  

does EINTR, maybe with a sysctl to auto-retry the operation sound bad?

The process should get the signal regardless, any process that is
careful enoughto trap should be careful enough about close problems.
Blocking forever is just wrong, making people abide by 'published'
man pages is expected.

Alfred Perlstein - Programmer, HotJobs Inc. - www.hotjobs.com
-- There are operating systems, and then there's FreeBSD.
-- http://www.freebsd.org/                        3.0-current

>    -Dave
> 


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.05.9812170155360.378-100000>