Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 26 Apr 2005 11:50:43 -0400
From:      Brian Fundakowski Feldman <green@freebsd.org>
To:        Marc Olzheim <marcolz@stack.nl>
Cc:        freebsd-standards@freebsd.org
Subject:   Re: NFS client/buffer cache deadlock
Message-ID:  <20050426155043.GC5789@green.homeunix.org>
In-Reply-To: <20050426151751.GB68038@stack.nl>
References:  <20050419160900.GB12287@stack.nl> <20050419161616.GF1157@green.homeunix.org> <20050419204723.GG1157@green.homeunix.org> <20050420140409.GA77731@stack.nl> <20050420142448.GH1157@green.homeunix.org> <20050420143842.GB77731@stack.nl> <16998.36437.809896.936800@khavrinen.csail.mit.edu> <20050420173859.GA99695@stack.nl> <20050426140701.GB5789@green.homeunix.org> <20050426151751.GB68038@stack.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Apr 26, 2005 at 05:17:51PM +0200, Marc Olzheim wrote:
> On Tue, Apr 26, 2005 at 10:07:01AM -0400, Brian Fundakowski Feldman wrote:
> > > Could someone from standards comment here ? I believe Garrett is
> > > right...
> > > (thread is on -hackers and -current)
> > 
> > What prevents you from using O_FSYNC | O_APPEND to get the
> > functionality you desire?  The semantics of IO_UNIT -- atomic writes
> > -- are definitely defined and assumed to function properly by the rest
> > of the kernel.  Allowing asynchronous unbounded atomic appends is
> > impossible, so something must be done to prevent deadlock.  Breaking
> > IO_UNIT really shouldn't be considered as a solution.  Automatically
> > turning the write into a synchronous + atomic append if an asynchrous
> > + atomic append is not possible might follow POLA best.
> 
> I don't care whether a user application corrupts it's own data by
> writing simultaneously to the same file from different hosts; that's the
> choice of the application. What I want is when the application behaves
> and is the only one writing to the file, that that writev() succeeds.
> 
> I'm okay with the fact that simultaneous huge writes to the same file
> over NFS could lead to corruption and that the exact outcome is
> undefined.
> 
> This is exactly how it was in FreeBSD 4.x and that's perfectly workable.
> 
> But that's just my way of looking at it and certainly not ideal. :-/

I don't know what you mean.  The exact same bug should exists in 4.x,
and should cause a system deadlock in exactly the same scenario.
Simultaneous huge writes for NFSv3 were and still are atomic and I
do not intend to break that -- just make it so they won't deadlock
the system.  I'm not okay with making applications suddenly start
corrupting data.

Why can't you use O_FSYNC for your huge writes?  I'm willing to bet
that its semantics are exactly what you're looking for.

-- 
Brian Fundakowski Feldman                           \'[ FreeBSD ]''''''''''\
  <> green@FreeBSD.org                               \  The Power to Serve! \
 Opinions expressed are my own.                       \,,,,,,,,,,,,,,,,,,,,,,\



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050426155043.GC5789>