From owner-freebsd-current@FreeBSD.ORG Wed Apr 20 15:54:22 2005 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from green.homeunix.org (freefall.freebsd.org [216.136.204.21]) by hub.freebsd.org (Postfix) with ESMTP id A495316A4CE; Wed, 20 Apr 2005 15:54:22 +0000 (GMT) Received: from green.homeunix.org (green@localhost [127.0.0.1]) by green.homeunix.org (8.13.3/8.13.1) with ESMTP id j3KFqXoh072666; Wed, 20 Apr 2005 11:52:33 -0400 (EDT) (envelope-from green@green.homeunix.org) Received: (from green@localhost) by green.homeunix.org (8.13.3/8.13.1/Submit) id j3KFqXqZ072665; Wed, 20 Apr 2005 11:52:33 -0400 (EDT) (envelope-from green) Date: Wed, 20 Apr 2005 11:52:33 -0400 From: Brian Fundakowski Feldman To: Marc Olzheim Message-ID: <20050420155233.GJ1157@green.homeunix.org> References: <20050419151800.GE1157@green.homeunix.org> <20050419160258.GA12287@stack.nl> <20050419160900.GB12287@stack.nl> <20050419161616.GF1157@green.homeunix.org> <20050419204723.GG1157@green.homeunix.org> <20050420140409.GA77731@stack.nl> <20050420142448.GH1157@green.homeunix.org> <20050420143842.GB77731@stack.nl> <20050420152038.GI1157@green.homeunix.org> <20050420153528.GC77731@stack.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050420153528.GC77731@stack.nl> User-Agent: Mutt/1.5.6i cc: freebsd-hackers@freebsd.org cc: freebsd-current@freebsd.org Subject: Re: NFS client/buffer cache deadlock X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Apr 2005 15:54:23 -0000 On Wed, Apr 20, 2005 at 05:35:28PM +0200, Marc Olzheim wrote: > On Wed, Apr 20, 2005 at 11:20:38AM -0400, Brian Fundakowski Feldman wrote: > > Reads should be totally unaffected... > > The server was misbehaving. Fixed. :-) > > > > Btw.: I'm not sure write(),writev() and pwrite() are allowed to do short > > > writes on regular files... ? > > > > Our manpage is incorrect; POSIX states that they are (see earlier > > e-mail). There really is no alternative -- we simply can't build > > an NFS transaction larger than our buffer cache can accomodate. > > Note that short wries won't happen for normal buffer sizes, only > > excessively large ones. I really don't believe that writev() is meant > > to be used so that you can write gigantic data structures in a single > > transaction... > > Ah, I was reading the SUSv2 page: > > http://www.opengroup.org/onlinepubs/009695399/functions/write.html > > instead of the POSIX version. > > But in neither of those I can extrude the fact that it can return > with result < nbyte, without it being a permanent condition. > What phrase makes you conclude that it can ? This specific issue is not clear-cut; the best thing to do lies somewhere within the range of these scenarios: "If a write() requests that more bytes be written than there is room for (for example, [XSI] [Option Start] the process' file size limit or [Option End] the physical end of a medium), only as many bytes as there is room for shall be written. For example, suppose there is space for 20 bytes more in a file before reaching a limit. A write of 512 bytes will return 20. The next write of a non-zero number of bytes would give a failure return (except as noted below)." "When attempting to write to a file descriptor (other than a pipe or FIFO) that supports non-blocking writes and cannot accept the data immediately: * If the O_NONBLOCK flag is clear, write() shall block the calling thread until the data can be accepted. * If the O_NONBLOCK flag is set, write() shall not block the thread. If some data can be written without blocking the thread, write() shall write what it can and return the number of bytes written. Otherwise, it shall return -1 and set errno to [EAGAIN]." "[ENOBUFS] Insufficient resources were available in the system to perform the operation." I think the first is more useful behavior than the last. Supporting it should be exactly the same as supporting what happens if the actual filesystem fills up. In this case, the filesystem is being requested to write more "than there is room for." -- Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\ <> green@FreeBSD.org \ The Power to Serve! \ Opinions expressed are my own. \,,,,,,,,,,,,,,,,,,,,,,\