Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 9 Jul 2015 14:32:14 -0400
From:      Garrett Wollman <wollman@csail.mit.edu>
To:        freebsd-fs@freebsd.org
Cc:        rmacklem@freebsd.org
Subject:   How does NFS respond when a VFS operation gives ERESTART?
Message-ID:  <21918.48686.157217.979707@khavrinen.csail.mit.edu>

next in thread | raw e-mail | index | archive | help
When networked filesystems are not involved, the special error code
[ERESTART] can be returned by the implementation of any system call,
with the effect of causing the system call to be restarted when
execution hits the kernel-user boundary, rather than returning to
userland.  This is used to allow certain system calls to be restarted
after being interrupted by a signal.  However, this normally only
applies to system calls which might potentially sleep for a long time
-- such as write() to a socket or a tty -- and not to disk I/O, which
is normally uninterruptible.

In investigating an issue reported by our users, it appears to me from
an inspection of the code that ZFS can sometimes give an [ERESTART]
condition, specifically when writing to a dataset that has reached its
quota, AND there are pending block free operations that would reduce
usage below the quota.  But I don't see any code in the NFS (or kernel
RPC) implementation that would actually handle this case, and of
course the NFS server doesn't normally hit the user-kernel boundary at
all.  So does anyone have a theory about what actually happens in this
case, and what *should* happen?  It doesn't seem useful to just spin
on the one operation over and over again until the blocks are freed
(which I think might take a full ZFS transaction sync interval).

The actual symptom which I'm investigating is that sometimes --
despite my fixes to the throttling code -- the server is still getting
throttled, with thousands of requests enqueued for the same file.
(The FHA code does a nice job of directing them all to the appropriate
set of service threads, but that doesn't help the other clients get
anything done because of the global throttle.)  These seem not to make
any progress for a long time, but the condition ultimately clears by
itself -- what I'm trying to figure out is why so many requests get
queued and don't make progress, and so far this seems to be related to
hitting the quota on the filesystem.  So [ERESTART] may be a total red
herring, but it was something that stuck out at me when I was
reviewing the code paths that could set [EDQUOT].

-GAWollman



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?21918.48686.157217.979707>