Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 4 Nov 2004 12:52:48 +0300 (MSK)
From:      Igor Sysoev <is@rambler-co.ru>
To:        Uwe Doering <gemini@geminix.org>
Cc:        stable@freebsd.org
Subject:   Re: vnode_pager_putpages errors and DOS?
Message-ID:  <20041104124616.S92154@is.park.rambler.ru>
In-Reply-To: <4189666A.9020500@geminix.org>
References:  <Pine.NEB.3.96L.1041009150440.93055O-100000@fledge.watson.org> <4168578F.7060706@geminix.org> <20041103191641.K63546@is.park.rambler.ru> <4189666A.9020500@geminix.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 4 Nov 2004, Uwe Doering wrote:

> Igor Sysoev wrote:
> > On Sat, 9 Oct 2004, Uwe Doering wrote:
> >>[...]
> >>I wonder whether the unresponsiveness is actually just the result of the
> >>kernel spending most of the time in printf(), generating warning
> >>messages.  vnode_pager_generic_putpages() doesn't return any error in
> >>case of a write failure, so the caller (syncer in this case) isn't aware
> >>that the paging out failed, that is, it is supposed to carry on as if
> >>nothing happened.
> >>
> >>So how about limiting the number of warnings to one per second?  UFS has
> >>similar code in order to curb "file system full" and the like.  Please
> >>consider trying the attached patch, which applies cleanly to 4-STABLE.
> >>It won't make the actual application causing these errors any happier,
> >>but it may eliminate the DoS aspect of the issue.
> >
> > I have just tried your patch. To test I ran the program from
> > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/67919
> >
> > The patch allows me to login on machine while the system reports about
> > "vnode_pager_putpages: I/O error 28". However, the file system access is
> > very limited and after some time the system became unresponsible.
>
> Limited file system access is to be expected, since
> vnode_pager_putpages() keeps the number of dirty buffers
> ('numdirtybuffers') near its upper limit ('hidirtybuffers').  However,
> the unresponsiveness may be caused by another shortcoming I found in the
> meantime.
>
> When 'numdirtybuffers' is greater or equal 'hidirtybuffers', function
> bwillwrite() will block until 'numdirtybuffers' drops below some
> threshold value.  bwillwrite() gets called in a number of places that
> deal with writing data to disk.
>
> Two of these places are dofilewrite() (which is in turn called by
> write() and pwrite()) and writev().  There, bwillwrite() gets called if
> the file descriptor is of type DTYPE_VNODE.  Now, this unfortunately
> doesn't take into account that ttys, including pseudo ttys, and even
> /dev/null and friends, are character device nodes and therefore vnodes
> as well, but have nothing to do with writing data to disk.  That is, in
> case of heavy disk write activity, write attempts to these device nodes
> get blocked, too!  With the consequence that the system appears to
> become unresponsive at the shell prompt, or reacts very sporadic.  Even
> daemonized processes that happen to log data to /dev/null (on stdout &
> stderr, for example) will block.
>
> What we need here is an additional test that makes sure that in case of
> a character device bwillwrite() gets called only if the device is in
> fact a disk.  Please consider trying out the attached patch.  It will
> not reduce the heavy disk activity (which is, after all, legitimate),
> but it is supposed to enable you to operate the system at shell level
> and kill the offending process, or do whatever is necessary to resolve
> the problem.

I've tried your patch from second email (it requires to include
<sys/conf.h> for devsw and D_DISK): the system also became unresponsible.

The main problem is that I could not kill the offending process - it
stuck in biowr state.


Igor Sysoev
http://sysoev.ru/en/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041104124616.S92154>