Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 09 Oct 2004 23:26:39 +0200
From:      Uwe Doering <gemini@geminix.org>
To:        stable@freebsd.org
Subject:   Re: vnode_pager_putpages errors and DOS?
Message-ID:  <4168578F.7060706@geminix.org>
In-Reply-To: <Pine.NEB.3.96L.1041009150440.93055O-100000@fledge.watson.org>
References:  <Pine.NEB.3.96L.1041009150440.93055O-100000@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------060704090007080900030007
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

Robert Watson wrote:
> On Fri, 8 Oct 2004, Steve Shorter wrote:
> 
>>>	I have some machines that run customers cgi stuff.
>>>These machines have started to hang and become unresponsive.
>>>At first I thought it was a hardware issue, but I discovered in 
>>>a cyclades log the following stuff that got logged to the
>>>console which explains the cause of the system hangs/failures.
>>>
>>>vnode_pager_putpages: residual I/O 65536 at 347
>>>vnode_pager_putpages: I/O error 28]
>>>vnode_pager_putpages: residual I/O 65536 at 285] 
>>
>>	Aha! also at the same time I get in syslog
>>
>>	/kernel: pid 6 (syncer), uid 0 on /chroot/tmp: file system full
>>
>>	Whats happening? Can a full filesystem bring the thing down? 
>>Ideas? Fixes? 
> 
> Ideally not, but many UNIX programs respond poorly to being out of memory
> and disk space ("No space, wot?").  Are you using a swap file, and if so,
> how did you create the swapfile?  Are you using sparse files much?

I wonder whether the unresponsiveness is actually just the result of the 
kernel spending most of the time in printf(), generating warning 
messages.  vnode_pager_generic_putpages() doesn't return any error in 
case of a write failure, so the caller (syncer in this case) isn't aware 
that the paging out failed, that is, it is supposed to carry on as if 
nothing happened.

So how about limiting the number of warnings to one per second?  UFS has 
similar code in order to curb "file system full" and the like.  Please 
consider trying the attached patch, which applies cleanly to 4-STABLE. 
It won't make the actual application causing these errors any happier, 
but it may eliminate the DoS aspect of the issue.

    Uwe
-- 
Uwe Doering         |  EscapeBox - Managed On-Demand UNIX Servers
gemini@geminix.org  |  http://www.escapebox.net

--------------060704090007080900030007
Content-Type: text/plain;
 name="vnode_pager.c.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="vnode_pager.c.diff"

--- src/sys/vm/vnode_pager.c.orig	Fri Oct 31 11:39:38 2003
+++ src/sys/vm/vnode_pager.c	Sun Feb 15 02:38:21 2004
@@ -955,6 +955,7 @@
 	struct iovec aiov;
 	int error;
 	int ioflags;
+	static int last_elog, last_rlog;
 
 	object = vp->v_object;
 	count = bytecount / PAGE_SIZE;
@@ -1035,10 +1036,12 @@
 	cnt.v_vnodeout++;
 	cnt.v_vnodepgsout += ncount;
 
-	if (error) {
+	if (error && last_elog != time_second) {
+		last_elog = time_second;
 		printf("vnode_pager_putpages: I/O error %d\n", error);
 	}
-	if (auio.uio_resid) {
+	if (auio.uio_resid && last_rlog != time_second) {
+		last_rlog = time_second;
 		printf("vnode_pager_putpages: residual I/O %d at %lu\n",
 		    auio.uio_resid, (u_long)m[0]->pindex);
 	}

--------------060704090007080900030007--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4168578F.7060706>