Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 7 Sep 2003 19:51:22 -0700
From:      Aaron Smith <aaron@mutex.org>
To:        sos@freebsd.org
Cc:        freebsd-current@freebsd.org
Subject:   pst driver: timeout explosion? (patch is attached)
Message-ID:  <20030908025121.GQ560@gelatinous.com>

next in thread | raw e-mail | index | archive | help

--J2SCkAp4GZ/dPZZf
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Hi,

I think I may have found the cause of the pst timeout panics.  I'm using
the Promise SX6000 RAID on -CURRENT, using the pst driver.  Unfortunately,
under sufficiently high I/O load, the box starts printing:

  "pst: timeout mfa=0x00327b90 cmd=0x01"

The 'mfa' address varies. It starts printing more and more rapidly, and
then eventually the machine wedges solid. Sometimes it makes it to:

  "panic: timeout table full"

Here's what I think is happening. Two timeouts are being scheduled every
time a timeout triggers, because pst_timeout schedules a timeout before
calling pst_rw to retry the operation. Then pst_rw schedules ANOTHER
timeout.  Both of these timeouts call pst_timeout, so they double every 10
seconds until there are a large enough number of timeouts firing, retrying
the same I/O operation, that the table fills and the machine panics.

Check out the following diff

  http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/pst/pst-raid.c.diff?r1=1.8&r2=1.9&f=h

This is where pst_rw was changed to schedule its own timeouts, but the
timeout function didn't have its removed.

Do you think this could be the correct explanation? It seems like once
pst_timeout is called, the machine is doomed... I'm recompiling my kernel
now to test the fix under load.

--Aaron

--J2SCkAp4GZ/dPZZf
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="pst-raid.c.patch"

Index: /sys/dev/pst/pst-raid.c
===================================================================
RCS file: /usr/cvs/src/sys/dev/pst/pst-raid.c,v
retrieving revision 1.11
diff -u -r1.11 pst-raid.c
--- /sys/dev/pst/pst-raid.c	24 Aug 2003 17:54:17 -0000	1.11
+++ /sys/dev/pst/pst-raid.c	8 Sep 2003 02:32:58 -0000
@@ -316,11 +316,6 @@
 	mtx_unlock(&request->psc->iop->mtx);
 	return;
     }
-    if (dumping)
-	request->timeout_handle.callout = NULL;
-    else
-	request->timeout_handle =
-	    timeout((timeout_t*)pst_timeout, request, 10 * hz);
     if (pst_rw(request)) {
 	iop_free_mfa(request->psc->iop, request->mfa);
 	biofinish(request->bp, NULL, EIO);

--J2SCkAp4GZ/dPZZf--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030908025121.GQ560>