Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 2 Apr 2001 18:29:09 +0930
From:      Greg Lehey <grog@lemis.com>
To:        Andrew Gordon <arg@arg1.demon.co.uk>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: 4.3-RC processes stuck sleeping on "inode" (?vinum) problem update
Message-ID:  <20010402182909.A75576@wantadilla.lemis.com>
In-Reply-To: <20010402094208.D73090@wantadilla.lemis.com>; from grog@lemis.com on Mon, Apr 02, 2001 at 09:42:08AM %2B0930
References:  <Pine.BSF.4.21.0104020008080.9790-100000@server.arg.sj.co.uk> <20010402094208.D73090@wantadilla.lemis.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Monday,  2 April 2001 at  9:42:08 +0930, Greg Lehey wrote:
> On Monday,  2 April 2001 at  0:36:31 +0100, Andrew Gordon wrote:
>>
>> Further to my previous report:
>>
>>  - This is definitely a problem in 4.3RC: I rolled back to 31st Jan
>>    sources (world & kernel), and the system has now been up for 36 hours
>>    (as opposed to at most 6 hours running 4.3RC).
>>
>>  - New evidence makes me lean towards thinking that Vinum is responsible
>>    (though this is by no means conclusive):
>>
>>     1. I had previously only had my nfsd processes getting stuck
>>        (plus the 'reboot' process itself if I tried to reboot),
>>        however, while doing a 'cvs checkout' onto the vinum filesystem
>>        to build my jan31 world, the cvs process got stuck in "inode" too.
>>
>>     2. That same cvs checkout completed OK on a non-vinum filesystem.
>>
>>     3. I have just noticed in my console logs, that in the "ps"
>>        output showing the nfsd processes stuck in "inode",
>>        the "(syncer)" process is stuck in "vrlock" which is a
>>        vinum wait channel.
>
> Hmm.  This is pretty conclusive.  It's a deadlock.
>
> Tor Egge reported a possible cause of this kind of deadlock.  I've
> been testing a fix, but I'm not sure it doesn't have side effects.
> Try this (in /usr/src/sys/dev/vinum), then rebuild the kernel module
> (in /usr/src/sys/modules/vinum), stop and restart vinum, and see if it
> helps:
>
> RCS file: /home/ncvs/src/sys/dev/vinum/vinumlock.c,v
> retrieving revision 1.18.2.2
> diff -w -u -r1.18.2.2 vinumlock.c
> --- vinumlock.c 2001/03/13 02:59:43     1.18.2.2
> +++ vinumlock.c 2001/04/02 00:09:53
> @@ -169,7 +169,7 @@
>  #endif
>                     plex->lockwaits++;                      /* waited one more time */
>                     tsleep(lock, PRIBIO, "vrlock", 0);
> -                   lock = plex->lock;                      /* start again */
> +                   lock = &plex->lock[-1];                 /* start again */
>                     foundlocks = 0;
>                     pos = NULL;
>                 }

OK.  I've tried this change, and indeed I still ended up with
problems.  It seems that from time to time a wakeup gets lost, causing
things to hang.  I've now made a workaround, and things seem to be
working stably.  Try this fix instead (or apply the other line if
you've already made a change).  I'm relatively confident that this
will fix the problem.  In view of the code freeze, please let me know
as soon as possible whether this fixes your problem.

RCS file: /home/ncvs/src/sys/dev/vinum/vinumlock.c,v
retrieving revision 1.18.2.2
diff -w -u -r1.18.2.2 vinumlock.c
--- vinumlock.c 2001/03/13 02:59:43     1.18.2.2
+++ vinumlock.c 2001/04/02 08:56:26
@@ -168,8 +168,8 @@
                    }
 #endif
                    plex->lockwaits++;                      /* waited one more time */
-                   tsleep(lock, PRIBIO, "vrlock", 0);
-                   lock = plex->lock;                      /* start again */
+                   tsleep(lock, PRIBIO, "vrlock", hz);
+                   lock = &plex->lock [-1];                /* start again */
                    foundlocks = 0;
                    pos = NULL;
                }
Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010402182909.A75576>