From owner-freebsd-stable@FreeBSD.ORG Wed Jul 16 03:45:14 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EF283106564A for ; Wed, 16 Jul 2008 03:45:14 +0000 (UTC) (envelope-from steve@ibctech.ca) Received: from ibctech.ca (v6.ibctech.ca [IPv6:2607:f118::b6]) by mx1.freebsd.org (Postfix) with SMTP id 9DB958FC1D for ; Wed, 16 Jul 2008 03:45:14 +0000 (UTC) (envelope-from steve@ibctech.ca) Received: (qmail 78541 invoked by uid 89); 16 Jul 2008 03:48:34 -0000 Received: from unknown (HELO ?IPv6:2607:f118::5?) (steve@ibctech.ca@2607:f118::5) by 2607:f118::b6 with ESMTPA; 16 Jul 2008 03:48:34 -0000 Message-ID: <487D6ECA.9040706@ibctech.ca> Date: Tue, 15 Jul 2008 23:45:14 -0400 From: Steve Bertrand User-Agent: Thunderbird 2.0.0.14 (Windows/20080421) MIME-Version: 1.0 To: Matthew Dillon References: <487CCD46.8080506@ibctech.ca> <200807151711.m6FHBgVO007481@apollo.backplane.com> <487CF077.2040201@ibctech.ca> <487CFA08.5000308@ibctech.ca> <200807151955.m6FJtf77008969@apollo.backplane.com> <487D5D08.9070102@ibctech.ca> <200807160327.m6G3Rh57012575@apollo.backplane.com> In-Reply-To: <200807160327.m6G3Rh57012575@apollo.backplane.com> X-Enigmail-Version: 0.95.6 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: taskqueue timeout X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Jul 2008 03:45:15 -0000 Matthew Dillon wrote: > This issue is vexing a lot of people. Heh... I can appreciate this. I would like someone to inform me that this can't be guaranteed to be a ZFS problem... if I can get confirmation that others have this issue aside from ZFS, I would feel content. > Setting the timeout to 30 will not effect performance, but it will > cause a 30 second delay in recovery when (if) the problem occurs. > i.e. when the disk stalls it will just sit there doing nothing for > 30 seconds, then it will print the timeout message and try to recover. If I have the timeout at >= 30 and the issue still occurs, the problem must be elsewhere. > It occurs to me that it might be beneficial to actually measure the > disk's response time to each request, and then graph it over a period > of time. Maybe seeing the issue visually will give some clue as to the > actual cause. I am interested in following through with this, but can't do it on my own. I'm willing to dedicate the box and bandwidth to anyone who can legitimately test this as you state. ie: I need either guidance or assistance. This box is ready for the taking. Beyond this box, I can provide legitimate parties other network resources to produce a consistent flow of data to ensure the ability to easily reproduce the issue locally, on demand. Steve