From owner-freebsd-stable@FreeBSD.ORG  Tue Feb 18 17:14:53 2014
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 21048D3A;
 Tue, 18 Feb 2014 17:14:53 +0000 (UTC)
Received: from mail-ob0-x22a.google.com (mail-ob0-x22a.google.com
 [IPv6:2607:f8b0:4003:c01::22a])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id BD5AA1CB3;
 Tue, 18 Feb 2014 17:14:52 +0000 (UTC)
Received: by mail-ob0-f170.google.com with SMTP id va2so18787397obc.15
 for <multiple recipients>; Tue, 18 Feb 2014 09:14:52 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:from:date:message-id
 :subject:to:cc:content-type;
 bh=3xwDH88855ISjpZj+t3iCzYMKQo7VUPWv98hV/HDm7M=;
 b=ZiEGNL6VBrHuzjG353HQDDigix0/GFRqdkKqpU/WjSG/IXP7YK8sHd3TGqQjF2hngF
 D+ptQGNysfxHZErH3TXoRXUha6X9EBNNPFZJKvW9U9/aLAsijbfkf/oX8pt3yfwLXSHx
 S3loz7iRdY5FUy8BvIHLn5PNjPJ9b4NGAgeAGv6RuhpjesNzV9VZlfLC0Qz3XntYk+5G
 X+y6u2yL60YYy+TGWpaUIvjrmYd9voZ3LKZtG+p5/4TOEp8NAbTI6o0NKwrPhX2U1yTj
 FT2TPTjaKYw2P3hn94kY+iZterCuV2iBopvTs74Wgzy5sCKnqKQhj9Y3WWPjN1gt/WY0
 aPtA==
X-Received: by 10.182.88.69 with SMTP id be5mr2755266obb.55.1392743692032;
 Tue, 18 Feb 2014 09:14:52 -0800 (PST)
MIME-Version: 1.0
Sender: mr.kodiak@gmail.com
Received: by 10.60.173.206 with HTTP; Tue, 18 Feb 2014 09:14:21 -0800 (PST)
In-Reply-To: <6F4E2014-5489-4055-962C-4DFC6184A18E@jnielsen.net>
References: <920CC320-1A95-46E2-BB18-B6987805885E@jnielsen.net>
 <18D133C0-E71B-4E66-A13F-6DC3B1BF620C@FreeBSD.org>
 <6F4E2014-5489-4055-962C-4DFC6184A18E@jnielsen.net>
From: Bryan Venteicher <bryanv@freebsd.org>
Date: Tue, 18 Feb 2014 11:14:21 -0600
X-Google-Sender-Auth: gd_OMOvccsDXz3NMKSw2yVA3dn0
Message-ID: <CAGaYwLf+EhtUjLGfz6GynCGe3SwFijETLaqDxNjYA5rpN-HOHQ@mail.gmail.com>
Subject: Re: recovering from or increasing timeouts on virtio block device
To: John Nielsen <lists@jnielsen.net>
Content-Type: text/plain; charset=ISO-8859-2
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.17
Cc: freebsd-stable@freebsd.org,
 =?ISO-8859-2?Q?Edward_Tomasz_Napiera=B3a?= <trasz@freebsd.org>,
 Bryan Venteicher <bryanv@freebsd.org>
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable/>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Feb 2014 17:14:53 -0000

On Tue, Feb 18, 2014 at 10:57 AM, John Nielsen <lists@jnielsen.net> wrote:

> On Feb 18, 2014, at 3:32 AM, Edward Tomasz Napiera=B3a <trasz@freebsd.org=
>
> wrote:
>
> > Wiadomo=B6=E6 napisana przez John Nielsen w dniu 17 lut 2014, o godz. 2=
1:21:
> >> I run several FreeBSD virtual machines in a Linux KVM environment with
> a SAN. The VMs use virtio block storage, and the KVM hosts map the virtua=
l
> volumes to targets on the SAN. Occasionally, failover or other maintenanc=
e
> events on the SAN cause it to be unavailable for 30+ seconds. When this
> happens, the FreeBSD VMs have hard failures on the vtbd* devices, and
> thereafter any attempted reads or writes return immediately with an error
> (even after the SAN is responsive again). The only way to recover a VM on=
ce
> that happens is to hard boot it.
> >>
> >> Is there any way to adjust the timeouts or enable some kind of retry
> for the virtio block devices? It would be nice to be able to recover
> gracefully after a SAN event without needing to reboot the VMs.
> >
> > Use gmountver(8) perhaps?
>
> Thanks for the tip (and for writing it :), I haven't encountered that one
> before. I will experiment with it but I'm not sure it's a fit for this
> particular scenario (at least not by itself). When a SAN event happens th=
e
> virtual machine's vtbd0 device doesn't disappear, the underlying hardware
> just fails to respond for a long-ish time. I suspect that the driver give=
s
> up after either a certain length of time or number of errors, but my C
> driver-fu isn't up to figuring it out exactly. Once it gives up, any I/O
> requests to the (still "present") device fail immediately, and I can't se=
e
> a way to get the driver to actually try any (new or old) I/O again.
>


The vtbd driver has no internal retry mechanism, and pays no attention to
errors other than report then, and never gives up :)

It is not clear to me whether IO is getting turned around in FreeBSD before
it reaches the driver, or within the host. Do you continue to see "hard
error ..." messages on the console?


>
> JN
>
>