From owner-freebsd-stable@FreeBSD.ORG Mon Feb 17 20:21:14 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8C042AD6; Mon, 17 Feb 2014 20:21:14 +0000 (UTC) Received: from secure.freebsdsolutions.net (secure.freebsdsolutions.net [69.55.234.48]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 6DBBB1C5C; Mon, 17 Feb 2014 20:21:14 +0000 (UTC) Received: from [10.10.1.198] (office.betterlinux.com [199.58.199.60]) (authenticated bits=0) by secure.freebsdsolutions.net (8.14.4/8.14.4) with ESMTP id s1HKL4Ii022639 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Mon, 17 Feb 2014 15:21:05 -0500 (EST) (envelope-from lists@jnielsen.net) From: John Nielsen Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: recovering from or increasing timeouts on virtio block device Date: Mon, 17 Feb 2014 13:21:19 -0700 Message-Id: <920CC320-1A95-46E2-BB18-B6987805885E@jnielsen.net> To: freebsd-stable@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) X-Mailer: Apple Mail (2.1827) X-DCC-Etherboy-Metrics: ns1.jnielsen.net 1002; Body=2 Fuz1=2 Fuz2=2 X-Virus-Scanned: clamav-milter 0.97.8 at ns1.jnielsen.net X-Virus-Status: Clean Cc: Bryan Venteicher X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Feb 2014 20:21:14 -0000 I run several FreeBSD virtual machines in a Linux KVM environment with a = SAN. The VMs use virtio block storage, and the KVM hosts map the virtual = volumes to targets on the SAN. Occasionally, failover or other = maintenance events on the SAN cause it to be unavailable for 30+ = seconds. When this happens, the FreeBSD VMs have hard failures on the = vtbd* devices, and thereafter any attempted reads or writes return = immediately with an error (even after the SAN is responsive again). The = only way to recover a VM once that happens is to hard boot it. Is there any way to adjust the timeouts or enable some kind of retry for = the virtio block devices? It would be nice to be able to recover = gracefully after a SAN event without needing to reboot the VMs. Thanks! John Nielsen