From owner-freebsd-fs@freebsd.org Thu Oct 29 02:20:14 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AD4C0A20D92 for ; Thu, 29 Oct 2015 02:20:14 +0000 (UTC) (envelope-from mwlucas@mail.michaelwlucas.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 9340D1DAD for ; Thu, 29 Oct 2015 02:20:14 +0000 (UTC) (envelope-from mwlucas@mail.michaelwlucas.com) Received: by mailman.ysv.freebsd.org (Postfix) id 90537A20D90; Thu, 29 Oct 2015 02:20:14 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 774A4A20D8E for ; Thu, 29 Oct 2015 02:20:14 +0000 (UTC) (envelope-from mwlucas@mail.michaelwlucas.com) Received: from mail.michaelwlucas.com (mail.michaelwlucas.com [104.236.197.233]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 05B391DAA for ; Thu, 29 Oct 2015 02:20:13 +0000 (UTC) (envelope-from mwlucas@mail.michaelwlucas.com) Received: from mail.michaelwlucas.com (localhost [127.0.0.1]) by mail.michaelwlucas.com (8.14.9/8.14.7) with ESMTP id t9T1vMik095128 for ; Wed, 28 Oct 2015 21:57:22 -0400 (EDT) (envelope-from mwlucas@mail.michaelwlucas.com) Received: (from mwlucas@localhost) by mail.michaelwlucas.com (8.14.9/8.14.7/Submit) id t9T1vMqU095127 for fs@freebsd.org; Wed, 28 Oct 2015 21:57:22 -0400 (EDT) (envelope-from mwlucas) Date: Wed, 28 Oct 2015 21:57:21 -0400 From: "Michael W. Lucas" To: fs@freebsd.org Subject: iSCSI/ZFS strangeness Message-ID: <20151029015721.GA95057@mail.michaelwlucas.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=0.0 required=5.0 tests=UNPARSEABLE_RELAY, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail.michaelwlucas.com X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (mail.michaelwlucas.com [127.0.0.1]); Wed, 28 Oct 2015 21:57:27 -0400 (EDT) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Oct 2015 02:20:14 -0000 Hi, I'm experimenting with iSCSI HA with FreeBSD 10.2 amd64. I know people do this sort of thing, but I can't figure out what I'm missing. (Most of the tutorials cover HAST instead). I suspect the real problem is "Lucas doesn't know the right search terms." The goal is to make an iSCSI-based ZFS pool that's available to two separate hosts, and remains available even if one of the iSCSI servers fails. Instead, the pool hangs when either of the iSCSI servers goes down. My test environment has two iSCSI servers, iscsi1 and iscsi2. They each export three drives as a single target. There's two iSCSI initiators, zfs1 and zfs2. Both of them have active connections to the iSCSI targets. On another host I've created a ZFS pool of striped mirrors. Each mirror has one drive from each iSCSI server. The initiators can both access the iSCSI-based pool--not simultaneously, of course. But CARP, devd, and some shell scripting should get me a highly available pool that can withstand the demise of any one iSCSI server and any one initiator. The hope is that the pool would continue to work even if an iSCSI host shuts down. When the downed iSCSI host returns, the initiators should log back in and the pool auto-resilver. Some ten minutes ago, I killed iscsi2. The pool is live on zfs1. And the drives really have disappeared. # iscsictl Target name Target portal State iqn.2013-11.io.mwl:target0 iscsi2.blackhelicopters.org Operation timed out iqn.2013-11.io.mwl:target0 iscsi1.blackhelicopters.org Connected: da2 da3 da4 I would expect to see the pool appear degraded. But instead, I have: # zpool status iscsi pool: iscsi state: ONLINE scan: resilvered 1.16G in 0h3m with 0 errors on Wed Oct 28 14:13:08 2015 config: NAME STATE READ WRITE CKSUM iscsi ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/iscsi1-0 ONLINE 0 0 0 gpt/iscsi2-0 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 gpt/iscsi1-1 ONLINE 0 0 0 gpt/iscsi2-1 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 gpt/iscsi1-2 ONLINE 0 0 0 gpt/iscsi2-2 ONLINE 0 0 0 errors: No known data errors To try to make ZFS realize the pool is degraded, I write to the iSCSI pool. (tar -xvpf ports.tar.gz) Each time, the extract gets to a certain point and hangs. Can't ^C or ^Z out of it. This latest time, the extract reaches: x ports/www/firefox-esr/files/patch-media-mtransport-third_party-nICEr-src-util-mbslen.c I can still SSH into the machine, but if I try to look in any directories under /iscsi/ports/* my terminal hangs. So I restart the downed iSCSI server. The initiators log back into the target. And the hung tar extract picks up where it left off. So, I haven't achieved HA. The pool stays up, but it's not exactly usable. Any hints on what I'm missing? Thanks, ==ml -- Michael W. Lucas - mwlucas@michaelwlucas.com, Twitter @mwlauthor http://www.MichaelWLucas.com/, http://blather.MichaelWLucas.com/