From owner-freebsd-fs@freebsd.org Thu Oct 29 18:17:44 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8654BA212DF for ; Thu, 29 Oct 2015 18:17:44 +0000 (UTC) (envelope-from crest@rlwinm.de) Received: from smtp.rlwinm.de (smtp.rlwinm.de [IPv6:2a01:4f8:201:31ef::e]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4DF3A17C7 for ; Thu, 29 Oct 2015 18:17:44 +0000 (UTC) (envelope-from crest@rlwinm.de) Received: from crest.local (unknown [87.253.189.132]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.rlwinm.de (Postfix) with ESMTPSA id 4553F17D79 for ; Thu, 29 Oct 2015 19:17:41 +0100 (CET) Subject: Re: iSCSI/ZFS strangeness To: freebsd-fs@freebsd.org References: <20151029015721.GA95057@mail.michaelwlucas.com> From: Jan Bramkamp Message-ID: <563262C4.1040706@rlwinm.de> Date: Thu, 29 Oct 2015 19:17:40 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <20151029015721.GA95057@mail.michaelwlucas.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Oct 2015 18:17:44 -0000 On 29/10/15 02:57, Michael W. Lucas wrote: > The initiators can both access the iSCSI-based pool--not > simultaneously, of course. But CARP, devd, and some shell scripting > should get me a highly available pool that can withstand the demise of > any one iSCSI server and any one initiator. > > The hope is that the pool would continue to work even if an iSCSI host > shuts down. When the downed iSCSI host returns, the initiators should > log back in and the pool auto-resilver. I would recommend against using CARP for this because CARP is prone to split-brain situations and in this case they could destroy your whole storage pool. If the current head node fails the replacement has to `zpool import -f` the pool and and in the case of a split-brain situation both head nodes would continue writing to the iSCSI targets. I would move the leader election to an external service like consul, etcd or zookeeper. This is one case where the added complexity is worth it. If you can't run an external service for this e.g. it would exceed the scope of the chapter you're writing please simplify the setup with more reliable hardware, good monitoring and manual failover for maintenance. CARP isn't designed to implement reliable (enough) master election for your storage cluster. Adding iSCSI to your storage stack adds complexity and overhead. For setups which still fit inside a single rack SAS (with geom_multipath) is normally faster and cheaper. On the other hand you can't spread out SAS storage far enough to implement disaster tolerance should you really need it and it certainly is an interesting setup.