From owner-freebsd-fs@FreeBSD.ORG Thu May 19 23:22:59 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DB83E1065670; Thu, 19 May 2011 23:22:58 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-yx0-f182.google.com (mail-yx0-f182.google.com [209.85.213.182]) by mx1.freebsd.org (Postfix) with ESMTP id 6C8CE8FC12; Thu, 19 May 2011 23:22:58 +0000 (UTC) Received: by yxl31 with SMTP id 31so1427409yxl.13 for ; Thu, 19 May 2011 16:22:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=WT7LfSUGZjGBQDHfJxmgmjseHDgVbSiZMZ4g+ma8V9A=; b=TtRMCQb8hEOhu3hSkGj+LoSJxSoKD2oUkZibBE8/AUIltro85SdoJ6g9Tv38mdF8vV Tu6hjLkiityPbfsSdJ5eo1nK/0/d1+aQOl1xH00yTxtevy9ycoNAkuFrU4tG6CjQ0kYf TGgWXotBWdS2oK4Mg6ffOM/59swPdiNFAiZIw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=D04O2Ce5UiWVQXWXkz0mOk8VyO4uOu4VZfHNLuTdofMxxURYT9emGjUfYWCeVBT6Li I3T43uJzqDAPCXAqTnyYPMh0JJvnLPD5CPOXn6hYejD9UtpYcR9A9185hbuZlQQu7DiP MAzHHglLQqnGgXaAlgPN1p5el/iIqEar//GLM= MIME-Version: 1.0 Received: by 10.90.147.18 with SMTP id u18mr251911agd.95.1305847377816; Thu, 19 May 2011 16:22:57 -0700 (PDT) Received: by 10.90.138.17 with HTTP; Thu, 19 May 2011 16:22:57 -0700 (PDT) In-Reply-To: <20110519230921.GF2100@garage.freebsd.pl> References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> <20110519181436.GB2100@garage.freebsd.pl> <4DD5A1CF.70807@itassistans.se> <20110519230921.GF2100@garage.freebsd.pl> Date: Thu, 19 May 2011 16:22:57 -0700 Message-ID: From: Freddie Cash To: Pawel Jakub Dawidek Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: HAST + ZFS self healing? Hot spares? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 23:22:59 -0000 On Thu, May 19, 2011 at 4:09 PM, Pawel Jakub Dawidek wrote: > On Fri, May 20, 2011 at 01:03:43AM +0200, Per von Zweigbergk wrote: >> Very well, that is how failures are handled. But how do we *recover* >> from a disk failure? Without taking the entire server down that is. > > HAST opens local disk only when changing role to primary or changing > role to secondary and accepting connection from primary. > If your disk fails, switch to init for that HAST device, replace you > disk, call 'hastctl create ' and switch back to primary or > secondary. > >> I already know how to deal with my HBA to hot-add and hot-remove >> devices. And how to deal with hardware failures on the *secondary* >> node seems fairly straightforward, after all, it doesn't really >> matter if the mirroring becomes degraded for a few seconds while I >> futz around with restarting hastd and such. The primary sees the >> secondary disappear a few seconds, when it comes back, it will just >> truck all of the dirty data over. Big deal. > > You don't need to restart hastd or stop secondary. Just use hastctl to > change role to init for the failing resource. This process works exceedingly well. Just went through it a week or so ago. You just need to think in layers, the way GEOM works: Non-HAST setup HAST setup ------------------ ------------------ The non-HAST process for replacing a disk in a ZFS pool is: - zpool offline poolname diskname - remove dead disk - insert new disk - partition, label, etc as needed - zpool replace poolname olddisk newdisk - wait for resilver to complete With HAST, there's only a couple of small changes needed: - zpool offline poolname diskname <-- removes the /dev/hast node from the pool - hastctl role init diskname <-- removes the /dev/hast node - remove dead disk - insert new disk - partition, label, etc as needed - hastctl role create diskname <-- creates the hast resource - hastctl role primary diskname <-- creates the new /dev/hast node - zpool replace poolname olddisk newdisk <-- adds the /dev/hast node to pool - wait for resilver to complete The downside to this setup is that the data on the disk in the secondary node is lost, as the resilver of the disk on the primary node recreates all the data on the secondary node. But, at least then you know the data is good on both disks in the HAST resource. -- Freddie Cash fjwcash@gmail.com