From owner-freebsd-fs@FreeBSD.ORG Sat Apr 21 00:08:23 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7F64716A401 for ; Sat, 21 Apr 2007 00:08:23 +0000 (UTC) (envelope-from ticso@cicely12.cicely.de) Received: from raven.bwct.de (raven.bwct.de [85.159.14.73]) by mx1.freebsd.org (Postfix) with ESMTP id 2757013C4AD for ; Sat, 21 Apr 2007 00:08:22 +0000 (UTC) (envelope-from ticso@cicely12.cicely.de) Received: from cicely5.cicely.de ([10.1.1.7]) by raven.bwct.de (8.13.4/8.13.4) with ESMTP id l3L08HMf019846; Sat, 21 Apr 2007 02:08:17 +0200 (CEST) (envelope-from ticso@cicely12.cicely.de) Received: from cicely12.cicely.de (cicely12.cicely.de [10.1.1.14]) by cicely5.cicely.de (8.13.4/8.13.4) with ESMTP id l3L087Cd015021 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 21 Apr 2007 02:08:07 +0200 (CEST) (envelope-from ticso@cicely12.cicely.de) Received: from cicely12.cicely.de (localhost [127.0.0.1]) by cicely12.cicely.de (8.13.4/8.13.3) with ESMTP id l3L086D1064512; Sat, 21 Apr 2007 02:08:06 +0200 (CEST) (envelope-from ticso@cicely12.cicely.de) Received: (from ticso@localhost) by cicely12.cicely.de (8.13.4/8.13.3/Submit) id l3L086Fj064511; Sat, 21 Apr 2007 02:08:06 +0200 (CEST) (envelope-from ticso) Date: Sat, 21 Apr 2007 02:08:05 +0200 From: Bernd Walter To: Mike Wolman Message-ID: <20070421000805.GA64413@cicely12.cicely.de> References: <20070420232209.G4559@nux.eros.office> <20070421000029.N4559@nux.eros.office> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070421000029.N4559@nux.eros.office> X-Operating-System: FreeBSD cicely12.cicely.de 5.4-STABLE alpha User-Agent: Mutt/1.5.9i X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED=-1.8, BAYES_00=-2.599 autolearn=ham version=3.1.7 X-Spam-Checker-Version: SpamAssassin 3.1.7 (2006-10-05) on cicely12.cicely.de Cc: freebsd-fs@freebsd.org Subject: Re: lazy mirror / live backup X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: ticso@cicely.de List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 21 Apr 2007 00:08:23 -0000 On Sat, Apr 21, 2007 at 12:12:57AM +0100, Mike Wolman wrote: > > According to Mike Wolman: > >>The problem with zfs is you cannot layer it as u can with the geom > >>classes. > > > >Yes and no. It is not as versatile as all geom classes I believe but on > >FreeBSD it is a geom provider. > > I havent played with zfs that much and was unaware that zfs created a > device like the geom classes. > > >>For example if you want to create a failover zfs storage pool, if you make > >>the zfs pool out of gmirror devices with one being a local device and the > >>other being a ggatec device. You would then have your zfs raidz pool > >>replicated on a remote host. I do not think you can do this with zfs by > >>itself as you are not able to layer raidz pool ontop of a load of zfs > >>mirrors. > > > >On plain zfs yes, you can have that. > > > >zpool create tank raidz mirror da0 da1 mirror da2 da3 mirror da4 da5. > > So could you do: > > zpool create rank raidz mirror ad0 ggatec0 mirror ad1 ggatec1 mirror ad2 > ggatec2 AFAIK you can't do raidz over mirror, unless you mirror at disk layer. Just mirror would work for shure. > Then should the primary fileserver fail would the backup machine be able > to import this zfs pool? This is obviously the primary reason for such a mirror, but it also the most insane point about it. While you can import it on another machine you completely loose the state information, since the backup side is modified in an unknown way. That's the case with your gmirror enhancements as well. You completely trust the sync state of an offline disk. > Would you also be able to tune zfs to prefer the local disks to the remote > ones? No, unless you offline them. But the point with ZFS is that you don't neeed such mirror hacks at all. Just create a pool for your data and one or more for backup. The backup pools can be on different machines as well - no need to ggate the disks. Then setup a cron-job to build snapshots on short time regular base. "zfs send" it into the "zfs receive" on the backup host. E.g. by using ssh. This can be done differentially between two snapshots, no need to do a full transfer everytime. In case you need to switch to the backup side, you can easily rool back any side you want to get back in sync. You can also just replace the disks on your primary system with the disks from the backup and just import the pool. This even works for a slow link, because writers don't need to wait for the backup system to keep up, wich would be the case for ggate based mirrors over slow links. And if you still want to disconnect/reconnect drives on the main system you can zpool import/export your backup pool disks and run the zfs send/receive localy. There is no need that the backup disks are of the same size, because you copy based on data not disk blocks. E.g. your backup pool can be much bigger than the primary and keep more snapshots, or it is smaller and keep less snapshots. You can use cheap SATA disks to backup a SCSI system. You can compress the data on the backup pool, while not using compression on the primary pool - or use stronger compression. -- B.Walter http://www.bwct.de http://www.fizon.de bernd@bwct.de info@bwct.de support@fizon.de