From owner-freebsd-fs@FreeBSD.ORG Wed Aug 3 11:38:42 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DECEE106566B for ; Wed, 3 Aug 2011 11:38:42 +0000 (UTC) (envelope-from borjam@sarenet.es) Received: from proxypop02.sare.net (proxypop02.sare.net [194.30.18.43]) by mx1.freebsd.org (Postfix) with ESMTP id A163C8FC12 for ; Wed, 3 Aug 2011 11:38:42 +0000 (UTC) Received: from [172.16.1.55] (izaro.sarenet.es [192.148.167.11]) by proxypop02.sare.net (Postfix) with ESMTPSA id 29BF7109F0D9 for ; Wed, 3 Aug 2011 13:22:19 +0200 (CEST) From: Borja Marcos Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Date: Wed, 3 Aug 2011 13:22:23 +0200 Message-Id: <19D8728E-6C92-4882-BDEB-8DDC4918B997@sarenet.es> To: freebsd-fs@freebsd.org Mime-Version: 1.0 (Apple Message framework v1084) X-Mailer: Apple Mail (2.1084) Subject: ZFS v28 issues with "zfs" command X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Aug 2011 11:38:42 -0000 Hello, I've been doing tests with FreeBSD 8-STABLE, cvsupped yesterday.=20 First, I haven't been able to reproduce the deadlock I observed several = times when receiving a snapshot on a dataset on which there was some = reading activity. So this bug seems to be solved. However, I've seen something worrysome.=20 I'm using a small, simple replication program. At given intervals, right = now i'm using 20 second intervals, it sends an incremental snapshot to a = secondary machine. The algorithm is this: (time to replicate a new snapshot) ssh destination zfs list -t snapshot... zfs list -t snapshot determine most recent snapshot in common zfs snapshot pool/dataset@thenew (name format is = pool/dataset@YYYYMMDDHHMMSS) zfs send -i most_recent_snapshot_in_common new_snapshot > = /var/tmp/temp_filename scp /var/tmp/temp_filename destination:/var/tmp ssh destination zfs receive -d pool < /var/tmp/femp_filename ssh destination zfs destroy pool/most_recent_snapshot_in_common The program works, it's pretty simple.=20 However, I've found a problem. While it was working, I ran "zfs list -t = snapshot" several times on the destination machine. I can't recall if it = was during the zfs receive or the zfs destroy command, but after that = something went wrong. I noticed that destroying a snapshot got an error = message, despite the fact that the snapshots were really destroyed. Inspecting the pool with zdb -d (found it doing some Google searches) I = noticed that I had developed a "hidden clone" problem. And I saw this = snapshot which, aparently, came from nowhere: rpool/tachan@newsrc-23608-1 1.33K - 786M - Seems that there's some contention issue affecting the zfs command. In = my case, it was triggered by a "zfs list -t snapshot" command during = either a "zfs receive -d -F" or a "zfs destroy". I'm wondering how to capture useful data regarding this... =20 Borja.