From owner-freebsd-fs@FreeBSD.ORG  Wed Aug  3 11:38:42 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DECEE106566B
	for <freebsd-fs@freebsd.org>; Wed,  3 Aug 2011 11:38:42 +0000 (UTC)
	(envelope-from borjam@sarenet.es)
Received: from proxypop02.sare.net (proxypop02.sare.net [194.30.18.43])
	by mx1.freebsd.org (Postfix) with ESMTP id A163C8FC12
	for <freebsd-fs@freebsd.org>; Wed,  3 Aug 2011 11:38:42 +0000 (UTC)
Received: from [172.16.1.55] (izaro.sarenet.es [192.148.167.11])
	by proxypop02.sare.net (Postfix) with ESMTPSA id 29BF7109F0D9
	for <freebsd-fs@freebsd.org>; Wed,  3 Aug 2011 13:22:19 +0200 (CEST)
From: Borja Marcos <borjam@sarenet.es>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Date: Wed, 3 Aug 2011 13:22:23 +0200
Message-Id: <19D8728E-6C92-4882-BDEB-8DDC4918B997@sarenet.es>
To: freebsd-fs@freebsd.org
Mime-Version: 1.0 (Apple Message framework v1084)
X-Mailer: Apple Mail (2.1084)
Subject: ZFS v28 issues with "zfs" command
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 03 Aug 2011 11:38:42 -0000


Hello,

I've been doing tests with FreeBSD 8-STABLE, cvsupped yesterday.=20

First, I haven't been able to reproduce the deadlock I observed several =
times when receiving a snapshot on a dataset on which there was some =
reading activity. So this bug seems to be solved.

However, I've seen something worrysome.=20

I'm using a small, simple replication program. At given intervals, right =
now i'm using 20 second intervals, it sends an incremental snapshot to a =
secondary machine.

The algorithm is this:

(time to replicate a new snapshot)
ssh destination zfs list -t snapshot...
zfs list -t snapshot
determine most recent snapshot in common
zfs snapshot pool/dataset@thenew (name format is =
pool/dataset@YYYYMMDDHHMMSS)
zfs send -i most_recent_snapshot_in_common new_snapshot > =
/var/tmp/temp_filename
scp /var/tmp/temp_filename destination:/var/tmp
ssh destination zfs receive -d pool < /var/tmp/femp_filename
ssh destination zfs destroy pool/most_recent_snapshot_in_common

The program works, it's pretty simple.=20

However, I've found a problem. While it was working, I ran "zfs list -t =
snapshot" several times on the destination machine. I can't recall if it =
was during the zfs receive or the zfs destroy command, but after that =
something went wrong. I noticed that destroying a snapshot got an error =
message, despite the fact that the snapshots were  really destroyed.

Inspecting the pool with zdb -d (found it doing some Google searches) I =
noticed that I had developed a "hidden clone" problem. And I saw this =
snapshot which, aparently, came from nowhere:

rpool/tachan@newsrc-23608-1     1.33K      -   786M  -

Seems that there's some contention issue affecting the zfs command. In =
my case, it was triggered by a  "zfs list -t snapshot" command during =
either a "zfs receive -d -F" or a "zfs destroy".

I'm wondering how to capture useful data regarding this...

=20


Borja.