Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 22 Sep 2008 13:12:50 -0400
From:      Scott Burns <scott@bqinternet.com>
To:        freebsd-current@freebsd.org
Subject:   Re: ZFS panic in zone_dataset_visible
Message-ID:  <48D7D212.7090908@bqinternet.com>
In-Reply-To: <48D4E974.2020008@bqinternet.com>
References:  <48D4E974.2020008@bqinternet.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Scott Burns wrote:
> Hello,
> 
> I am running several servers using Pawel's July 27 ZFS patchset, applied 
> against 8-current source from the same day.  I have seen a similar panic 
> on two different servers:
...
> Stopped at      _mtx_lock_flags+0x15:   lock cmpxchgq   %rsi,0x18(%rdi)
> db> bt
> Tracing pid 95276 tid 100432 td 0xffffff010b3cc000
> _mtx_lock_flags() at _mtx_lock_flags+0x15
> zone_dataset_visible() at zone_dataset_visible+0x94
> zfs_mount() at zfs_mount+0x3e5
...

With a bit of testing, I found that this panic is easily reproducible. 
Simply try to list the contents of a snapshot from within a jail, as 
long as the snapshot isn't already mounted, and the system panics.  If I 
mount the snapshot from outside of the jail first, and then list it 
inside the jail, it does not panic.

I spent a bit of time debugging this weekend.  Trying to list an 
unmounted snapshot triggers a zfs_mount() for the snapshot, which calls 
zone_dataset_visible() to determine if the snapshot should be visible in 
the current zone.  When it is run outside of a jail, it returns true 
early on because INGLOBALZONE(curproc) is true, otherwise it takes 
another code path.

The panic is happening after that check, at mtx_lock(&pr->cr_mtx), 
because (pr = curthread->td_ucred->cr_prison) is NULL.  Interestingly, 
it's not NULL if zone_dataset_visible() is triggered by a "zfs list" 
command, but it is NULL if zone_dataset_visible() is called from 
zfs_mount().

As a temporary workaround, I modified my copy of 
cddl/compat/opensolaris/kern/opensolaris_zone.c to have 
zone_dataset_visible() return true if it is being called for a snapshot. 
  I modified it as below:

-if (INGLOBALZONE(curproc))
+if (INGLOBALZONE(curproc) || strchr(dataset, '@'))

This is obviously not ideal, since it allows the manipulation of the 
snapshot from another jail if the caller knows that it exists.  Since I 
am the only one with root access to any of the jails, I am not concerned 
with that. "zfs list" continues to behave normally.

I will continue looking at this, but since my main goal of working 
around the panic has been taken care of, I am not sure how long my 
attention span will last.  If the cause of 
curthread->td_ucred->cr_prison being NULL under these conditions is 
obvious to anyone, please let me know.

--
Scott Burns
System Administrator
BQ Internet Corporation



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?48D7D212.7090908>