Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 27 Aug 2013 10:59:27 +0200
From:      Davide Italiano <davide@freebsd.org>
To:        Andriy Gapon <avg@freebsd.org>
Cc:        src-committers@freebsd.org, Xin LI <d@delphij.net>, svn-src-all@freebsd.org, Xin LI <delphij@freebsd.org>, svn-src-head@freebsd.org, Konstantin Belousov <kib@freebsd.org>, Xin Li <delphij@delphij.net>
Subject:   Re: svn commit: r254585 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs
Message-ID:  <CACYV=-G0i3%2BU3ubWnPSGQpEvKsuJShYJZOMvKQnjBA1aLftJsA@mail.gmail.com>
In-Reply-To: <521C5CAC.2060400@FreeBSD.org>
References:  <201308202231.r7KMVERi068300@svn.freebsd.org> <20130825221517.GM24767@caravan.chchile.org> <521B75CE.70103@FreeBSD.org> <521BDEAC.9080909@delphij.net> <521C5CAC.2060400@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Aug 27, 2013 at 10:00 AM, Andriy Gapon <avg@freebsd.org> wrote:
> on 27/08/2013 02:03 Xin Li said the following:
>> On 08/26/13 08:35, Andriy Gapon wrote:
>>> on 26/08/2013 01:15 Jeremie Le Hen said the following:
>>>> Hi Xin,
>>>>
>>>> On Tue, Aug 20, 2013 at 10:31:14PM +0000, Xin LI wrote:
> [snip]
>>>>> @@ zfs_rename(vnode_t *sdvp, char *snm, vno if
>>>>> (VOP_REALVP(tdvp, &realvp, ct) == 0) tdvp = realvp;
>>>>>
>>>>> -  if (tdvp->v_vfsp != sdvp->v_vfsp || zfsctl_is_node(tdvp)) { +
>>>>> tdzp = VTOZ(tdvp);
>>
>>> The problem with this change, at least on FreeBSD, is that tdvp may
>>> not belong to ZFS.  In that case VTOZ(tdvp) does not make any sense
>>> and would produce garbage or trigger an assert.  Previously
>>> tdvp->v_vfsp != sdvp->v_vfsp check would catch that situations. Two
>>> side notes: - v_vfsp is actually v_mount on FreeBSD
>>
>> Ah that's good point.  Any objection in putting a change to their
>> _freebsd_ counterpart (zfs_freebsd_rename and zfs_freebsd_link) as
>> attached?
>
> I think that at least the change to zfs_freebsd_rename as it is now would break
> locking and reference counting semantics for the vnodes involved -- vreles and
> vputs have to be done even in the error case.
>
> Also, look at the check at the start of ufs_rename, it also considers tvp when
> it's not NULL.  I am not sure if it is possible to have a situation where fvp
> and tdvp are from the same fs, but tvp is from a different one.
>
> I've been also tempted to place the check in the common code in kern_renameat.
> That should cover the system calls.  But could be insufficient for direct calls
> to VOP_RENAME in other places.
>
> diff --git a/sys/kern/vfs_syscalls.c b/sys/kern/vfs_syscalls.c
> index a7cb87a..cfa4d93 100644
> --- a/sys/kern/vfs_syscalls.c
> +++ b/sys/kern/vfs_syscalls.c
> @@ -3608,6 +3608,14 @@ kern_renameat(struct thread *td, int oldfd, char *old,
> int newfd, char *new,
>                 error = EINVAL;
>                 goto out;
>         }
> +
> +       /* Check for cross-device rename. */
> +       if ((fvp->v_mount != tdvp->v_mount) ||
> +           (tvp && (fvp->v_mount != tvp->v_mount))) {
> +               error = EXDEV;
> +               goto out;
> +       }
> +
>         /*
>          * If the source is the same as the destination (that is, if they
>          * are links to the same vnode), then there is nothing to do.
>
>>> - VOP_REALVP is a glorified nop on FreeBSD
>>
>> It's not clear to me what was the intention for having a macro instead
>> of just ifdef'ing the code out -- maybe to prevent unwanted conflict
>> with upstream?  These two VOP's are the only consumers of VOP_REALVP
>> in tree.
>
> Yes.  Personally I would just ifdef out the calls.
>
>>> Another unrelated problem that existed before this change and has
>>> been noted by Davide Italiano is that sdvp is not locked and so it
>>> can potentially be recycled before ZFS_ENTER() enter and for that
>>> reason sdzp and zfsvfs could be invalid. Because sdvp is
>>> referenced, this problem can currently occur only if a forced
>>> unmount runs concurrently to zfs_rename. tdvp should be locked and
>>> thus could be used instead of sdvp as a source for zfsvfs.
>>
>> I think this would need more work, I'll take a look after we have this
>> regression fixed.
>
> Thank you.
>
> --
> Andriy Gapon

I've written a patch that attempts to fix the second problem.
http://people.freebsd.org/~davide/review/zfsrename_recycle.diff
The culprit is that we're dereferencing sdvp without the vnode lock
held, so data-stability is not guaranteed.
tdvp, instead, is locked at the entry point of VOP_RENAME() (at least
according to vop_rename_pre()) so it can be safely used.
As pointed out by Andriy ZFS has some different mechanisms wrt other
file systems that allow the vnode to not be recycled when it's
referenced (ZFS_ENTER), so once we get zfsvfs_t * from tdzp we can
call ZFS_ENTER and dereference sdvp in a safe manner.

Thanks,

-- 
Davide

"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACYV=-G0i3%2BU3ubWnPSGQpEvKsuJShYJZOMvKQnjBA1aLftJsA>