Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 6 Sep 2019 15:43:36 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Florian Schulze <mail@florian-schulze.net>
Cc:        freebsd-fs@freebsd.org, mgj@freebsd.org
Subject:   Re: held  file reference issue with ZFS and nullfs
Message-ID:  <20190906124336.GE2559@kib.kiev.ua>
In-Reply-To: <45B080A9-DE0F-4633-91F8-71438408D4B8@florian-schulze.net>
References:  <45B080A9-DE0F-4633-91F8-71438408D4B8@florian-schulze.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Sep 06, 2019 at 12:44:46PM +0200, Florian Schulze wrote:
> Hi!
> 
> Since FreeBSD 12 (updated from 10.3, I skipped 11.x completely, the box 
> started around 9.3) I have the issue that ZFS is not freeing up space 
> for some deleted files. The filesystems where this happens are mounted 
> into multiple jails via nullfs. Only one jail has write access, the 
> others are read only. When files are deleted the space for them is not 
> freed. I can still see their objects via zdb. When I unmount one of the 
> read only nullfs mounts the space is freed and the objects released.
> 
> I already used lsof, procstat and fstat to see if any process still has 
> a reference to the file, but that is not the case. But it seems to 
> matter which nullfs mount is unmounted, it is always one of the read 
> only ones. The processes which access the read only mounts are 
> completely different, it only seems to matter that the files are opened 
> at all. Killing the processes doesn't help, only unmounting the nullfs.
There were some bugs in past where nullfs referenced a lower vnode but
did not dereferenced it.

> 
> Today I noticed an odd message when I used zfs diff: "Unable to 
> determine path or stats for object 6 in 
> ...@zfs-diff-15651-00000001d86eb8cb: Stale NFS file handle". I don't 
> have NFS enabled anywhere (just checked the properties) and it never was 
> enabled!
This is probably unrelated.

> 
> The zdb output for object 6:
> 
> Dataset ... [ZPL], ID 18405, cr_txg 36040566, 10.2G, 43 objects
> 
>      Object  lvl   iblk   dblk  dsize  dnsize lsize   %full  type
>           6    1    16K    16K    16K     512    32K  100.00  SA attr 
> layouts
> 
> 
> ZFS_DBGMSG(zdb):
> spa_open_common: opening ...
> spa_load(tank2, config trusted): LOADING
> disk vdev '/dev/diskid/DISK-WD-...': best uberblock found for spa tank2. 
> txg 40503599
> spa_load(tank2, config untrusted): using uberblock with txg=40503599
> spa_load(tank2, config trusted): spa_load_verify found 0 metadata errors 
> and 2 data errors
> spa_load(tank2, config trusted): LOADED
> 
> 
> In the zfs diff was also a line "-   ...(on_delete_queue)".
> 
> I have one zfs filesystem where this happens quite often, one were it 
> happens sometimes and a few others which have a similar setup and where 
> I never noticed it (though the average file size on them is smaller).
> 
> I asked in #freebsd about this and koobs said I should write to this 
> list and CC kib@freebsd.org and mgj@freebsd.org
> He also did a quick look at the nullfs changes between 10.3 and 12.0 and 
> spotted the following change, which he said I should mention as well:
> https://github.com/freebsd/freebsd/commit/82f9c275c43da09f404546cceeff187a90ecc573#diff-81e7d6520611101890dd6425324dd8f8
> 
> Is there a known bug there? Could the stale NFS handle cause the leak? 
> Where is that NFS handle coming from?

So what is the exact version of your system ?  If 12.0, upgrade kernel
to latest stable/12 and see if it helps with the leak.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190906124336.GE2559>