Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 9 May 2019 04:13:13 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Conrad Meyer <cem@freebsd.org>
Cc:        Rick Macklem <rmacklem@uoguelph.ca>,  "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject:   Re: test hash functions for fsid
Message-ID:  <20190509033852.W1574@besplex.bde.org>
In-Reply-To: <CAG6CVpUER0ODF5mfdGFA-soSpGbh=qkjc4rKiJEdaOUaUaUmAw@mail.gmail.com>
References:  <YQBPR0101MB2260D82BAE348FB82902508CDD320@YQBPR0101MB2260.CANPRD01.PROD.OUTLOOK.COM> <CAG6CVpUER0ODF5mfdGFA-soSpGbh=qkjc4rKiJEdaOUaUaUmAw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 8 May 2019, Conrad Meyer wrote:

Another reply.

> (A) FSIDs themselves have poor initial distribution and *very* few
> unique bits (i.e., for filesystems that use vfs_getnewfsid, the int32
> fsid val[1] will be identical across all filesystems of the same type,
> such as NFS mounts).  The remaining val[0] is unique due to an
> incrementing global counter.  You could pretty easily extract the
> vfs_getnewfsid() code out to userspace to simulate creating a bunch of
> fsid_t's and run some tests on that list.

It is a bug to actually use val[1] or 64-bit dev_t.  Using it is either
unnecessary because val[0] is unique, or breaks compat syscalls.  See
my previous reply.  (I have a patch to make the compat syscalls fail
if they would truncate the dev_t, but it is too strict to be the default
and I forget if I committed it).

> It isn't a real world
> distribution but it would probably be pretty close, outside of ZFS.
> ZFS FSIDs have 8 bits of shared vfs type, but the remaining 56 bits
> appears to be generated by arc4rand(9), so ZFS FSIDs will have good
> distribution even without hashing.

Bug in zfs.

Even stat(1) doesn't understand these 64-bit numbers.  It misformats
them in decimal:

$ stat ~
16921688315829575803 2179998 drwxr-xr-x 62 bde bde 18446744073709551615 85 "Nov 25 21:21:07 2016" "May  8 16:33:55 2019" "May  8 16:33:55 2019" "Sep 28 00:06:41 2015" 5632 24 0x800 /home/bde

The first large number is st_dev and the second large number is st_rdev.
The decimal formatting of these is hard enough to read when they are
32 bits.  The 1844 number looks a like it is near UINT64_MAX, and is
in fact exactly that, so it is far from unique.  It is apparently just
NODEV = (dev_t)(-1).  This is correct except for the formatting, while
in ffs st_rdev is garbage except for actual devices, since ufs_getattr()
sets va_rdev to di_rdev even for non-devices, but for non-devices
di_rdev is an alias for the first block number.

I did commit my checks for truncation of dev_t's (r335035 and r335053).
Fixing makedev() put the synthetic major number 255 in vfs_getnewfsid()
back in the low 32 bits, so the checks usually pass for nfs.  However,
for zfs, they would often fail for the large number in st_dev, and always
fail for regular files for the large number NODEV = UINT64_MAX in st_rdev.
They should fail for NODEV for regular files on all file systems.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190509033852.W1574>