Date: Thu, 9 May 2019 11:11:00 -0700 From: Conrad Meyer <cem@freebsd.org> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: Re: test hash functions for fsid Message-ID: <CAG6CVpWpduHKVU9Y_W01sZUcL6zT4uP5iNnF-4Lr3Bv_bt=jUg@mail.gmail.com> In-Reply-To: <YQBPR0101MB226068818D8C0D9591DD3D94DD330@YQBPR0101MB2260.CANPRD01.PROD.OUTLOOK.COM> References: <YQBPR0101MB2260D82BAE348FB82902508CDD320@YQBPR0101MB2260.CANPRD01.PROD.OUTLOOK.COM> <CAG6CVpUER0ODF5mfdGFA-soSpGbh=qkjc4rKiJEdaOUaUaUmAw@mail.gmail.com> <YQBPR0101MB226068818D8C0D9591DD3D94DD330@YQBPR0101MB2260.CANPRD01.PROD.OUTLOOK.COM>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Rick, On Wed, May 8, 2019 at 5:41 PM Rick Macklem <rmacklem@uoguelph.ca> wrote: [liberal snipping throughout :-)] > > I'll admit I had never heard of PCTRIE, but <sys/pctrie.h> seems to be > #ifdef _KERNEL, so it can't be used in userland? Ah, you're completely correct. I had considered using PCTRIE for a userspace application earlier, and had started converting it to be buildable in userspace, but I never finished. (I determined it wouldn't be a good fit for that application, unfortunately.) I do think it might be a good datastructure to expose to base userspace programs eventually, but you probably don't really want to tackle that extra work :-). A hash table is totally fine. > Yes. I just chose 256 for this test program. Unfortunately, the way mount= d.c is > currently coded, the hash table must be sized before the getmntinfo() cal= l, > so it must be sized before it knows how many file systems there are. > Since this is userland and table size isn't a memory issue, I'm just temp= ted to > make it large enough for a large server with something like 25,000 file s= ystems. No objection, so long as people can still run tiny NFS servers on their Raspberry Pis. :-) > (I'd guess somewhere in the 256->1024 range would work. I'm not sure what > you mean by load factor below 0.8?) Load factor is just number of valid entries divided by space in the table. So if your table has 256 spaces, load factor 0.8 would be having about 205 valid entries. > Fortunately, neither ZFS nor UFS uses vfs_getnewfsid() unless there is a = collision > and I don't really care about other file system types. Ah, sorry =E2=80=94 I didn't read carefully enough when looking for fsid in= itialization. > After all, it just does > a linear search down the list of N file systems right and just about any= thing > should be an improvement.) Yes :-). > I added a simple (take the low order bits of val[0]) case to the test. I = actually > suspect any of the hash functions will work well enough, since, as you no= te, most of the values (val[0] and 24bits of val[1]) are from a random # ge= nerator which should > be pretty uniform in distribution for ZFS. > UFS now uses a value from the superblock. It appears that newfs sets val[= 0] to the > creation time of the file system and val[1] to a random value. Hm, it looks like makefs sets val[1] (fs_id[1]) to a non-random value generated by the predictable PRNG random(3), for reproducible build reasons. makefs seeds srandom(3) with either the current time in seconds (low entropy) or some known timestamp (either from a file, also in seconds, or an integer) (no entropy). I guess random(3) may provide better distribution for the table than a plain global counter, though. > If it had been the > reverse, I would be tempted to only use val[0], but I'll see how well the= test goes > for Peter. Seems like you were right =E2=80=94 any old function is good enough :-). Take care, Conrad
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAG6CVpWpduHKVU9Y_W01sZUcL6zT4uP5iNnF-4Lr3Bv_bt=jUg>