Date: Thu, 10 Sep 2015 13:44:18 -0400 From: "Chad J. Milios" <milios@ccsys.com> To: Erich Dollansky <erichsfreebsdlist@alogt.com> Cc: "freebsd-questions@freebsd.org" <freebsd-questions@freebsd.org> Subject: Re: mdconfig creating file based memory disk Message-ID: <A646B4C4-04DD-4840-A478-2EB28B0951F9@ccsys.com> In-Reply-To: <20150910111034.20b97c41@X220.alogt.com> References: <20150910111034.20b97c41@X220.alogt.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> On Sep 9, 2015, at 11:10 PM, Erich Dollansky <erichsfreebsdlist@alogt.com>= wrote: >=20 > Hi, >=20 > I just came across a simple question. What will happen when I create > two memory disks using the same file? >=20 > Example: >=20 > mdconfig -f /usr/home/swap/swapfile -u 0 > mdconfig -f /usr/home/swap/swapfile -u 1 >=20 > and then I do a >=20 > swapon /dev/md0 > swapon /dev/md1 >=20 > It gives me double the size of 'swapfile' as swap space. It is obvious > to me that this must fail. >=20 > Shouldn't there be a note in the documentation? >=20 > Erich Perhaps, but if we documented every way in which FreeBSD allows one to shoot= oneself in the foot, the docs would probably more than triple in size. :) This is an interesting experiment but I can't imagine anyone inviting the da= nger while actually expecting to get away with such a configuration and I do= n't imagine happening onto it by accident any more likely than the other inf= inite potentially dangerous misconfigurations of *nix. I doubt this merits a= mention for safety's sake, though as an illustration of how swap actually w= orks internally it has a lot of merit. I'd be curious to see more thorough t= est results and discussion from those with intimate knowledge of the virtual= memory and swapper/pager systems. Imagine the following analog: a hypothetical database software which mmap()s= a file possibly larger than physical memory to rely on the VM system for de= mand paging. Now imagine two or more instances of the database software bein= g started with hard links to the same underlying file and both/all are allow= ed to read and write. If the software is SMP-capable and uses locks or data s= tructures WITHIN the mapped region to handle synchronization (and doesn't go= out of its way to in-and-of-itself cache/process the data (beyond the help t= he kernel already provides) outside that region for moments during which the= data could become stale) then the multiple instances could all serve data f= rom, AND modify data in, that same single source of truth and will remain st= able and in-sync even without msync()ing to the underlying file or storage. I= 'm also positive this holds true though any (or an arbitrary and very large)= number/combination of indirections through hardlinks, symlinks, mdconfig, n= ullfs and/or unionfs (or it intends to, so any failure or race should be con= sidered a kernel bug). So without inspecting the relevant kernel source myself, based on the little= experiment you've conducted, I can imagine the swap perhaps having been set= up in a way that the data structure(s) that map swapped regions is either f= ully inside or fully outside the swap partition/file in a way in which any "= surprise" data showing up in the "other" swap device (besides the one it was= written to) ends up being non-problematic. I am just brainstorming here and= would love it if someone with knowledge rather than conjecture chimes in. := ) On the outset of the experiment you describe, my expectation was almost cert= ain spectacular failure. Anything else actually is quite curious and if such= a config doesn't just burst right into flames I consider it quite a testame= nt to sound *nix engineering. I'd be interested to hear someone exercise it w= ith more swapping out and paging in of data and verifying the data and seman= tics.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A646B4C4-04DD-4840-A478-2EB28B0951F9>