Date: Tue, 11 Apr 2017 17:30:03 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Flavius Anton <f.v.anton@gmail.com> Cc: freebsd-hackers@freebsd.org Subject: Re: On COW memory mapping in d_mmap_single Message-ID: <20170411143003.GT1788@kib.kiev.ua> In-Reply-To: <CANXdjjZrjxhbqhZ13sAuZP7cqpvYU8CJusQ2NEpGuRCVMgr0=g@mail.gmail.com> References: <CANXdjjYajtvWK%2Bq3OK4j5uPFR4sVUrhrQD8zZSpoJ1hwZhVS5Q@mail.gmail.com> <CANXdjjZrjxhbqhZ13sAuZP7cqpvYU8CJusQ2NEpGuRCVMgr0=g@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Apr 11, 2017 at 04:55:00PM +0300, Flavius Anton wrote: > >On Tue, Apr 11, 2017 at 04:00:21PM +0300, Konstantin Belousov wrote: > >>On Tue, Apr 11, 2017 at 03:37:26PM +0300, Flavius Anton wrote: > >> Hi everyone, > >> > >> I'll start by giving some context, so you can better understand what > >> is the problem I'm trying to solve. I???ve been working for a while on > >> bhyve trying to implement save/restore [1]. We've currently managed to > >> get it working for VMs using a ramdisk and no devices, so just vCPU > >> and memory states are saved and restored so far. > >> > >> Last week I started looking into network devices, specifically > >> virtio-net devices. The problem was that when I issue a checkpoint > >> operation, the guest virtio driver stops working. After digging for a > >> while, I figured out the problem is marking VM memory as COW. If I > >> don't do this, the driver continues with no problem after > >> checkpointing. > >> > >> Each VM has an associated vmspace and a /dev/vmm/VM_NAME device. When > >> the user space does a mmap on the /dev device, we would like to mark > >> VM memory as COW, thus the VM can continue touching pages while the > >> user space is writing the 'freezed', COW marked memory to a persistent > >> storage. We do this by iterating through all vm_entries from VM's > >> vmspace, we find which entry is mapping the object that has VM memory > >> and then we roughly just set MAP_ENTRY_COW and MAP_ENTRY_NEEDS_COPY on > >> that entry. You can see the code here [2]. > > > >This is very strange operation, to put it mildly. First, are other vCPUs > >operate while you do your 'COW' ? If yes, you are guaranteed to get > >inconsistent snapshot. If not, then you do not need 'COW'. > > Yes, all vCPUs are locked before calling mmap(). I agree that we don't > need 'COW', as long as we keep all vCPUs locked while we copy the > entire VM memory. But this might take a while, imagine a VM with 32GB > or more of RAM. This will take maybe minutes to write to disk, so we > don't actually want the VM to be freezed for so long. That's the > reason we'd like to map the memory COW and then unlock vCPUs. > > >More, what kinds of VM objects are mapped into the vmspace ? FreeBSD VM > >does not support shadowing of device objects (which means, inserting > >shadow objects into the device object chain breaks VM invariants). One > >of the main reasons why it not needed to be supported is because shadow > >copy cannot see changes which are performed on the shadowed pages, > >supposedly done by device. If vmm mmaps some devices into guest vmspace, > >the devices would kind of 'freeze' from the guest PoV. > > It's a OBJT_DEFAULT. It's not a device object, it's the memory object > given to guest to use as physical memory. Perhaps add asserts that you only shadow default/swap/vnode objects. Then you will see if the issue is what I noted above, or not. > > >Next, how do you undo the damage done by your 'COW' ? > > This is one thing that we've thought about, but we don't have a > solution for now. I agree it is very important, though. I figured that > it might be possible to 'unmark' the memory object as COW with some > additional tricks. You might consider using vm_object_collapse().
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170411143003.GT1788>