Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 11 Apr 2017 17:30:03 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Flavius Anton <f.v.anton@gmail.com>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: On COW memory mapping in d_mmap_single
Message-ID:  <20170411143003.GT1788@kib.kiev.ua>
In-Reply-To: <CANXdjjZrjxhbqhZ13sAuZP7cqpvYU8CJusQ2NEpGuRCVMgr0=g@mail.gmail.com>
References:  <CANXdjjYajtvWK%2Bq3OK4j5uPFR4sVUrhrQD8zZSpoJ1hwZhVS5Q@mail.gmail.com> <CANXdjjZrjxhbqhZ13sAuZP7cqpvYU8CJusQ2NEpGuRCVMgr0=g@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Apr 11, 2017 at 04:55:00PM +0300, Flavius Anton wrote:
> >On Tue, Apr 11, 2017 at 04:00:21PM +0300, Konstantin Belousov wrote:
> >>On Tue, Apr 11, 2017 at 03:37:26PM +0300, Flavius Anton wrote:
> >> Hi everyone,
> >>
> >> I'll start by giving some context, so you can better understand what
> >> is the problem I'm trying to solve. I???ve been working for a while on
> >> bhyve trying to implement save/restore [1]. We've currently managed to
> >> get it working for VMs using a ramdisk and no devices, so just vCPU
> >> and memory states are saved and restored so far.
> >>
> >> Last week I started looking into network devices, specifically
> >> virtio-net devices. The problem was that when I issue a checkpoint
> >> operation, the guest virtio driver stops working. After digging for a
> >> while, I figured out the problem is marking VM memory as COW. If I
> >> don't do this, the driver continues with no problem after
> >> checkpointing.
> >>
> >> Each VM has an associated vmspace and a /dev/vmm/VM_NAME device. When
> >> the user space does a mmap on the /dev device, we would like to mark
> >> VM memory as COW, thus the VM can continue touching pages while the
> >> user space is writing the 'freezed', COW marked memory to a persistent
> >> storage. We do this by iterating through all vm_entries from VM's
> >> vmspace, we find which entry is mapping the object that has VM memory
> >> and then we roughly just set MAP_ENTRY_COW and MAP_ENTRY_NEEDS_COPY on
> >> that entry. You can see the code here [2].
> >
> >This is very strange operation, to put it mildly.  First, are other vCPUs
> >operate while you do your 'COW' ?  If yes, you are guaranteed to get
> >inconsistent snapshot.  If not, then you do not need 'COW'.
> 
> Yes, all vCPUs are locked before calling mmap(). I agree that we don't
> need 'COW', as long as we keep all vCPUs locked while we copy the
> entire VM memory. But this might take a while, imagine a VM with 32GB
> or more of RAM. This will take maybe minutes to write to disk, so we
> don't actually want the VM to be freezed for so long. That's the
> reason we'd like to map the memory COW and then unlock vCPUs.
> 
> >More, what kinds of VM objects are mapped into the vmspace ? FreeBSD VM
> >does not support shadowing of device objects (which means, inserting
> >shadow objects into the device object chain breaks VM invariants). One
> >of the main reasons why it not needed to be supported is because shadow
> >copy cannot see changes which are performed on the shadowed pages,
> >supposedly done by device. If vmm mmaps some devices into guest vmspace,
> >the devices would kind of 'freeze' from the guest PoV.
> 
> It's a OBJT_DEFAULT. It's not a device object, it's the memory object
> given to guest to use as physical memory.
Perhaps add asserts that you only shadow default/swap/vnode objects.
Then you will see if the issue is what I noted above, or not.

> 
> >Next, how do you undo the damage done by your 'COW' ?
> 
> This is one thing that we've thought about, but we don't have a
> solution for now. I agree it is very important, though. I figured that
> it might be possible to 'unmark' the memory object as COW with some
> additional tricks.
You might consider using vm_object_collapse().



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170411143003.GT1788>