From owner-freebsd-amd64@freebsd.org Tue Jun 26 14:05:55 2018 Return-Path: Delivered-To: freebsd-amd64@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C0651102827D for ; Tue, 26 Jun 2018 14:05:55 +0000 (UTC) (envelope-from lists@eitanadler.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 518FB71BE6 for ; Tue, 26 Jun 2018 14:05:55 +0000 (UTC) (envelope-from lists@eitanadler.com) Received: by mailman.ysv.freebsd.org (Postfix) id 0EC58102827A; Tue, 26 Jun 2018 14:05:55 +0000 (UTC) Delivered-To: amd64@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DE0DD1028279 for ; Tue, 26 Jun 2018 14:05:54 +0000 (UTC) (envelope-from lists@eitanadler.com) Received: from mail-yb0-x229.google.com (mail-yb0-x229.google.com [IPv6:2607:f8b0:4002:c09::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7C55C71BE5 for ; Tue, 26 Jun 2018 14:05:54 +0000 (UTC) (envelope-from lists@eitanadler.com) Received: by mail-yb0-x229.google.com with SMTP id h127-v6so2729002ybg.12 for ; Tue, 26 Jun 2018 07:05:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eitanadler.com; s=0xdeadbeef; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=eYKqmEkmoBxq3pm+Ib5BL/dZlc4sFTleeNRmSVkGVZY=; b=PItTpa99gjDOeDCEWRYbzluU/YrrPurVSqjgbx/KbWFVP6kYxI63K0jA50dpksAXrl l+aSRvJIc8rApKVKJJqBIcHh0P4wlwl/2Fq9kCQ7XZJPUmdCm8KKZWf2JGxS+N2ZeP8r bnHrHz1gFHbPR3WnsYMLt4oqwSlVAY5QlzxzM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=eYKqmEkmoBxq3pm+Ib5BL/dZlc4sFTleeNRmSVkGVZY=; b=hfAO/0E13+Qosd/oVT6ga5oE66fQvadQmVBzuHlGx75/SHJhBRCiAKavNXODYAmixP qGjkabikqdLHHCbSIcqzZhlNWQg5XYMzduA+2uadMZZ6GGXlKL8u9GLcDt+aVIHocVET QSYn+LRE5YhVotogA5rSMH/Kkn1DnK3AP7fRCmnyfrGjj1qyaxLhxSk1kvb8mdmmE9yM vm50oIMsB5/JiEooixfJZoxW4788MJMvFO9/FaLPXZRgzo+nAmxVH+LVNMw/76HkGbH6 Ed/ImllgSMQqgviSfnMT9EewomGl9veeYXzK9b7xkTyvyRuJKBxcKgSu9uDp/8Zk0fZZ 3CjQ== X-Gm-Message-State: APt69E2/6m7qrbWEwutJkejdoi268uw0IEXdyu5lk2jL6n9UPRxW9tba obEO3aZUlgqzLExPaBiBl+7qRz0wZXj5bK3jIyqBYg== X-Google-Smtp-Source: ADUXVKLJDTHWHNzHS358kqLEH1sQZLz5HKR2LwCEa17mEqUlXajS4cMdR3x9UeCQpT+KtPsG1speXFIOLg86sEo9kOU= X-Received: by 2002:a25:8542:: with SMTP id f2-v6mr852228ybn.87.1530021953563; Tue, 26 Jun 2018 07:05:53 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a25:ef50:0:0:0:0:0 with HTTP; Tue, 26 Jun 2018 07:05:22 -0700 (PDT) In-Reply-To: <20180619115030.5491af33@ernst.home> References: <20180613103535.GP2493@kib.kiev.ua> <20180619115030.5491af33@ernst.home> From: Eitan Adler Date: Tue, 26 Jun 2018 07:05:22 -0700 Message-ID: Subject: Re: Ryzen public erratas To: Gary Jennejohn Cc: Konstantin Belousov , amd64@freebsd.org, "current@freebsd.org" Content-Type: text/plain; charset="UTF-8" X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Jun 2018 14:05:56 -0000 On 19 June 2018 at 02:50, Gary Jennejohn wrote: > On Mon, 18 Jun 2018 22:44:13 -0700 > Eitan Adler wrote: > >> On 13 June 2018 at 04:16, Eitan Adler wrote: >> > On 13 June 2018 at 03:35, Konstantin Belousov wrote: >> >> Today I noted that AMD published the public errata document for Ryzens, >> >> https://developer.amd.com/wp-content/resources/55449_1.12.pdf >> >> >> >> Some of the issues listed there looks quite relevant to the potential >> >> hangs that some people still experience with the machines. I wrote >> >> a script which should apply the recommended workarounds to the erratas >> >> that I find interesting. >> >> >> >> To run it, kldload cpuctl, then apply the latest firmware update to your >> >> CPU, then run the following shell script. Comments indicate the errata >> >> number for the workarounds. >> >> >> >> Please report the results. If the script helps, I will code the kernel >> >> change to apply the workarounds. >> >> >> >> #!/bin/sh >> >> >> >> # Enable workarounds for erratas listed in >> >> # https://developer.amd.com/wp-content/resources/55449_1.12.pdf >> >> >> >> # 1057, 1109 >> >> sysctl machdep.idle_mwait=0 >> >> sysctl machdep.idle=hlt >> > >> > >> > Is this needed if it was previously machdep.idle: acpi ? >> >> This might explain why I've never seen the lockup issues mentioned by >> other people. What would cause my machine to differ from others? >> > > I had sysctl machdep.idle_mwait=1 and machdep.idle=acpi before > applying the shell script. I had multiple lockups every week, > sometimes multiple lockups per day. This makes me curious about why I didn't experience lockups. Perhaps my BIOS defaulted to something else? With these settings: machdep.idle: acpi machdep.idle_mwait: 1 -- Eitan Adler From owner-freebsd-amd64@freebsd.org Tue Jun 26 16:31:28 2018 Return-Path: Delivered-To: freebsd-amd64@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 88BDD10019F4 for ; Tue, 26 Jun 2018 16:31:28 +0000 (UTC) (envelope-from gljennjohn@gmail.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 15CCC77FDE for ; Tue, 26 Jun 2018 16:31:28 +0000 (UTC) (envelope-from gljennjohn@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id C60B010019F2; Tue, 26 Jun 2018 16:31:27 +0000 (UTC) Delivered-To: amd64@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A132B10019F0; Tue, 26 Jun 2018 16:31:27 +0000 (UTC) (envelope-from gljennjohn@gmail.com) Received: from mail-wr0-x230.google.com (mail-wr0-x230.google.com [IPv6:2a00:1450:400c:c0c::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 208E477FD8; Tue, 26 Jun 2018 16:31:27 +0000 (UTC) (envelope-from gljennjohn@gmail.com) Received: by mail-wr0-x230.google.com with SMTP id a12-v6so17917938wro.1; Tue, 26 Jun 2018 09:31:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; bh=BRJPCMVmatsSPfSumJl9MNz9KEoYEw1FInrURNRinec=; b=LrALL+hw8kVhJcunvyD+6HjmNo+etIYxrcEg0DMRrjSDDxzEJn530b8NbndaxR6wrU 3kCx+F7SKjCbDlVM9CFuOQo/KHKpDFfzW9DAvxfQwjLrcAiqDmZJXTvpvilQs6r1kZGA o9+2D+0Xbeo8koncs09DI0+2XQULf/ATTED/dDvLybg+EJIGawdQVGfFAzahk59zjvsC irLdYsIW7QvoLchpOh1oAihOgWxFXJubrU2jjV1/oA4ZNKMOxJdJwZQXoHB0GSpHW+BQ oBZBOvnj5Mv2gI/8lMlpbHzu+TAOXtcqoHnPjOsPoRSUyxtbAmXaxQ49rBT7DC1/XLj+ 6ZnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:reply-to:mime-version:content-transfer-encoding; bh=BRJPCMVmatsSPfSumJl9MNz9KEoYEw1FInrURNRinec=; b=fxI86DhVpfPwGM4j+XJN+WuVD8uYiONERooJNrcq+/Y8yU4rrpaL755fgwuccE64LB J9GCusvF/4Ox9eM9l7/McqNt6zZbpTZ+lSV0XNYfGB536B0IGiKWw++Vhw15LWNCO5Rd ojW64awDQj5WrRSpoikjLrwgYpm3uNRY4nw4VHO4Wbs48qrQ8ZN9wWoEwsjodmlQkp3K ToVnPFL5XHmM7aeM0QsoE8vOsAES6ubd3Yoi8aq9sYRrhiCHfnIUb1SrQF2fwbD+imft oTd28I5bQX0l/8MSA+euWq1XBfbcvLss0310Io7JEgexNLUxsEx7yObvVyeEjWaqUhGK DjzQ== X-Gm-Message-State: APt69E0QGw6HosKNHMIet2E5AZuV1JUgSuZn4oALBNWoUZmLbQtZl7F0 2hu3Ee3skfCfC0ioUSbinyI= X-Google-Smtp-Source: AAOMgpeBCv+VRFpXBUjpdD4DT8Sit4X8hWVb9woUTtsBXH8ruyor/+aB6+3Hd1Rgbky3ncMVu9ihUQ== X-Received: by 2002:a5d:4b4b:: with SMTP id w11-v6mr2113423wrs.87.1530030686149; Tue, 26 Jun 2018 09:31:26 -0700 (PDT) Received: from ernst.home (p5B0234B4.dip0.t-ipconnect.de. [91.2.52.180]) by smtp.gmail.com with ESMTPSA id c6-v6sm2924601wrp.29.2018.06.26.09.31.24 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 26 Jun 2018 09:31:25 -0700 (PDT) Date: Tue, 26 Jun 2018 18:31:24 +0200 From: Gary Jennejohn To: Eitan Adler Cc: Konstantin Belousov , amd64@freebsd.org, "current@freebsd.org" Subject: Re: Ryzen public erratas Message-ID: <20180626183124.5d2173ec@ernst.home> In-Reply-To: References: <20180613103535.GP2493@kib.kiev.ua> <20180619115030.5491af33@ernst.home> Reply-To: gljennjohn@gmail.com X-Mailer: Claws Mail 3.16.0 (GTK+ 2.24.32; amd64-portbld-freebsd12.0) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Jun 2018 16:31:28 -0000 On Tue, 26 Jun 2018 07:05:22 -0700 Eitan Adler wrote: > On 19 June 2018 at 02:50, Gary Jennejohn wrote: > > On Mon, 18 Jun 2018 22:44:13 -0700 > > Eitan Adler wrote: > > > >> On 13 June 2018 at 04:16, Eitan Adler wrote: > >> > On 13 June 2018 at 03:35, Konstantin Belousov wrote: > >> >> Today I noted that AMD published the public errata document for Ryzens, > >> >> https://developer.amd.com/wp-content/resources/55449_1.12.pdf > >> >> > >> >> Some of the issues listed there looks quite relevant to the potential > >> >> hangs that some people still experience with the machines. I wrote > >> >> a script which should apply the recommended workarounds to the erratas > >> >> that I find interesting. > >> >> > >> >> To run it, kldload cpuctl, then apply the latest firmware update to your > >> >> CPU, then run the following shell script. Comments indicate the errata > >> >> number for the workarounds. > >> >> > >> >> Please report the results. If the script helps, I will code the kernel > >> >> change to apply the workarounds. > >> >> > >> >> #!/bin/sh > >> >> > >> >> # Enable workarounds for erratas listed in > >> >> # https://developer.amd.com/wp-content/resources/55449_1.12.pdf > >> >> > >> >> # 1057, 1109 > >> >> sysctl machdep.idle_mwait=0 > >> >> sysctl machdep.idle=hlt > >> > > >> > > >> > Is this needed if it was previously machdep.idle: acpi ? > >> > >> This might explain why I've never seen the lockup issues mentioned by > >> other people. What would cause my machine to differ from others? > >> > > > > I had sysctl machdep.idle_mwait=1 and machdep.idle=acpi before > > applying the shell script. I had multiple lockups every week, > > sometimes multiple lockups per day. > > This makes me curious about why I didn't experience lockups. Perhaps my > BIOS defaulted to something else? > > With these settings: > > machdep.idle: acpi > machdep.idle_mwait: 1 > I can only say that after updating the processor's microcde and applying the errata script my system runs much more stabily. No lockups for days. I suspect that updating the microcode helped quite a bit. I have a first-generation Ryzen 5 1600 with all the errata. -- Gary Jennejohn From owner-freebsd-amd64@freebsd.org Fri Jun 29 08:59:03 2018 Return-Path: Delivered-To: freebsd-amd64@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 47C5B1035237; Fri, 29 Jun 2018 08:59:03 +0000 (UTC) (envelope-from elenamihailescu22@gmail.com) Received: from mail-oi0-x22c.google.com (mail-oi0-x22c.google.com [IPv6:2607:f8b0:4003:c06::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C4E3782BC8; Fri, 29 Jun 2018 08:59:02 +0000 (UTC) (envelope-from elenamihailescu22@gmail.com) Received: by mail-oi0-x22c.google.com with SMTP id w126-v6so517819oie.7; Fri, 29 Jun 2018 01:59:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to:cc; bh=btaGFOIewTKa9yTmziEpM/4891mrOI3TGYGhUw7YPEs=; b=WHPN8I0fDlxS+IrbEMfuTdylWED+yGUkqXSH9Q8veuJyEG4go7sO6YrCWmslhnEkqA Sg6P3s+3GiokpDLwVUxoZtNjgfL9Ci6hWTTvAERjbH4LBiMNzj0kkK7+1tqyd6GNMW/J niNNdCicublQ68y11VVRvRwPP3QvwBHiTMIdUfUM+ZNz5VThyof/9ZGijkTPNXJLJFch gf6QlhHtYgwXpoykqhn1fjMOu/T0TDvde2JOKdWVSnSB4PqK55vd3i2YSQO0g7gMP3NZ GBZr+9fXUxrdG6ZV30tIgcqqEcUY1+57pD1KOV5oQ8Gw7yAvRbU3SknmbWquNqJKOu6N Xerw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc; bh=btaGFOIewTKa9yTmziEpM/4891mrOI3TGYGhUw7YPEs=; b=oqgXXdLKDcb+0X1srQCgpltpTOgm/l738cC5b5Fdx3xt1Q8khFHmciKZYFigtsH0D3 VdNhEYzH2ZvE7capCO4DaxLS1j6OKTuuSSvPcjCcFgxU5YMShdZMUqWSflFZrfXY+P9p p0pXTdgYSzBt3TjVSU7edb5ryYys0bHZANqgnLDWsn3WHoBwyDJBq9Bw10uRcriujO5P rPCDvVvs4a8n0QB3A43Xh/MhKP8EhmevNyATlKSrv3jbcsoKofwMaH4bgH6JGACPQkqF 9NFr60plEXBVigu5/09fPt4xhmioY06Qh30VJHMAC8wWHflKGk3mxPPp266uMgn47ixZ RfEQ== X-Gm-Message-State: APt69E0ZvFRZxJ8zqQdlGUiqCI0CBl1FsYkGfzV7/bHGCiW7h4Sj1nK0 FKcH4weaLz6uvgxH/STX/zFppYt5ABLhuc38qgVgyQ== X-Google-Smtp-Source: AAOMgpeNkLg+B1jKTyyHZE6S0vKqsuyf606meQWvDo/5xH6FyrFf3wefAfpAIyyTnTf8UUjAnCiVAM3nUPu1xJkVbjw= X-Received: by 2002:aca:f409:: with SMTP id s9-v6mr8162984oih.102.1530262741744; Fri, 29 Jun 2018 01:59:01 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:3c4e:0:0:0:0:0 with HTTP; Fri, 29 Jun 2018 01:58:31 -0700 (PDT) From: Elena Mihailescu Date: Fri, 29 Jun 2018 11:58:31 +0300 Message-ID: Subject: Inspect pages created after a vm_object is marked as copy-on-write To: freebsd-amd64@freebsd.org, freebsd-virtualization@freebsd.org Cc: Mihai Carabas Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.26 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Jun 2018 08:59:03 -0000 Hello, I am interested if there is a method to inspect what pages/objects were created after a vm_object (the vm_map_entry associated with the object) is marked as copy-on-write. More specifically, I'm interested only in the pages that were copied when a write operation was proceed on a page that belongs to the object marked copy-on-write. I need this for a live migration feature for bhyve in order to send the pages that were modified between the iterations in which I migrate the guest's memory(the guest's memory will be migrated in rounds - firstly, all memory will be sent remote, then, only the pages that were modified and so on). What I want to implement is the following: Step 1: Given a vm_object *obj, mark its associated vm_map_entry *entry as copy-on-write. Step 2: After a while (a non-deterministic amount of time), inspect/retrieve the pages that were created based on information existent in the object. What I tried until now: I implemented a function in kernel that: - gets the vmspace structure pointer for the current process - gets the vm_map structure pointer for the vmspace - iterates through each vm_map_entry and based on the vm_offset_start and vm_offset_end determines vm_map_entry that contains the object I am interested in. - for this object, it prints some debug information such as: shadow_count, ref_count, whether if it has a backing_object or not. The code written is similar with the code from here (the way in which I get vmspace for the current process and the way I am iterating through vm_map_entry and objects): [0] https://github.com/FreeBSD-UPB/freebsd/blob/projects/bhyve_migration/sys/amd64/vmm/vmm_dev.c#L979 I have read the following documentation about FreeBSD's implementation for virtual memory: [1] https://www.freebsd.org/doc/en/books/arch-handbook/vm.html [2] https://www.freebsd.org/doc/en_US.ISO8859-1/articles/vm-design/article.html [3] https://people.freebsd.org/~neel/bhyve/bhyve_nested_paging.pdf [4] http://www.cse.chalmers.se/edu/year/2011/course/EDA203/unix4.pdf As far as I could tell after reading the documentation presented above, I should look for the object that the object I am interested in is a shadow of or an object that my object is shadow for. To do that, I should inspect the following fields from the vm_object structure (among others)( https://github.com/freebsd/freebsd/blob/master/sys/vm/vm_object.h#L98) : - int shadow_count; /* how many objects that this is a shadow for */ - struct vm_object *backing_object; /* object that I'm a shadow of */ But in all my tests, for the object I am interested in, the shadow_count is 0 and the backing_object is NULL. The code I use to mark the vm_map_entry for the object I am interested in copy-on-write is here: [5] https://github.com/FreeBSD-UPB/freebsd/blob/projects/bhyve_migration/sys/amd64/vmm/vmm_dev.c#L949 Is there anything I am doing wrong? Maybe I misunderstood something about the way the virtual memory works in FreeBSD. There is another way I could inspect what pages were created between the moment I mark an object (its vm_map_entry) as copy-on-write and a later moment? Thank you, Elena From owner-freebsd-amd64@freebsd.org Fri Jun 29 22:52:23 2018 Return-Path: Delivered-To: freebsd-amd64@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 23023FD0671; Fri, 29 Jun 2018 22:52:23 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-wr0-x242.google.com (mail-wr0-x242.google.com [IPv6:2a00:1450:400c:c0c::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 93BA27FFCA; Fri, 29 Jun 2018 22:52:22 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-wr0-x242.google.com with SMTP id c13-v6so10169903wrq.2; Fri, 29 Jun 2018 15:52:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=IWBYpSa/0j+dyi3tR8YyCQOoR9qXzbtjfWSijgB7Muw=; b=WyhXBhwRhpdR0WB5YCOR+2hIFklOMJVJCpj6L/gtinjX6ler5ZRYM+RtYyAJfcaKvm w8BZMcDc5aVexGt0Wznwq9dy4C87Gc7SaDuFKnN5/xiQROPsBmHetixPwF9wlBfOkiD5 1IoJxmiPFI8x7GlaNjinlnZjXRioB9X8Wcij5ZF2Nq6cOJ2Pn6ryN0+YoKR+qNGgfKDJ aX0vX3Lgr8ofPQFzv9i0h6ePt8ia2zOjkg9ZFYZ2qTpR7BXIk0RgFasv5y3rSSEmRCuR rtHOtx9547sxs4mQlcw0Snv173d8H+Gu76dve/z2y4cqWVOft/LbLq3C1vuqqst3rkDs ZPRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=IWBYpSa/0j+dyi3tR8YyCQOoR9qXzbtjfWSijgB7Muw=; b=oguf3b5OhgV90XJ+hclvj2WnHuC22s3VdxkTLZVxfaw9dKTe5gBfJrAcevjjxIs8++ WmeFjmLLi/fF6ee+lJ7DIyN58vonTSLzdSt2kZa6UTC1pF26H5B2Mq0Xt99RLddD5pev jYd5QUhxxG+ELjfSmrPRqmZuNMT+30cAPsaItfvPJKXJ2LVinvgt4+6Rn2jF+l6/GBPM JMiQYjL2GsmKIszl4UgLOQqauOWxmV5YC2Y4pU45ysRbUvx/inYjHuxz5ktTdb7ozxps wlSThtdPArRe1fI6dVriwluvV6LeWuUlSyt3ZUQF6OG+y0DCvtmm82kiHhLr6S6c5q7b 9Fxw== X-Gm-Message-State: APt69E31POEQPICQlr6fp/QJZy1At7Wvz2/bnUJbr4c9c6oC7v6SdG2b 5K0awE3HSPsjelgu7YCgMrA= X-Google-Smtp-Source: AAOMgpco/ZUS2rNZaM1Dyc1WEAw+4l9XkpGVaHzLIuI4hDqMWWmzNMbBkcDPpWL0Nmzi4bdaw1T+xA== X-Received: by 2002:adf:a792:: with SMTP id j18-v6mr13826696wrc.187.1530312741513; Fri, 29 Jun 2018 15:52:21 -0700 (PDT) Received: from pesky.lan (93-34-93-211.ip49.fastwebnet.it. [93.34.93.211]) by smtp.gmail.com with ESMTPSA id v15-v6sm1738168wmc.16.2018.06.29.15.52.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 Jun 2018 15:52:20 -0700 (PDT) Sender: Mark Johnston Date: Fri, 29 Jun 2018 18:52:15 -0400 From: Mark Johnston To: Elena Mihailescu Cc: freebsd-amd64@freebsd.org, freebsd-virtualization@freebsd.org Subject: Re: Inspect pages created after a vm_object is marked as copy-on-write Message-ID: <20180629225209.GA4238@pesky.lan> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.0 (2018-05-17) X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Jun 2018 22:52:23 -0000 On Fri, Jun 29, 2018 at 11:58:31AM +0300, Elena Mihailescu wrote: > Hello, > > I am interested if there is a method to inspect what pages/objects were > created after a vm_object (the vm_map_entry associated with the object) is > marked as copy-on-write. More specifically, I'm interested only in the > pages that were copied when a write operation was proceed on a page that > belongs to the object marked copy-on-write. > > I need this for a live migration feature for bhyve in order to send the > pages that were modified between the iterations in which I migrate the > guest's memory(the guest's memory will be migrated in rounds - firstly, all > memory will be sent remote, then, only the pages that were modified and so > on). > > What I want to implement is the following: > Step 1: Given a vm_object *obj, mark its associated vm_map_entry *entry as > copy-on-write. > Step 2: After a while (a non-deterministic amount of time), > inspect/retrieve the pages that were created based on information existent > in the object. > > What I tried until now: > > I implemented a function in kernel that: > - gets the vmspace structure pointer for the current process > - gets the vm_map structure pointer for the vmspace > - iterates through each vm_map_entry and based on the vm_offset_start and > vm_offset_end determines vm_map_entry that contains the object I am > interested in. > - for this object, it prints some debug information such as: shadow_count, > ref_count, whether if it has a backing_object or not. > The code written is similar with the code from here (the way in which I get > vmspace for the current process and the way I am iterating through > vm_map_entry and objects): > [0] > https://github.com/FreeBSD-UPB/freebsd/blob/projects/bhyve_migration/sys/amd64/vmm/vmm_dev.c#L979 > > I have read the following documentation about FreeBSD's implementation for > virtual memory: > [1] https://www.freebsd.org/doc/en/books/arch-handbook/vm.html > [2] > https://www.freebsd.org/doc/en_US.ISO8859-1/articles/vm-design/article.html > [3] https://people.freebsd.org/~neel/bhyve/bhyve_nested_paging.pdf > [4] http://www.cse.chalmers.se/edu/year/2011/course/EDA203/unix4.pdf > > As far as I could tell after reading the documentation presented above, I > should look for the object that the object I am interested in is a shadow > of or an object that my object is shadow for. Right. When a copy-on-write fault results in the creation of a new object, the new object is said to shadow the original object, which becomes the backing object for the shadow object. When faults in the corresponding map entry occur, the fault handler first searches for a page in the map entry's object, and then falls back to the backing object if necessary. > To do that, I should inspect the following fields from the vm_object > structure (among others)( > https://github.com/freebsd/freebsd/blob/master/sys/vm/vm_object.h#L98) : > > - int shadow_count; /* how many objects that this is a shadow for */ > - struct vm_object *backing_object; /* object that I'm a shadow of */ > > But in all my tests, for the object I am interested in, the shadow_count is > 0 and the backing_object is NULL. > > The code I use to mark the vm_map_entry for the object I am interested in > copy-on-write is here: > [5] > https://github.com/FreeBSD-UPB/freebsd/blob/projects/bhyve_migration/sys/amd64/vmm/vmm_dev.c#L949 MAP_ENTRY_NEEDS_COPY needs to be set in order for the copy-on-write machinery to work the way you expect. Take a look at vm_map_lookup(): when it sees that flag and the caller is attempting a write fault, it creates a shadow object, updates the entry and clears MAP_ENTRY_NEEDS_COPY, leaving MAP_ENTRY_COW set. > Is there anything I am doing wrong? Maybe I misunderstood something about > the way the virtual memory works in FreeBSD. I'll note that inspecting and manipulating vm_map_entry and vm_object structures in the bhyve code constitutes something of an abstraction violation, though it's reasonable to proceed this way while working on a prototype of the feature. That is, I think you should keep trying your current approach, but just be aware that you are using the copy-on-write mechanism in a way that the VM system isn't really expecting. > There is another way I could inspect what pages were created between the > moment I mark an object (its vm_map_entry) as copy-on-write and a later > moment? From owner-freebsd-amd64@freebsd.org Sat Jun 30 07:38:25 2018 Return-Path: Delivered-To: freebsd-amd64@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 211441021F6C; Sat, 30 Jun 2018 07:38:25 +0000 (UTC) (envelope-from mihai.carabas@gmail.com) Received: from mail-wr0-x241.google.com (mail-wr0-x241.google.com [IPv6:2a00:1450:400c:c0c::241]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8CE67729CE; Sat, 30 Jun 2018 07:38:24 +0000 (UTC) (envelope-from mihai.carabas@gmail.com) Received: by mail-wr0-x241.google.com with SMTP id k7-v6so7277335wrq.0; Sat, 30 Jun 2018 00:38:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=f7RuluXkpAnQRk9uDHIvnrFH3Y52S1duSgcGn8WSKvY=; b=QzAU0leOPgqojohnP1pnNmUJImprwTvSYdYoHXmxOfrC/aKjZ/afdiciT4NnbpUgrZ FC9extIPxh3QmX/m+E2lPJGXfhYZ1JV5A9y/iqByLSjIjco39UwosPuiodmhxyH90Ee2 ZzudXiNIsc0Rf00E8aGLdQJHnhB/Uzt5vTH/fPrFTjnfN+jhAULopniFq5ZrwUhFNhx6 i22xdp5hf7+B7MbNqdd6Ls78odxuAJ/5B9ZwpTYd3xbTQnA1fuOsGZfV6wFGrlqMRvqj uVjIzhFC0ZpNhtvdZhMP56uZwKpNFeZ8xufWEch1dKGdchr+9Ektkl2O56fhu6XOZxp/ q67w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=f7RuluXkpAnQRk9uDHIvnrFH3Y52S1duSgcGn8WSKvY=; b=WmEN+KTLg9sBJPzniOYXDlfbf/cy8l6fuawoRDqJjCSgBe5NZ43O7388Z+XJX6xKeT 2Hh0lBzhgXt6cclHqgeHCS3Q2hWINOIKTS9ayygL/XeRiB0Z/EZdbcZJwqvikbyYLGEY wUPY4vsjnHAV9MB6GHzfyZmCgmY/nlor2yWQF/PZJ0Y3N5dRbY2e2uF25xOH2FCe1ApE Be59iWOLw0IRA4/09aVIJMWryhLXRZvRkCYcoPBInHRNAw0jUr3bkZUTxwbX0SJ4OzcA EcHtufbNt1zygNiokXQCK+ggOlmE2SrhL/VQf5n+1pWwi5hPPqrhGmGocenYMDnIlCLg pN+w== X-Gm-Message-State: APt69E3KmrhT3b7K6JSokNSkN3Rjmc3njeztfTRz1Tx7twpcdg2+DgKT PefgJ8+ISIkc/tRq6gXK5wCZWQNi3c481diILi0= X-Google-Smtp-Source: AAOMgpfOf6ZJajqhM11CtCaDP2nUDfB1lWXOp+VU6x/jXOcrXCiDauEsLFBLR9odbGtJh/K+uBZKW9BC0hofd9QzrGI= X-Received: by 2002:a5d:4a09:: with SMTP id m9-v6mr9071047wrq.91.1530344302913; Sat, 30 Jun 2018 00:38:22 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a1c:749:0:0:0:0:0 with HTTP; Sat, 30 Jun 2018 00:38:21 -0700 (PDT) In-Reply-To: <20180629225209.GA4238@pesky.lan> References: <20180629225209.GA4238@pesky.lan> From: Mihai Carabas Date: Sat, 30 Jun 2018 10:38:21 +0300 Message-ID: Subject: Re: Inspect pages created after a vm_object is marked as copy-on-write To: Mark Johnston Cc: Elena Mihailescu , freebsd-virtualization@freebsd.org, freebsd-amd64@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Mailman-Approved-At: Sat, 30 Jun 2018 10:23:56 +0000 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 30 Jun 2018 07:38:25 -0000 On Sat, Jun 30, 2018 at 1:52 AM, Mark Johnston wrote: > On Fri, Jun 29, 2018 at 11:58:31AM +0300, Elena Mihailescu wrote: >> Hello, >> >> I am interested if there is a method to inspect what pages/objects were >> created after a vm_object (the vm_map_entry associated with the object) is >> marked as copy-on-write. More specifically, I'm interested only in the >> pages that were copied when a write operation was proceed on a page that >> belongs to the object marked copy-on-write. >> >> I need this for a live migration feature for bhyve in order to send the >> pages that were modified between the iterations in which I migrate the >> guest's memory(the guest's memory will be migrated in rounds - firstly, all >> memory will be sent remote, then, only the pages that were modified and so >> on). >> >> What I want to implement is the following: >> Step 1: Given a vm_object *obj, mark its associated vm_map_entry *entry as >> copy-on-write. >> Step 2: After a while (a non-deterministic amount of time), >> inspect/retrieve the pages that were created based on information existent >> in the object. >> >> What I tried until now: >> >> I implemented a function in kernel that: >> - gets the vmspace structure pointer for the current process >> - gets the vm_map structure pointer for the vmspace >> - iterates through each vm_map_entry and based on the vm_offset_start and >> vm_offset_end determines vm_map_entry that contains the object I am >> interested in. >> - for this object, it prints some debug information such as: shadow_count, >> ref_count, whether if it has a backing_object or not. >> The code written is similar with the code from here (the way in which I get >> vmspace for the current process and the way I am iterating through >> vm_map_entry and objects): >> [0] >> https://github.com/FreeBSD-UPB/freebsd/blob/projects/bhyve_migration/sys/amd64/vmm/vmm_dev.c#L979 >> >> I have read the following documentation about FreeBSD's implementation for >> virtual memory: >> [1] https://www.freebsd.org/doc/en/books/arch-handbook/vm.html >> [2] >> https://www.freebsd.org/doc/en_US.ISO8859-1/articles/vm-design/article.html >> [3] https://people.freebsd.org/~neel/bhyve/bhyve_nested_paging.pdf >> [4] http://www.cse.chalmers.se/edu/year/2011/course/EDA203/unix4.pdf >> >> As far as I could tell after reading the documentation presented above, I >> should look for the object that the object I am interested in is a shadow >> of or an object that my object is shadow for. > > Right. When a copy-on-write fault results in the creation of a new > object, the new object is said to shadow the original object, which > becomes the backing object for the shadow object. When faults in the > corresponding map entry occur, the fault handler first searches for a > page in the map entry's object, and then falls back to the backing > object if necessary. > >> To do that, I should inspect the following fields from the vm_object >> structure (among others)( >> https://github.com/freebsd/freebsd/blob/master/sys/vm/vm_object.h#L98) : >> >> - int shadow_count; /* how many objects that this is a shadow for */ >> - struct vm_object *backing_object; /* object that I'm a shadow of */ >> >> But in all my tests, for the object I am interested in, the shadow_count is >> 0 and the backing_object is NULL. >> >> The code I use to mark the vm_map_entry for the object I am interested in >> copy-on-write is here: >> [5] >> https://github.com/FreeBSD-UPB/freebsd/blob/projects/bhyve_migration/sys/amd64/vmm/vmm_dev.c#L949 > > MAP_ENTRY_NEEDS_COPY needs to be set in order for the copy-on-write > machinery to work the way you expect. Take a look at vm_map_lookup(): > when it sees that flag and the caller is attempting a write fault, it > creates a shadow object, updates the entry and clears > MAP_ENTRY_NEEDS_COPY, leaving MAP_ENTRY_COW set. > >> Is there anything I am doing wrong? Maybe I misunderstood something about >> the way the virtual memory works in FreeBSD. > > I'll note that inspecting and manipulating vm_map_entry and vm_object > structures in the bhyve code constitutes something of an abstraction > violation, though it's reasonable to proceed this way while working on a > prototype of the feature. That is, I think you should keep trying your > current approach, but just be aware that you are using the copy-on-write > mechanism in a way that the VM system isn't really expecting. > Can you point out the right approach in our case? Thanks, Mihai >> There is another way I could inspect what pages were created between the >> moment I mark an object (its vm_map_entry) as copy-on-write and a later >> moment? > _______________________________________________ > freebsd-virtualization@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization > To unsubscribe, send any mail to "freebsd-virtualization-unsubscribe@freebsd.org" From owner-freebsd-amd64@freebsd.org Sat Jun 30 22:00:03 2018 Return-Path: Delivered-To: freebsd-amd64@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 39402FDCEE5; Sat, 30 Jun 2018 22:00:03 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-wm0-x235.google.com (mail-wm0-x235.google.com [IPv6:2a00:1450:400c:c09::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9CC3C8E6C8; Sat, 30 Jun 2018 22:00:02 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-wm0-x235.google.com with SMTP id z13-v6so5213706wma.5; Sat, 30 Jun 2018 15:00:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=uv0iM8A3Q5bi8kbUBZ8u4JhxiCCULZru6pX2TLh/qss=; b=dqu00l/JKA0d0MYRsiJzXWV+csykYRRSAfLjW8oN6lxv7s88Lc8BkQCF5kSrVp4kgp 9JEAnD2fLrhuo3yZXhNgoKx+an2YYmS3GScJuv5FQE73VTq+YqSU4cFgmE1gMq1mVVD0 EDKDDofrR9PgpYhR3dVGn8NhewRJE+nghFmosC/MpWGWsXBKuFXfWFXsfTptzP0bOBDQ Tus/MUXRchJXHCWCyr0fMzLkMl8FYlYsD+XS6/pIf5fOEeaC/yTTdvdESLxbjWT5qHMS TIOpbVRaDEH1u60x0DphxyZp2tknF75p4TiKHD2NxDcFbYYA52OaSdDB9LaX9nJFQZJc JxSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=uv0iM8A3Q5bi8kbUBZ8u4JhxiCCULZru6pX2TLh/qss=; b=MEP7SIZkbq1/tsGiuI5RT9ItreBKp7/S0E3EYowhReXyVsed4DGvwduon5r7GgxHa1 pJpKu2cyzp0SojDFGK8TJ5kpb6dU6VP9KfgkPLCZ6yC5I19OpKko3csjulMRMeEqzWRA /bTsLfXjGR6kOePITPQzhStB9OeARqf1t+nx3g2fJ72PU/8zJUbbaqCzdCHxpjM2Q0q7 uvmqQmGrTzT6fyl1+74dVyRdrwNz6C5GBWJRZ49U5WQqXvZQuPpFwYaVCM8msm3MHaxU TNhSmTxoQTyuwrJ5wXv3mnAwLKGmChWpSW1cNadSUg1qQST52BbHJ9Sq7cP8txr8Incy BNHA== X-Gm-Message-State: APt69E3HPB8a/qc4/kltEpTTrh0VDsqCyd85Xmj3TY1EFKcMW9qVE0GL ky/Nm+U03DaDb87f6t7mA37M/g== X-Google-Smtp-Source: AAOMgpeWb7tOWpceP3apEqviLOtZ1BZhvohSEqttxcFDgUMCDvAv7/xdowIQJlfkmgmv1n/Jr/Y4yw== X-Received: by 2002:a1c:a103:: with SMTP id k3-v6mr4955409wme.161.1530396001452; Sat, 30 Jun 2018 15:00:01 -0700 (PDT) Received: from pesky.lan (93-34-93-211.ip49.fastwebnet.it. [93.34.93.211]) by smtp.gmail.com with ESMTPSA id c17-v6sm5814944wrp.54.2018.06.30.14.59.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 30 Jun 2018 15:00:00 -0700 (PDT) Sender: Mark Johnston Date: Sat, 30 Jun 2018 17:59:56 -0400 From: Mark Johnston To: Mihai Carabas Cc: Elena Mihailescu , freebsd-virtualization@freebsd.org, freebsd-amd64@freebsd.org Subject: Re: Inspect pages created after a vm_object is marked as copy-on-write Message-ID: <20180630215956.GA1282@pesky.lan> References: <20180629225209.GA4238@pesky.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.0 (2018-05-17) X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 30 Jun 2018 22:00:03 -0000 On Sat, Jun 30, 2018 at 10:38:21AM +0300, Mihai Carabas wrote: > On Sat, Jun 30, 2018 at 1:52 AM, Mark Johnston wrote: > > On Fri, Jun 29, 2018 at 11:58:31AM +0300, Elena Mihailescu wrote: > >> Is there anything I am doing wrong? Maybe I misunderstood something about > >> the way the virtual memory works in FreeBSD. > > > > I'll note that inspecting and manipulating vm_map_entry and vm_object > > structures in the bhyve code constitutes something of an abstraction > > violation, though it's reasonable to proceed this way while working on a > > prototype of the feature. That is, I think you should keep trying your > > current approach, but just be aware that you are using the copy-on-write > > mechanism in a way that the VM system isn't really expecting. > > > > Can you point out the right approach in our case? I am merely suggesting that once the required VM interactions are fully understood, the mechanism implemented for bhyve should be generalized and lifted into the VM code. It's hard to say what the "right" approach is, since I don't fully understand the proposed algorithm. It sounds like you might be attempting something like: 1. mark the mappings of to-be-migrated objects as NEEDS_COW, so that a subsequent write fault triggers creation of a shadow object 2. invalidate all physical mappings of pages in the object to be copied, so that subsequent writes trigger a fault 3. copy pages from the backing object to the destination 4. copy any pages from the shadow object to the desination 5. collapse the backing object into the shadow 6. if the shadow object exists and was non-empty before the collapse, goto 1 Is that at all accurate? I'm not familiar with the mechanisms used to implement live migration in other hypervisors. From owner-freebsd-amd64@freebsd.org Sat Jun 30 22:34:13 2018 Return-Path: Delivered-To: freebsd-amd64@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F046BFDDD18; Sat, 30 Jun 2018 22:34:12 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 564428F607; Sat, 30 Jun 2018 22:34:12 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id w5UMY1dx068036 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sun, 1 Jul 2018 01:34:04 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua w5UMY1dx068036 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id w5UMY1Ph068035; Sun, 1 Jul 2018 01:34:01 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 1 Jul 2018 01:34:01 +0300 From: Konstantin Belousov To: Mark Johnston Cc: Mihai Carabas , freebsd-amd64@freebsd.org, freebsd-virtualization@freebsd.org Subject: Re: Inspect pages created after a vm_object is marked as copy-on-write Message-ID: <20180630223401.GW2430@kib.kiev.ua> References: <20180629225209.GA4238@pesky.lan> <20180630215956.GA1282@pesky.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180630215956.GA1282@pesky.lan> User-Agent: Mutt/1.10.0 (2018-05-17) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 30 Jun 2018 22:34:13 -0000 On Sat, Jun 30, 2018 at 05:59:56PM -0400, Mark Johnston wrote: > On Sat, Jun 30, 2018 at 10:38:21AM +0300, Mihai Carabas wrote: > > On Sat, Jun 30, 2018 at 1:52 AM, Mark Johnston wrote: > > > On Fri, Jun 29, 2018 at 11:58:31AM +0300, Elena Mihailescu wrote: > > >> Is there anything I am doing wrong? Maybe I misunderstood something about > > >> the way the virtual memory works in FreeBSD. > > > > > > I'll note that inspecting and manipulating vm_map_entry and vm_object > > > structures in the bhyve code constitutes something of an abstraction > > > violation, though it's reasonable to proceed this way while working on a > > > prototype of the feature. That is, I think you should keep trying your > > > current approach, but just be aware that you are using the copy-on-write > > > mechanism in a way that the VM system isn't really expecting. > > > > > > > Can you point out the right approach in our case? > > I am merely suggesting that once the required VM interactions are fully > understood, the mechanism implemented for bhyve should be generalized > and lifted into the VM code. It's hard to say what the "right" approach > is, since I don't fully understand the proposed algorithm. It sounds > like you might be attempting something like: > > 1. mark the mappings of to-be-migrated objects as NEEDS_COW, so that a > subsequent write fault triggers creation of a shadow object It is actually MAP_ENTRY_COW | MAP_ENTRY_NEEDS_COPY. Note that setting an entry to COW changes the behaviour of mprotect(2), at least. > 2. invalidate all physical mappings of pages in the object to be copied, > so that subsequent writes trigger a fault I do not think this is needed to detect writes after the COW is set. It is enough to remove the write permissions. Same as fork() does, see the vm_map_copy_entry() code for the handling of MAP_ENTRY_NEEDS_COPY case. > 3. copy pages from the backing object to the destination As I understand, this is done right after the entry is marked as COW. > 4. copy any pages from the shadow object to the desination And this is done after all backing data is copied and the process is suspended. > 5. collapse the backing object into the shadow > 6. if the shadow object exists and was non-empty before the collapse, > goto 1 Are you trying to describe how to undo the COW marking ? Marking an entry as COW really changes its semantic, and we do not need the undo operation in the base so far. Collapsing the objects would lesser the pressure on the system pollution with objects, but it does not change back the meaning of mappings, e.g. their behaviour on inheritance on fork. > > Is that at all accurate? I'm not familiar with the mechanisms used to > implement live migration in other hypervisors. > _______________________________________________ > freebsd-amd64@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-amd64 > To unsubscribe, send any mail to "freebsd-amd64-unsubscribe@freebsd.org"