Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 13 Feb 2018 07:03:14 -0500
From:      Mike Tancsa <mike@sentex.net>
To:        Konstantin Belousov <kib@freebsd.org>, Elliott.Rabe@dell.com
Cc:        alc@freebsd.org, freebsd-hackers@freebsd.org, markj@freebsd.org, Eric.Van.Gyzen@dell.com
Subject:   Re: Stale memory during post fork cow pmap update
Message-ID:  <51a330e1-10fa-e5cb-e8a9-c519680fdbcd@sentex.net>
In-Reply-To: <20180210225608.GM33564@kib.kiev.ua>
References:  <5A7E7F2B.80900@dell.com> <20180210111848.GL33564@kib.kiev.ua> <5A7F6A7C.80607@dell.com> <20180210225608.GM33564@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2/10/2018 5:56 PM, Konstantin Belousov wrote:
> On Sat, Feb 10, 2018 at 09:56:20PM +0000, Elliott.Rabe@dell.com wrote:
>> On 02/10/2018 05:18 AM, Konstantin Belousov wrote:
>>> On Sat, Feb 10, 2018 at 05:12:11AM +0000, Elliott.Rabe@dell.com wrote:
>>>> Greetings-
>>>>
>>>> I've been hunting for the root cause of elusive, slight memory
>>>> corruptions in a large, complex process that manages many threads. All
>>>> failures and experimentation thus far has been on x86_64 architecture
>>>> machines, and pmap_pcid is not in use.
>>>>


The patch below seems to fix the issues I was seeing in

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225584

at least I have not been able to reproduce it.  It would normally take
2-3 builds of net/samba47 to manifest, but I was able to do 70 over
night without fail.  For some reason, this issue was far more acute on
AMD Ryzen CPUs than any of the Intel CPUs I had been testing on.


> So I agree that doing two-stage COW, with the first stage copying page
> but keeping it read-only, is the right solution. Below is my take.
> During the smoke boot, I noted that there is somewhat related issue in
> reevaluation of the map entry permissions.
> 
> diff --git a/sys/vm/vm_fault.c b/sys/vm/vm_fault.c
> index 83e12a588ee..149a15f1d9d 100644
> --- a/sys/vm/vm_fault.c
> +++ b/sys/vm/vm_fault.c
> @@ -1135,6 +1157,10 @@ RetryFault:;
>  				 */
>  				pmap_copy_page(fs.m, fs.first_m);
>  				fs.first_m->valid = VM_PAGE_BITS_ALL;
> +				if ((fault_flags & VM_FAULT_WIRE) == 0) {
> +					prot &= ~VM_PROT_WRITE;
> +					fault_type &= ~VM_PROT_WRITE;
> +				}
>  				if (wired && (fault_flags &
>  				    VM_FAULT_WIRE) == 0) {
>  					vm_page_lock(fs.first_m);
> @@ -1219,6 +1245,12 @@ RetryFault:;
>  			 * write-enabled after all.
>  			 */
>  			prot &= retry_prot;
> +			fault_type &= retry_prot;
> +			if (prot == 0) {
> +				release_page(&fs);
> +				unlock_and_deallocate(&fs);
> +				goto RetryFault;
> +			}
>  		}
>  	}
>  
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
> 
> 


-- 
-------------------
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, mike@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?51a330e1-10fa-e5cb-e8a9-c519680fdbcd>