From owner-freebsd-arch@FreeBSD.ORG Mon Sep 29 15:51:34 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4DF679B5; Mon, 29 Sep 2014 15:51:34 +0000 (UTC) Received: from mail-qc0-x229.google.com (mail-qc0-x229.google.com [IPv6:2607:f8b0:400d:c01::229]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id EBEC1D69; Mon, 29 Sep 2014 15:51:33 +0000 (UTC) Received: by mail-qc0-f169.google.com with SMTP id r5so8942680qcx.14 for ; Mon, 29 Sep 2014 08:51:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=X+aRJKon7p8YETci/bYYaaAzqvpqNFlbjQOp69HhGDs=; b=gJ5N3REr3NqBpyecVsmQie9SkMXSGVZpk5YTJRTVN8Jufnnd7ZMG+7Flf5ryhxAmbo 3n1kY9WAp7jmmBlJgH2Fhw7tiHwq20qncVgeSS76Fq8uKxLm9DVlsptnJT+Lyu3nendn r4AjdZZIzq1OMjU2vEsmzPDBRps6OZphfi1GBlx4NZ9xtSDsSaZGKkxzw46WzsTlOjjT vMmtUZiY5tEpJx+0m5Z/RDKWOZsqS0AX2yLdZTx7uT6un26fZliAdw6nJGY/rPilVgRS e+dP8FUv/rUoaXh5paMJ/TMaK4DwgsoGpMROfr8KmH4ypZAyuqk67OglVhPCBK+UPwVo fhng== MIME-Version: 1.0 X-Received: by 10.224.46.6 with SMTP id h6mr53590476qaf.45.1412005892861; Mon, 29 Sep 2014 08:51:32 -0700 (PDT) Received: by 10.140.23.242 with HTTP; Mon, 29 Sep 2014 08:51:32 -0700 (PDT) In-Reply-To: <5428AF3B.1030906@rice.edu> References: <5428AF3B.1030906@rice.edu> Date: Mon, 29 Sep 2014 17:51:32 +0200 Message-ID: Subject: Re: vm_page_array and VM_PHYSSEG_SPARSE From: Svatopluk Kraus To: Alan Cox Content-Type: multipart/mixed; boundary=001a11c35a9caa54240504363d05 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: alc@freebsd.org, FreeBSD Arch X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Sep 2014 15:51:34 -0000 --001a11c35a9caa54240504363d05 Content-Type: text/plain; charset=UTF-8 On Mon, Sep 29, 2014 at 3:00 AM, Alan Cox wrote: > On 09/27/2014 03:51, Svatopluk Kraus wrote: > > > On Fri, Sep 26, 2014 at 8:08 PM, Alan Cox wrote: > >> >> >> On Wed, Sep 24, 2014 at 7:27 AM, Svatopluk Kraus >> wrote: >> >>> Hi, >>> >>> I and Michal are finishing new ARM pmap-v6 code. There is one problem >>> we've >>> dealt with somehow, but now we would like to do it better. It's about >>> physical pages which are allocated before vm subsystem is initialized. >>> While later on these pages could be found in vm_page_array when >>> VM_PHYSSEG_DENSE memory model is used, it's not true for >>> VM_PHYSSEG_SPARSE >>> memory model. And ARM world uses VM_PHYSSEG_SPARSE model. >>> >>> It really would be nice to utilize vm_page_array for such preallocated >>> physical pages even when VM_PHYSSEG_SPARSE memory model is used. Things >>> could be much easier then. In our case, it's about pages which are used >>> for >>> level 2 page tables. In VM_PHYSSEG_SPARSE model, we have two sets of such >>> pages. First ones are preallocated and second ones are allocated after vm >>> subsystem was inited. We must deal with each set differently. So code is >>> more complex and so is debugging. >>> >>> Thus we need some method how to say that some part of physical memory >>> should be included in vm_page_array, but the pages from that region >>> should >>> not be put to free list during initialization. We think that such >>> possibility could be utilized in general. There could be a need for some >>> physical space which: >>> >>> (1) is needed only during boot and later on it can be freed and put to vm >>> subsystem, >>> >>> (2) is needed for something else and vm_page_array code could be used >>> without some kind of its duplication. >>> >>> There is already some code which deals with blacklisted pages in >>> vm_page.c >>> file. So the easiest way how to deal with presented situation is to add >>> some callback to this part of code which will be able to either exclude >>> whole phys_avail[i], phys_avail[i+1] region or single pages. As the >>> biggest >>> phys_avail region is used for vm subsystem allocations, there should be >>> some more coding. (However, blacklisted pages are not dealt with on that >>> part of region.) >>> >>> We would like to know if there is any objection: >>> >>> (1) to deal with presented problem, >>> (2) to deal with the problem presented way. >>> Some help is very appreciated. Thanks >>> >>> >> >> As an experiment, try modifying vm_phys.c to use dump_avail instead of >> phys_avail when sizing vm_page_array. On amd64, where the same problem >> exists, this allowed me to use VM_PHYSSEG_SPARSE. Right now, this is >> probably my preferred solution. The catch being that not all architectures >> implement dump_avail, but my recollection is that arm does. >> > > Frankly, I would prefer this too, but there is one big open question: > > What is dump_avail for? > > > > dump_avail[] is solving a similar problem in the minidump code, hence, the > prefix "dump_" in its name. In other words, the minidump code couldn't use > phys_avail[] either because it didn't describe the full range of physical > addresses that might be included in a minidump, so dump_avail[] was created. > > There is already precedent for what I'm suggesting. dump_avail[] is > already (ab)used outside of the minidump code on x86 to solve this same > problem in x86/x86/nexus.c, and on arm in arm/arm/mem.c. > > > Using it for vm_page_array initialization and segmentation means that > phys_avail must be a subset of it. And this must be stated and be visible > enough. Maybe it should be even checked in code. I like the idea of > thinking about dump_avail as something what desribes all memory in a > system, but it's not how dump_avail is defined in archs now. > > > > When you say "it's not how dump_avail is defined in archs now", I'm not > sure whether you're talking about the code or the comments. In terms of > code, dump_avail[] is a superset of phys_avail[], and I'm not aware of any > code that would have to change. In terms of comments, I did a grep looking > for comments defining what dump_avail[] is, because I couldn't remember > any. I found one ... on arm. So, I don't think it's a onerous task > changing the definition of dump_avail[]. :-) > > Already, as things stand today with dump_avail[] being used outside of the > minidump code, one could reasonably argue that it should be renamed to > something like phys_exists[]. > > > > I will experiment with it on monday then. However, it's not only about how > memory segments are created in vm_phys.c, but it's about how vm_page_array > size is computed in vm_page.c too. > > > > Yes, and there is also a place in vm_reserv.c that needs to change. I've > attached the patch that I developed and tested a long time ago. It still > applies cleanly and runs ok on amd64. > > I took your patch and added some changes to vm_page.c to make - IMHO - things more consistent across dense and sparse cases. It runs ok on arm (odroid-xu). New patch is attached. I've investigated dump_avail and phys_avail in other archs. In mips, dump_avail is equal to phys_avail, so it should run with no difference there. However, sys\mips\atheros\ar71xx_machdep.c should be fixed probably. There is no dump_avail definition in sparc64 and powerpc. There, dump_avail could be defined same way like in mips. So, it should run with no problem too. The involved files are: sys\powerpc\aim\mmu_oea.c sys\powerpc\aim\mmu_oea64.c sys\powerpc\booke\pmap.c sys\sparc64\sparc64\pmap.c There are some files where I can imagine that phys_avail could be replaced by dump_avail as a matter of purity. sys\arm\arm\busdma_machdep.c sys\mips\mips\busdma_machdep.c sys\arm\arm\busdma_machdep-v6.c sys\sparc64\sparc64\mem.c sys\mips\mips\minidump_machdep.c I agree that dump_avail should be renamed if this change happen. I'm prepared to work on full patch. Svata --001a11c35a9caa54240504363d05 Content-Type: application/octet-stream; name="vm_page_array.path" Content-Disposition: attachment; filename="vm_page_array.path" Content-Transfer-Encoding: base64 X-Attachment-Id: f_i0nyk4771 SW5kZXg6IHN5cy92bS92bV9wYWdlLmMKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0gc3lzL3ZtL3ZtX3BhZ2UuYwko cmV2aXNpb24gMjcyMjgxKQorKysgc3lzL3ZtL3ZtX3BhZ2UuYwkod29ya2luZyBjb3B5KQpAQCAt MzAxLDMxICszMDEsMzggQEAKIAliaWdnZXN0b25lID0gMDsKIAl2YWRkciA9IHJvdW5kX3BhZ2Uo dmFkZHIpOwogCi0JZm9yIChpID0gMDsgcGh5c19hdmFpbFtpICsgMV07IGkgKz0gMikgewotCQlw aHlzX2F2YWlsW2ldID0gcm91bmRfcGFnZShwaHlzX2F2YWlsW2ldKTsKLQkJcGh5c19hdmFpbFtp ICsgMV0gPSB0cnVuY19wYWdlKHBoeXNfYXZhaWxbaSArIDFdKTsKKwlmb3IgKGkgPSAwOyBkdW1w X2F2YWlsW2kgKyAxXTsgaSArPSAyKSB7CisJCWR1bXBfYXZhaWxbaV0gPSByb3VuZF9wYWdlKGR1 bXBfYXZhaWxbaV0pOworCQlkdW1wX2F2YWlsW2kgKyAxXSA9IHRydW5jX3BhZ2UoZHVtcF9hdmFp bFtpICsgMV0pOwogCX0KIAotCWxvd193YXRlciA9IHBoeXNfYXZhaWxbMF07Ci0JaGlnaF93YXRl ciA9IHBoeXNfYXZhaWxbMV07CisJbG93X3dhdGVyID0gZHVtcF9hdmFpbFswXTsKKwloaWdoX3dh dGVyID0gZHVtcF9hdmFpbFsxXTsKIAorCWZvciAoaSA9IDA7IGR1bXBfYXZhaWxbaSArIDFdOyBp ICs9IDIpIHsKKwkJaWYgKGR1bXBfYXZhaWxbaV0gPCBsb3dfd2F0ZXIpCisJCQlsb3dfd2F0ZXIg PSBkdW1wX2F2YWlsW2ldOworCQlpZiAoZHVtcF9hdmFpbFtpICsgMV0gPiBoaWdoX3dhdGVyKQor CQkJaGlnaF93YXRlciA9IGR1bXBfYXZhaWxbaSArIDFdOworCX0KKworI2lmZGVmIFhFTgorCWxv d193YXRlciA9IDA7CisjZW5kaWYKKwogCWZvciAoaSA9IDA7IHBoeXNfYXZhaWxbaSArIDFdOyBp ICs9IDIpIHsKLQkJdm1fcGFkZHJfdCBzaXplID0gcGh5c19hdmFpbFtpICsgMV0gLSBwaHlzX2F2 YWlsW2ldOworCQl2bV9wYWRkcl90IHNpemU7CiAKKwkJcGh5c19hdmFpbFtpXSA9IHJvdW5kX3Bh Z2UocGh5c19hdmFpbFtpXSk7CisJCXBoeXNfYXZhaWxbaSArIDFdID0gdHJ1bmNfcGFnZShwaHlz X2F2YWlsW2kgKyAxXSk7CisKKwkJc2l6ZSA9IHBoeXNfYXZhaWxbaSArIDFdIC0gcGh5c19hdmFp bFtpXTsKIAkJaWYgKHNpemUgPiBiaWdnZXN0c2l6ZSkgewogCQkJYmlnZ2VzdG9uZSA9IGk7CiAJ CQliaWdnZXN0c2l6ZSA9IHNpemU7CiAJCX0KLQkJaWYgKHBoeXNfYXZhaWxbaV0gPCBsb3dfd2F0 ZXIpCi0JCQlsb3dfd2F0ZXIgPSBwaHlzX2F2YWlsW2ldOwotCQlpZiAocGh5c19hdmFpbFtpICsg MV0gPiBoaWdoX3dhdGVyKQotCQkJaGlnaF93YXRlciA9IHBoeXNfYXZhaWxbaSArIDFdOwogCX0K IAotI2lmZGVmIFhFTgotCWxvd193YXRlciA9IDA7Ci0jZW5kaWYJCi0KIAllbmQgPSBwaHlzX2F2 YWlsW2JpZ2dlc3RvbmUrMV07CiAKIAkvKgpAQCAtMzkzLDggKzQwMCw4IEBACiAJZmlyc3RfcGFn ZSA9IGxvd193YXRlciAvIFBBR0VfU0laRTsKICNpZmRlZiBWTV9QSFlTU0VHX1NQQVJTRQogCXBh Z2VfcmFuZ2UgPSAwOwotCWZvciAoaSA9IDA7IHBoeXNfYXZhaWxbaSArIDFdICE9IDA7IGkgKz0g MikKLQkJcGFnZV9yYW5nZSArPSBhdG9wKHBoeXNfYXZhaWxbaSArIDFdIC0gcGh5c19hdmFpbFtp XSk7CisJZm9yIChpID0gMDsgZHVtcF9hdmFpbFtpICsgMV0gIT0gMDsgaSArPSAyKQorCQlwYWdl X3JhbmdlICs9IGF0b3AoZHVtcF9hdmFpbFtpICsgMV0gLSBkdW1wX2F2YWlsW2ldKTsKICNlbGlm IGRlZmluZWQoVk1fUEhZU1NFR19ERU5TRSkKIAlwYWdlX3JhbmdlID0gaGlnaF93YXRlciAvIFBB R0VfU0laRSAtIGZpcnN0X3BhZ2U7CiAjZWxzZQpJbmRleDogc3lzL3ZtL3ZtX3BoeXMuYwo9PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09Ci0tLSBzeXMvdm0vdm1fcGh5cy5jCShyZXZpc2lvbiAyNzIyODEpCisrKyBzeXMvdm0v dm1fcGh5cy5jCSh3b3JraW5nIGNvcHkpCkBAIC0zNjUsMTcgKzM2NSwxNyBAQAogCXN0cnVjdCB2 bV9mcmVlbGlzdCAqZmw7CiAJaW50IGRvbSwgZmxpbmQsIGksIG9pbmQsIHBpbmQ7CiAKLQlmb3Ig KGkgPSAwOyBwaHlzX2F2YWlsW2kgKyAxXSAhPSAwOyBpICs9IDIpIHsKKwlmb3IgKGkgPSAwOyBk dW1wX2F2YWlsW2kgKyAxXSAhPSAwOyBpICs9IDIpIHsKICNpZmRlZglWTV9GUkVFTElTVF9JU0FE TUEKLQkJaWYgKHBoeXNfYXZhaWxbaV0gPCAxNjc3NzIxNikgewotCQkJaWYgKHBoeXNfYXZhaWxb aSArIDFdID4gMTY3NzcyMTYpIHsKLQkJCQl2bV9waHlzX2NyZWF0ZV9zZWcocGh5c19hdmFpbFtp XSwgMTY3NzcyMTYsCisJCWlmIChkdW1wX2F2YWlsW2ldIDwgMTY3NzcyMTYpIHsKKwkJCWlmIChk dW1wX2F2YWlsW2kgKyAxXSA+IDE2Nzc3MjE2KSB7CisJCQkJdm1fcGh5c19jcmVhdGVfc2VnKGR1 bXBfYXZhaWxbaV0sIDE2Nzc3MjE2LAogCQkJCSAgICBWTV9GUkVFTElTVF9JU0FETUEpOwotCQkJ CXZtX3BoeXNfY3JlYXRlX3NlZygxNjc3NzIxNiwgcGh5c19hdmFpbFtpICsgMV0sCisJCQkJdm1f cGh5c19jcmVhdGVfc2VnKDE2Nzc3MjE2LCBkdW1wX2F2YWlsW2kgKyAxXSwKIAkJCQkgICAgVk1f RlJFRUxJU1RfREVGQVVMVCk7CiAJCQl9IGVsc2UgewotCQkJCXZtX3BoeXNfY3JlYXRlX3NlZyhw aHlzX2F2YWlsW2ldLAotCQkJCSAgICBwaHlzX2F2YWlsW2kgKyAxXSwgVk1fRlJFRUxJU1RfSVNB RE1BKTsKKwkJCQl2bV9waHlzX2NyZWF0ZV9zZWcoZHVtcF9hdmFpbFtpXSwKKwkJCQkgICAgZHVt cF9hdmFpbFtpICsgMV0sIFZNX0ZSRUVMSVNUX0lTQURNQSk7CiAJCQl9CiAJCQlpZiAoVk1fRlJF RUxJU1RfSVNBRE1BID49IHZtX25mcmVlbGlzdHMpCiAJCQkJdm1fbmZyZWVsaXN0cyA9IFZNX0ZS RUVMSVNUX0lTQURNQSArIDE7CkBAIC0zODIsMjEgKzM4MiwyMSBAQAogCQl9IGVsc2UKICNlbmRp ZgogI2lmZGVmCVZNX0ZSRUVMSVNUX0hJR0hNRU0KLQkJaWYgKHBoeXNfYXZhaWxbaSArIDFdID4g Vk1fSElHSE1FTV9BRERSRVNTKSB7Ci0JCQlpZiAocGh5c19hdmFpbFtpXSA8IFZNX0hJR0hNRU1f QUREUkVTUykgewotCQkJCXZtX3BoeXNfY3JlYXRlX3NlZyhwaHlzX2F2YWlsW2ldLAorCQlpZiAo ZHVtcF9hdmFpbFtpICsgMV0gPiBWTV9ISUdITUVNX0FERFJFU1MpIHsKKwkJCWlmIChkdW1wX2F2 YWlsW2ldIDwgVk1fSElHSE1FTV9BRERSRVNTKSB7CisJCQkJdm1fcGh5c19jcmVhdGVfc2VnKGR1 bXBfYXZhaWxbaV0sCiAJCQkJICAgIFZNX0hJR0hNRU1fQUREUkVTUywgVk1fRlJFRUxJU1RfREVG QVVMVCk7CiAJCQkJdm1fcGh5c19jcmVhdGVfc2VnKFZNX0hJR0hNRU1fQUREUkVTUywKLQkJCQkg ICAgcGh5c19hdmFpbFtpICsgMV0sIFZNX0ZSRUVMSVNUX0hJR0hNRU0pOworCQkJCSAgICBkdW1w X2F2YWlsW2kgKyAxXSwgVk1fRlJFRUxJU1RfSElHSE1FTSk7CiAJCQl9IGVsc2UgewotCQkJCXZt X3BoeXNfY3JlYXRlX3NlZyhwaHlzX2F2YWlsW2ldLAotCQkJCSAgICBwaHlzX2F2YWlsW2kgKyAx XSwgVk1fRlJFRUxJU1RfSElHSE1FTSk7CisJCQkJdm1fcGh5c19jcmVhdGVfc2VnKGR1bXBfYXZh aWxbaV0sCisJCQkJICAgIGR1bXBfYXZhaWxbaSArIDFdLCBWTV9GUkVFTElTVF9ISUdITUVNKTsK IAkJCX0KIAkJCWlmIChWTV9GUkVFTElTVF9ISUdITUVNID49IHZtX25mcmVlbGlzdHMpCiAJCQkJ dm1fbmZyZWVsaXN0cyA9IFZNX0ZSRUVMSVNUX0hJR0hNRU0gKyAxOwogCQl9IGVsc2UKICNlbmRp ZgotCQl2bV9waHlzX2NyZWF0ZV9zZWcocGh5c19hdmFpbFtpXSwgcGh5c19hdmFpbFtpICsgMV0s CisJCXZtX3BoeXNfY3JlYXRlX3NlZyhkdW1wX2F2YWlsW2ldLCBkdW1wX2F2YWlsW2kgKyAxXSwK IAkJICAgIFZNX0ZSRUVMSVNUX0RFRkFVTFQpOwogCX0KIAlmb3IgKGRvbSA9IDA7IGRvbSA8IHZt X25kb21haW5zOyBkb20rKykgewpJbmRleDogc3lzL3ZtL3ZtX3Jlc2Vydi5jCj09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0K LS0tIHN5cy92bS92bV9yZXNlcnYuYwkocmV2aXNpb24gMjcyMjgxKQorKysgc3lzL3ZtL3ZtX3Jl c2Vydi5jCSh3b3JraW5nIGNvcHkpCkBAIC04MjQsOSArODI0LDkgQEAKIAkgKiBJbml0aWFsaXpl IHRoZSByZXNlcnZhdGlvbiBhcnJheS4gIFNwZWNpZmljYWxseSwgaW5pdGlhbGl6ZSB0aGUKIAkg KiAicGFnZXMiIGZpZWxkIGZvciBldmVyeSBlbGVtZW50IHRoYXQgaGFzIGFuIHVuZGVybHlpbmcg c3VwZXJwYWdlLgogCSAqLwotCWZvciAoaSA9IDA7IHBoeXNfYXZhaWxbaSArIDFdICE9IDA7IGkg Kz0gMikgewotCQlwYWRkciA9IHJvdW5kdXAyKHBoeXNfYXZhaWxbaV0sIFZNX0xFVkVMXzBfU0la RSk7Ci0JCXdoaWxlIChwYWRkciArIFZNX0xFVkVMXzBfU0laRSA8PSBwaHlzX2F2YWlsW2kgKyAx XSkgeworCWZvciAoaSA9IDA7IGR1bXBfYXZhaWxbaSArIDFdICE9IDA7IGkgKz0gMikgeworCQlw YWRkciA9IHJvdW5kdXAyKGR1bXBfYXZhaWxbaV0sIFZNX0xFVkVMXzBfU0laRSk7CisJCXdoaWxl IChwYWRkciArIFZNX0xFVkVMXzBfU0laRSA8PSBkdW1wX2F2YWlsW2kgKyAxXSkgewogCQkJdm1f cmVzZXJ2X2FycmF5W3BhZGRyID4+IFZNX0xFVkVMXzBfU0hJRlRdLnBhZ2VzID0KIAkJCSAgICBQ SFlTX1RPX1ZNX1BBR0UocGFkZHIpOwogCQkJcGFkZHIgKz0gVk1fTEVWRUxfMF9TSVpFOwo= --001a11c35a9caa54240504363d05--