Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Sep 2014 17:51:32 +0200
From:      Svatopluk Kraus <onwahe@gmail.com>
To:        Alan Cox <alc@rice.edu>
Cc:        alc@freebsd.org, FreeBSD Arch <freebsd-arch@freebsd.org>
Subject:   Re: vm_page_array and VM_PHYSSEG_SPARSE
Message-ID:  <CAFHCsPXE8FTBeii2wQDmkqGBZ-jJta0cQSVQBACs95ert-hVqw@mail.gmail.com>
In-Reply-To: <5428AF3B.1030906@rice.edu>
References:  <CAFHCsPWkq09_RRDz7fy3UgsRFv8ZbNKdAH2Ft0x6aVSwLPi6BQ@mail.gmail.com> <CAJUyCcPXBuLu0nvaCqpg8NJ6KzAX9BA1Rt%2BooD%2B3pzq%2BFV%2B%2BTQ@mail.gmail.com> <CAFHCsPWq9WqeFnx1a%2BStfSxj=jwcE9GPyVsoyh0%2Bazr3HmM6vQ@mail.gmail.com> <5428AF3B.1030906@rice.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
--001a11c35a9caa54240504363d05
Content-Type: text/plain; charset=UTF-8

On Mon, Sep 29, 2014 at 3:00 AM, Alan Cox <alc@rice.edu> wrote:

>   On 09/27/2014 03:51, Svatopluk Kraus wrote:
>
>
> On Fri, Sep 26, 2014 at 8:08 PM, Alan Cox <alan.l.cox@gmail.com> wrote:
>
>>
>>
>>  On Wed, Sep 24, 2014 at 7:27 AM, Svatopluk Kraus <onwahe@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I and Michal are finishing new ARM pmap-v6 code. There is one problem
>>> we've
>>> dealt with somehow, but now we would like to do it better. It's about
>>> physical pages which are allocated before vm subsystem is initialized.
>>> While later on these pages could be found in vm_page_array when
>>> VM_PHYSSEG_DENSE memory model is used, it's not true for
>>> VM_PHYSSEG_SPARSE
>>> memory model. And ARM world uses VM_PHYSSEG_SPARSE model.
>>>
>>> It really would be nice to utilize vm_page_array for such preallocated
>>> physical pages even when VM_PHYSSEG_SPARSE memory model is used. Things
>>> could be much easier then. In our case, it's about pages which are used
>>> for
>>> level 2 page tables. In VM_PHYSSEG_SPARSE model, we have two sets of such
>>> pages. First ones are preallocated and second ones are allocated after vm
>>> subsystem was inited. We must deal with each set differently. So code is
>>> more complex and so is debugging.
>>>
>>> Thus we need some method how to say that some part of physical memory
>>> should be included in vm_page_array, but the pages from that region
>>> should
>>> not be put to free list during initialization. We think that such
>>> possibility could be utilized in general. There could be a need for some
>>> physical space which:
>>>
>>> (1) is needed only during boot and later on it can be freed and put to vm
>>> subsystem,
>>>
>>> (2) is needed for something else and vm_page_array code could be used
>>> without some kind of its duplication.
>>>
>>> There is already some code which deals with blacklisted pages in
>>> vm_page.c
>>> file. So the easiest way how to deal with presented situation is to add
>>> some callback to this part of code which will be able to either exclude
>>> whole phys_avail[i], phys_avail[i+1] region or single pages. As the
>>> biggest
>>> phys_avail region is used for vm subsystem allocations, there should be
>>> some more coding. (However, blacklisted pages are not dealt with on that
>>> part of region.)
>>>
>>> We would like to know if there is any objection:
>>>
>>> (1) to deal with presented problem,
>>> (2) to deal with the problem presented way.
>>> Some help is very appreciated. Thanks
>>>
>>>
>>
>> As an experiment, try modifying vm_phys.c to use dump_avail instead of
>> phys_avail when sizing vm_page_array.  On amd64, where the same problem
>> exists, this allowed me to use VM_PHYSSEG_SPARSE.  Right now, this is
>> probably my preferred solution.  The catch being that not all architectures
>> implement dump_avail, but my recollection is that arm does.
>>
>
> Frankly, I would prefer this too, but there is one big open question:
>
> What is dump_avail for?
>
>
>
> dump_avail[] is solving a similar problem in the minidump code, hence, the
> prefix "dump_" in its name.  In other words, the minidump code couldn't use
> phys_avail[] either because it didn't describe the full range of physical
> addresses that might be included in a minidump, so dump_avail[] was created.
>
> There is already precedent for what I'm suggesting.  dump_avail[] is
> already (ab)used outside of the minidump code on x86 to solve this same
> problem in x86/x86/nexus.c, and on arm in arm/arm/mem.c.
>
>
>  Using it for vm_page_array initialization and segmentation means that
> phys_avail must be a subset of it. And this must be stated and be visible
> enough. Maybe it should be even checked in code. I like the idea of
> thinking about dump_avail as something what desribes all memory in a
> system, but it's not how dump_avail is defined in archs now.
>
>
>
> When you say "it's not how dump_avail is defined in archs now", I'm not
> sure whether you're talking about the code or the comments.  In terms of
> code, dump_avail[] is a superset of phys_avail[], and I'm not aware of any
> code that would have to change.  In terms of comments, I did a grep looking
> for comments defining what dump_avail[] is, because I couldn't remember
> any.  I found one ... on arm.  So, I don't think it's a onerous task
> changing the definition of dump_avail[].  :-)
>
> Already, as things stand today with dump_avail[] being used outside of the
> minidump code, one could reasonably argue that it should be renamed to
> something like phys_exists[].
>
>
>
> I will experiment with it on monday then. However, it's not only about how
> memory segments are created in vm_phys.c, but it's about how vm_page_array
> size is computed in vm_page.c too.
>
>
>
> Yes, and there is also a place in vm_reserv.c that needs to change.   I've
> attached the patch that I developed and tested a long time ago.  It still
> applies cleanly and runs ok on amd64.
>
>

 I took your patch and added some changes to vm_page.c to make - IMHO -
things more consistent across dense and sparse cases. It runs ok on arm
(odroid-xu). New patch is attached.

I've investigated dump_avail and phys_avail in other archs. In mips,
dump_avail is equal to phys_avail, so it should run with no difference
there. However, sys\mips\atheros\ar71xx_machdep.c should be fixed
probably. There is no dump_avail definition in sparc64 and powerpc. There,
dump_avail could be defined same way like in mips. So, it should run with
no problem too. The involved files are:

sys\powerpc\aim\mmu_oea.c
sys\powerpc\aim\mmu_oea64.c
sys\powerpc\booke\pmap.c
sys\sparc64\sparc64\pmap.c

There are some files where I can imagine that phys_avail could be replaced
by dump_avail as a matter of purity.

sys\arm\arm\busdma_machdep.c
sys\mips\mips\busdma_machdep.c
sys\arm\arm\busdma_machdep-v6.c
sys\sparc64\sparc64\mem.c
sys\mips\mips\minidump_machdep.c

I agree that dump_avail should be renamed if this change happen.
I'm prepared to work on full patch.

Svata

--001a11c35a9caa54240504363d05
Content-Type: application/octet-stream; name="vm_page_array.path"
Content-Disposition: attachment; filename="vm_page_array.path"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_i0nyk4771

SW5kZXg6IHN5cy92bS92bV9wYWdlLmMKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0gc3lzL3ZtL3ZtX3BhZ2UuYwko
cmV2aXNpb24gMjcyMjgxKQorKysgc3lzL3ZtL3ZtX3BhZ2UuYwkod29ya2luZyBjb3B5KQpAQCAt
MzAxLDMxICszMDEsMzggQEAKIAliaWdnZXN0b25lID0gMDsKIAl2YWRkciA9IHJvdW5kX3BhZ2Uo
dmFkZHIpOwogCi0JZm9yIChpID0gMDsgcGh5c19hdmFpbFtpICsgMV07IGkgKz0gMikgewotCQlw
aHlzX2F2YWlsW2ldID0gcm91bmRfcGFnZShwaHlzX2F2YWlsW2ldKTsKLQkJcGh5c19hdmFpbFtp
ICsgMV0gPSB0cnVuY19wYWdlKHBoeXNfYXZhaWxbaSArIDFdKTsKKwlmb3IgKGkgPSAwOyBkdW1w
X2F2YWlsW2kgKyAxXTsgaSArPSAyKSB7CisJCWR1bXBfYXZhaWxbaV0gPSByb3VuZF9wYWdlKGR1
bXBfYXZhaWxbaV0pOworCQlkdW1wX2F2YWlsW2kgKyAxXSA9IHRydW5jX3BhZ2UoZHVtcF9hdmFp
bFtpICsgMV0pOwogCX0KIAotCWxvd193YXRlciA9IHBoeXNfYXZhaWxbMF07Ci0JaGlnaF93YXRl
ciA9IHBoeXNfYXZhaWxbMV07CisJbG93X3dhdGVyID0gZHVtcF9hdmFpbFswXTsKKwloaWdoX3dh
dGVyID0gZHVtcF9hdmFpbFsxXTsKIAorCWZvciAoaSA9IDA7IGR1bXBfYXZhaWxbaSArIDFdOyBp
ICs9IDIpIHsKKwkJaWYgKGR1bXBfYXZhaWxbaV0gPCBsb3dfd2F0ZXIpCisJCQlsb3dfd2F0ZXIg
PSBkdW1wX2F2YWlsW2ldOworCQlpZiAoZHVtcF9hdmFpbFtpICsgMV0gPiBoaWdoX3dhdGVyKQor
CQkJaGlnaF93YXRlciA9IGR1bXBfYXZhaWxbaSArIDFdOworCX0KKworI2lmZGVmIFhFTgorCWxv
d193YXRlciA9IDA7CisjZW5kaWYKKwogCWZvciAoaSA9IDA7IHBoeXNfYXZhaWxbaSArIDFdOyBp
ICs9IDIpIHsKLQkJdm1fcGFkZHJfdCBzaXplID0gcGh5c19hdmFpbFtpICsgMV0gLSBwaHlzX2F2
YWlsW2ldOworCQl2bV9wYWRkcl90IHNpemU7CiAKKwkJcGh5c19hdmFpbFtpXSA9IHJvdW5kX3Bh
Z2UocGh5c19hdmFpbFtpXSk7CisJCXBoeXNfYXZhaWxbaSArIDFdID0gdHJ1bmNfcGFnZShwaHlz
X2F2YWlsW2kgKyAxXSk7CisKKwkJc2l6ZSA9IHBoeXNfYXZhaWxbaSArIDFdIC0gcGh5c19hdmFp
bFtpXTsKIAkJaWYgKHNpemUgPiBiaWdnZXN0c2l6ZSkgewogCQkJYmlnZ2VzdG9uZSA9IGk7CiAJ
CQliaWdnZXN0c2l6ZSA9IHNpemU7CiAJCX0KLQkJaWYgKHBoeXNfYXZhaWxbaV0gPCBsb3dfd2F0
ZXIpCi0JCQlsb3dfd2F0ZXIgPSBwaHlzX2F2YWlsW2ldOwotCQlpZiAocGh5c19hdmFpbFtpICsg
MV0gPiBoaWdoX3dhdGVyKQotCQkJaGlnaF93YXRlciA9IHBoeXNfYXZhaWxbaSArIDFdOwogCX0K
IAotI2lmZGVmIFhFTgotCWxvd193YXRlciA9IDA7Ci0jZW5kaWYJCi0KIAllbmQgPSBwaHlzX2F2
YWlsW2JpZ2dlc3RvbmUrMV07CiAKIAkvKgpAQCAtMzkzLDggKzQwMCw4IEBACiAJZmlyc3RfcGFn
ZSA9IGxvd193YXRlciAvIFBBR0VfU0laRTsKICNpZmRlZiBWTV9QSFlTU0VHX1NQQVJTRQogCXBh
Z2VfcmFuZ2UgPSAwOwotCWZvciAoaSA9IDA7IHBoeXNfYXZhaWxbaSArIDFdICE9IDA7IGkgKz0g
MikKLQkJcGFnZV9yYW5nZSArPSBhdG9wKHBoeXNfYXZhaWxbaSArIDFdIC0gcGh5c19hdmFpbFtp
XSk7CisJZm9yIChpID0gMDsgZHVtcF9hdmFpbFtpICsgMV0gIT0gMDsgaSArPSAyKQorCQlwYWdl
X3JhbmdlICs9IGF0b3AoZHVtcF9hdmFpbFtpICsgMV0gLSBkdW1wX2F2YWlsW2ldKTsKICNlbGlm
IGRlZmluZWQoVk1fUEhZU1NFR19ERU5TRSkKIAlwYWdlX3JhbmdlID0gaGlnaF93YXRlciAvIFBB
R0VfU0laRSAtIGZpcnN0X3BhZ2U7CiAjZWxzZQpJbmRleDogc3lzL3ZtL3ZtX3BoeXMuYwo9PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09Ci0tLSBzeXMvdm0vdm1fcGh5cy5jCShyZXZpc2lvbiAyNzIyODEpCisrKyBzeXMvdm0v
dm1fcGh5cy5jCSh3b3JraW5nIGNvcHkpCkBAIC0zNjUsMTcgKzM2NSwxNyBAQAogCXN0cnVjdCB2
bV9mcmVlbGlzdCAqZmw7CiAJaW50IGRvbSwgZmxpbmQsIGksIG9pbmQsIHBpbmQ7CiAKLQlmb3Ig
KGkgPSAwOyBwaHlzX2F2YWlsW2kgKyAxXSAhPSAwOyBpICs9IDIpIHsKKwlmb3IgKGkgPSAwOyBk
dW1wX2F2YWlsW2kgKyAxXSAhPSAwOyBpICs9IDIpIHsKICNpZmRlZglWTV9GUkVFTElTVF9JU0FE
TUEKLQkJaWYgKHBoeXNfYXZhaWxbaV0gPCAxNjc3NzIxNikgewotCQkJaWYgKHBoeXNfYXZhaWxb
aSArIDFdID4gMTY3NzcyMTYpIHsKLQkJCQl2bV9waHlzX2NyZWF0ZV9zZWcocGh5c19hdmFpbFtp
XSwgMTY3NzcyMTYsCisJCWlmIChkdW1wX2F2YWlsW2ldIDwgMTY3NzcyMTYpIHsKKwkJCWlmIChk
dW1wX2F2YWlsW2kgKyAxXSA+IDE2Nzc3MjE2KSB7CisJCQkJdm1fcGh5c19jcmVhdGVfc2VnKGR1
bXBfYXZhaWxbaV0sIDE2Nzc3MjE2LAogCQkJCSAgICBWTV9GUkVFTElTVF9JU0FETUEpOwotCQkJ
CXZtX3BoeXNfY3JlYXRlX3NlZygxNjc3NzIxNiwgcGh5c19hdmFpbFtpICsgMV0sCisJCQkJdm1f
cGh5c19jcmVhdGVfc2VnKDE2Nzc3MjE2LCBkdW1wX2F2YWlsW2kgKyAxXSwKIAkJCQkgICAgVk1f
RlJFRUxJU1RfREVGQVVMVCk7CiAJCQl9IGVsc2UgewotCQkJCXZtX3BoeXNfY3JlYXRlX3NlZyhw
aHlzX2F2YWlsW2ldLAotCQkJCSAgICBwaHlzX2F2YWlsW2kgKyAxXSwgVk1fRlJFRUxJU1RfSVNB
RE1BKTsKKwkJCQl2bV9waHlzX2NyZWF0ZV9zZWcoZHVtcF9hdmFpbFtpXSwKKwkJCQkgICAgZHVt
cF9hdmFpbFtpICsgMV0sIFZNX0ZSRUVMSVNUX0lTQURNQSk7CiAJCQl9CiAJCQlpZiAoVk1fRlJF
RUxJU1RfSVNBRE1BID49IHZtX25mcmVlbGlzdHMpCiAJCQkJdm1fbmZyZWVsaXN0cyA9IFZNX0ZS
RUVMSVNUX0lTQURNQSArIDE7CkBAIC0zODIsMjEgKzM4MiwyMSBAQAogCQl9IGVsc2UKICNlbmRp
ZgogI2lmZGVmCVZNX0ZSRUVMSVNUX0hJR0hNRU0KLQkJaWYgKHBoeXNfYXZhaWxbaSArIDFdID4g
Vk1fSElHSE1FTV9BRERSRVNTKSB7Ci0JCQlpZiAocGh5c19hdmFpbFtpXSA8IFZNX0hJR0hNRU1f
QUREUkVTUykgewotCQkJCXZtX3BoeXNfY3JlYXRlX3NlZyhwaHlzX2F2YWlsW2ldLAorCQlpZiAo
ZHVtcF9hdmFpbFtpICsgMV0gPiBWTV9ISUdITUVNX0FERFJFU1MpIHsKKwkJCWlmIChkdW1wX2F2
YWlsW2ldIDwgVk1fSElHSE1FTV9BRERSRVNTKSB7CisJCQkJdm1fcGh5c19jcmVhdGVfc2VnKGR1
bXBfYXZhaWxbaV0sCiAJCQkJICAgIFZNX0hJR0hNRU1fQUREUkVTUywgVk1fRlJFRUxJU1RfREVG
QVVMVCk7CiAJCQkJdm1fcGh5c19jcmVhdGVfc2VnKFZNX0hJR0hNRU1fQUREUkVTUywKLQkJCQkg
ICAgcGh5c19hdmFpbFtpICsgMV0sIFZNX0ZSRUVMSVNUX0hJR0hNRU0pOworCQkJCSAgICBkdW1w
X2F2YWlsW2kgKyAxXSwgVk1fRlJFRUxJU1RfSElHSE1FTSk7CiAJCQl9IGVsc2UgewotCQkJCXZt
X3BoeXNfY3JlYXRlX3NlZyhwaHlzX2F2YWlsW2ldLAotCQkJCSAgICBwaHlzX2F2YWlsW2kgKyAx
XSwgVk1fRlJFRUxJU1RfSElHSE1FTSk7CisJCQkJdm1fcGh5c19jcmVhdGVfc2VnKGR1bXBfYXZh
aWxbaV0sCisJCQkJICAgIGR1bXBfYXZhaWxbaSArIDFdLCBWTV9GUkVFTElTVF9ISUdITUVNKTsK
IAkJCX0KIAkJCWlmIChWTV9GUkVFTElTVF9ISUdITUVNID49IHZtX25mcmVlbGlzdHMpCiAJCQkJ
dm1fbmZyZWVsaXN0cyA9IFZNX0ZSRUVMSVNUX0hJR0hNRU0gKyAxOwogCQl9IGVsc2UKICNlbmRp
ZgotCQl2bV9waHlzX2NyZWF0ZV9zZWcocGh5c19hdmFpbFtpXSwgcGh5c19hdmFpbFtpICsgMV0s
CisJCXZtX3BoeXNfY3JlYXRlX3NlZyhkdW1wX2F2YWlsW2ldLCBkdW1wX2F2YWlsW2kgKyAxXSwK
IAkJICAgIFZNX0ZSRUVMSVNUX0RFRkFVTFQpOwogCX0KIAlmb3IgKGRvbSA9IDA7IGRvbSA8IHZt
X25kb21haW5zOyBkb20rKykgewpJbmRleDogc3lzL3ZtL3ZtX3Jlc2Vydi5jCj09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0K
LS0tIHN5cy92bS92bV9yZXNlcnYuYwkocmV2aXNpb24gMjcyMjgxKQorKysgc3lzL3ZtL3ZtX3Jl
c2Vydi5jCSh3b3JraW5nIGNvcHkpCkBAIC04MjQsOSArODI0LDkgQEAKIAkgKiBJbml0aWFsaXpl
IHRoZSByZXNlcnZhdGlvbiBhcnJheS4gIFNwZWNpZmljYWxseSwgaW5pdGlhbGl6ZSB0aGUKIAkg
KiAicGFnZXMiIGZpZWxkIGZvciBldmVyeSBlbGVtZW50IHRoYXQgaGFzIGFuIHVuZGVybHlpbmcg
c3VwZXJwYWdlLgogCSAqLwotCWZvciAoaSA9IDA7IHBoeXNfYXZhaWxbaSArIDFdICE9IDA7IGkg
Kz0gMikgewotCQlwYWRkciA9IHJvdW5kdXAyKHBoeXNfYXZhaWxbaV0sIFZNX0xFVkVMXzBfU0la
RSk7Ci0JCXdoaWxlIChwYWRkciArIFZNX0xFVkVMXzBfU0laRSA8PSBwaHlzX2F2YWlsW2kgKyAx
XSkgeworCWZvciAoaSA9IDA7IGR1bXBfYXZhaWxbaSArIDFdICE9IDA7IGkgKz0gMikgeworCQlw
YWRkciA9IHJvdW5kdXAyKGR1bXBfYXZhaWxbaV0sIFZNX0xFVkVMXzBfU0laRSk7CisJCXdoaWxl
IChwYWRkciArIFZNX0xFVkVMXzBfU0laRSA8PSBkdW1wX2F2YWlsW2kgKyAxXSkgewogCQkJdm1f
cmVzZXJ2X2FycmF5W3BhZGRyID4+IFZNX0xFVkVMXzBfU0hJRlRdLnBhZ2VzID0KIAkJCSAgICBQ
SFlTX1RPX1ZNX1BBR0UocGFkZHIpOwogCQkJcGFkZHIgKz0gVk1fTEVWRUxfMF9TSVpFOwo=
--001a11c35a9caa54240504363d05--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFHCsPXE8FTBeii2wQDmkqGBZ-jJta0cQSVQBACs95ert-hVqw>