Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 28 Sep 2014 20:00:43 -0500
From:      Alan Cox <alc@rice.edu>
To:        Svatopluk Kraus <onwahe@gmail.com>, alc@freebsd.org
Cc:        FreeBSD Arch <freebsd-arch@freebsd.org>
Subject:   Re: vm_page_array and VM_PHYSSEG_SPARSE
Message-ID:  <5428AF3B.1030906@rice.edu>
In-Reply-To: <CAFHCsPWq9WqeFnx1a%2BStfSxj=jwcE9GPyVsoyh0%2Bazr3HmM6vQ@mail.gmail.com>
References:  <CAFHCsPWkq09_RRDz7fy3UgsRFv8ZbNKdAH2Ft0x6aVSwLPi6BQ@mail.gmail.com>	<CAJUyCcPXBuLu0nvaCqpg8NJ6KzAX9BA1Rt%2BooD%2B3pzq%2BFV%2B%2BTQ@mail.gmail.com> <CAFHCsPWq9WqeFnx1a%2BStfSxj=jwcE9GPyVsoyh0%2Bazr3HmM6vQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------030707070306010907080705
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit

On 09/27/2014 03:51, Svatopluk Kraus wrote:
>
> On Fri, Sep 26, 2014 at 8:08 PM, Alan Cox <alan.l.cox@gmail.com
> <mailto:alan.l.cox@gmail.com>> wrote:
>
>
>
>     On Wed, Sep 24, 2014 at 7:27 AM, Svatopluk Kraus <onwahe@gmail.com
>     <mailto:onwahe@gmail.com>> wrote:
>
>         Hi,
>
>         I and Michal are finishing new ARM pmap-v6 code. There is one
>         problem we've
>         dealt with somehow, but now we would like to do it better.
>         It's about
>         physical pages which are allocated before vm subsystem is
>         initialized.
>         While later on these pages could be found in vm_page_array when
>         VM_PHYSSEG_DENSE memory model is used, it's not true for
>         VM_PHYSSEG_SPARSE
>         memory model. And ARM world uses VM_PHYSSEG_SPARSE model.
>
>         It really would be nice to utilize vm_page_array for such
>         preallocated
>         physical pages even when VM_PHYSSEG_SPARSE memory model is
>         used. Things
>         could be much easier then. In our case, it's about pages which
>         are used for
>         level 2 page tables. In VM_PHYSSEG_SPARSE model, we have two
>         sets of such
>         pages. First ones are preallocated and second ones are
>         allocated after vm
>         subsystem was inited. We must deal with each set differently.
>         So code is
>         more complex and so is debugging.
>
>         Thus we need some method how to say that some part of physical
>         memory
>         should be included in vm_page_array, but the pages from that
>         region should
>         not be put to free list during initialization. We think that such
>         possibility could be utilized in general. There could be a
>         need for some
>         physical space which:
>
>         (1) is needed only during boot and later on it can be freed
>         and put to vm
>         subsystem,
>
>         (2) is needed for something else and vm_page_array code could
>         be used
>         without some kind of its duplication.
>
>         There is already some code which deals with blacklisted pages
>         in vm_page.c
>         file. So the easiest way how to deal with presented situation
>         is to add
>         some callback to this part of code which will be able to
>         either exclude
>         whole phys_avail[i], phys_avail[i+1] region or single pages.
>         As the biggest
>         phys_avail region is used for vm subsystem allocations, there
>         should be
>         some more coding. (However, blacklisted pages are not dealt
>         with on that
>         part of region.)
>
>         We would like to know if there is any objection:
>
>         (1) to deal with presented problem,
>         (2) to deal with the problem presented way.
>         Some help is very appreciated. Thanks
>
>
>
>     As an experiment, try modifying vm_phys.c to use dump_avail
>     instead of phys_avail when sizing vm_page_array.  On amd64, where
>     the same problem exists, this allowed me to use
>     VM_PHYSSEG_SPARSE.  Right now, this is probably my preferred
>     solution.  The catch being that not all architectures implement
>     dump_avail, but my recollection is that arm does.
>
>  
> Frankly, I would prefer this too, but there is one big open question:
>  
> What is dump_avail for?


dump_avail[] is solving a similar problem in the minidump code, hence,
the prefix "dump_" in its name.  In other words, the minidump code
couldn't use phys_avail[] either because it didn't describe the full
range of physical addresses that might be included in a minidump, so
dump_avail[] was created.

There is already precedent for what I'm suggesting.  dump_avail[] is
already (ab)used outside of the minidump code on x86 to solve this same
problem in x86/x86/nexus.c, and on arm in arm/arm/mem.c.


> Using it for vm_page_array initialization and segmentation means that
> phys_avail must be a subset of it. And this must be stated and be
> visible enough. Maybe it should be even checked in code. I like the
> idea of thinking about dump_avail as something what desribes all
> memory in a system, but it's not how dump_avail is defined in archs now.


When you say "it's not how dump_avail is defined in archs now", I'm not
sure whether you're talking about the code or the comments.  In terms of
code, dump_avail[] is a superset of phys_avail[], and I'm not aware of
any code that would have to change.  In terms of comments, I did a grep
looking for comments defining what dump_avail[] is, because I couldn't
remember any.  I found one ... on arm.  So, I don't think it's a onerous
task changing the definition of dump_avail[].  :-)

Already, as things stand today with dump_avail[] being used outside of
the minidump code, one could reasonably argue that it should be renamed
to something like phys_exists[].


>  
> I will experiment with it on monday then. However, it's not only about
> how memory segments are created in vm_phys.c, but it's about how
> vm_page_array size is computed in vm_page.c too.


Yes, and there is also a place in vm_reserv.c that needs to change.  
I've attached the patch that I developed and tested a long time ago.  It
still applies cleanly and runs ok on amd64.



--------------030707070306010907080705
Content-Type: text/plain; charset=ISO-8859-15;
 name="dump_avail_sparse.patch"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="dump_avail_sparse.patch"

SW5kZXg6IHZtL3ZtX3BoeXMuYwo9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09Ci0tLSB2bS92bV9waHlzLmMJKHJl
dmlzaW9uIDIyMDEwMikKKysrIHZtL3ZtX3BoeXMuYwkod29ya2luZyBjb3B5KQpAQCAtMjg5
LDM4ICsyODksMzggQEAgdm1fcGh5c19pbml0KHZvaWQpCiAJaW50IG5kb21haW5zLCBqOwog
I2VuZGlmCiAKLQlmb3IgKGkgPSAwOyBwaHlzX2F2YWlsW2kgKyAxXSAhPSAwOyBpICs9IDIp
IHsKKwlmb3IgKGkgPSAwOyBkdW1wX2F2YWlsW2kgKyAxXSAhPSAwOyBpICs9IDIpIHsKICNp
ZmRlZglWTV9GUkVFTElTVF9JU0FETUEKLQkJaWYgKHBoeXNfYXZhaWxbaV0gPCAxNjc3NzIx
NikgewotCQkJaWYgKHBoeXNfYXZhaWxbaSArIDFdID4gMTY3NzcyMTYpIHsKLQkJCQl2bV9w
aHlzX2NyZWF0ZV9zZWcocGh5c19hdmFpbFtpXSwgMTY3NzcyMTYsCisJCWlmIChkdW1wX2F2
YWlsW2ldIDwgMTY3NzcyMTYpIHsKKwkJCWlmIChkdW1wX2F2YWlsW2kgKyAxXSA+IDE2Nzc3
MjE2KSB7CisJCQkJdm1fcGh5c19jcmVhdGVfc2VnKGR1bXBfYXZhaWxbaV0sIDE2Nzc3MjE2
LAogCQkJCSAgICBWTV9GUkVFTElTVF9JU0FETUEpOwotCQkJCXZtX3BoeXNfY3JlYXRlX3Nl
ZygxNjc3NzIxNiwgcGh5c19hdmFpbFtpICsgMV0sCisJCQkJdm1fcGh5c19jcmVhdGVfc2Vn
KDE2Nzc3MjE2LCBkdW1wX2F2YWlsW2kgKyAxXSwKIAkJCQkgICAgVk1fRlJFRUxJU1RfREVG
QVVMVCk7CiAJCQl9IGVsc2UgewotCQkJCXZtX3BoeXNfY3JlYXRlX3NlZyhwaHlzX2F2YWls
W2ldLAotCQkJCSAgICBwaHlzX2F2YWlsW2kgKyAxXSwgVk1fRlJFRUxJU1RfSVNBRE1BKTsK
KwkJCQl2bV9waHlzX2NyZWF0ZV9zZWcoZHVtcF9hdmFpbFtpXSwKKwkJCQkgICAgZHVtcF9h
dmFpbFtpICsgMV0sIFZNX0ZSRUVMSVNUX0lTQURNQSk7CiAJCQl9CiAJCQlpZiAoVk1fRlJF
RUxJU1RfSVNBRE1BID49IHZtX25mcmVlbGlzdHMpCiAJCQkJdm1fbmZyZWVsaXN0cyA9IFZN
X0ZSRUVMSVNUX0lTQURNQSArIDE7CiAJCX0gZWxzZQogI2VuZGlmCiAjaWZkZWYJVk1fRlJF
RUxJU1RfSElHSE1FTQotCQlpZiAocGh5c19hdmFpbFtpICsgMV0gPiBWTV9ISUdITUVNX0FE
RFJFU1MpIHsKLQkJCWlmIChwaHlzX2F2YWlsW2ldIDwgVk1fSElHSE1FTV9BRERSRVNTKSB7
Ci0JCQkJdm1fcGh5c19jcmVhdGVfc2VnKHBoeXNfYXZhaWxbaV0sCisJCWlmIChkdW1wX2F2
YWlsW2kgKyAxXSA+IFZNX0hJR0hNRU1fQUREUkVTUykgeworCQkJaWYgKGR1bXBfYXZhaWxb
aV0gPCBWTV9ISUdITUVNX0FERFJFU1MpIHsKKwkJCQl2bV9waHlzX2NyZWF0ZV9zZWcoZHVt
cF9hdmFpbFtpXSwKIAkJCQkgICAgVk1fSElHSE1FTV9BRERSRVNTLCBWTV9GUkVFTElTVF9E
RUZBVUxUKTsKIAkJCQl2bV9waHlzX2NyZWF0ZV9zZWcoVk1fSElHSE1FTV9BRERSRVNTLAot
CQkJCSAgICBwaHlzX2F2YWlsW2kgKyAxXSwgVk1fRlJFRUxJU1RfSElHSE1FTSk7CisJCQkJ
ICAgIGR1bXBfYXZhaWxbaSArIDFdLCBWTV9GUkVFTElTVF9ISUdITUVNKTsKIAkJCX0gZWxz
ZSB7Ci0JCQkJdm1fcGh5c19jcmVhdGVfc2VnKHBoeXNfYXZhaWxbaV0sCi0JCQkJICAgIHBo
eXNfYXZhaWxbaSArIDFdLCBWTV9GUkVFTElTVF9ISUdITUVNKTsKKwkJCQl2bV9waHlzX2Ny
ZWF0ZV9zZWcoZHVtcF9hdmFpbFtpXSwKKwkJCQkgICAgZHVtcF9hdmFpbFtpICsgMV0sIFZN
X0ZSRUVMSVNUX0hJR0hNRU0pOwogCQkJfQogCQkJaWYgKFZNX0ZSRUVMSVNUX0hJR0hNRU0g
Pj0gdm1fbmZyZWVsaXN0cykKIAkJCQl2bV9uZnJlZWxpc3RzID0gVk1fRlJFRUxJU1RfSElH
SE1FTSArIDE7CiAJCX0gZWxzZQogI2VuZGlmCi0JCXZtX3BoeXNfY3JlYXRlX3NlZyhwaHlz
X2F2YWlsW2ldLCBwaHlzX2F2YWlsW2kgKyAxXSwKKwkJdm1fcGh5c19jcmVhdGVfc2VnKGR1
bXBfYXZhaWxbaV0sIGR1bXBfYXZhaWxbaSArIDFdLAogCQkgICAgVk1fRlJFRUxJU1RfREVG
QVVMVCk7CiAJfQogCWZvciAoZmxpbmQgPSAwOyBmbGluZCA8IHZtX25mcmVlbGlzdHM7IGZs
aW5kKyspIHsKSW5kZXg6IHZtL3ZtX3Jlc2Vydi5jCj09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KLS0tIHZtL3Zt
X3Jlc2Vydi5jCShyZXZpc2lvbiAyMjAxMDIpCisrKyB2bS92bV9yZXNlcnYuYwkod29ya2lu
ZyBjb3B5KQpAQCAtNDg3LDkgKzQ4Nyw5IEBAIHZtX3Jlc2Vydl9pbml0KHZvaWQpCiAJICog
SW5pdGlhbGl6ZSB0aGUgcmVzZXJ2YXRpb24gYXJyYXkuICBTcGVjaWZpY2FsbHksIGluaXRp
YWxpemUgdGhlCiAJICogInBhZ2VzIiBmaWVsZCBmb3IgZXZlcnkgZWxlbWVudCB0aGF0IGhh
cyBhbiB1bmRlcmx5aW5nIHN1cGVycGFnZS4KIAkgKi8KLQlmb3IgKGkgPSAwOyBwaHlzX2F2
YWlsW2kgKyAxXSAhPSAwOyBpICs9IDIpIHsKLQkJcGFkZHIgPSByb3VuZHVwMihwaHlzX2F2
YWlsW2ldLCBWTV9MRVZFTF8wX1NJWkUpOwotCQl3aGlsZSAocGFkZHIgKyBWTV9MRVZFTF8w
X1NJWkUgPD0gcGh5c19hdmFpbFtpICsgMV0pIHsKKwlmb3IgKGkgPSAwOyBkdW1wX2F2YWls
W2kgKyAxXSAhPSAwOyBpICs9IDIpIHsKKwkJcGFkZHIgPSByb3VuZHVwMihkdW1wX2F2YWls
W2ldLCBWTV9MRVZFTF8wX1NJWkUpOworCQl3aGlsZSAocGFkZHIgKyBWTV9MRVZFTF8wX1NJ
WkUgPD0gZHVtcF9hdmFpbFtpICsgMV0pIHsKIAkJCXZtX3Jlc2Vydl9hcnJheVtwYWRkciA+
PiBWTV9MRVZFTF8wX1NISUZUXS5wYWdlcyA9CiAJCQkgICAgUEhZU19UT19WTV9QQUdFKHBh
ZGRyKTsKIAkJCXBhZGRyICs9IFZNX0xFVkVMXzBfU0laRTsKSW5kZXg6IHZtL3ZtX3BhZ2Uu
Ywo9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09Ci0tLSB2bS92bV9wYWdlLmMJKHJldmlzaW9uIDIyMDEwMikKKysr
IHZtL3ZtX3BhZ2UuYwkod29ya2luZyBjb3B5KQpAQCAtMzg5LDggKzM4OSw4IEBAIHZtX3Bh
Z2Vfc3RhcnR1cCh2bV9vZmZzZXRfdCB2YWRkcikKIAlmaXJzdF9wYWdlID0gbG93X3dhdGVy
IC8gUEFHRV9TSVpFOwogI2lmZGVmIFZNX1BIWVNTRUdfU1BBUlNFCiAJcGFnZV9yYW5nZSA9
IDA7Ci0JZm9yIChpID0gMDsgcGh5c19hdmFpbFtpICsgMV0gIT0gMDsgaSArPSAyKQotCQlw
YWdlX3JhbmdlICs9IGF0b3AocGh5c19hdmFpbFtpICsgMV0gLSBwaHlzX2F2YWlsW2ldKTsK
Kwlmb3IgKGkgPSAwOyBkdW1wX2F2YWlsW2kgKyAxXSAhPSAwOyBpICs9IDIpCisJCXBhZ2Vf
cmFuZ2UgKz0gYXRvcChkdW1wX2F2YWlsW2kgKyAxXSAtIGR1bXBfYXZhaWxbaV0pOwogI2Vs
aWYgZGVmaW5lZChWTV9QSFlTU0VHX0RFTlNFKQogCXBhZ2VfcmFuZ2UgPSBoaWdoX3dhdGVy
IC8gUEFHRV9TSVpFIC0gZmlyc3RfcGFnZTsKICNlbHNlCkluZGV4OiBhbWQ2NC9pbmNsdWRl
L3ZtcGFyYW0uaAo9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09Ci0tLSBhbWQ2NC9pbmNsdWRlL3ZtcGFyYW0uaAko
cmV2aXNpb24gMjIwMTAyKQorKysgYW1kNjQvaW5jbHVkZS92bXBhcmFtLmgJKHdvcmtpbmcg
Y29weSkKQEAgLTc5LDcgKzc5LDcgQEAKIC8qCiAgKiBUaGUgcGh5c2ljYWwgYWRkcmVzcyBz
cGFjZSBpcyBkZW5zZWx5IHBvcHVsYXRlZC4KICAqLwotI2RlZmluZQlWTV9QSFlTU0VHX0RF
TlNFCisjZGVmaW5lCVZNX1BIWVNTRUdfU1BBUlNFCiAKIC8qCiAgKiBUaGUgbnVtYmVyIG9m
IFBIWVNTRUcgZW50cmllcyBtdXN0IGJlIG9uZSBncmVhdGVyIHRoYW4gdGhlIG51bWJlcgo=
--------------030707070306010907080705--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5428AF3B.1030906>