Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 15 Dec 2014 16:33:04 -0800
From:      Alfred Perlstein <bright@mu.org>
To:        Peter Wemm <peter@wemm.org>
Cc:        "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>, Ian Lepore <ian@freebsd.org>
Subject:   Re: i386 PAE kernel works fine on 10-stable
Message-ID:  <A33FE253-B9EE-46FE-9229-CFBC21A80024@mu.org>
In-Reply-To: <1641407.80FsgLC8bS@overcee.wemm.org>
References:  <1418579278.2026.9.camel@freebsd.org> <1418580756.2026.12.camel@freebsd.org> <847BD158-0867-4F5F-83A9-1651E77D29EF@mu.org> <1641407.80FsgLC8bS@overcee.wemm.org>

next in thread | previous in thread | raw e-mail | index | archive | help

> On Dec 15, 2014, at 3:42 PM, Peter Wemm <peter@wemm.org> wrote:
>=20
>> On Sunday, December 14, 2014 10:53:14 AM Alfred Perlstein wrote:
>>> On Dec 14, 2014, at 10:12 AM, Ian Lepore wrote:
>>>> On Sun, 2014-12-14 at 10:09 -0800, Alfred Perlstein wrote:
>>>>> On Dec 14, 2014, at 9:47 AM, Ian Lepore wrote:
>>>>> This is an out of the blue FYI post to let people know that despite al=
l
>>>>> the misinformation you'll run across if you search for information on
>>>>> FreeBSD PAE support, it (still) works just fine.  I've been using it
>>>>> (for reasons related to our build system and products at $work) since
>>>>> 2006, and I can say unequivocally that it works fine on 6.x, 8.x, and
>>>>> now 10.x (and presumably on the odd-numbered releases too but I've nev=
er
>>>>> tried those).
>>>>>=20
>>>>> In my most recent testing with 10-stable, I found it was compatible wi=
th
>>>>> drm2 and radeonkms drivers and I was able to run Xorg and gnome just
>>>>> fine.  All my devices, and apps, and even the linuxulator worked just
>>>>> fine.
>>>>>=20
>>>>> One thing that changed somewhere between 8.4 and 10.1 is that I had to=

>>>>> add a kernel tuning option to my kernel config:
>>>>>=20
>>>>> option  KVA_PAGES=3D768        # Default is 512
>>>>>=20
>>>>> I suspect that the most frequent use of PAE is on laptops that have 4g=
b
>>>>> and the default tuning is adequate for that.  My desktop machine has
>>>>> 12gb and I needed to bump up that value to avoid errors related to bei=
ng
>>>>> unable to create new kernel stacks.
>>>>=20
>>>> There already is a #define that is bifurcated based on PAE in pmap.h:
>>>>=20
>>>> #ifndef KVA_PAGES
>>>> #ifdef PAE
>>>> #define KVA_PAGES       512
>>>> #else
>>>> #define KVA_PAGES       256
>>>> #endif
>>>> #endif
>>>>=20
>>>> Do you think it will harm things to apply your suggested default to thi=
s
>>>> file?>
>>> I would have to defer to someone who actually understands just what that=

>>> parm is tuning.  It was purely speculation on my part that the current
>>> default is adequate for less memory than I have, and I don't know what
>>> that downside might be for setting it too high.
>>=20
>> KVA pages is the amount of pages reserved for kernel address space:
>>=20
>> * Size of Kernel address space.  This is the number of page table pages
>> * (4MB each) to use for the kernel.  256 pages =3D=3D 1 Gigabyte.
>> * This **MUST** be a multiple of 4 (eg: 252, 256, 260, etc).
>> * For PAE, the page table page unit size is 2MB.  This means that 512 pag=
es
>> * is 1 Gigabyte.  Double everything.  It must be a multiple of 8 for PAE.=

>>=20
>> It appears that our default for PAE leaves 1GB for kernel address to play=

>> with?  That's an interesting default.  Wonder if it really makes sense fo=
r
>> PAE since the assumption is that you'll have >4GB ram in the box, wiring
>> down 1.5GB for kernel would seem to make sense=E2=80=A6  Probably make se=
nse to ask
>> Peter or Alan on this.
>=20
> It's always been a 1GB/3GB split.  It was never a problem until certain=20=

> scaling defaults were changed to scale solely based on physical ram withou=
t=20
> regard for kva limits.

Hmm the original patch I gave for that only changed scaling for machines wit=
h 64 bit pointers. Why was it that the 32 bit stuff was made to change?

>=20
> With the current settings and layout of the userland address space between=
 the=20
> zero-memory hole, the reservation for maxdsiz, followed by the ld-elf.so.1=
=20
> space and shared libraries, there's just enough room to mmap a 2GB file an=
d=20
> have a tiny bit of wiggle room left.
>=20
> With changing the kernel/user split to 1.5/2.5 then userland is more=20
> restricted and is typically around the 1.8/1.9GB range.
>=20
> You can get a large memory PAE system to boot with default settings by=20
> seriously scaling things down like kern.maxusers, mbufs limits, etc.
>=20
> However, we have run ref11-i386 and ref10-i386 in the cluster for 18+ mont=
hs=20
> with a 1.5/2.5 split and even then we've run out of kva and we've hit a fe=
w=20
> pmap panics and things that appear to be fallout of bounce buffer problems=
.
>=20
> While yes, you can make it work, I am personally not convinced that it is=20=

> reliable.
>=20
> My last i386 PAE machine died earlier this year with a busted scsi backpla=
ne=20
> for the drives.  It went to the great server crusher.

Oh I made dumb assumption that pae was 4/4 basically not split. Ok thanks.=20=


>=20
>> Also wondering how bad it would be to make these tunables, I see they
>> trickle down quite a bit into the system, hopefully not defining some
>> static arrays, but I haven't dived down that far.
>=20
> They cause extensive compile time macro expansion variations that are expo=
rted=20
> to assembler code via genassym.  KVA_PAGES is not a good candidate for a=20=

> runtime tunable unless you like the pain of i386/locore.s and friends.

Ouch. Ok.=20

-Alfred.=20=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A33FE253-B9EE-46FE-9229-CFBC21A80024>