Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 7 Jun 2010 23:47:47 +0530
From:      "C. Jayachandran" <c.jayachandran@gmail.com>
To:        Alan Cox <alc@cs.rice.edu>
Cc:        "Jayachandran C." <jchandra@freebsd.org>, Konstantin Belousov <kostikbel@gmail.com>, mips@freebsd.org
Subject:   Re: svn commit: r208589 - head/sys/mips/mips
Message-ID:  <AANLkTikZxx_30H9geHvZYkYd0sE-wiuZljEd0PAi14ca@mail.gmail.com>
In-Reply-To: <4C0D2BEA.6060103@cs.rice.edu>
References:  <201005271005.o4RA5eVu032269@svn.freebsd.org> <4C058145.3000707@cs.rice.edu> <AANLkTil955Ek-a3tek4Ony9NqrK5l2j7lNA4baRVPBzb@mail.gmail.com> <4C05F9BC.40409@cs.rice.edu> <AANLkTikdnvXsTwm8onl3MZPQmVfV-2GovB9--KMNnMgC@mail.gmail.com> <4C0731A5.7090301@cs.rice.edu> <AANLkTimIa3jmBPMhWIOcY6DenGpZ2ZYmqwDTWspVx0-u@mail.gmail.com> <AANLkTil2gE1niUWCHnsTlQvibhxBh7QYwD0TTWo0rj5c@mail.gmail.com> <AANLkTinA2D5iTDGPbflHVzLyAZW-ZewjJkUWWL8FVskr@mail.gmail.com> <4C07E07B.9060802@cs.rice.edu> <AANLkTimjyPc_AXKP1yaJaF1BN7CAGBeNikVzcp9OCb4P@mail.gmail.com> <4C09345F.9040300@cs.rice.edu> <AANLkTinmFOZY3OlaoKStxlNIRBt2G2I4ILkQ1P0CjozG@mail.gmail.com> <4C0D2BEA.6060103@cs.rice.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jun 7, 2010 at 10:57 PM, Alan Cox <alc@cs.rice.edu> wrote:
> C. Jayachandran wrote:
>>
>> On Fri, Jun 4, 2010 at 10:44 PM, Alan Cox <alc@cs.rice.edu> wrote:
>>
>>>
>>> C. Jayachandran wrote:
>>>
>>>>
>>>> On Thu, Jun 3, 2010 at 10:33 PM, Alan Cox <alc@cs.rice.edu> wrote:
>>>>
>>>>
>>>
>>> [snip]
>>>
>>>>>
>>>>> Add a static counter to pmap_ptpgzone_allocf(). =A0When the counter
>>>>> reaches
>>>>> some not-too-small prime number, don't call vm_phys_alloc_contig(),
>>>>> instead
>>>>> set "m" to NULL and reset the counter to zero. =A0Setting "m" to NULL
>>>>> should
>>>>> then trigger the vm_contig_grow_cache() call. =A0Make sense?
>>>>>
>>>>>
>>>>
>>>> Is this to move the vm_contig_grow_cache() out of the
>>>> pmap_ptpgzone_allocf() and into the function calling uma_zalloc()? =A0=
I
>>>> am wondering why the prime number is needed.
>>>>
>>>> Another implementation I had thought about was to have a kernel
>>>> process maintain a pool of direct mapped pages for MIPS. This will be
>>>> woken up if the pool goes below a low water mark - any comments on
>>>> this?
>>>>
>>>>
>>>>
>>>
>>> Here is how the page table page allocation should be done. =A0It's not =
much
>>> harder than the vm_contig_grow_cache() change.
>>>
>>> 1. Look at vm/vm_phys.c, in particular, vm_phys_init(). =A0Create a new
>>> freelist, like =A0 =A0 =A0VM_FREELIST_ISADMA, for the direct-mapped mem=
ory on
>>> MIPS. =A0Perhaps, call it VM_FREELIST_LOWMEM. =A0The controlling macros
>>> should
>>> be defined in mips/include/vmparam.h.
>>>
>>> 2. Trivially refactor vm_phys_alloc_pages(). =A0Specifically, take the =
body
>>> of
>>> the outer for loop and make it a new function, vm_phys_alloc_freelist()=
,
>>> =A0that takes the variable "flind" as a parameter.
>>>
>>> 3. Eliminate the UMA zone for page table pages. =A0In place of
>>> vm_phys_alloc_contig() call your new function with VM_FREELIST_LOWMEM.
>>> =A0(The
>>> vm_contig_grow_cache() remains unchanged.) =A0Go back to calling
>>> vm_page_free() to release page table pages.
>>>
>>> For the record, the changes that you made to start using a zone for pag=
e
>>> table page allocation introduced at least one possible race condition
>>> between pmap_remove_pages() and pmap_remove_all(). =A0Specifically, by
>>> dropping and reacquiring the page queues and pmap lock when you free a
>>> page
>>> table page, you allow a pmap_remove_all() call while pmap_remove_pages(=
)
>>> is
>>> running to leave the variable "npv" in pmap_remove_pages() pointing at =
a
>>> freed pv entry.
>>>
>>
>> My first attempt is attached, it comes up multiuser but crashes if I
>> stress it a bit (panic: Bad link elm 0xc0078f00 prev->next !=3D elm).
>> Will look at this tomorrow, and see if I can find the cause.
>>
>>
>
> Where exactly is this occurring?
>
>> In the meantime, it may be worth looking at the patch to see if this
>> is in line with the approach you had suggested. I have tried to =A0use
>> VM_FREELIST_HIGHMEM which is already there, instead of adding
>> VM_FREELIST_LOWMEM.
>>
>>
>
> Basically, yes. =A0At first, I was taken aback by defining VM_FREELIST_DE=
FAULT
> as other than zero, but it shouldn't break anything.

I can change this to add VM_FREELIST_LOWMEM. This works okay so far,
even though I haven't really checked the code to see if it works when
there is no memory in HIGHMEM.

> The one issue that I see with the patch is that you'll need to release an=
d
> reacquire the pmap and page queues locks around the call to
> vm_contig_grow_cache(). =A0However, if you release these locks, you need =
to
> restart the pmap_allocpte(), i.e., perform the "goto retry;", because whi=
le
> you slept another processor/thread might have already allocated and
> installed the page table page that you are trying to allocate into the pa=
ge
> table.
>
> For what it's worth, this same race condition exists in the current code =
in
> subversion using an UMA zone, but not in the code that predates use of an
> UMA zone.

I have been testing with an updated version of the patch - my -j128
buildworld works on this. The sys/vm side changes are simple enough,
but the he page table page allocation part is still not complete. The
version I have now is (with gmail whitespace damage):

---
static vm_page_t
pmap_alloc_pte_page(pmap_t pmap, unsigned int index, int wait, vm_offset_t =
*vap)
{
=A0 =A0 =A0 =A0vm_paddr_t paddr;
=A0 =A0 =A0 =A0vm_page_t m;
=A0 =A0 =A0 =A0int tries, flags;

=A0 =A0 =A0 =A0tries =3D 0;
retry:
=A0 =A0 =A0 =A0mtx_lock(&vm_page_queue_free_mtx);
=A0 =A0 =A0 =A0m =3D vm_phys_alloc_freelist_pages(VM_FREELIST_DEFAULT,
=A0 =A0 =A0 =A0 =A0 =A0VM_FREEPOOL_DEFAULT, 0);
=A0 =A0 =A0 =A0if (m =3D=3D NULL) {
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mtx_unlock(&vm_page_queue_free_mtx);
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (tries < ((wait & M_NOWAIT) !=3D 0 ? 1 : =
3)) {
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0vm_contig_grow_cache(tries, =
0, MIPS_KSEG0_LARGEST_PHYS);
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tries++;
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0goto retry;
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} else {
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0printf("[%d] %s wait %x, fai=
l!\n", tries, __func__,
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0wait);
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return (NULL);
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
=A0 =A0 =A0 =A0}

=A0 =A0 =A0 =A0m->pindex =3D index;
=A0 =A0 =A0 =A0m->valid =3D VM_PAGE_BITS_ALL;
=A0 =A0 =A0 =A0atomic_add_int(&cnt.v_wire_count, 1);
=A0 =A0 =A0 =A0m->wire_count =3D 1;
=A0 =A0 =A0 =A0if (m->flags & PG_ZERO)
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0vm_page_zero_count--;
=A0 =A0 =A0 =A0else
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0pmap_zero_page(m);
=A0 =A0 =A0 =A0flags =3D PG_ZERO | PG_UNMANAGED;
=A0 =A0 =A0 =A0m->flags =3D flags;
=A0 =A0 =A0 =A0m->oflags =3D 0;
=A0 =A0 =A0 =A0m->act_count =3D 0;
=A0 =A0 =A0 =A0if ((m->flags & PG_CACHED) !=3D 0)
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0printf("Not yet! handle cached =A0page %p fl=
ags 0x%x\n",
m, m->flags);
=A0 =A0 =A0 =A0else
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0cnt.v_free_count--;

=A0 =A0 =A0 =A0m->act_count =3D 0;
=A0 =A0 =A0 =A0mtx_unlock(&vm_page_queue_free_mtx);
}


----

I tried to match the vm_page_alloc() code here. I'm not sure if I have
to handle the PG_CACHED case, and if it is okay to change the VM
counters here.

Also, I will change the code to  return NULL all the way and retry
allocation (similar to the old code which called VM_WAIT and then
returned NULL).

Probably I should get the whole code in line with the old code,
replacing the old vm_page_alloc() call with the function above without
retry, and the VM_WAIT with a new function which does the
vm_contig_grow_cache().

Thanks,
JC.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTikZxx_30H9geHvZYkYd0sE-wiuZljEd0PAi14ca>