Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 18 Oct 2010 23:01:33 +0200
From:      Giovanni Trematerra <gianni@freebsd.org>
To:        Andriy Gapon <avg@freebsd.org>
Cc:        alc@freebsd.org, freebsd-current@freebsd.org
Subject:   Re: panic in uma_startup for many-core amd64 system
Message-ID:  <AANLkTim-z-rNvPa%2BFDGAb7oroKB2DWxZSECti=ioH8GD@mail.gmail.com>
In-Reply-To: <4CBC5719.1020807@freebsd.org>
References:  <4C9B9B9C.6000807@freebsd.org> <4CBBEBDF.3060905@freebsd.org> <AANLkTi=O4GtAKDqEr%2BR27E5Xe%2BdGBZc0d2_=KpobtuSW@mail.gmail.com> <4CBC5719.1020807@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Oct 18, 2010 at 4:18 PM, Andriy Gapon <avg@freebsd.org> wrote:
> on 18/10/2010 16:40 Giovanni Trematerra said the following:
>> On Mon, Oct 18, 2010 at 8:40 AM, Andriy Gapon <avg@freebsd.org> wrote:
>>> on 23/09/2010 21:25 Andriy Gapon said the following:
>>>>
>>>> Jeff,
>>>>
>>>> just for the kicks I tried to emulate a machine with 64 logical CPUs u=
sing
>>>> qemu-devel port:
>>>> qemu-system-x86_64 -smp sockets=3D4,cores=3D8,threads=3D2 ...
>>>>
>>>> It seems that FreeBSD agreed to recognize only first 32 CPUs, but it p=
aniced anyway.
>>>>
>>>> Here's a backtrace:
>>>> #34 0xffffffff804fe7f5 in zone_alloc_item (zone=3D0xffffffff80be1554,
>>>> udata=3D0xffffffff80be1550, flags=3D1924) at /usr/src/sys/vm/uma_core.=
c:2506
>>>> #35 0xffffffff804ff35d in hash_alloc (hash=3D0xffffff001ffdb030) at
>>>> /usr/src/sys/vm/uma_core.c:483
>>>> #36 0xffffffff804ff642 in keg_ctor (mem=3DVariable "mem" is not availa=
ble.
>>>> ) at /usr/src/sys/vm/uma_core.c:1396
>>>> #37 0xffffffff804fe91b in zone_alloc_item (zone=3D0xffffffff80a1f300,
>>>> udata=3D0xffffffff80be1b60, flags=3D2) at /usr/src/sys/vm/uma_core.c:2=
544
>>>> #38 0xffffffff804ff92e in zone_ctor (mem=3DVariable "mem" is not avail=
able.
>>>> ) at /usr/src/sys/vm/uma_core.c:1832
>>>> #39 0xffffffff804ffca4 in uma_startup (bootmem=3D0xffffff001ffac000, b=
oot_pages=3D48)
>>>> at /usr/src/sys/vm/uma_core.c:1741
>>>> #40 0xffffffff80514822 in vm_page_startup (vaddr=3D1844674407157681766=
4) at
>>>> /usr/src/sys/vm/vm_page.c:360
>>>> #41 0xffffffff805060c5 in vm_mem_init (dummy=3DVariable "dummy" is not=
 available.
>>>> ) at /usr/src/sys/vm/vm_init.c:118
>>>> #42 0xffffffff803258b9 in mi_startup () at /usr/src/sys/kern/init_main=
.c:253
>>>> #43 0xffffffff8017177c in btext () at /usr/src/sys/amd64/amd64/locore.=
S:81
>>>> [[[
>>>> Note:
>>>> 1. Frame numbers are high because the backtrace is obtained via gdb re=
motely
>>>> connected to qemu and also there is bunch of extra frames from DDB, et=
c.
>>>> 2. Line numbers in uma_core. won't match those in FreeBSD tree, becaus=
e I've doing
>>>> some unrelated hacking in the file.
>>>> ]]]
>>>>
>>>> The problem seems to be with creation of "UMA Zones" zone and keg.
>>>> Because of the large number of processors, size argument in the follow=
ing snippet
>>>> is set to a value of 4480:
>>>>
>>>> args.name =3D "UMA Zones";
>>>> args.size =3D sizeof(struct uma_zone) +
>>>> =A0 =A0 (sizeof(struct uma_cache) * (mp_maxid + 1));
>>>>
>>>> Because of this, keg_ctor() calls keg_large_init():
>>>>
>>>> else if ((keg->uk_size+UMA_FRITM_SZ) >
>>>> =A0 =A0 (UMA_SLAB_SIZE - sizeof(struct uma_slab)))
>>>> =A0 =A0 =A0 =A0 keg_large_init(keg);
>>>> else
>>>> =A0 =A0 =A0 =A0 keg_small_init(keg);
>>>>
>>>> keg_large_init sets UMA_ZONE_OFFPAGE and UMA_ZONE_HASH flags for this =
keg.
>>>> This leads to hash_alloc() being invoked from keg_ctor():
>>>>
>>>> if (keg->uk_flags & UMA_ZONE_HASH)
>>>> =A0 =A0 =A0 =A0 hash_alloc(&keg->uk_hash);
>>>>
>>>> But the problem is that "UMA Hash" zone is not created yet and thus th=
e call leads
>>>> to the panic. =A0"UMA Hash" zone is the last of system zones created.
>>>>
>>>> Not sure what the proper fix here could/should be.
>>>> Would it work to simply not set UMA_ZONE_HASH flag when UMA_ZFLAG_INTE=
RNAL is set?
>>>>
>>>>
>>>> And some final calculations.
>>>> On the test system sizeof(struct uma_cache) is 128 bytes and (mp_maxid=
 + 1) is 32,
>>>> so it's already UMA_SLAB_SIZE =3D PAGE_SIZE =3D 4096.
>>>>
>>>
>>> Here is a simple solution that seems to work:
>>> http://people.freebsd.org/~avg/uma-many-cpus.diff
>>> Not sure if it's the best we can do.
>>>
>>
>> I don't know if it makes sense I only want to raise a flag.
>> Is it safe to call kmem_malloc() before bucket_init() during
>> uma_startup() to reserve room for CPU caches?
>
> Hmm, not sure what exactly you mean.

Sorry, nevermind

>
>> Reading the top uma_int.h comment, it seems that the best way to
>> handle this issue
>> would be to implement and allow for dynamic slab sizes.
>
> Again, not sure if I follow you, I don't see relation between per-cpu cac=
hes and
> dynamic slab size.

Your patch seems just a work around about initial slab size where the
keg is backed.
Having dynamic slab sizes would allow to have the keg backed on a larger sl=
ab
without going OFFPAGE.


--
Giovanni Trematerra



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTim-z-rNvPa%2BFDGAb7oroKB2DWxZSECti=ioH8GD>