Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 23 Sep 2010 21:25:32 +0300
From:      Andriy Gapon <avg@freebsd.org>
To:        Jeff Roberson <jeff@freebsd.org>, freebsd-current@freebsd.org
Subject:   panic in uma_startup for many-core amd64 system
Message-ID:  <4C9B9B9C.6000807@freebsd.org>

next in thread | raw e-mail | index | archive | help

Jeff,

just for the kicks I tried to emulate a machine with 64 logical CPUs using
qemu-devel port:
qemu-system-x86_64 -smp sockets=4,cores=8,threads=2 ...

It seems that FreeBSD agreed to recognize only first 32 CPUs, but it paniced anyway.

Here's a backtrace:
#34 0xffffffff804fe7f5 in zone_alloc_item (zone=0xffffffff80be1554,
udata=0xffffffff80be1550, flags=1924) at /usr/src/sys/vm/uma_core.c:2506
#35 0xffffffff804ff35d in hash_alloc (hash=0xffffff001ffdb030) at
/usr/src/sys/vm/uma_core.c:483
#36 0xffffffff804ff642 in keg_ctor (mem=Variable "mem" is not available.
) at /usr/src/sys/vm/uma_core.c:1396
#37 0xffffffff804fe91b in zone_alloc_item (zone=0xffffffff80a1f300,
udata=0xffffffff80be1b60, flags=2) at /usr/src/sys/vm/uma_core.c:2544
#38 0xffffffff804ff92e in zone_ctor (mem=Variable "mem" is not available.
) at /usr/src/sys/vm/uma_core.c:1832
#39 0xffffffff804ffca4 in uma_startup (bootmem=0xffffff001ffac000, boot_pages=48)
at /usr/src/sys/vm/uma_core.c:1741
#40 0xffffffff80514822 in vm_page_startup (vaddr=18446744071576817664) at
/usr/src/sys/vm/vm_page.c:360
#41 0xffffffff805060c5 in vm_mem_init (dummy=Variable "dummy" is not available.
) at /usr/src/sys/vm/vm_init.c:118
#42 0xffffffff803258b9 in mi_startup () at /usr/src/sys/kern/init_main.c:253
#43 0xffffffff8017177c in btext () at /usr/src/sys/amd64/amd64/locore.S:81
[[[
Note:
1. Frame numbers are high because the backtrace is obtained via gdb remotely
connected to qemu and also there is bunch of extra frames from DDB, etc.
2. Line numbers in uma_core. won't match those in FreeBSD tree, because I've doing
some unrelated hacking in the file.
]]]

The problem seems to be with creation of "UMA Zones" zone and keg.
Because of the large number of processors, size argument in the following snippet
is set to a value of 4480:

args.name = "UMA Zones";
args.size = sizeof(struct uma_zone) +
    (sizeof(struct uma_cache) * (mp_maxid + 1));

Because of this, keg_ctor() calls keg_large_init():

else if ((keg->uk_size+UMA_FRITM_SZ) >
    (UMA_SLAB_SIZE - sizeof(struct uma_slab)))
        keg_large_init(keg);
else
        keg_small_init(keg);

keg_large_init sets UMA_ZONE_OFFPAGE and UMA_ZONE_HASH flags for this keg.
This leads to hash_alloc() being invoked from keg_ctor():

if (keg->uk_flags & UMA_ZONE_HASH)
        hash_alloc(&keg->uk_hash);

But the problem is that "UMA Hash" zone is not created yet and thus the call leads
to the panic.  "UMA Hash" zone is the last of system zones created.

Not sure what the proper fix here could/should be.
Would it work to simply not set UMA_ZONE_HASH flag when UMA_ZFLAG_INTERNAL is set?


And some final calculations.
On the test system sizeof(struct uma_cache) is 128 bytes and (mp_maxid + 1) is 32,
so it's already UMA_SLAB_SIZE = PAGE_SIZE = 4096.

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C9B9B9C.6000807>