Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 27 Jan 2018 17:52:28 +0100
From:      Hans Petter Selasky <hps@selasky.org>
To:        Konstantin Belousov <kostikbel@gmail.com>, Hartmut Brandt <hartmut.brandt@dlr.de>
Cc:        hackers@freebsd.org
Subject:   Re: allocating large piece of contiguous kernel memory
Message-ID:  <bf4fca1f-81a3-6598-ddf6-e8cbdbe1d77a@selasky.org>
In-Reply-To: <20180127135013.GA55707@kib.kiev.ua>
References:  <alpine.BSF.2.20.1801251747460.70702@KNOP-BEAGLE.kn.op.dlr.de> <20180125172116.GS55707@kib.kiev.ua> <alpine.BSF.2.20.1801260957190.73798@KNOP-BEAGLE.kn.op.dlr.de> <20180127135013.GA55707@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On 01/27/18 14:50, Konstantin Belousov wrote:
> On Fri, Jan 26, 2018 at 10:05:14AM +0100, Hartmut Brandt wrote:
>> Hi,
>>
>> On Thu, 25 Jan 2018, Konstantin Belousov wrote:
>>
>> KB>On Thu, Jan 25, 2018 at 05:52:57PM +0100, Hartmut Brandt wrote:
>> KB>> Hi all,
>> KB>>
>> KB>> for a device driver communicating with an FPGA I would like to allocate
>> KB>> an 8GByte piece or two 4GByte pieces of contiguous memory in the device
>> KB>> driver. The machine is amd64 and has 256GByte.
>> KB>>
>> KB>> The maximum I can allocate is 512MByte. If I try more (in one piece or in
>> KB>> two pieces) contigfree() crashes in pmap_resident_count_dec() with an
>> KB>> "pmap resident count underflow".
>> KB>>
>> KB>> Is this something which is not supposed to work? Or is there something
>> KB>> tunable or is this a bug?
>> KB>
>> KB>I suspect a bug. Print out the value of
>> KB>kernel_pmap->pm_stats.resident_count before and after contigmalloc. Also
>> KB>interesting are the values of the resident count and decrement at the
>> KB>panic time (but they are already printed by the panic, you did not shown
>> KB>them).
>>
>> This is the panic:
>>
>> knot_attach: attaching
>> kernel_pmap->pm_stats.resident_count=57322
>> cmem=0xfffffe0101000000
>> kernel_pmap->pm_stats.resident_count=57712
>> kernel_pmap->pm_stats.resident_count=57712
>> panic: pmap 0xffffffff80e40188 resident count underflow 368 512
>> cpuid = 3
>> time = 1516956458
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0100d0d1c0
>> vpanic() at vpanic+0x19c/frame 0xfffffe0100d0d240
>> kassert_panic() at kassert_panic+0x126/frame 0xfffffe0100d0d2b0
>> pmap_remove_pde() at pmap_remove_pde+0x645/frame 0xfffffe0100d0d340
>> pmap_remove() at pmap_remove+0x484/frame 0xfffffe0100d0d3c0
>> kmem_unback() at kmem_unback+0x3a/frame 0xfffffe0100d0d400
>> kmem_free() at kmem_free+0x3d/frame 0xfffffe0100d0d430
>> contigfree() at contigfree+0x27/frame 0xfffffe0100d0d460
>> knot_attach() at knot_attach+0xec/frame 0xfffffe0100d0d4c0
>> device_attach() at device_attach+0x3f7/frame 0xfffffe0100d0d510
>> pci_driver_added() at pci_driver_added+0xe9/frame 0xfffffe0100d0d550
>> devclass_driver_added() at devclass_driver_added+0x7d/frame 0xfffffe0100d0d590
>> devclass_add_driver() at devclass_add_driver+0x144/frame 0xfffffe0100d0d5d0
>> module_register_init() at module_register_init+0xc0/frame 0xfffffe0100d0d600
>> linker_load_module() at linker_load_module+0xb78/frame 0xfffffe0100d0d910
>> kern_kldload() at kern_kldload+0xa9/frame 0xfffffe0100d0d950
>> sys_kldload() at sys_kldload+0x5b/frame 0xfffffe0100d0d980
>> amd64_syscall() at amd64_syscall+0x271/frame 0xfffffe0100d0dab0
>> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0100d0dab0
>> --- syscall (304, FreeBSD ELF64, sys_kldload), rip = 0x80087228a, rsp =
>> 0x7fffffffd508, rbp = 0x7fffffffda80 ---
>> KDB: enter: panic
>> [ thread pid 1150 tid 100146 ]
>> Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
>> db>
>>
>> The first printf() is just before contigmalloc(), then it prints the
>> returned pointer. The next to prints() are just before the contigfree().
>>
>> Doesn't the change of the count by 390 look strange when allocating
>> 8GByte?
> I cannot reproduce your issue. I suspect this is something local to your
> system.  With the patch below, if I do
> 	# sysctl debug.a=1
> I see
> # sysctl debug.a=1
> debug.a: before 31310
> 0after alloc 2128462
> after free 31310
>   -> 0
> 
> The numbers are exactly as what I expect.
> 
> diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c
> index 12c5b695e23..3c707444169 100644
> --- a/sys/amd64/amd64/pmap.c
> +++ b/sys/amd64/amd64/pmap.c
> @@ -8040,3 +8040,26 @@ DB_SHOW_COMMAND(phys2dmap, pmap_phys2dmap)
>   	}
>   }
>   #endif
> +
> +static int
> +pmap_contigalloc(SYSCTL_HANDLER_ARGS)
> +{
> +	void *ptr;
> +	size_t sz;
> +	int error, i;
> +
> +	i = 0;
> +	error = sysctl_handle_int(oidp, &i, 0, req);
> +	if (error != 0 || req->newptr == NULL)
> +		return (error);
> +	sz = 8UL * 1024 * 1024 * 1024;
> +	printf("before %ld\n", kernel_pmap->pm_stats.resident_count);
> +	ptr = contigmalloc(sz, M_TEMP, M_WAITOK, 0, ~0UL, 0, 0);
> +	printf("after alloc %ld\n", kernel_pmap->pm_stats.resident_count);
> +	contigfree(ptr, sz, M_TEMP);
> +	printf("after free %ld\n", kernel_pmap->pm_stats.resident_count);
> +	return (0);
> +}
> +SYSCTL_PROC(_debug, OID_AUTO, a, CTLTYPE_INT | CTLFLAG_RW |
> +    CTLFLAG_MPSAFE, NULL, 0, pmap_contigalloc, "I",
> +    "");

Hi,

What is the earliest point in the SYSINIT sequence which contigmalloc() 
can be called? Is contigmalloc() being called too early for this big 
allocations?

--HPS






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bf4fca1f-81a3-6598-ddf6-e8cbdbe1d77a>