Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 30 Nov 2011 15:59:14 -0600
From:      Alan Cox <alan.l.cox@gmail.com>
To:        Andreas Tobler <andreast-list@fgznet.ch>
Cc:        Kostik Belousov <kostikbel@gmail.com>, FreeBSD Arch <freebsd-arch@freebsd.org>
Subject:   Re: powerpc64 malloc limit?
Message-ID:  <CAJUyCcO%2B76VirEbArnRPKx8JVGRfy8mknN4A7p1XGnug=vo8yw@mail.gmail.com>
In-Reply-To: <4ED6A36D.1050107@fgznet.ch>
References:  <4ED5BE19.70805@fgznet.ch> <20111130162236.GA50300@deviant.kiev.zoral.com.ua> <4ED65F70.7050700@fgznet.ch> <20111130170936.GB50300@deviant.kiev.zoral.com.ua> <4ED66B75.3060409@fgznet.ch> <20111130200103.GE50300@deviant.kiev.zoral.com.ua> <4ED698EB.8090904@fgznet.ch> <20111130212439.GF50300@deviant.kiev.zoral.com.ua> <4ED6A36D.1050107@fgznet.ch>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Nov 30, 2011 at 3:43 PM, Andreas Tobler <andreast-list@fgznet.ch>wrote:

> On 30.11.11 22:24, Kostik Belousov wrote:
>
>> On Wed, Nov 30, 2011 at 09:58:19PM +0100, Andreas Tobler wrote:
>>
>>> On 30.11.11 21:01, Kostik Belousov wrote:
>>>
>>>> On Wed, Nov 30, 2011 at 06:44:21PM +0100, Andreas Tobler wrote:
>>>>
>>>>> On 30.11.11 18:09, Kostik Belousov wrote:
>>>>>
>>>>>> On Wed, Nov 30, 2011 at 05:53:04PM +0100, Andreas Tobler wrote:
>>>>>>
>>>>>>> On 30.11.11 17:22, Kostik Belousov wrote:
>>>>>>>
>>>>>>>> On Wed, Nov 30, 2011 at 06:24:41AM +0100, Andreas Tobler wrote:
>>>>>>>>
>>>>>>>>> All,
>>>>>>>>>
>>>>>>>>> while working on gcc I found a very strange situation which
>>>>>>>>> renders my
>>>>>>>>> powerpc64 machine unusable.
>>>>>>>>> The test case below tries to allocate that much memory as 'wanted'.
>>>>>>>>> The
>>>>>>>>> same test case on amd64 returns w/o trying to allocate mem because
>>>>>>>>> the
>>>>>>>>> size is far to big.
>>>>>>>>>
>>>>>>>>> I couldn't find the reason so far, that's why I'm here.
>>>>>>>>>
>>>>>>>>> As Nathan pointed out the VM_MAXUSER_SIZE is the biggest on
>>>>>>>>> powerpc64:
>>>>>>>>> #define VM_MAXUSER_ADDRESS      (0x7ffffffffffff000UL)
>>>>>>>>>
>>>>>>>>> So, I'd expect a system to return an allocation error when a user
>>>>>>>>> tries
>>>>>>>>> to allocate too much memory and not really trying it and going to
>>>>>>>>> be
>>>>>>>>> unusable. Iow, I'd exepect the situation on powerpc64 as I see on
>>>>>>>>> amd64.
>>>>>>>>>
>>>>>>>>> Can anybody explain me the situation, why do I not have a working
>>>>>>>>> limit
>>>>>>>>> on powerpc64?
>>>>>>>>>
>>>>>>>>> The machine itself has 7GB RAM and 12GB swap. The amd64 where I
>>>>>>>>> compared
>>>>>>>>> has around 4GB/4GB RAM/swap.
>>>>>>>>>
>>>>>>>>> TIA,
>>>>>>>>> Andreas
>>>>>>>>>
>>>>>>>>> include<stdlib.h>
>>>>>>>>> #include<stdio.h>
>>>>>>>>>
>>>>>>>>> int main()
>>>>>>>>> {
>>>>>>>>>          void *p;
>>>>>>>>>
>>>>>>>>>          p = (void*) malloc (1152921504606846968ULL);
>>>>>>>>>          if (p != NULL)
>>>>>>>>>                  printf("p = %p\n", p);
>>>>>>>>>
>>>>>>>>>          printf("p = %p\n", p);
>>>>>>>>>          return (0);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>
>>>>>>>> First, you should provide details of what consistutes 'the unusable
>>>>>>>> machine situation' on powerpc.
>>>>>>>>
>>>>>>>
>>>>>>> I can not login anymore, everything is stuck except the core control
>>>>>>> mechanisms for example the fan controller.
>>>>>>>
>>>>>>> Top reports 'ugly' figures, below from a earlier try:
>>>>>>>
>>>>>>> last pid:  6790;  load averages:  0.78,  0.84,  0.86    up 0+00:34:52
>>>>>>> 22:42:29 47 processes:  1 running, 46 sleeping
>>>>>>> CPU:  0.0% user,  0.0% nice, 15.4% system, 11.8% interrupt, 72.8%
>>>>>>> idle
>>>>>>> Mem: 5912M Active, 570M Inact, 280M Wired, 26M Cache, 104M Buf, 352K
>>>>>>> Free
>>>>>>> Swap: 12G Total, 9904M Used, 2383M Free, 80% Inuse, 178M Out
>>>>>>>
>>>>>>>    PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU
>>>>>>> COMMAND
>>>>>>>   6768 andreast      1  52    01073741824G  6479M pfault  1   0:58
>>>>>>> 18.90% 31370.
>>>>>>>
>>>>>>> And after my mem and swap are full I see swap_pager_getswapspace(16)
>>>>>>> failed.
>>>>>>>
>>>>>>> In this state I can only power-cycle the machine.
>>>>>>>
>>>>>>>  That said, on amd64 the user map is between 0 and 0x7fffffffffff,
>>>>>>>> which
>>>>>>>> obviously less then the requested allocation size 0x100000000000000.
>>>>>>>> If you look at the kdump output on amd64, you will see that malloc()
>>>>>>>> tries to mmap() the area, fails and retries with obreak(). Default
>>>>>>>> virtual memory limit is unlimited, so my best quess is that on amd64
>>>>>>>> vm_map_findspace() returns immediately.
>>>>>>>>
>>>>>>>> On powerpc64, I see no reason why vm_map_entry cannot be allocated,
>>>>>>>> but
>>>>>>>> please note that vm object and pages shall be only allocated on
>>>>>>>> demand.
>>>>>>>> So I am curious how does your machine breaks and where.
>>>>>>>>
>>>>>>>
>>>>>>> I would expect that the 'system' does not allow me to allocate that
>>>>>>> much
>>>>>>> of ram.
>>>>>>>
>>>>>>
>>>>>> Does the issue with machine going into limbo reproducable with the
>>>>>> code
>>>>>> you posted ?
>>>>>>
>>>>>
>>>>> If I understand you correctly, yes. I can launch the test case and the
>>>>> machine is immediately unusable. Means I can not kill the process nor
>>>>> can I log in. Also, top does not show anything useful.
>>>>>
>>>> Again, let me restate my question: the single mmap() of the huge size is
>>>> enough for powerpc64 machine to break apart ?
>>>>
>>>
>>> I can't answer. I don't know yet.
>>>
>>>  What happen if you insert sleep(1000000); call before return ? Do not
>>>> kill
>>>> the process, I want to know is machine dead while the process sleeps.
>>>>
>>>
>>> Ok, during the 'sleep' the machine is usable. top is reporting figures,
>>> I can log in and edit files. The process runs now for aboutt 30'.
>>>
>>> When I kill the process, I do not get back to the shell nor can I log
>>> in. Also top stops reporting.
>>> But as you said, I didn't kill in this run.
>>>
>> Then, as Alan Cox pointed out, caused by the approach taken in powerpc64
>> pmap to handle pmap_remove(). It is definitely arch-specific.
>>
>
> Ok. I think you mean moea64_remove which is pmap_remove, right?
>
> Where did Alan pointed this out?
>
>
I was in a rush earlier, so I sent a short, cryptic note to Kostik
privately.

Alan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJUyCcO%2B76VirEbArnRPKx8JVGRfy8mknN4A7p1XGnug=vo8yw>