From owner-freebsd-arch@FreeBSD.ORG  Wed Nov 30 22:21:40 2011
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 9C641106566C
	for <freebsd-arch@freebsd.org>; Wed, 30 Nov 2011 22:21:40 +0000 (UTC)
	(envelope-from alan.l.cox@gmail.com)
Received: from mail-yx0-f182.google.com (mail-yx0-f182.google.com
	[209.85.213.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 3BBEA8FC0A
	for <freebsd-arch@freebsd.org>; Wed, 30 Nov 2011 22:21:39 +0000 (UTC)
Received: by yenq9 with SMTP id q9so1628454yen.13
	for <freebsd-arch@freebsd.org>; Wed, 30 Nov 2011 14:21:39 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=mime-version:reply-to:in-reply-to:references:date:message-id
	:subject:from:to:cc:content-type;
	bh=wJR27wwqVuetIRET3qEKQ2E0+H6r3oB89gMAR1suBdg=;
	b=X4YMddEwi4Fu0zsJsax56dE2xCfCo0cUUVOp7eY0QRymSf5tikOTUuYu8cfCSx9gON
	L2jRCdsDPs1OOawwhS3qHbUWthxNo2PJ5og7LXQl9+IAJFkp0Sm+W/DeDVr5icybIVMI
	uw3T3+Z2vYtT/NwP0NAnRJ+JJgUJB3+fD6p/8=
MIME-Version: 1.0
Received: by 10.68.28.230 with SMTP id e6mr464563pbh.117.1322689852816; Wed,
	30 Nov 2011 13:50:52 -0800 (PST)
Received: by 10.143.159.9 with HTTP; Wed, 30 Nov 2011 13:50:52 -0800 (PST)
In-Reply-To: <CAGE5yCpe8rfZp3ErXrf_SFwY_KNYQDyF87TAypxajJa-FSqcpQ@mail.gmail.com>
References: <4ED5BE19.70805@fgznet.ch>
	<20111130162236.GA50300@deviant.kiev.zoral.com.ua>
	<4ED65F70.7050700@fgznet.ch>
	<20111130170936.GB50300@deviant.kiev.zoral.com.ua>
	<4ED66B75.3060409@fgznet.ch>
	<CAGE5yCpe8rfZp3ErXrf_SFwY_KNYQDyF87TAypxajJa-FSqcpQ@mail.gmail.com>
Date: Wed, 30 Nov 2011 15:50:52 -0600
Message-ID: <CAJUyCcMPh818n-XxmBBCHUVJVZYQGaQN2AzGY9K8pEFm3rz-_w@mail.gmail.com>
From: Alan Cox <alan.l.cox@gmail.com>
To: Peter Wemm <peter@wemm.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: Kostik Belousov <kostikbel@gmail.com>,
	Andreas Tobler <andreast-list@fgznet.ch>,
	FreeBSD Arch <freebsd-arch@freebsd.org>
Subject: Re: powerpc64 malloc limit?
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: alc@freebsd.org
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 30 Nov 2011 22:21:40 -0000

On Wed, Nov 30, 2011 at 12:12 PM, Peter Wemm <peter@wemm.org> wrote:

> On Wed, Nov 30, 2011 at 9:44 AM, Andreas Tobler <andreast-list@fgznet.ch>
> wrote:
> > On 30.11.11 18:09, Kostik Belousov wrote:
> >>
> >> On Wed, Nov 30, 2011 at 05:53:04PM +0100, Andreas Tobler wrote:
> >>>
> >>> On 30.11.11 17:22, Kostik Belousov wrote:
> >>>>
> >>>> On Wed, Nov 30, 2011 at 06:24:41AM +0100, Andreas Tobler wrote:
> >>>>>
> >>>>> All,
> >>>>>
> >>>>> while working on gcc I found a very strange situation which renders
> my
> >>>>> powerpc64 machine unusable.
> >>>>> The test case below tries to allocate that much memory as 'wanted'.
> The
> >>>>> same test case on amd64 returns w/o trying to allocate mem because
> the
> >>>>> size is far to big.
> >>>>>
> >>>>> I couldn't find the reason so far, that's why I'm here.
> >>>>>
> >>>>> As Nathan pointed out the VM_MAXUSER_SIZE is the biggest on
> powerpc64:
> >>>>> #define VM_MAXUSER_ADDRESS      (0x7ffffffffffff000UL)
> >>>>>
> >>>>> So, I'd expect a system to return an allocation error when a user
> tries
> >>>>> to allocate too much memory and not really trying it and going to be
> >>>>> unusable. Iow, I'd exepect the situation on powerpc64 as I see on
> >>>>> amd64.
> >>>>>
> >>>>> Can anybody explain me the situation, why do I not have a working
> limit
> >>>>> on powerpc64?
> >>>>>
> >>>>> The machine itself has 7GB RAM and 12GB swap. The amd64 where I
> >>>>> compared
> >>>>> has around 4GB/4GB RAM/swap.
> >>>>>
> >>>>> TIA,
> >>>>> Andreas
> >>>>>
> >>>>> include<stdlib.h>
> >>>>> #include<stdio.h>
> >>>>>
> >>>>> int main()
> >>>>> {
> >>>>>          void *p;
> >>>>>
> >>>>>          p = (void*) malloc (1152921504606846968ULL);
> >>>>>          if (p != NULL)
> >>>>>                  printf("p = %p\n", p);
> >>>>>
> >>>>>          printf("p = %p\n", p);
> >>>>>          return (0);
> >>>>> }
> >>>>
> >>>> First, you should provide details of what consistutes 'the unusable
> >>>> machine situation' on powerpc.
> >>>
> >>> I can not login anymore, everything is stuck except the core control
> >>> mechanisms for example the fan controller.
> >>>
> >>> Top reports 'ugly' figures, below from a earlier try:
> >>>
> >>> last pid:  6790;  load averages:  0.78,  0.84,  0.86    up 0+00:34:52
> >>> 22:42:29 47 processes:  1 running, 46 sleeping
> >>> CPU:  0.0% user,  0.0% nice, 15.4% system, 11.8% interrupt, 72.8% idle
> >>> Mem: 5912M Active, 570M Inact, 280M Wired, 26M Cache, 104M Buf, 352K
> Free
> >>> Swap: 12G Total, 9904M Used, 2383M Free, 80% Inuse, 178M Out
> >>>
> >>>    PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU
> >>> COMMAND
> >>>   6768 andreast      1  52    01073741824G  6479M pfault  1   0:58
> >>> 18.90% 31370.
> >>>
> >>> And after my mem and swap are full I see swap_pager_getswapspace(16)
> >>> failed.
> >>>
> >>> In this state I can only power-cycle the machine.
> >>>
> >>>> That said, on amd64 the user map is between 0 and 0x7fffffffffff,
> which
> >>>> obviously less then the requested allocation size 0x100000000000000.
> >>>> If you look at the kdump output on amd64, you will see that malloc()
> >>>> tries to mmap() the area, fails and retries with obreak(). Default
> >>>> virtual memory limit is unlimited, so my best quess is that on amd64
> >>>> vm_map_findspace() returns immediately.
> >>>>
> >>>> On powerpc64, I see no reason why vm_map_entry cannot be allocated,
> but
> >>>> please note that vm object and pages shall be only allocated on
> demand.
> >>>> So I am curious how does your machine breaks and where.
> >>>
> >>> I would expect that the 'system' does not allow me to allocate that
> much
> >>> of ram.
> >>
> >> Does the issue with machine going into limbo reproducable with the code
> >> you posted ?
> >
> > If I understand you correctly, yes. I can launch the test case and the
> > machine is immediately unusable. Means I can not kill the process nor
> can I
> > log in. Also, top does not show anything useful.
> >
> > The original test case where I discovered this behavior behaves a bit
> > different.
> >
> http://gcc.gnu.org/viewcvs/trunk/libstdc%2B%2B-v3/testsuite/23_containers/vector/bool/modifiers/insert/31370.cc?revision=169421&view=markup
> >
> > Here I can follow how the ram and swap is eaten up. Top is reporting the
> > figures. If everything is 'full', the swaper errors start to appear on
> the
> > console.
> >
> >> Or, do you need to actually touch the pages in the allocated region ?
> >
> > If I have to, how would I do that?
> >
> >> If the later (and I do expect that), then how many pages do you need
> >> to touch before machine breaks ? Is it single access that causes the
> >> havoc, or you need to touch the amount approximately equal to RAM+swap ?
> >
> > Andreas
>
> ia64 had some vaguely related excitement earlier in its life.    If
> you created a 1TB sparse file and mmap'ed it over and over, tens,
> maybe hundreds of thousands of times, certain VM internal state got
> way out of hand.  mmaping was fine, but unmapping took 36 hours of cpu
> runtime when I killed the process.  It got so far out of hand because
> of the way ia64 handled just-in-time mappings on vhpt misses.
>
>
There is a fundamental scalability problem with the powerpc64/aim pmap.
See revision 212360.  In a nutshell, unlike amd64, ia64, and most other
pmap implementations, the powerpc64/aim pmap implementation doesn't link
together all of the pv entries belonging to a pmap into a list.  So, the
powerpc64/aim implementations of range operations, like pmap_remove(),
don't handle large, sparsely populated ranges as efficiently as ia64 does.
Moreover, powerpc64/aim can't effectively implement pmap_remove_pages(),
and so it doesn't even try.

Alan