Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 25 Feb 2013 12:17:52 -0800
From:      Jeremy Chadwick <jdc@koitsu.org>
To:        Ian Lepore <ian@FreeBSD.org>
Cc:        Michael Ross <gmx@ross.cx>, freebsd-stable@FreeBSD.org, John Mehr <jcm@visi.com>
Subject:   Re: svn - but smaller?
Message-ID:  <20130225201752.GA38298@icarus.home.lan>
In-Reply-To: <1361807679.16937.61.camel@revolution.hippie.lan>
References:  <20130224031509.GA47838@icarus.home.lan> <op.wszrv9k5g7njmm@michael-think> <20130224041638.GA51493@icarus.home.lan> <op.wszt3wh2g7njmm@michael-think> <20130224063110.GA53348@icarus.home.lan> <1361726397.16937.4.camel@revolution.hippie.lan> <20130224212436.GA13670@icarus.home.lan> <1361749413.16937.16.camel@revolution.hippie.lan> <20130225000454.GA17342@icarus.home.lan> <1361807679.16937.61.camel@revolution.hippie.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Feb 25, 2013 at 08:54:39AM -0700, Ian Lepore wrote:
> On Sun, 2013-02-24 at 16:04 -0800, Jeremy Chadwick wrote:
> > On Sun, Feb 24, 2013 at 04:43:33PM -0700, Ian Lepore wrote:
> > > On Sun, 2013-02-24 at 13:24 -0800, Jeremy Chadwick wrote:
> > > > On Sun, Feb 24, 2013 at 10:19:57AM -0700, Ian Lepore wrote:
> > > > > On Sat, 2013-02-23 at 22:31 -0800, Jeremy Chadwick wrote:
> > > > > > 
> > > > > > Also, John, please consider using malloc(3) instead of heap-allocated
> > > > > > buffers like file_buffer[6][] (196608 bytes) and command[] (32769
> > > > > > bytes).  I'm referring to this: 
> > > > > 
> > > > > Why?  I absolutely do not understand why people are always objecting to
> > > > > large stack-allocated arrays in userland code (sometimes with the
> > > > > definition of "large" being as small as 2k for some folks).
> > > > 
> > > > This is incredibly off-topic, but I'll bite.
> > > > 
> > > > I should not have said "heap-allocated", I was actually referring to
> > > > statically-allocated.
> > > > 
> > > > The issues as I see them:
> > > > 
> > > > 1. Such buffers exist during the entire program's lifetime even if they
> > > > aren't actively used/needed by the program.  With malloc(3) and friends,
> > > > you're allocating memory dynamically, and you can free(3) when done with
> > > > it, rather than just having a gigantic portion of memory allocated
> > > > sitting around potentially doing nothing.
> > > > 
> > > > 2. If the length of the buffer exceeds the amount of stack space
> > > > available at the time, the program will run but the behaviour is unknown
> > > > (I know that on FreeBSD if it exceeds "stacksize" per resource limits,
> > > > the program segfaults at runtime).  With malloc and friends you can
> > > > gracefully handle allocation failures.
> > > > 
> > > > 3. Statically-allocated buffers can't grow; meaning what you've
> > > > requested size-wise is all you get.  Compare this to something that's
> > > > dynamic -- think a linked list containing pointers to malloc'd memory,
> > > > which can even be realloc(3)'d if needed.
> > > > 
> > > > The definition of what's "too large" is up to the individual and the
> > > > limits of the underlying application.  For some people, sure, anything
> > > > larger than 2048 might warrant use of malloc.  I tend to use malloc for
> > > > anything larger than 4096.  That "magic number" comes from some piece of
> > > > information I was told long ago about what size pages malloc internally
> > > > uses, but looking at the IMPLEMENTATION NOTES section in malloc(3) it
> > > > appears to be a lot more complex than that.
> > > > 
> > > > If you want me to break down #1 for you with some real-world output and
> > > > a very small C program, showing you the effects on RES/RSS and SIZE/VIRT
> > > > of static vs. dynamic allocation, just ask.
> > > > 
> > > 
> > > Actually, after seeing that the userland limit for an unpriveleged user
> > > on freebsd is a mere 64k, I'd say the only valid reason to not allocate
> > > big things on the stack is because freebsd has completely broken
> > > defaults.
> > 
> > The limits (i.e. what's shown via limits(1)) differs per architecture
> > (ex. i386 vs. amd64) and may adjust based on amount of physical memory
> > available (not sure on the latter part).  The "64k" value you're talking
> > about, I think, is "memorylocked" -- I'm referring to "stacksize".
> > 
> > > I see no reason why there should even be a distinction
> > > between stack size and memory use limits in general.  Pages are pages,
> > > it really doesn't matter what part of your virtual address space they
> > > live in.
> > 
> > You're thinking purely of SIZE/VIRT.
> > 
> > I guess I'd best break the C program out.  Apologise in advance for the
> > crappy code (system(3)!), but I wanted something that made the task
> > easy.
> > 
> > $ limits -a
> > Resource limits (current):
> >   cputime              infinity secs
> >   filesize             infinity kB
> >   datasize              2621440 kB
> >   stacksize              262144 kB
> >   coredumpsize         infinity kB
> >   memoryuse            infinity kB
> >   memorylocked               64 kB
> >   maxprocesses             5547
> >   openfiles               11095
> >   sbsize               infinity bytes
> >   vmemoryuse           infinity kB
> >   pseudo-terminals     infinity
> >   swapuse              infinity kB
> > 
> > $ cat x.c
> > #include <stdio.h>
> > #include <stdlib.h>
> > #include <sys/types.h>
> > #include <unistd.h>
> > #include <string.h>
> > 
> > #define SIZE_MFATTY	512*1024*1024	/* 512MB */
> > #define SIZE_SFATTY	128*1024*1024	/* 128MB; must be smaller than limits stacksize! */
> > 
> > int main(int argc, char *argv[]) {
> > 	char procstat[BUFSIZ];
> > 	char topgrep[BUFSIZ];
> > 	pid_t mypid;
> > 	char *mfatty;
> > 	char sfatty[SIZE_SFATTY];
> > 
> > 	sfatty[0] = '\0';		/* squelch gcc unused var warning */
> > 
> > 	mypid = getpid();
> > 
> > 	snprintf(procstat, sizeof(procstat),
> > 		"procstat -v %u", mypid);
> > 	snprintf(topgrep, sizeof(topgrep),
> > 		"top -b 99999 | egrep '^(Mem:|[ ]+PID|[ ]*%u)'", mypid);
> > 
> > 	printf("at startup\n");
> > 	printf("============\n");
> > 	system(topgrep);
> > 	printf("-----\n");
> > 	system(procstat);
> > 	sleep(5);
> > 
> > 	mfatty = malloc(SIZE_MFATTY);
> > 	printf("\n");
> > 	printf("after malloc mfatty\n");
> > 	printf("=====================\n");
> > 	system(topgrep);
> > 	printf("-----\n");
> > 	system(procstat);
> > 	sleep(5);
> > 
> > 	memset(mfatty, 0x0, SIZE_MFATTY);
> > 	memset(&sfatty, 0x0, SIZE_SFATTY);
> > 	printf("\n");
> > 	printf("after memset mfatty and sfatty\n");
> > 	printf("  (e.g. pages are marked used/written to)\n");
> > 	printf("===========================================\n");
> > 	system(topgrep);
> > 	printf("-----\n");
> > 	system(procstat);
> > 	sleep(5);
> > 
> > 	free(mfatty);
> > 	printf("\n");
> > 	printf("after free mfatty\n");
> > 	printf("===================\n");
> > 	system(topgrep);
> > 	printf("-----\n");
> > 	system(procstat);
> > 
> > 	return(0);
> > }
> > 
> > $ gcc -Wall -o x x.c
> > $ ./x
> > at startup
> > ============
> > Mem: 97M Active, 221M Inact, 1530M Wired, 825M Buf, 6045M Free
> >   PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
> > 17567 jdc           1  27    0   138M  1524K wait    1   0:00  0.00% ./x
> > -----
> >   PID              START                END PRT  RES PRES REF SHD   FL TP PATH
> > 17567           0x400000           0x401000 r-x    1    0   1   0 CN-- vn /home/jdc/x
> > 17567           0x600000           0x800000 rw-    2    0   1   0 CN-- df
> > 17567         0xa0600000         0xa0618000 r-x   24    0  33   0 CN-- vn /libexec/ld-elf.so.1
> > 17567         0xa0618000         0xa0639000 rw-   21    0   1   0 CN-- df
> > 17567         0xa0818000         0xa081a000 rw-    2    0   1   0 CN-- df
> > 17567         0xa081a000         0xa094b000 r-x  283    0  57  24 CN-- vn /lib/libc.so.7
> > 17567         0xa094b000         0xa0b4a000 ---    0    0   1   0 CN-- df
> > 17567         0xa0b4a000         0xa0b55000 rw-   11    0   1   0 CN-- vn /lib/libc.so.7
> > 17567         0xa0b55000         0xa0b6f000 rw-    6    0   1   0 CN-- df
> > 17567         0xa0c00000         0xa1000000 rw-    9    0   1   0 CN-- df
> > 17567     0x7ffff7fdf000     0x7ffffffdf000 rw-    2    0   1   0 C--D df
> > 17567     0x7ffffffdf000     0x7ffffffff000 rw-    3    0   1   0 CN-- df
> > 17567     0x7ffffffff000     0x800000000000 r-x    0    0  36   0 CN-- ph
> > 
> > after malloc mfatty
> > =====================
> > Mem: 97M Active, 221M Inact, 1530M Wired, 825M Buf, 6045M Free
> >   PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
> > 17567 jdc           1  25    0   650M  1524K wait    3   0:00  0.00% ./x
> > -----
> >   PID              START                END PRT  RES PRES REF SHD   FL TP PATH
> > 17567           0x400000           0x401000 r-x    1    0   1   0 CN-- vn /home/jdc/x
> > 17567           0x600000           0x800000 rw-    2    0   1   0 CN-- df
> > 17567         0xa0600000         0xa0618000 r-x   24    0  33   0 CN-- vn /libexec/ld-elf.so.1
> > 17567         0xa0618000         0xa0639000 rw-   21    0   1   0 CN-- df
> > 17567         0xa0818000         0xa081a000 rw-    2    0   1   0 CN-- df
> > 17567         0xa081a000         0xa094b000 r-x  283    0  57  24 CN-- vn /lib/libc.so.7
> > 17567         0xa094b000         0xa0b4a000 ---    0    0   1   0 CN-- df
> > 17567         0xa0b4a000         0xa0b55000 rw-   11    0   1   0 CN-- vn /lib/libc.so.7
> > 17567         0xa0b55000         0xa0b6f000 rw-    6    0   1   0 CN-- df
> > 17567         0xa0c00000         0xa1000000 rw-    9    0   1   0 CN-- df
> > 17567         0xa1000000         0xc1000000 rw-    0    0   1   0 CN-- df
> > 17567     0x7ffff7fdf000     0x7ffffffdf000 rw-    2    0   1   0 C--D df
> > 17567     0x7ffffffdf000     0x7ffffffff000 rw-    3    0   1   0 CN-- df
> > 17567     0x7ffffffff000     0x800000000000 r-x    0    0  36   0 CN-- ph
> > 
> > after memset mfatty and sfatty
> >   (e.g. pages are marked used/written to)
> > ===========================================
> > Mem: 737M Active, 221M Inact, 1531M Wired, 825M Buf, 5404M Free
> >   PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
> > 17567 jdc           1  31    0   650M   643M wait    2   0:01  5.27% ./x
> > -----
> >   PID              START                END PRT  RES PRES REF SHD   FL TP PATH
> > 17567           0x400000           0x401000 r-x    1    0   1   0 CN-- vn /home/jdc/x
> > 17567           0x600000           0x800000 rw-    2    0   1   0 CN-- df
> > 17567         0xa0600000         0xa0618000 r-x   24    0  33   0 CN-- vn /libexec/ld-elf.so.1
> > 17567         0xa0618000         0xa0639000 rw-   21    0   1   0 CN-- df
> > 17567         0xa0818000         0xa081a000 rw-    2    0   1   0 CN-- df
> > 17567         0xa081a000         0xa094b000 r-x  283    0  57  24 CN-- vn /lib/libc.so.7
> > 17567         0xa094b000         0xa0b4a000 ---    0    0   1   0 CN-- df
> > 17567         0xa0b4a000         0xa0b55000 rw-   11    0   1   0 CN-- vn /lib/libc.so.7
> > 17567         0xa0b55000         0xa0b6f000 rw-    6    0   1   0 CN-- df
> > 17567         0xa0c00000         0xa1000000 rw-    9    0   1   0 CN-- df
> > 17567         0xa1000000         0xc1000000 rw- 131072    0   1   0 CNS- df
> > 17567     0x7ffff7fdf000     0x7ffffffdf000 rw- 32739    0   1   0 C-SD df
> > 17567     0x7ffffffdf000     0x7ffffffff000 rw-   32    0   1   0 CN-- df
> > 17567     0x7ffffffff000     0x800000000000 r-x    0    0  36   0 CN-- ph
> > 
> > after free mfatty
> > ===================
> > Mem: 229M Active, 222M Inact, 1531M Wired, 825M Buf, 5913M Free
> >   PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
> > 17567 jdc           1  27    0   138M   130M wait    3   0:01  2.78% ./x
> > -----
> >   PID              START                END PRT  RES PRES REF SHD   FL TP PATH
> > 17567           0x400000           0x401000 r-x    1    0   1   0 CN-- vn /home/jdc/x
> > 17567           0x600000           0x800000 rw-    2    0   1   0 CN-- df
> > 17567         0xa0600000         0xa0618000 r-x   24    0  38   0 CN-- vn /libexec/ld-elf.so.1
> > 17567         0xa0618000         0xa0639000 rw-   21    0   1   0 CN-- df
> > 17567         0xa0818000         0xa081a000 rw-    2    0   1   0 CN-- df
> > 17567         0xa081a000         0xa094b000 r-x  283    0  65  27 CN-- vn /lib/libc.so.7
> > 17567         0xa094b000         0xa0b4a000 ---    0    0   1   0 CN-- df
> > 17567         0xa0b4a000         0xa0b55000 rw-   11    0   1   0 CN-- vn /lib/libc.so.7
> > 17567         0xa0b55000         0xa0b6f000 rw-    6    0   1   0 CN-- df
> > 17567         0xa0c00000         0xa1000000 rw-    9    0   1   0 CN-- df
> > 17567     0x7ffff7fdf000     0x7ffffffdf000 rw- 32739    0   1   0 C-SD df
> > 17567     0x7ffffffdf000     0x7ffffffff000 rw-   32    0   1   0 CN-- df
> > 17567     0x7ffffffff000     0x800000000000 r-x    0    0  41   0 CN-- ph
> > 
> > Look very carefully at the RES column for that process, particularly
> > after the memset(), and then again after the free().
> > 
> > You'll see quite clearly that the sfatty[] array remains in use, wasting
> > memory that could otherwise be used by other processes on the system, up
> > until exit.  This is also quite apparent in the procstat output.
> > 
> > Moral of the story: it matters (and malloc is your friend).  :-)
> > 
> > > Almost everything I've ever done with freebsd runs as root on an
> > > embedded system, so I'm not used to thinking in terms of limits at all.
> > 
> > I would think an embedded platform would do the exact opposite -- force
> > you to think in terms of limits *all the time*.
> > 
> > For example: I partake in the TomatoUSB project (a Linux-based firmware
> > that runs mainly on MIPSR1/R2 boxes); my overall thought process. when
> > developing or fixing something relating to TomatoUSB are significantly
> > more strict than on a FreeBSD amd64 system with 8GB RAM and an 8-core CPU.
> > 
> > But generally speaking I tend to write code/develop things with
> > "minimal" in mind at all times, and that all stems from growing up doing
> > assembly on 65xxx systems (Apple II series, Nintendo/Famicom and the
> > Super Nintendo/Super Famicom) and PIC -- where CPU time and memory is
> > highly limits.  I always design things assuming it'll be used on very
> > minimal architectures.
> > 
> > I am in no way saying John (or you!) should have the same mentality
> > (like I said it varies on environment and application goal and so on),
> > but with regards to what I do, KISS principle combined with "minimalist"
> > approach has yet to fail me.  But there are certainly cases with complex
> > applications where you can't design something within those limits (e.g.
> > "okay, to do this efficiently, we're going to need to have about
> > 128MBytes of non-swapped RAM available just for this process at all
> > times"), and I respect that.  But my point is that static allocation
> > vs. malloc matters more than you think.
> > 
> 
> So your point is that if a program is going to call deep down into some
> chain that uses big stack-allocated buffers, then it continues to run
> for some non-trivial amount of time after that, never again descending
> that particular call chain, then the memory would have been recovered by
> the system faster if it had been malloc'd and free'd rather than stack
> allocated.  

No, that isn't what my point is.  I used large numbers to make it more
obvious (plus to deal with the KB->MB rounding situation in top).

My point is that once the pages of the statically-allocated buffer
(sfatty, the 128MByte buffer) are used, they appear to be unreleased
until the program exits.  Look closely at RES/RSS in the above output.

> But if the program uses the big buffers for most of its duration, then
> there's no real difference in behavior at all.

That's borderline political -- the statement doesn't take into
consideration what happens when stacksize is exceeded.

Did you see Stephen Montgomery-Smith's reply with mentioning mkctm in
the base system?  That's a real-world example of this happening, where
someone in the past said "screw it" and really didn't think about the
bigger picture (other platforms, memory limitations, etc.).

> If the big buffers are smaller than the limit within the malloc
> implementation that triggers a MADV_DONTNEED immediately, (for example,
> if they were less than... I think it's 4MB in our current
> implementation) then there's no real difference.  I'm guessing you had
> to use such extreme sized buffers to demonstrate your point because with
> something more reasonable (we were talking about ~200K of IO buffers
> originally) there was no difference.

I changed the declared sizes of mfatty and sfatty to 8MBytes and 2MBytes
respectively -- I wanted sfatty under this 4MByte number you're talking
about.

I threw a sleep(300) (5 minutes) followed by a couple more system()s to
see if pagedaemon (or whatever piece of the kernel VM) reaped or even
swapped out the memory associated with sfatty.  It didn't:

startup:  38827 jdc           1  26    0 11976K  1392K wait    3   0:00  0.00% ./x
malloc:   38827 jdc           1  25    0 20168K  1392K wait    0   0:00  0.00% ./x
memset:   38827 jdc           1  23    0 20168K 11640K wait    2   0:00  0.00% ./x
free:     38827 jdc           1  22    0 11976K  3432K wait    0   0:00  0.00% ./x
sleep300: 38827 jdc           1  20    0 11976K  3432K wait    2   0:00  0.00% ./x

The procstat output for the relevant pages (sfatty) also did not change
(meaning the TP field did not change, nor did FL).  You can clearly see
the 2MBytes "laying around".

As I understand it, this is because free(), via libc, can tell the
kernel immediately the kernel "I'm done with this", while such isn't the
case for something allocated during executable load-time.  Again, as I
understand it, such can only be released when the program exits.

> I guess the moral of the story is don't use huge stack-allocated buffers
> once at startup in a daemon, especially if megabytes are involved.  But
> if you're writing some normal program that needs some memory for a while
> for its major tasks, and when its done with those tasks it exits,
> then... there's no real difference.

Sure, this matters significantly more for daemons than programs which do
exit regularly, but that's splitting hairs.  Think about multiple
instances of a program, or a multi-user machine (e.g.  shell boxes), or
a program that simply runs for a while, or said program being used on a
system with smaller resources than you yourself are used to.

There was a thread somewhat recently on one of the lists, I forget
which/where, where people were talking about using present-day FreeBSD
on systems with either 64MB or 128MB of RAM (something "really small" by
today's standards).  You might think 32768*6 wouldn't matter to them,
but what if their box is a squid proxy, where most of the RAM is for
content cache?  Suddenly it matters more.

When I write software, I try to think minimally.  I grew up developing
on the original Apple II series (64KB RAM barring the IIGS) and later
the 286/386, along with some PICs, and the thought process that went
along with the architectural/system limitations has stuck with me since.

I don't get exceedingly pedantic unless the software plans on being used
on a very minimal architecture (I've done a lot of development on the
NES/Famicom which has only 2KB RAM, for example).

If I don't know what architecture it'll be used on, I try to be
reasonable without being overly pedantic (e.g. worrying about every
single last bit); ex. worrying about what fits into L1/L2 cache lines and
so on is something I generally tend to leave to the compiler on
present-day platforms (it matters though, even in our malloc(3)
implementation!).

> Which I guess means the real bottom-line moral of the story is that
> there are no simple mantras.  "Don't use big stack allocated buffers" is
> going to be wrong as many times as "Always allocate everything on the
> stack" -- you can come up with examples that refute each assertion.

Absolutely (with regards to your last point), and I don't want to make
it sound like I'm just going to pull arguments out of my ass solely for
the sake of arguing.

As with everything in programming there are many variables involved in
software design; there are absolutely cases where a statically-allocated
buffer has its merits.  Example off the top of my head: a program where
CPU time matters and a tight loop is involved (the overhead of calling
malloc/free repeatedly chewing up too much time, where using a
statically-allocated buffer relieves that pain).  Trade-off is worth it.

So logically the next question would be "so where do you draw the line?"

And my answer would be this: "if you're unsure where to draw the line,
or unaware of the benefits and drawbacks of statically-allocated
buffers, or don't know when they're applicable, use malloc/free".
*shrug*  That's just how I operate.

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130225201752.GA38298>