From owner-freebsd-stable@FreeBSD.ORG Wed Jan 7 05:11:31 2004 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4A99916A4CE; Wed, 7 Jan 2004 05:11:31 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id E097943D53; Wed, 7 Jan 2004 05:11:27 -0800 (PST) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.10/8.12.10) with ESMTP id i07DB9D7031179; Wed, 7 Jan 2004 14:11:25 +0100 (CET) (envelope-from phk@phk.freebsd.dk) To: Alexander Leidinger From: "Poul-Henning Kamp" In-Reply-To: Your message of "Wed, 07 Jan 2004 13:39:30 +0100." <20040107133930.47eb851b@Magellan.Leidinger.net> Date: Wed, 07 Jan 2004 14:11:09 +0100 Message-ID: <31178.1073481069@critter.freebsd.dk> cc: Peter Jeremy cc: stable@freebsd.org cc: current@freebsd.org Subject: Re: perl malloc slow? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jan 2004 13:11:31 -0000 In message <20040107133930.47eb851b@Magellan.Leidinger.net>, Alexander Leidinge r writes: >> It's not clear why the builtin perl malloc is so much faster in this >> case. A quick check of the perl malloc code suggests that it uses a >> geometric-progression bucket arrangement (whereas phkmalloc appears to >> use page-sized buckets for large allocations) - this would >> significantly reduce the number of realloc() copies. > >This is IMHO the right allocation algorithm for such programs (at least >I don't know of a better one and I've seen it in several places where >you can't guess the amount of memory you need). I'm sure the perl >developers tuned the perl_malloc() with real world perl programs. Maybe >this kind of behavior is typical for a lot of perl programs. One of the assumptions I made ten years ago, was that we would expose more of the possible VM gymnastics to userland and in particular it was my expectation that it would be cheap for a process to do some sort of page-flipping or page-exchange. This has not materialized in the meantime, and VMwizards have generally been a lot less than enthusiastic about it when I have tried to coax them into providing this sort of thing. The result is that applications which use realloc() a lot suffer. The interesting catch22 is that the reason why they use realloc() a lot is no longer valid, and hasn't been since virtual memory came into use 15-20 years ago: If I need to read a string in, and I don't know how long it is, I can do it two ways: l = 80; /* User is probably using punched cards */ p = malloc(l); for (;;) { [...] /* Damn, add another card */ l += 80; p = realloc(p, l); [...] } This kind of code is based on the assumption that the amount of address-space in my process is important for performance. This was true before VM systems because the entire address-space of the process got swapped in and out. VM systems on the other hand, operates on a page level, and modern code would be much better off like this: l = PAGE_SIZE; p = malloc(l); for (;;) { [...] /* Damn */ l *= 16; p = realloc(p, l); [...] } /* Now trim */ p = realloc(p, strlen(p)); (For some value of 16.) The important thing to be aware of, is that under VM systems, having a page allocated but unused is very cheap. If the system comes under memory pressure, those pages will get paged out (if they have been written into) and never be paged in again. This btw, might be FAQ fodder. Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence.