From owner-freebsd-stable@FreeBSD.ORG Thu Jul 24 16:18:12 2003 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8CF0537B401 for ; Thu, 24 Jul 2003 16:18:12 -0700 (PDT) Received: from out005.verizon.net (out005pub.verizon.net [206.46.170.143]) by mx1.FreeBSD.org (Postfix) with ESMTP id A342843FB1 for ; Thu, 24 Jul 2003 16:18:11 -0700 (PDT) (envelope-from cswiger@mac.com) Received: from mac.com ([141.149.47.46]) by out005.verizon.net (InterMail vM.5.01.05.33 201-253-122-126-133-20030313) with ESMTP id <20030724231810.ZKNU20032.out005.verizon.net@mac.com> for ; Thu, 24 Jul 2003 18:18:10 -0500 Message-ID: <3F20692E.2060107@mac.com> Date: Thu, 24 Jul 2003 19:18:06 -0400 From: Chuck Swiger Organization: The Courts of Chaos User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-us, en MIME-Version: 1.0 Cc: freebsd-stable@freebsd.org References: <20030724155926.7305F231C11@smithers.nildram.co.uk> In-Reply-To: <20030724155926.7305F231C11@smithers.nildram.co.uk> X-Enigmail-Version: 0.76.1.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Authentication-Info: Submitted using SMTP AUTH at out005.verizon.net from [141.149.47.46] at Thu, 24 Jul 2003 18:18:10 -0500 Subject: Re: malloc does not return null when out of memory X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jul 2003 23:18:12 -0000 Muttley wrote: > Yes, I thought briefly about something like this. > > Then I thought 'there's a race condition'. Where? The FreeBSD implementation is wrapped in a THREAD_LOCK()...? > Then I realised that other processes might not link against this malloc. Perhaps. > Then I realised the race condition doesn't even matter; processes will > still be killed, as the kernel doesn't care that you're still in > malloc() when the overcommitted memory is touched, it just knows you've > touched it and there's no actual memory there. This will result in far > more processes being killed. I believe that's a bad thing. Someone stated that it was a problem that malloc() returned pointers to virtual address space that had been mapped but not allocated. This patch does not guarantee that malloc() will return, but, if malloc() does returns a pointer, using the memory being pointed to will refer to memory that is allocated. As Barny Wolff said: > Won't this merely die in malloc, not return 0? True. This isn't a perfect solution, but given the choice between: 1) malloc(LOTS) returning a pointer, and then sometime later the program dies with a bus error when using that memory because no more VM is available, or 2) malloc(LOTS) causing an immediate failure in malloc(), ...choice #2 appears to be significantly better. Figuring out what went wrong from a coredump or backtrace for #2 when the signal happens in malloc() should be obvious; determining why the program crashed in the middle of referencing memory in some large buffer is potentially misleading. Programs which take care to preallocate regions of memory they need before they start doing a transaction or some other operation that needs to be atomic would also prefer #2; the patch I proposed could have a beneficial impact on data integrity for such programs. -- People who encounter programs crashing in malloc() are likely going to continue to complain about malloc() not returning NULL when the system is out of memory. If malloc() is referencing memory before returning the pointer, means that the system is going to reserve VM resources with temporal locality towards memory _allocation_ rather than memory _reference_. Having the program crash at memory allocation time rather than usage helps identify when and where this problem actually happens more clearly, if only by a little bit. I'm not sure whether allocating memory sooner that way will make it more likely that brk()/sbrk() or mmap() will return ENOMEM to the libc malloc() implementation, but if it does not help, perhaps that means something and we've identified the location of problem more precisely. -- -Chuck