From owner-freebsd-current@FreeBSD.ORG Fri Jan 4 11:38:24 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A1E7716A418; Fri, 4 Jan 2008 11:38:24 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 700F813C43E; Fri, 4 Jan 2008 11:38:24 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id ACFFB512D6; Fri, 4 Jan 2008 06:38:23 -0500 (EST) Date: Fri, 4 Jan 2008 11:38:23 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Igor Mozolevsky In-Reply-To: Message-ID: <20080104113426.T77222@fledge.watson.org> References: <477C82F0.5060809@freebsd.org> <863ateemw2.fsf@ds4.des.no> <20080104002002.L30578@fledge.watson.org> <86bq81c12d.fsf@ds4.des.no> <20080104111938.N77222@fledge.watson.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: =?ISO-8859-1?Q?Dag-Erling_Sm=F8rgrav?= , freebsd-current@freebsd.org, Jason Evans Subject: Re: sbrk(2) broken X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Jan 2008 11:38:24 -0000 On Fri, 4 Jan 2008, Igor Mozolevsky wrote: > On 04/01/2008, Robert Watson wrote: >> On Fri, 4 Jan 2008, Igor Mozolevsky wrote: >> >>> Of course, if you're afraid of memory overcommit and you know in advance >>>> how much memory you need, you can simply allocate a sufficient amount of >>>> address space at startup and touch it all. This way, you will either be >>>> killed right away, or be guaranteed to have sufficient memory for the >>>> rest of your (process) lifetime. Alternatively, do what Varnish does: >>>> create a large file, mmap it, and allocate everything you need from that >>>> area, so you have your own private swap space. Just make sure to >>>> actually allocate the disk space you need (by filling the file with >>>> zeroes, or at the minimum writing a zero to the file every sb.st_blksize >>>> bytes, preferably sequentially to avoid excessive fragmentation) >>> >>> Surely you can just fseek() on the file at the correct lenght? >> >> That will create a sparse file without file system blocks to back it, and >> is effectively also over-commit. When the file system runs out of room, >> you will get SIGSEGV when the vnode pager discovers it can't write a page >> to disk. If you zero-fill it, the blocks are pre-allocated. > > Surely you should not be allowed to overcommit on fseek() followed by > write(,,1); zeroing out gigs of hdd space seems rather silly... Sparse files are a feature. It just becomes inconvenient at that point because you discover the lack of space asynchronously from a useful user process event. When memory pressure gets high, the vnode pager decides it's time to push a dirty page to disk, and then discovers that there are no free blocks on the file system to write to. As I mentioned in my e-mail, it would be nice if our file system supported a way to reserve blocks for files without hooking them up to the file's visiible address space (in order to avoid zeroing them, which is required if you do want to hook them up for an unprivileged process). However, that feature doesn't currently exist. Many systems with sensitivity to on-demand allocation costs and without security requirements allow files to be extended without zeroing. On systems with security requirements, this becomes a privileged operation (such as on Mac OS X) because exposing unzeroed pages from other files or processes not explicitly shared is Not Allowed. Robert N M Watson Computer Laboratory University of Cambridge