From owner-freebsd-questions@FreeBSD.ORG Wed Aug 13 16:56:18 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8DCBF1065677 for ; Wed, 13 Aug 2008 16:56:18 +0000 (UTC) (envelope-from cpghost@cordula.ws) Received: from fw.farid-hajji.net (fw.farid-hajji.net [213.146.115.42]) by mx1.freebsd.org (Postfix) with ESMTP id 07B1A8FC13 for ; Wed, 13 Aug 2008 16:56:18 +0000 (UTC) (envelope-from cpghost@cordula.ws) Received: from epia-2.farid-hajji.net (epia-2 [192.168.254.11]) by fw.farid-hajji.net (Postfix) with ESMTP id D2C1134B08; Wed, 13 Aug 2008 18:56:14 +0200 (CEST) Date: Wed, 13 Aug 2008 18:56:13 +0200 From: cpghost To: Laszlo Nagy Message-ID: <20080813165613.GB18638@epia-2.farid-hajji.net> References: <48A2EBD7.9000903@shopzeus.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <48A2EBD7.9000903@shopzeus.com> User-Agent: Mutt/1.5.18 (2008-05-17) Cc: freebsd-questions@freebsd.org Subject: Re: Max. number of opened files, efficiency X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2008 16:56:18 -0000 On Wed, Aug 13, 2008 at 04:12:39PM +0200, Laszlo Nagy wrote: > How many files can I open under FreeBSD, at the same time? % sysctl -a | grep maxfiles kern.maxfiles: 7880 kern.maxfilesperproc: 7092 But remember that you're already using a few hundred file descriptors, so usually, you won't have more than 6800 or so open files for your application... unless you crank up those values (in /etc/sysctl.conf IIRC) Your shell may also limit the number of open files (cf openfiles below): % limits Resource limits (current): cputime infinity secs filesize infinity kB datasize 524288 kB stacksize 65536 kB coredumpsize infinity kB memoryuse infinity kB memorylocked infinity kB maxprocesses 3546 openfiles 7092 sbsize infinity bytes vmemoryuse infinity kB > Problem: I'm making a pivot table, and when I drill down the facts, I > would like to create a new temporary file for each possible dimension > value. In most cases, there will be less than 1000 dimension values. I > tried to open 1000 temporary files and I could do so within one second. > > But how efficient is that? What happens when I open 1000 temporary > files, and write data into them randomly, 10 million times. (avg. 10 000 > write operations per file) Will this be handled efficiently by the OS? > Is efficiency affected by the underlying filesystem? Wouldn't it be more efficient to use a DBM file (anydbm, bsddb), indexed by dimension, for this? You may also want to consider numpy and some modules in scipy for this kind of computations: IIRC they do have some functions to efficiently store and read back binary data to/from files. And numpy (ndarray) does have a nice slice-like syntax too. > I also tried to create 10 000 temporary files, but performance dropped down. > > Example in Python: > > import tempfile > import time > N = 10000 > start = time.time() > files = [ tempfile.TemporaryFile() for i in range(N)] > stop = time.time() > print "created %s files/second" % ( int(N/(stop-start)) ) > > On my computer this program prints "3814 files/second" for N=1000, and > "1561 files/second" for N=10000. > > Thanks, > > Laszlo Regards, -cpghost. -- Cordula's Web. http://www.cordula.ws/