From owner-freebsd-questions@FreeBSD.ORG Thu Jun 12 03:20:18 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C7FF6106564A for ; Thu, 12 Jun 2008 03:20:18 +0000 (UTC) (envelope-from cswiger@mac.com) Received: from pi.codefab.com (pi.codefab.com [199.103.21.227]) by mx1.freebsd.org (Postfix) with ESMTP id 7B5D98FC1A for ; Thu, 12 Jun 2008 03:20:18 +0000 (UTC) (envelope-from cswiger@mac.com) Received: from localhost (localhost [127.0.0.1]) by pi.codefab.com (Postfix) with ESMTP id 9A0915CDB; Wed, 11 Jun 2008 23:20:17 -0400 (EDT) X-Virus-Scanned: amavisd-new at codefab.com Received: from pi.codefab.com ([127.0.0.1]) by localhost (pi.codefab.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id z9n6SIQomBSL; Wed, 11 Jun 2008 23:20:14 -0400 (EDT) Received: from [10.152.145.250] (72-165-115-225.dia.static.qwest.net [72.165.115.225]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by pi.codefab.com (Postfix) with ESMTPSA id DE29C5C1A; Wed, 11 Jun 2008 23:20:13 -0400 (EDT) Message-Id: From: Chuck Swiger To: Kirk Strauser In-Reply-To: <200806111442.50935.kirk@strauser.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v924) Date: Wed, 11 Jun 2008 20:20:12 -0700 References: <200806051508.29424.kirk@strauser.com> <200806111442.50935.kirk@strauser.com> X-Mailer: Apple Mail (2.924) Cc: freebsd-questions@freebsd.org Subject: Re: Poor read() performance, and I can't profile it X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Jun 2008 03:20:19 -0000 On Jun 11, 2008, at 12:42 PM, Kirk Strauser wrote: > I'm almost ready to give up on this. I've gone as far as completely > rewriting the > original C++ program into straightforward C, and still the > performance is terrible on > FreeBSD versus Linux. On Linux, GNU libc buffers file data much more extensively than FreeBSD's libc does. It means that doing things like reading a dozen bytes or so at a time is not intolerably slow on the former system, but that doesn't mean that it's a great idea either. If your data files are small enough to fit into 2GB of address space, try using mmap() and then treat the file(s) as an array of records or memoblocks or whatever, and let the VM system deal with paging in the parts of the file you need. Otherwise, don't fread() 1 record at a time, read in at least a (VM page / sizeof(record)) number of records at a time into a bigger buffer, and then process that in RAM rather than trying to fseek in little increments. (This is the opposite of calling setvbuf() to set the I/O buffer to, say, 13 bytes...) Also, if you're malloc'ing and freeing buf & memohead with every iteration of the loop, you're just thrashing the malloc system; instead, allocate your buffers once before the loop, and reuse them (zeroize or copy new data over the previous results) instead. Regards, -- -Chuck > Also note that on the FreeBSD machine, I have enough RAM that to > buffer the entire > file, and in practice gstat shows that the drives are idle for > subsequent runs after > the first one. > > Right now my code looks a lot like: > > for(recordnum = 0; recordnum < recordcount; recordnum++) { > buf = malloc(recordlength); > fread(buf, recordlength, 1, dbffile); > > /* Do stuff with buf */ > > memoblock = getmemoblock(buf); > /* Skip to the requested block if we're not already there */ > if(memoblock != currentmemofileblock) { > currentmemofileblock = memoblock; > fseek(memofile, currentmemofileblock * memoblocksize, SEEK_SET); > } > memohead = malloc(memoblocksize); > fread(memohead, memoblocksize, 1, memofile); > currentmemofileblock++; > > /* Do stuff with memohead */ > > free(memohead); > free(buf); > } > > ...where recordlength == 13 in this one case. Given that the whole > file is buffered in > RAM, the small reads shouldn't make a difference, should they? I've > played with > setvbuf() and it shaves off a few percent of runtime, but nothing to > write home about. > > Now, memofile gets quite a lot of seeks. Again, that shouldn't make > too much of a > difference if it's already buffered in RAM, should it? setvbuf() on > that file that > gets lots of random access actually made performance worse. > > What else can I do to make my code run as well on FreeBSD as it does > on a much wimpier > Linux machine? I'm almost to the point of throwing in the towel and > making a Linux > server to do nothing more than run this one program if I can't > FreeBSD's performance > more on parity, and I honestly never thought I'd be considering that. > > I'll gladly give shell access with my code and sample data files if > anyone is > interested in testing it. > -- > Kirk Strauser > _______________________________________________ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org > "