From owner-freebsd-questions@FreeBSD.ORG Thu Jun 12 19:17:58 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 540F91065676 for ; Thu, 12 Jun 2008 19:17:58 +0000 (UTC) (envelope-from kirk@strauser.com) Received: from kanga.honeypot.net (kanga.honeypot.net [206.29.77.83]) by mx1.freebsd.org (Postfix) with ESMTP id E136D8FC0A for ; Thu, 12 Jun 2008 19:17:57 +0000 (UTC) (envelope-from kirk@strauser.com) Received: from localhost (localhost [127.0.0.1]) by kanga.honeypot.net (Postfix) with ESMTP id 2F9B95DCD7D; Thu, 12 Jun 2008 14:17:27 -0500 (CDT) X-Virus-Scanned: amavisd-new at honeypot.net Received: from kanga.honeypot.net ([127.0.0.1]) by localhost (kanga.honeypot.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8cc6yHVYZxcV; Thu, 12 Jun 2008 14:17:24 -0500 (CDT) Received: from janus.daycos.com (janus.daycos.com [10.45.12.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by kanga.honeypot.net (Postfix) with ESMTPSA id C02595DCD79; Thu, 12 Jun 2008 14:17:23 -0500 (CDT) From: Kirk Strauser To: Chuck Swiger Date: Thu, 12 Jun 2008 14:17:13 -0500 User-Agent: KMail/1.9.9 References: <200806051508.29424.kirk@strauser.com> <200806111442.50935.kirk@strauser.com> In-Reply-To: X-Face: T+/_{qmjgbosI0J/e83I~w[&VF'w)!((xEpj///^bA/6?jHHS?nq+T8_+`nh"WnEWCWG, \}]Y2$)) =?utf-8?q?vLVz4ACChrEcb=7DCO=5EtYmMG=5C=0A=09ts=2Em=3F=5B7=5B6OwE*dAJ*9f+m?= =?utf-8?q?X=2E7R32qeN=5EDJ=5C?=(k@evW?IRQCy.^ MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart1980172.0YNeN6mcng"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200806121417.19885.kirk@strauser.com> Cc: freebsd-questions@freebsd.org Subject: Re: Poor read() performance, and I can't profile it X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Jun 2008 19:17:58 -0000 --nextPart1980172.0YNeN6mcng Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Wednesday 11 June 2008, Chuck Swiger wrote: > If your data files are small enough to fit into 2GB of address space, > try using mmap() and then treat the file(s) as an array of records or > memoblocks or whatever, and let the VM system deal with paging in the > parts of the file you need. Otherwise, don't fread() 1 record at a > time, read in at least a (VM page / sizeof(record)) number of records > at a time into a bigger buffer, and then process that in RAM rather > than trying to fseek in little increments. During a marathon session last night, I did just that. I changed the seque= ntial reads=20 in the "outer" file to fread many records at a time. Then I switched to mm= ap() for the=20 random-access file. The results were much better, with good CPU usage and = only 3 times=20 the wall clock runtime: kirk@linux$ date; time /tmp/cdbf /tmp/invoice.dbf >/dev/null; date Thu Jun 12 13:56:49 CDT 2008 /tmp/cdbf /tmp/invoice.dbf > /dev/null 29.00s user 11.16s system 56% cpu 1= :11.03 total Thu Jun 12 13:58:00 CDT 2008 kirk@freebsd$ date; time /tmp/cdbf ~pgsql/data/frodumps/xbase/invoice.dbf i= nvid ln=20 >/dev/null; date Thu Jun 12 14:10:57 CDT 2008 /tmp/cdbf ~pgsql/data/frodumps/xbase/invoice.dbf invid ln > /dev/null 38.1= 4s user=20 6.21s system 23% cpu 3:05.13 total Thu Jun 12 14:14:02 CDT 2008 > Also, if you're malloc'ing and freeing buf & memohead with every > iteration of the loop, you're just thrashing the malloc system; > instead, allocate your buffers once before the loop, and reuse them > (zeroize or copy new data over the previous results) instead. Also done. I'd gotten some technical advice from Slashdot (which speaks vo= lumes for my=20 clueless, granted) that made it sound like a good idea. I changed almost a= ll the=20 mallocs into static buffers. I'm still offering that shell account to anyone who wants to take a peek. = :-) =2D-=20 Kirk Strauser --nextPart1980172.0YNeN6mcng Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- iD8DBQBIUXY/5sRg+Y0CpvERApN7AKCjZoI0xJfFgu5pgtB1/krlk1Wy4ACgh8y5 yJdN8YZSUQe9pOAZklxN2rE= =Lwrt -----END PGP SIGNATURE----- --nextPart1980172.0YNeN6mcng--