From owner-freebsd-questions@FreeBSD.ORG Sun Jan 7 23:16:04 2007 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1FCD516A415 for ; Sun, 7 Jan 2007 23:16:04 +0000 (UTC) (envelope-from freebsd-questions@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.freebsd.org (Postfix) with ESMTP id A308513C442 for ; Sun, 7 Jan 2007 23:16:03 +0000 (UTC) (envelope-from freebsd-questions@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1H3hF1-0008TR-SH for freebsd-questions@freebsd.org; Mon, 08 Jan 2007 00:15:52 +0100 Received: from 89-172-46-216.adsl.net.t-com.hr ([89.172.46.216]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 08 Jan 2007 00:15:51 +0100 Received: from ivoras by 89-172-46-216.adsl.net.t-com.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 08 Jan 2007 00:15:51 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-questions@freebsd.org From: Ivan Voras Date: Mon, 08 Jan 2007 00:15:22 +0100 Lines: 81 Message-ID: References: <200701071301.27423.kirk@strauser.com> <200701071618.11242.kirk@strauser.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig00ECDF92C285B46F597CD921" X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: 89-172-46-216.adsl.net.t-com.hr User-Agent: Thunderbird 1.5.0.9 (Windows/20061207) In-Reply-To: <200701071618.11242.kirk@strauser.com> X-Enigmail-Version: 0.94.1.2 Sender: news Subject: Re: Tuning PostgreSQL for bulk imports X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Jan 2007 23:16:04 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig00ECDF92C285B46F597CD921 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Kirk Strauser wrote: > On Sunday 07 January 2007 15:15, Ivan Voras wrote: >> - What might help you is to keep the WAL (write-ahead-log, i.e. journa= l) >> files on a completely separate (and fast) drive from the rest of the >> database, to allow parallelism and speed. For best results, format it >> with 32k blocks/8k fragments. >=20 > Thanks for the idea. Assuming I actually get my wish of a matched set = of 4=20 > high speed drives, would I be better off setting one aside for the jour= nal,=20 > or striping them all together so everything benefits? Everything is first written to the WAL, and then copied to the "normal" database. So, speed is important, but maybe striping all 4 drives would be an overkill, though. The logs are written sequentially so seek times are not that important. What are your current IO rates? Since you have only two drives you might be restricted by available disk bandwidth... >> - If you don't specifically need the atomicity of transactions, you >> might divide your import into many small transactions, for example one= >> for every 100,000 rows instead of doing 8 million at once. >=20 > Would that actually make a difference in total elapsed time spent impor= ting? It will certainly lower the final "commit" time and avoid copying large chunks between the WAL and the database, but I don't have my own measurements. Do you have a SMP machine? If so, you can try importing NCPU of the big transactions in parallel. This could help you almost linearly, if your disks allow it. (see http://www.postgresql.org/docs/8.1/interactive/runtime-config-wal.html : you can increase commit_delay and decrease commit_siblings to achieve sort of localized-async operation across several (parallel) transactions)= I've just remembered - do you run VACUUM ANALYZE after your big imports? You should (to get performance on SELECTs afterwards, but it won't help the imports themselves). >> Also, what version of PostgreSQL are you using? As a general rule, the= >> newer the version, the faster it is. This is especially true if you're= >> using 7.x - go to 8.1.5 immediately (but don't use 8.2 until 8.2.1 get= s >> out). >=20 > We're already running 8.2 because it fixed some problems we were having= with=20 > 8.1.5. Other than the excessively long import times, it's absolutely=20 > screaming and we couldn't be more pleased. Ok. Be careful not to use SELECT ... LIMIT ALL ;) --------------enig00ECDF92C285B46F597CD921 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.4 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFoX8QldnAQVacBcgRAvKjAKD7JUUai6aglWT+r0FyyVZfVvWNyACg6a3A JgbKfyIDMTI7XGREDDnORhI= =4sNE -----END PGP SIGNATURE----- --------------enig00ECDF92C285B46F597CD921--