From owner-freebsd-database@FreeBSD.ORG Mon Jul 8 12:41:01 2013 Return-Path: Delivered-To: freebsd-database@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 1516EDFA; Mon, 8 Jul 2013 12:41:01 +0000 (UTC) (envelope-from artem.naluzhnyy@gmail.com) Received: from mail-wg0-x230.google.com (mail-wg0-x230.google.com [IPv6:2a00:1450:400c:c00::230]) by mx1.freebsd.org (Postfix) with ESMTP id 80E3F1E7B; Mon, 8 Jul 2013 12:41:00 +0000 (UTC) Received: by mail-wg0-f48.google.com with SMTP id f11so3647706wgh.15 for ; Mon, 08 Jul 2013 05:40:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=3C6s6nuonag5V+OfmlZYnUOWv/ddTgytAS0EZmFz7Ng=; b=xI2Frg0EwzlW+AwT2r58d1GgN1/B6d9TogCY59nNJaYDz0fpr6KL4ORmYCDICpRyiy 3G4t1TqdCFONwbjS4ZgJ2eXYkoUJ+e3bmcMeehntHhX0W9NFcSVUzM+x2WYhQgRLp9Or Rm3NL3D/vJvttSUtQyo2Wy49g3NtM80jVa5dWhBVB1+ASYrCK01kN+74vfzw2S3EbOSp +ikoNV1EMQxHPvcpPQUmh95i+2pOKIhFulmRrRfbPA/ElZNj9HMvNn62VOSzlfYUmF7c Jx+ao0LtWTJvK+kiHL6TG6pUeq/5X4oFASuqgZyefzcE9sz4qqTFDIV0O9gWOZb7qs61 8LRA== X-Received: by 10.180.211.171 with SMTP id nd11mr11512989wic.17.1373287259569; Mon, 08 Jul 2013 05:40:59 -0700 (PDT) MIME-Version: 1.0 Received: by 10.217.123.138 with HTTP; Mon, 8 Jul 2013 05:40:19 -0700 (PDT) From: Artem Naluzhnyy Date: Mon, 8 Jul 2013 15:40:19 +0300 Message-ID: Subject: RAID10 stripe size and PostgreSQL performance To: freebsd-database@freebsd.org, freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-database@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Database use and development under FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 12:41:01 -0000 Hi, I'm benchmarking PostgreSQL using different RAID10 stripe size values for a new server. Tried bonnie++ and pgbench on two stripe size configurations: * 32 KB (a half of current UFS bsize) - 254 pgbench tps * 1 MB (max supported by the RAID controller) - 626 pgbench tps See OS/hardware configuration, benchmark methodology and raw results here - http://pastebin.com/F8uZEZdm Is this expected behavior with more than twice higher pgbench tps on 1MB stripe size? Are there any RAID stripe size recommendations for better PostgreSQL performance? (I can not change the FS type, standard PG block size etc. - they are locked by vendor in this commercial FreeBSD distribution) -- Artem Naluzhnyy From owner-freebsd-database@FreeBSD.ORG Thu Jul 11 21:08:01 2013 Return-Path: Delivered-To: freebsd-database@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 39831893; Thu, 11 Jul 2013 21:08:01 +0000 (UTC) (envelope-from artem.naluzhnyy@gmail.com) Received: from mail-we0-x22b.google.com (mail-we0-x22b.google.com [IPv6:2a00:1450:400c:c03::22b]) by mx1.freebsd.org (Postfix) with ESMTP id 7A409162D; Thu, 11 Jul 2013 21:08:00 +0000 (UTC) Received: by mail-we0-f171.google.com with SMTP id m46so7403294wev.2 for ; Thu, 11 Jul 2013 14:07:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=erX7BSvS/+JGXgu0ZR797YljNpENpjJTIq6rAuTDR6M=; b=U6sx+MLMZOPGLY4t5scTLwORo9g/QhB2xsyUD1SYktg9fhFMQjXjNnyFAhvIx2rv2o WXtlKFLD6Vs6XM20zV2eeQUB+P4K0rfufGxp/j4qcY924gQbvd+xdRUk1ceZm6M0KZuP M/fuZ3RGvh2CnVu9IFeckC7OmKwxjPswRaNAV6E5x1Hg6Lxs8fWjLoonLYgSiTpMmZze 951+GtzB0mU37nlAT22bESFbzydkSSw6yCThthJOHgDdmK9fwLNBgG8boib8ZX4PTfW/ C8d+X7Lwt+474pBHQ/AN8xGwrvePLZT9OAW2VsmvmEAChsKlmPSR1quax++0qHCMiLLa eJQQ== X-Received: by 10.181.11.227 with SMTP id el3mr16669415wid.31.1373576879571; Thu, 11 Jul 2013 14:07:59 -0700 (PDT) MIME-Version: 1.0 Sender: artem.naluzhnyy@gmail.com Received: by 10.216.203.68 with HTTP; Thu, 11 Jul 2013 14:07:19 -0700 (PDT) In-Reply-To: References: From: Artem Naluzhnyy Date: Fri, 12 Jul 2013 00:07:19 +0300 X-Google-Sender-Auth: Wv2vzzctBS3XMN8UzLrHMHLKl7Y Message-ID: Subject: Re: RAID10 stripe size and PostgreSQL performance To: Ivan Voras Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org, freebsd-database@freebsd.org X-BeenThere: freebsd-database@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Database use and development under FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Jul 2013 21:08:01 -0000 On Mon, Jul 8, 2013 at 6:16 PM, Ivan Voras wrote: > On 08/07/2013 14:40, Artem Naluzhnyy wrote: >> Is this expected behavior with more than twice higher pgbench tps on >> 1MB stripe size? > > No, it is not. > > For start, can you please repeat your benchmarks but with restarting the > PostgreSQL server between each pgbench run? Fresh OS installation without DB warning, reboot after pgbench DB initialization (DB size: 26 GB) before benchmarking: * 32 KB (half of the UFS bsize) - tps=198 * 64 KB - tps=226 * 128 KB (default for the RAID controller) - tps=298 * 1 MB (max for the RAID controller) - tps=347 > Also, you should make sure that the database is located on the same > location on the disk platters by e.g. creating a small partition which > is about 150% larger than your pgbench database (and your pgbench > database should be at least 2x larger than your RAM, if you are going to > benchmark IO and not memory caches), which is located at the same > position (byte offset) in your RAID10 volume. Unfortunately it's not that easy to make a custom partitioning. However, all tests were done just after the server reinstallation using exactly the same order of commands. The server has 24 GB RAM, so with 88 GB DB we have: * 32 KB stripe - tps=161 * 1 MB stripe - tps=258 The server is used for VoIP billing, there are also lots of plain-text log files dumping. Had it still better use 1 MB stripe size, or it might have some side effects on performance. -- Artem Naluzhnyy From owner-freebsd-database@FreeBSD.ORG Fri Jul 12 13:55:43 2013 Return-Path: Delivered-To: freebsd-database@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 635A1F3C for ; Fri, 12 Jul 2013 13:55:43 +0000 (UTC) (envelope-from freebsd-database@m.gmane.org) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) by mx1.freebsd.org (Postfix) with ESMTP id 21DB317E4 for ; Fri, 12 Jul 2013 13:55:42 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1Uxdor-0001z8-9I for freebsd-database@freebsd.org; Fri, 12 Jul 2013 15:55:33 +0200 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 12 Jul 2013 15:55:33 +0200 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 12 Jul 2013 15:55:33 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-database@freebsd.org From: Ivan Voras Subject: Re: RAID10 stripe size and PostgreSQL performance Date: Fri, 12 Jul 2013 15:55:18 +0200 Lines: 82 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="----enig2KBTORBKKVINQAUTXCBNH" X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130322 Thunderbird/17.0.4 In-Reply-To: X-Enigmail-Version: 1.5.1 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-database@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Database use and development under FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Jul 2013 13:55:43 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) ------enig2KBTORBKKVINQAUTXCBNH Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 11/07/2013 23:07, Artem Naluzhnyy wrote: > On Mon, Jul 8, 2013 at 6:16 PM, Ivan Voras wrote: >> On 08/07/2013 14:40, Artem Naluzhnyy wrote: >>> Is this expected behavior with more than twice higher pgbench tps on >>> 1MB stripe size? >> >> No, it is not. >> >> For start, can you please repeat your benchmarks but with restarting t= he >> PostgreSQL server between each pgbench run? >=20 > Fresh OS installation without DB warning, reboot after pgbench DB > initialization (DB size: 26 GB) before benchmarking: >=20 > * 32 KB (half of the UFS bsize) - tps=3D198 >=20 > * 64 KB - tps=3D226 >=20 > * 128 KB (default for the RAID controller) - tps=3D298 >=20 > * 1 MB (max for the RAID controller) - tps=3D347 I just looked at your RAID configuration at http://pastebin.com/F8uZEZdm and you have a mirror of stripes (RAID-01) nor a stripe of mirrors (RAID-10). And apparently, is I parse your configuration correctly, you have a 1M stripe in the MIRROR part of the RAID, and an unknown stripe size in the STRIPE part. Mirroring may halp your read performance, but will not help your write performance. If you are running pgbench with default settings, and with your test database size which can fit in RAM, you probably cache all reads eventually and then writes become the bottleneck. >> Also, you should make sure that the database is located on the same >> location on the disk platters by e.g. creating a small partition which= >> is about 150% larger than your pgbench database (and your pgbench >> database should be at least 2x larger than your RAM, if you are going = to >> benchmark IO and not memory caches), which is located at the same >> position (byte offset) in your RAID10 volume. >=20 > Unfortunately it's not that easy to make a custom partitioning. > However, all tests were done just after the server reinstallation > using exactly the same order of commands. I'm not saying that your production database should be on a custom partition, but your pgbench test database (and the file for the following test) should. Anyway, could you please do one more test: 1) create a large file with "dd if=3D/dev/zero of=3Dfile bs=3D1m count=3D= 48000" 2) install /usr/ports/benchmarks/randomio 3) run "randomio file 8 0.5 1 8192 10 10" =2E.. and report the results. ------enig2KBTORBKKVINQAUTXCBNH Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlHgCsYACgkQ/QjVBj3/HSyGngCgk4lxgvQD696qqjJs86ozGIzX o/cAoKT64CMmalnxC2PvJsvSxmGG/FCA =PDLg -----END PGP SIGNATURE----- ------enig2KBTORBKKVINQAUTXCBNH-- From owner-freebsd-database@FreeBSD.ORG Fri Jul 12 19:15:45 2013 Return-Path: Delivered-To: freebsd-database@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B5DC82C5; Fri, 12 Jul 2013 19:15:45 +0000 (UTC) (envelope-from artem.naluzhnyy@gmail.com) Received: from mail-wi0-x22d.google.com (mail-wi0-x22d.google.com [IPv6:2a00:1450:400c:c05::22d]) by mx1.freebsd.org (Postfix) with ESMTP id 040611A15; Fri, 12 Jul 2013 19:15:44 +0000 (UTC) Received: by mail-wi0-f173.google.com with SMTP id hq4so1038469wib.6 for ; Fri, 12 Jul 2013 12:15:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=+88swS6fEFYbjY/PJZ6w5Z5I9iLpjMWv67ummD29GBg=; b=FRCfri8cFg4xxuHSYBw9aqzFmv7Okow/FR2ck8d1gkCdowAqejOKSs78Vb2fz2RrJ0 /qB6C/tYKA4GtT6mXlgpkxGlJORHd35vyOVKFZdK3q9HaxEPh09cX45+Ja4okfRNoSTn bbIA0QOO+zB7zil3ZzyawzKfwVm7mVeewG0wMN2WXGxJmAVw8GW2763OMVHsyHr2QFZ8 rzQbnMjX9EftVhLvQ8XjrsMUKDvDe47NApDY4jFEfzyNmdtz54znkwKD7+suElQXhnoW FClU2B9FUbZeZHc7K0+uz9wYump5LJrd13zFDz5BDhxvroPAS1yJ82mvKobL8QwaqeuI QDhw== X-Received: by 10.180.36.107 with SMTP id p11mr2507885wij.31.1373656544076; Fri, 12 Jul 2013 12:15:44 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.203.68 with HTTP; Fri, 12 Jul 2013 12:15:03 -0700 (PDT) In-Reply-To: References: From: Artem Naluzhnyy Date: Fri, 12 Jul 2013 22:15:03 +0300 Message-ID: Subject: Re: RAID10 stripe size and PostgreSQL performance To: Ivan Voras Content-Type: text/plain; charset=UTF-8 Cc: freebsd-database@freebsd.org, freebsd-fs@freebsd.org X-BeenThere: freebsd-database@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Database use and development under FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Jul 2013 19:15:45 -0000 On Fri, Jul 12, 2013 at 4:55 PM, Ivan Voras wrote: > I just looked at your RAID configuration at http://pastebin.com/F8uZEZdm > and you have a mirror of stripes (RAID-01) nor a stripe of mirrors > (RAID-10). And apparently, is I parse your configuration correctly, you > have a 1M stripe in the MIRROR part of the RAID, and an unknown stripe > size in the STRIPE part. This is probably a bug in mfiutil output. There is no "RAID 01" option in the controller configuration, and its documentation says (http://goo.gl/6X5pe): "RAID 10, a combination of RAID 0 and RAID 1, consists of striped data across mirrored spans. A RAID 10 drive group is a spanned drive group that creates a striped set from a series of mirrored drives. RAID 10 allows a maximum of eight spans. You must use an even number of configuration Scenarios 1-7 drives in each RAID virtual drive in the span. The RAID 1 virtual drives must have the same stripe size." There is also no options to configure a different stripe size for the mirrors, I can only set it globally for the whole RAID 10 volume. > Anyway, could you please do one more test: > > 1) create a large file with "dd if=/dev/zero of=file bs=1m count=48000" > 2) install /usr/ports/benchmarks/randomio > 3) run "randomio file 8 0.5 1 8192 10 10" > > ... and report the results. See results at the end of http://pastebin.com/F8uZEZdm There is yet another issue that makes (I guess it should) all previous benchmarks kinda inaccurate and irrelevant - looks like the the UFS partitions are not aligned properly: $ gpart show => 63 1167966145 mfid0 MBR (557G) 63 1167957567 1 freebsd [active] (556G) 1167957630 8578 - free - (4.2M) => 0 1167957567 mfid0s1 BSD (556G) 0 4194304 1 freebsd-ufs (2.0G) 4194304 16777216 2 freebsd-swap (8.0G) 20971520 1130217472 5 freebsd-ufs (539G) 1151188992 16768575 4 freebsd-ufs (8G) Will also try to fix the alignment and make some tests. -- Artem Naluzhnyy