From owner-freebsd-performance@FreeBSD.ORG  Mon Nov 22 00:22:22 2010
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CA54F106566C
	for <freebsd-performance@freebsd.org>;
	Mon, 22 Nov 2010 00:22:22 +0000 (UTC)
	(envelope-from gofp-freebsd-performance@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id 53BFF8FC14
	for <freebsd-performance@freebsd.org>;
	Mon, 22 Nov 2010 00:22:21 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <gofp-freebsd-performance@m.gmane.org>)
	id 1PKKAi-0003rn-8D
	for freebsd-performance@freebsd.org; Mon, 22 Nov 2010 01:22:16 +0100
Received: from cpe-188-129-98-75.dynamic.amis.hr ([188.129.98.75])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-performance@freebsd.org>; Mon, 22 Nov 2010 01:22:16 +0100
Received: from ivoras by cpe-188-129-98-75.dynamic.amis.hr with local (Gmexim
	0.1 (Debian)) id 1AlnuQ-0007hv-00
	for <freebsd-performance@freebsd.org>; Mon, 22 Nov 2010 01:22:16 +0100
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-performance@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Mon, 22 Nov 2010 01:21:58 +0100
Lines: 46
Message-ID: <iccd37$lhh$1@dough.gmane.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: cpe-188-129-98-75.dynamic.amis.hr
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6
Cc: freebsd-hackers@freebsd.org
Subject: PostgreSQL performance scaling
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Nov 2010 00:22:22 -0000

This is not a request for help but a report, in case it helps developers 
or someone in the future. The setup is:

AMD64 machine, 24 GB RAM, 2x6-core Xeon CPU + HTT (24 logical CPUs)
FreeBSD 8.1-stable, AMD64
PostgreSQL 9.0.1, 10 GB shared buffers, using pgbench with a scale 
factor of 500 (7.5 GB database)

with pgbench -S (SELECT-queries only, no disk IO) the performance curve is:

-c#	result
4	33549
8	64864
12	79491
16	79887
20	66957
24	52576
28	50406
32	49491
40	45535
50	39499
75	29415

After 16 clients (which is still good since there are only 12 "real" 
cores in the system), the performance drops sharply, and looking at the 
processes' state, most of them seem to eat away system call (i.e. 
executing in the kernel) in states "semwait" and "sbwait", i.e. 
semaphore wait and socket buffer wait, for example:

  3107 pgsql       1  62    0 10533M   439M CPU1    0   0:02 13.57% postgres
  3105 pgsql       1  63    0 10533M   438M CPU9    9   0:02 13.57% postgres
  3109 pgsql       1  62    0 10533M   440M sbwait 13   0:02 13.48% postgres
  3106 pgsql       1  61    0 10533M   445M sbwait  8   0:02 13.48% postgres
  3118 pgsql       1  62    0 10533M   431M sbwait 21   0:02 13.48% postgres
  3114 pgsql       1  63    0 10533M   434M sbwait 19   0:02 13.38% postgres
  3122 pgsql       1  63    0 10533M   428M sbwait 15   0:02 13.28% postgres
  3108 pgsql       1  63    0 10533M   439M sbwait  5   0:02 13.18% postgres
  3116 pgsql       1  62    0 10533M   432M sbwait 11   0:02 13.18% postgres
  3113 pgsql       1  62    0 10533M   430M semwai 20   0:02 13.18% postgres
  3115 pgsql       1  62    0 10533M   428M RUN    14   0:02 13.18% postgres

The "semwait" part is from PostgreSQL - probably shared buffer locking, 
but there's a large number of processes regularly in sbwait - maybe 
something can be optimized here?

This is IPC over Unix sockets.


From owner-freebsd-performance@FreeBSD.ORG  Mon Nov 22 06:14:31 2010
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B634F1065670
	for <freebsd-performance@freebsd.org>;
	Mon, 22 Nov 2010 06:14:31 +0000 (UTC) (envelope-from feld@feld.me)
Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com
	[209.85.214.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 875098FC0C
	for <freebsd-performance@freebsd.org>;
	Mon, 22 Nov 2010 06:14:31 +0000 (UTC)
Received: by iwn39 with SMTP id 39so8148767iwn.13
	for <freebsd-performance@freebsd.org>;
	Sun, 21 Nov 2010 22:14:31 -0800 (PST)
Received: by 10.231.19.8 with SMTP id y8mr6455970iba.111.1290404958172;
	Sun, 21 Nov 2010 21:49:18 -0800 (PST)
Received: from skeletor.lan (66-168-54-242.dhcp.mdsn.wi.charter.com
	[66.168.54.242])
	by mx.google.com with ESMTPS id 34sm5089638ibi.20.2010.11.21.21.49.17
	(version=TLSv1/SSLv3 cipher=RC4-MD5);
	Sun, 21 Nov 2010 21:49:17 -0800 (PST)
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes
To: freebsd-performance@freebsd.org
References: <iccd37$lhh$1@dough.gmane.org>
Date: Sun, 21 Nov 2010 23:49:16 -0600
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: "Mark Felder" <feld@feld.me>
Message-ID: <op.vmj44dm634t2sn@skeletor.lan>
In-Reply-To: <iccd37$lhh$1@dough.gmane.org>
User-Agent: Opera Mail/11.00 (FreeBSD)
Subject: Re: PostgreSQL performance scaling
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Nov 2010 06:14:31 -0000

I recommend posting this on the Postgres performance list, too.


Regards,


Mark

From owner-freebsd-performance@FreeBSD.ORG  Mon Nov 22 08:37:23 2010
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 30680106566B
	for <freebsd-performance@freebsd.org>;
	Mon, 22 Nov 2010 08:37:23 +0000 (UTC)
	(envelope-from davidxu@freebsd.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 053D58FC12;
	Mon, 22 Nov 2010 08:37:23 +0000 (UTC)
Received: from xyf.my.dom (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oAM8bL0T093475;
	Mon, 22 Nov 2010 08:37:22 GMT (envelope-from davidxu@freebsd.org)
Message-ID: <4CEA9C46.8010507@freebsd.org>
Date: Mon, 22 Nov 2010 16:37:26 +0000
From: David Xu <davidxu@freebsd.org>
User-Agent: Thunderbird 2.0.0.24 (X11/20100630)
MIME-Version: 1.0
To: Mark Felder <feld@feld.me>
References: <iccd37$lhh$1@dough.gmane.org> <op.vmj44dm634t2sn@skeletor.lan>
In-Reply-To: <op.vmj44dm634t2sn@skeletor.lan>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-performance@freebsd.org
Subject: Re: PostgreSQL performance scaling
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Nov 2010 08:37:23 -0000

Mark Felder wrote:
> I recommend posting this on the Postgres performance list, too.
> 
> 
> 
> 
> Regards,
> 
> 
> Mark

I think if PostgreSQL uses semaphore for inter-process locking,
it might be a good idea to use POSIX semaphore exits in our head
branch, the new POSIX semaphore implementation now supports
process-shared, and is more light weight than SYSV semaphore,
if there is no contention, a process need not enter kernel to
acquire/release a lock. Note that I have just fixed a bug in head
branch. However RELENG_8 does not support process-shared semaphore
yet.

Regards,
David Xu


From owner-freebsd-performance@FreeBSD.ORG  Tue Nov 23 00:30:07 2010
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id ED38F106564A
	for <freebsd-performance@freebsd.org>;
	Tue, 23 Nov 2010 00:30:07 +0000 (UTC)
	(envelope-from gofp-freebsd-performance@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id 963DF8FC15
	for <freebsd-performance@freebsd.org>;
	Tue, 23 Nov 2010 00:30:06 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <gofp-freebsd-performance@m.gmane.org>)
	id 1PKglp-0007v4-FE
	for freebsd-performance@freebsd.org; Tue, 23 Nov 2010 01:30:05 +0100
Received: from cpe-188-129-85-205.dynamic.amis.hr ([188.129.85.205])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-performance@freebsd.org>; Tue, 23 Nov 2010 01:30:05 +0100
Received: from ivoras by cpe-188-129-85-205.dynamic.amis.hr with local (Gmexim
	0.1 (Debian)) id 1AlnuQ-0007hv-00
	for <freebsd-performance@freebsd.org>; Tue, 23 Nov 2010 01:30:05 +0100
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-performance@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Tue, 23 Nov 2010 01:26:27 +0100
Lines: 38
Message-ID: <icf1nk$192$1@dough.gmane.org>
References: <iccd37$lhh$1@dough.gmane.org> <op.vmj44dm634t2sn@skeletor.lan>
	<4CEA9C46.8010507@freebsd.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: cpe-188-129-85-205.dynamic.amis.hr
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6
In-Reply-To: <4CEA9C46.8010507@freebsd.org>
Subject: Re: PostgreSQL performance scaling
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Nov 2010 00:30:08 -0000

On 11/22/10 17:37, David Xu wrote:
> Mark Felder wrote:
>> I recommend posting this on the Postgres performance list, too.
>>
>>
>>
>>
>> Regards,
>>
>>
>> Mark
>
> I think if PostgreSQL uses semaphore for inter-process locking,
> it might be a good idea to use POSIX semaphore exits in our head
> branch, the new POSIX semaphore implementation now supports
> process-shared, and is more light weight than SYSV semaphore,
> if there is no contention, a process need not enter kernel to
> acquire/release a lock. Note that I have just fixed a bug in head
> branch. However RELENG_8 does not support process-shared semaphore
> yet.

Another thing might be that, despite that they appear to try to avoid 
it, they possibly have a large number of processes hanging on the same 
semaphore, leading to thundering herd problem.

There already is code for POSIX semaphores in PostgreSQL. It requires 
some manual fiddling with the configuration to enable 
(USE_UNNAMED_POSIX_SEMAPHORES).

However, I've just tried it on 9-CURRENT and it doesn't work:

Nov 23 01:23:02 biggie postgres[1515]: [1-1] FATAL:  sem_init failed: No 
space left on device

PostgreSQL calls it as "sem_init(sem, 1, 1);"

One more thing: apparently I had to kldload sem.ko - which looks like an 
error, since it is in GENERIC in 8-STABLE !


From owner-freebsd-performance@FreeBSD.ORG  Tue Nov 23 00:51:36 2010
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 495E91065693
	for <freebsd-performance@freebsd.org>;
	Tue, 23 Nov 2010 00:51:36 +0000 (UTC)
	(envelope-from gofp-freebsd-performance@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id C03578FC26
	for <freebsd-performance@freebsd.org>;
	Tue, 23 Nov 2010 00:51:35 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <gofp-freebsd-performance@m.gmane.org>)
	id 1PKh6b-0006xC-Qa
	for freebsd-performance@freebsd.org; Tue, 23 Nov 2010 01:51:33 +0100
Received: from cpe-188-129-85-205.dynamic.amis.hr ([188.129.85.205])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-performance@freebsd.org>; Tue, 23 Nov 2010 01:51:33 +0100
Received: from ivoras by cpe-188-129-85-205.dynamic.amis.hr with local (Gmexim
	0.1 (Debian)) id 1AlnuQ-0007hv-00
	for <freebsd-performance@freebsd.org>; Tue, 23 Nov 2010 01:51:33 +0100
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-performance@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Tue, 23 Nov 2010 01:51:21 +0100
Lines: 78
Message-ID: <icf36a$8ik$1@dough.gmane.org>
References: <iccd37$lhh$1@dough.gmane.org>
	<op.vmj44dm634t2sn@skeletor.lan>	<4CEA9C46.8010507@freebsd.org>
	<icf1nk$192$1@dough.gmane.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: cpe-188-129-85-205.dynamic.amis.hr
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6
In-Reply-To: <icf1nk$192$1@dough.gmane.org>
Subject: Re: PostgreSQL performance scaling
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Nov 2010 00:51:36 -0000

On 11/23/10 01:26, Ivan Voras wrote:
> On 11/22/10 17:37, David Xu wrote:
>> Mark Felder wrote:
>>> I recommend posting this on the Postgres performance list, too.
>>>
>>>
>>>
>>>
>>> Regards,
>>>
>>>
>>> Mark
>>
>> I think if PostgreSQL uses semaphore for inter-process locking,
>> it might be a good idea to use POSIX semaphore exits in our head
>> branch, the new POSIX semaphore implementation now supports
>> process-shared, and is more light weight than SYSV semaphore,
>> if there is no contention, a process need not enter kernel to
>> acquire/release a lock. Note that I have just fixed a bug in head
>> branch. However RELENG_8 does not support process-shared semaphore
>> yet.
>
> Another thing might be that, despite that they appear to try to avoid
> it, they possibly have a large number of processes hanging on the same
> semaphore, leading to thundering herd problem.
>
> There already is code for POSIX semaphores in PostgreSQL. It requires
> some manual fiddling with the configuration to enable
> (USE_UNNAMED_POSIX_SEMAPHORES).
>
> However, I've just tried it on 9-CURRENT and it doesn't work:
>
> Nov 23 01:23:02 biggie postgres[1515]: [1-1] FATAL: sem_init failed: No
> space left on device

Ok, I've found the p1003_1b.sem_nsems_max sysctl.

It seems to help when used instead of sysv semaphores, but very little:

sysv semaphores:

-c#    result
4    33549
8    64864
12    79491
16    79887
20    66957
24    52576
28    50406
32    49491
40    45535
50    39499
75    29415

posix semaphores:

16	79125
20	70061
24	55620

After 20 clients, sys time goes sharply up like before

  procs      memory      page                    disks     faults 
   cpu
  r b w     avm    fre   flt  re  pi  po    fr  sr mf0 mf1   in   sy 
cs us sy id
27 32 0  11887M  3250M 62442   0   0   0     0   0   0   0   10 255078 
109047 18 73 10
30 32 0  11887M  3162M 58165   0   0   0    12   0   0   1    7 272540 
114416 17 75  9
29 32 0  11887M  3105M 57487   0   0   0     0   0   0   0    8 279475 
117891 15 75 10
16 31 0  11887M  3063M 59215   0   0   0     0   0   0   0    6 295342 
121090 16 70 13


and the overall behaviour is similar - the processes spend a lot of time 
in "sbwait" and "ksem" states.


From owner-freebsd-performance@FreeBSD.ORG  Tue Nov 23 01:35:39 2010
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B88E51065670;
	Tue, 23 Nov 2010 01:35:39 +0000 (UTC)
	(envelope-from davidxu@freebsd.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 8C3C68FC13;
	Tue, 23 Nov 2010 01:35:39 +0000 (UTC)
Received: from xyf.my.dom (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oAN1Zb7w056682;
	Tue, 23 Nov 2010 01:35:38 GMT (envelope-from davidxu@freebsd.org)
Message-ID: <4CEB8AEF.7030202@freebsd.org>
Date: Tue, 23 Nov 2010 09:35:43 +0000
From: David Xu <davidxu@freebsd.org>
User-Agent: Thunderbird 2.0.0.24 (X11/20100630)
MIME-Version: 1.0
To: Ivan Voras <ivoras@freebsd.org>
References: <iccd37$lhh$1@dough.gmane.org>	<op.vmj44dm634t2sn@skeletor.lan>	<4CEA9C46.8010507@freebsd.org>	<icf1nk$192$1@dough.gmane.org>
	<icf36a$8ik$1@dough.gmane.org>
In-Reply-To: <icf36a$8ik$1@dough.gmane.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-performance@freebsd.org
Subject: Re: PostgreSQL performance scaling
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Nov 2010 01:35:39 -0000

Ivan Voras wrote:
> On 11/23/10 01:26, Ivan Voras wrote:
>> On 11/22/10 17:37, David Xu wrote:
>>> Mark Felder wrote:
>>>> I recommend posting this on the Postgres performance list, too.
>>>>
>>>>
>>>>
>>>>
>>>> Regards,
>>>>
>>>>
>>>> Mark
>>>
>>> I think if PostgreSQL uses semaphore for inter-process locking,
>>> it might be a good idea to use POSIX semaphore exits in our head
>>> branch, the new POSIX semaphore implementation now supports
>>> process-shared, and is more light weight than SYSV semaphore,
>>> if there is no contention, a process need not enter kernel to
>>> acquire/release a lock. Note that I have just fixed a bug in head
>>> branch. However RELENG_8 does not support process-shared semaphore
>>> yet.
>>
>> Another thing might be that, despite that they appear to try to avoid
>> it, they possibly have a large number of processes hanging on the same
>> semaphore, leading to thundering herd problem.
>>
>> There already is code for POSIX semaphores in PostgreSQL. It requires
>> some manual fiddling with the configuration to enable
>> (USE_UNNAMED_POSIX_SEMAPHORES).
>>
>> However, I've just tried it on 9-CURRENT and it doesn't work:
>>
>> Nov 23 01:23:02 biggie postgres[1515]: [1-1] FATAL: sem_init failed: No
>> space left on device
> 
> Ok, I've found the p1003_1b.sem_nsems_max sysctl.
> 
> It seems to help when used instead of sysv semaphores, but very little:
> 
> sysv semaphores:
> 
> -c#    result
> 4    33549
> 8    64864
> 12    79491
> 16    79887
> 20    66957
> 24    52576
> 28    50406
> 32    49491
> 40    45535
> 50    39499
> 75    29415
> 
> posix semaphores:
> 
> 16    79125
> 20    70061
> 24    55620
> 
> After 20 clients, sys time goes sharply up like before
> 
>  procs      memory      page                    disks     faults   cpu
>  r b w     avm    fre   flt  re  pi  po    fr  sr mf0 mf1   in   sy cs 
> us sy id
> 27 32 0  11887M  3250M 62442   0   0   0     0   0   0   0   10 255078 
> 109047 18 73 10
> 30 32 0  11887M  3162M 58165   0   0   0    12   0   0   1    7 272540 
> 114416 17 75  9
> 29 32 0  11887M  3105M 57487   0   0   0     0   0   0   0    8 279475 
> 117891 15 75 10
> 16 31 0  11887M  3063M 59215   0   0   0     0   0   0   0    6 295342 
> 121090 16 70 13
> 
> 
> and the overall behaviour is similar - the processes spend a lot of time 
> in "sbwait" and "ksem" states.
> 
Strange, the POSIX semaphore in head branch does not use ksem, it is
based on umtx, there is no limit on POSIX semaphore, the only limit
is process's address space which limits how many semaphores can be
used.


From owner-freebsd-performance@FreeBSD.ORG  Tue Nov 23 02:15:34 2010
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C6DA11065674;
	Tue, 23 Nov 2010 02:15:34 +0000 (UTC)
	(envelope-from ivoras@gmail.com)
Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com
	[209.85.216.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 6796B8FC0A;
	Tue, 23 Nov 2010 02:15:34 +0000 (UTC)
Received: by qwg5 with SMTP id 5so531679qwg.13
	for <multiple recipients>; Mon, 22 Nov 2010 18:15:33 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:mime-version:sender:received
	:in-reply-to:references:from:date:x-google-sender-auth:message-id
	:subject:to:cc:content-type;
	bh=Pt3d+RqVQv8vj3tMYBfXT6sHnlrcXWU5K2HhOnJmXN4=;
	b=Lfb1dmXN8b3+1KIhQlNIAU3zk2LmEbdSfL//+6gX0n5l+jZuC8puon2NjQ6GHp3qk9
	9dr5XYZZmAuQ7Sd8VciVoQ1YwALGddizs/isoxW1xrv6F+MoBpE5MzASrBeEMBPmeKER
	HYaVWdVDz48bmxaV3MRHc6vIJFSiIDPy2rYAk=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:sender:in-reply-to:references:from:date
	:x-google-sender-auth:message-id:subject:to:cc:content-type;
	b=bAgP+74gPBl4J0wo5yBLV/+N9McnVhiDHQGImpvBKbJFNYSve0UMR2iG40K1egqp5L
	BEBKSiZpRhVjm9Wjv/eFLwSu1V9JT0lRzdNDG14V3KmgaUn70SpW+SHAGUgHauixZvep
	ViGUUcIKlWe4oUOfzutW19EgiAIvvb5T3j+Vs=
Received: by 10.229.212.5 with SMTP id gq5mr5599744qcb.275.1290476759318; Mon,
	22 Nov 2010 17:45:59 -0800 (PST)
MIME-Version: 1.0
Sender: ivoras@gmail.com
Received: by 10.229.231.143 with HTTP; Mon, 22 Nov 2010 17:45:19 -0800 (PST)
In-Reply-To: <4CEB8AEF.7030202@freebsd.org>
References: <iccd37$lhh$1@dough.gmane.org> <op.vmj44dm634t2sn@skeletor.lan>
	<4CEA9C46.8010507@freebsd.org> <icf1nk$192$1@dough.gmane.org>
	<icf36a$8ik$1@dough.gmane.org> <4CEB8AEF.7030202@freebsd.org>
From: Ivan Voras <ivoras@freebsd.org>
Date: Tue, 23 Nov 2010 02:45:19 +0100
X-Google-Sender-Auth: Fel-KL5ST49mnrAAqD_IhjtgAzc
Message-ID: <AANLkTikyK_q1Uw+SWHB9EMTNRgiQhA9frhdUYy7KoQF_@mail.gmail.com>
To: David Xu <davidxu@freebsd.org>
Content-Type: text/plain; charset=UTF-8
Cc: freebsd-performance@freebsd.org
Subject: Re: PostgreSQL performance scaling
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Nov 2010 02:15:34 -0000

On 23 November 2010 10:35, David Xu <davidxu@freebsd.org> wrote:
> Ivan Voras wrote:

>> and the overall behaviour is similar - the processes spend a lot of time
>> in "sbwait" and "ksem" states.
>>
> Strange, the POSIX semaphore in head branch does not use ksem, it is
> based on umtx, there is no limit on POSIX semaphore, the only limit
> is process's address space which limits how many semaphores can be
> used.

*shrug*; I don't know how it could be wrong - this PostgreSQL was
built from ports after I upgraded & booted 9-current.

If it didn't use POSIX semaphores from HEAD, shared semaphores
wouldn't have worked, right?

From owner-freebsd-performance@FreeBSD.ORG  Tue Nov 23 02:34:59 2010
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0FD87106564A;
	Tue, 23 Nov 2010 02:34:59 +0000 (UTC)
	(envelope-from davidxu@freebsd.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id F05E78FC15;
	Tue, 23 Nov 2010 02:34:58 +0000 (UTC)
Received: from xyf.my.dom (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oAN2YuJl018060;
	Tue, 23 Nov 2010 02:34:57 GMT (envelope-from davidxu@freebsd.org)
Message-ID: <4CEB98D6.40902@freebsd.org>
Date: Tue, 23 Nov 2010 10:35:02 +0000
From: David Xu <davidxu@freebsd.org>
User-Agent: Thunderbird 2.0.0.24 (X11/20100630)
MIME-Version: 1.0
To: Ivan Voras <ivoras@freebsd.org>
References: <iccd37$lhh$1@dough.gmane.org> <op.vmj44dm634t2sn@skeletor.lan>
	<4CEA9C46.8010507@freebsd.org> <icf1nk$192$1@dough.gmane.org>
	<icf36a$8ik$1@dough.gmane.org> <4CEB8AEF.7030202@freebsd.org>
	<AANLkTikyK_q1Uw+SWHB9EMTNRgiQhA9frhdUYy7KoQF_@mail.gmail.com>
In-Reply-To: <AANLkTikyK_q1Uw+SWHB9EMTNRgiQhA9frhdUYy7KoQF_@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-performance@freebsd.org
Subject: Re: PostgreSQL performance scaling
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Nov 2010 02:34:59 -0000

Ivan Voras wrote:
> On 23 November 2010 10:35, David Xu <davidxu@freebsd.org> wrote:
>> Ivan Voras wrote:
> 
>>> and the overall behaviour is similar - the processes spend a lot of time
>>> in "sbwait" and "ksem" states.
>>>
>> Strange, the POSIX semaphore in head branch does not use ksem, it is
>> based on umtx, there is no limit on POSIX semaphore, the only limit
>> is process's address space which limits how many semaphores can be
>> used.
> 
> *shrug*; I don't know how it could be wrong - this PostgreSQL was
> built from ports after I upgraded & booted 9-current.
> 
> If it didn't use POSIX semaphores from HEAD, shared semaphores
> wouldn't have worked, right?
> 

It may work, but even it is shared in memory, it still enters
kernel to do P/V operation.


From owner-freebsd-performance@FreeBSD.ORG  Wed Nov 24 01:40:07 2010
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 376E71065672
	for <freebsd-performance@freebsd.org>;
	Wed, 24 Nov 2010 01:40:07 +0000 (UTC)
	(envelope-from gofp-freebsd-performance@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id D773B8FC17
	for <freebsd-performance@freebsd.org>;
	Wed, 24 Nov 2010 01:40:06 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <gofp-freebsd-performance@m.gmane.org>)
	id 1PL4L6-00021T-N3
	for freebsd-performance@freebsd.org; Wed, 24 Nov 2010 02:40:04 +0100
Received: from bl13-84-33.dsl.telepac.pt ([85.246.84.33])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-performance@freebsd.org>; Wed, 24 Nov 2010 02:40:04 +0100
Received: from luis.neves by bl13-84-33.dsl.telepac.pt with local (Gmexim 0.1
	(Debian)) id 1AlnuQ-0007hv-00
	for <freebsd-performance@freebsd.org>; Wed, 24 Nov 2010 02:40:04 +0100
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-performance@freebsd.org
From: Luis Neves <luis.neves@gmail.com>
Date: Wed, 24 Nov 2010 01:30:01 +0000
Lines: 23
Message-ID: <ichpqp$vr8$1@dough.gmane.org>
References: <iccd37$lhh$1@dough.gmane.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: bl13-84-33.dsl.telepac.pt
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US;
	rv:1.9.2.12) Gecko/20101027 Thunderbird/3.1.6
In-Reply-To: <iccd37$lhh$1@dough.gmane.org>
X-Mailman-Approved-At: Wed, 24 Nov 2010 02:57:13 +0000
Cc: freebsd-hackers@freebsd.org
Subject: Re: PostgreSQL performance scaling
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 24 Nov 2010 01:40:07 -0000

On 11/22/2010 12:21 AM, Ivan Voras wrote:

> The "semwait" part is from PostgreSQL - probably shared buffer locking,
> but there's a large number of processes regularly in sbwait - maybe
> something can be optimized here?

I think this paper was mentioned before, did you read it?... "An 
Analysis of Linux Scalability to Many Cores"?
<http://pdos.csail.mit.edu/papers/linux:osdi10.pdf>


ABSTRACT.
"This paper analyzes the scalability of seven system applications (Exim,
memcached, Apache, PostgreSQL, gmake, Psearchy, and MapReduce) running
on Linux on a 48- core computer."


The paper is about Linux, but it also focus on some changes that can be 
made to PostgreSQL to achieve better concurrency.


--
Luis Neves


From owner-freebsd-performance@FreeBSD.ORG  Thu Nov 25 09:50:16 2010
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id F3291106564A
	for <freebsd-performance@freebsd.org>;
	Thu, 25 Nov 2010 09:50:15 +0000 (UTC)
	(envelope-from yar.tikhiy@gmail.com)
Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com
	[74.125.82.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 8D7478FC16
	for <freebsd-performance@freebsd.org>;
	Thu, 25 Nov 2010 09:50:15 +0000 (UTC)
Received: by wyf19 with SMTP id 19so708294wyf.13
	for <freebsd-performance@freebsd.org>;
	Thu, 25 Nov 2010 01:50:14 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:received:date:message-id
	:subject:from:to:content-type;
	bh=NPhOp55UigYWIhd8Qfz8LbuP3WsklTO4Yb3dz+odlck=;
	b=ZVMRjDGij0E29qhRceVCFrAgHgu3aR5lHrSQGG7mDTf43NSf5lfFkQL7iEu0TsTy3S
	sUNXIX7XcXri+dGaHc1Z8Ck75biBwpbo1w5uMZDnNo8w5B5HcIIkHCImAM2ryxMIS0sN
	5E+LUQTlSjuTSrjv1hLx3KDVZ/BQ3GMaFbjA4=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:date:message-id:subject:from:to:content-type;
	b=eYOyXw1Xn7VT8vlKwHeO5GNapOT2uHW1D2b0fTQ1VDJot7l8QZLgC9iKD/k4FANveh
	94vY7IlLZ8ZiEAQqFzGMOSeOWsJ/xSfqB20zfFVSdgWZjBt76gv5MITddK2H0n66qF34
	1dyCE1NI0RigCD76D/ayqrWRcZumjykd+WjY0=
MIME-Version: 1.0
Received: by 10.227.132.137 with SMTP id b9mr531094wbt.48.1290676835234; Thu,
	25 Nov 2010 01:20:35 -0800 (PST)
Received: by 10.227.127.143 with HTTP; Thu, 25 Nov 2010 01:20:35 -0800 (PST)
Date: Thu, 25 Nov 2010 20:20:35 +1100
Message-ID: <AANLkTimKQKxn1415bYHivve46r=TJ+cgS0eiGTRXzNqZ@mail.gmail.com>
From: Yar Tikhiy <yar.tikhiy@gmail.com>
To: freebsd-performance@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
Subject: Poor RAID performance demystified
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Nov 2010 09:50:16 -0000

Hi all,

This issue has been raised periodically on various lists and forums
and I myself recently ran into it so I feel that I should just post my
findings here.

Every now and then somebody complains about extremely poor RAID
performance.  What is common in those reports is that they usually
mention FreeBSD and HP RAID controllers, and all of them are about
load patterns from Postgresql.  We are just about to see why it's so.

People get surprisingly low disk I/O performance (e.g., 1-2MB/s) in
spite of numerous spindles striped in the array when the benchmark
involves a lot of tiny DB transactions.  On the same array, sequential
read and write rates can be more than satisfactory.

That happens just because Postgresql in its default configuration is
*remarkably* stringent about flushing every transaction out to the
disk before proceeding to the next one.  The PG folks know that well.

But, as it is known from practice, just application flushing data
wouldn't be sufficient for this effect to be so pronounced.  What
_might_ be happening here is that HP RAIDs as driven by FreeBSD do
fully comply with flush requests all the way down the disk stack
whereas other popular RAID / OS combos can effectively ignore them to
a certain extent due to latent write-back caching, e.g., that in the
drives.

Why does striping fail to speed the things up?  Just because the
transactions are tiny and every disk write ends up blocked waiting for
a single spindle to handle it.  No striping can speed up 8K or 16K
synchronous writes because they are seek limited, not bandwidth
limited.  (Likewise, no RAID or cache can speed up highly random reads
just a few blocks each as reads are synchronous by their nature just
because you can't know the data before it has been read in.)

It is easy to check if you are hitting this kind of bottleneck.  While
running your benchmark, watch the output from iostat or systat -vm or
gstat.  The average I/O size will closely match the FS block size (the
default is 16K now on FFS) and the tps (transfers per second) value
will be quite close to your disks RPM rate expressed in revs per
second.  E.g., with 10K RPM disks you are going to get 10000 / 60 =
~170 tps and with 15K RPM disks it'll be around 250 tps.  You are just
hitting very basic laws of nature and logic here.

The final question will be, of course, what to do about this issue.
First of all, make up your mind if 150 or 200 write transactions per
second aren't going to be enough for your task.  Your actual load
pattern can be quite different from that in the benchmark.  If you
still need greater write performance on tiny transactions, consider
getting a battery backup unit (BBU) for your RAID adapter.  Quite
remarkably, HP refer to them as "Write-back Cache Enablers" because
installing one is the only way to get an HP RAID adapter do write-back
caching.  A write-back cache with BBU will let the adapter delay and
coalesce tiny writes without jeopardizing the DB integrity.  However,
you'll need to trust your BBU as your DB integrity will be staked on
it (the PG folks are somehow skeptical about BBUs).  On the other
hand, just fiddling with the PG settings to disable transaction
flushing is a certain recipe for disaster.  Fortunately, there is a
trade-off mode in PG where it does transaction coalescing by itself --
search for synchronous_commit.  The downside of it is that, should the
system crash, a few most recent transactions can be lost after they
were reported as successful to the SQL client. That can be OK or not
OK depending on the task, and synchronous_commit can be toggled on per
session or per transaction basis to finely tune the trade-off.

That's it, folks.

Thanks,
Yar

From owner-freebsd-performance@FreeBSD.ORG  Thu Nov 25 11:10:35 2010
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 944AE1065674
	for <freebsd-performance@freebsd.org>;
	Thu, 25 Nov 2010 11:10:35 +0000 (UTC)
	(envelope-from gofp-freebsd-performance@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id 477B38FC16
	for <freebsd-performance@freebsd.org>;
	Thu, 25 Nov 2010 11:10:34 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <gofp-freebsd-performance@m.gmane.org>)
	id 1PLZii-0004vA-Rl
	for freebsd-performance@freebsd.org; Thu, 25 Nov 2010 12:10:32 +0100
Received: from lara.cc.fer.hr ([161.53.72.113])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-performance@freebsd.org>; Thu, 25 Nov 2010 12:10:32 +0100
Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-performance@freebsd.org>; Thu, 25 Nov 2010 12:10:32 +0100
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-performance@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Thu, 25 Nov 2010 12:10:28 +0100
Lines: 19
Message-ID: <iclg6r$pu1$1@dough.gmane.org>
References: <AANLkTimKQKxn1415bYHivve46r=TJ+cgS0eiGTRXzNqZ@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6
In-Reply-To: <AANLkTimKQKxn1415bYHivve46r=TJ+cgS0eiGTRXzNqZ@mail.gmail.com>
X-Enigmail-Version: 1.1.2
Subject: Re: Poor RAID performance demystified
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Nov 2010 11:10:35 -0000

On 11/25/10 10:20, Yar Tikhiy wrote:

> If you
> still need greater write performance on tiny transactions, consider
> getting a battery backup unit (BBU) for your RAID adapter.  Quite
> remarkably, HP refer to them as "Write-back Cache Enablers" because
> installing one is the only way to get an HP RAID adapter do write-back
> caching.  A write-back cache with BBU will let the adapter delay and
> coalesce tiny writes without jeopardizing the DB integrity.  However,
> you'll need to trust your BBU as your DB integrity will be staked on
> it (the PG folks are somehow skeptical about BBUs).

HP also has (and so do probably others by now) capacitor-backed flash 
caches; the theory is to have a fast random IO chunk of flash memory and 
use the capacitor to keep the power up for as long as the flash needs to 
write its large blocks.

I've tried it and the performance is good, but don't have it in 
production yet.