Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 10 Apr 2007 14:43:04 -0400
From:      Kris Kennaway <kris@obsecurity.org>
To:        Mark Kirkwood <markir@paradise.net.nz>
Cc:        pgsql-hackers <pgsql-hackers@postgresql.org>, performance@FreeBSD.org, current@FreeBSD.org, Kris Kennaway <kris@obsecurity.org>
Subject:   Re: Anyone interested in improving postgresql scaling?
Message-ID:  <20070410184304.GB44123@xor.obsecurity.org>
In-Reply-To: <461B69C0.4060707@paradise.net.nz>
References:  <20070226002234.GA80974@xor.obsecurity.org> <461B69C0.4060707@paradise.net.nz>

next in thread | previous in thread | raw e-mail | index | archive | help

--huq684BweRXVnRxX
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Apr 10, 2007 at 10:41:04PM +1200, Mark Kirkwood wrote:
> Kris Kennaway wrote:
> >If so, then your task is the following:
> >
> >Make SYSV semaphores less dumb about process wakeups.  Currently
> >whenever the semaphore state changes, all processes sleeping on the
> >semaphore are woken, even if we only have released enough resources
> >for one waiting process to claim.  i.e. there is a thundering herd
> >wakeup situation which destroys performance at high loads.  Fixing
> >this will involve replacing the wakeup() calls with appropriate
> >amounts of wakeup_one().
>=20
> I'm forwarding this to the pgsql-hackers list so that folks more=20
> qualified than I can comment, but as I understand the way postgres=20
> implements locking each process has it *own* semaphore it waits on  -=20
> and who is waiting for what is controlled by an in (shared) memory hash=
=20
> of lock structs (access to these is controlled via platform Dependant=20
> spinlock code). So a given semaphore state change should only involve=20
> one process wakeup.

I have not studied the exact code path, but there are indeed multiple
wakeups happening from the semaphore code (as many as the number of
active postgresql processes).  It is easy to instrument
sleepq_broadcast() and log them when they happen.

Anyway mux@ fixed this some time ago, which indeed helped scaling for
traffic over a local domain socket (particularly at higher loads), but
I saw some anomalous results when using loopback TCP traffic.  I think
this is unrelated (in this situation TCP is highly contended, and it
is often the case that fixing one bottleneck can make a highly
contended situation perform worse, because you were effectively
serializing a bit before, and reducing the non-linear behaviour) but
am still investigating, so the patch has not yet been committed.

Kris

--huq684BweRXVnRxX
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (FreeBSD)

iD8DBQFGG9q4Wry0BWjoQKURAgj5AKD8GphymMDpkMqiJsyxu77xXZN5RACbBlbV
OxZZdXcUrbW7nwz2Ac/srxo=
=UDMf
-----END PGP SIGNATURE-----

--huq684BweRXVnRxX--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070410184304.GB44123>