FreeBSD Mail Archives

Date:      Sun, 11 Jun 2006 16:31:44 -0400
From:      Kris Kennaway <kris@obsecurity.org>
To:        Kris Kennaway <kris@obsecurity.org>
Cc:        scrappy@FreeBSD.org, performance@FreeBSD.org
Subject:   Re: Postgresql performance profiling
Message-ID:  <20060611203144.GA34123@xor.obsecurity.org>
In-Reply-To: <20060611174527.GA31119@xor.obsecurity.org>
References:  <20060611174527.GA31119@xor.obsecurity.org>


--PNTmBPCT7hxwcZjr
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sun, Jun 11, 2006 at 01:45:28PM -0400, Kris Kennaway wrote:
> I set up supersmack against postgresql 8.1 from ports (default config)
> on a 12 CPU E4500.  It scales and performs somewhat better than mysql
> on this machine (which is heavily limited by contention between
> threads in a process), but there are a number of obvious performance
> bottlenecks:

FYI, on a dual p4 + HTT, mysql significantly outperforms pgsql (by
>55% peak performance, probably more if I was using libthr which I
cannot on this machine for technical reasons) on select-key.smack when
configured the same way (i.e. transport over IPv4 instead of local
socket, which supersmack prefers for mysql).

Contention is still a big issue here (only listing mutexes contended
more than 10% of acquisitions):

     0          0     142969      0       1996      14458   .101 kern/kern_=
synch.c:218 (Giant)
     0          0     199028      0      11649      27944   .140 kern/kern_=
condvar.c:208 (sellck)
     0          0     400103      0     111216      91336   .228 kern/kern_=
sysctl.c:1317 (Giant)
     0          0     303147      0     108735     131237   .432 i386/i386/=
trap.c:1005 (Giant)

I turned off process title setting and got an 8% performance boost.

Contention is now a bit better but still serious:

     0          0      22952      0       2067       2521   .109 vm/vm_faul=
t.c:987 (vm object)
     0          0     199153      0      12589      31512   .158 kern/kern_=
condvar.c:208 (sellck)
     0          0     361305      0     124766     130901   .362 i386/i386/=
trap.c:1005 (Giant)

i.e. semop() (the Giant-locked syscall) is contending with itself a
lot, and select() is a secondary problem.

Actually rwatson noticed that semop() is marked MPSAFE, so it's not
clear (but nevertheless true) why Giant is acquired here.  OK, pjd
worked out that it's because SYSCALL_MODULE_HELPER() *never* sets the
mpsafe flag, so all such syscalls registered that way (i.e. those
which are part of subsystems that may be loaded from kld) are
Giant-locked regardless of what syscalls.master says.

I removed the SYSCALL_MODULE_HELPERs from sysv_sem.c but now
postgresql hangs when trying to start; possibly the locking in
sysv_sem.c is just broken since it was never in fact tested.

Kris

> * The postgres processes seem to change their proctitle hundreds or
> thousands of times per second.  This is currently done via a
> Giant-locked sysctl (kern.proc.args) so there is enormous contention
> for Giant.  Even when this is fixed (thanks to a patch from csjp@),
> each of them requires a syscall and syscalls ain't free.  This is not
> a clever thing to be doing from a performance standpoint.
>=20
> * pgsql uses select() and this seems to be a major choke point.  I bet
> you'd see fairly impressive performance gains (especially on SMP) if
> it was modified to use kqueue instead of select.
>=20
> * You really want to avoid using IPv6 for transport (since it's
> Giant-locked).  This was an issue at first since I was running against
> localhost, which maps to ::1 by default.  We should reconsider the
> preference for IPv6 over IPv4 until IPv6 is Giant-free - there are
> probably many other situations where IPv6 is being secretly used
> "because it is there" and costing performance.
>=20
> * The sysv IPC code is still giant-locked.  pgsql makes a lot of
> semop() calls which grab Giant, and it also msleep()s on the Giant
> lock in the semwait channel.
>=20
> * When semop() wants to wake up some sleeping processes because
> semaphores have been released, it does a wakeup() and wakes them all
> up.  This means a thundering herd (I see up to 11 CPUs being woken
> here).  Since we know exactly how many resources are available, it
> would be better to only wakeup_one() that number of times instead.
>=20
> Here are what seem to be the relevant heavily-contended mutex
> acquisitions (ratio =3D cnt_lock/count measures how many times this lock
> was contended by something else while held by this code line):
>=20
>   count   cnt_hold cnt_lock ratio name
>  106080     7420    19238   .181 kern/kern_synch.c:222 (lockbuilder mtxpo=
ol) <-- vfs
>  175435    13952    42365   .241 kern/kern_condvar.c:113 (lockbuilder mtx=
pool) <-- vfs
> 1075841   271138   419862   .390 kern/kern_synch.c:220 (Giant) <-- msleep=
 with Giant
>  734613   248249   291969   .397 kern/sys_generic.c:1140 (sellck) <-- sel=
ect
>  800332   379020   326324   .407 kern/sys_generic.c:944 (sellck) <-- sele=
ct
>  401751    19731   175305   .436 kern/sys_generic.c:1092 (sellck) <-- sel=
ect
>  400280   198880   176623   .441 kern/sys_generic.c:935 (sellck) <-- sele=
ct
> 1361163   695637   624171   .458 sparc64/sparc64/trap.c:586 (Giant) <-- s=
emop
>  400190   193112   238578   .596 kern/kern_condvar.c:208 (sellck) <-- sel=
ect
>=20
> Kris



--PNTmBPCT7hxwcZjr
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (FreeBSD)

iD8DBQFEjH2uWry0BWjoQKURAsonAKCarmABCAfQLdp+3DnJNvN7AuOF3ACfcxkt
a8UTiVQhh/fDu/xeADalNeg=
=DsOF
-----END PGP SIGNATURE-----

--PNTmBPCT7hxwcZjr--

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060611203144.GA34123>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation