From owner-freebsd-current@FreeBSD.ORG Tue Apr 10 11:49:42 2007 Return-Path: X-Original-To: current@FreeBSD.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 65AB016A405; Tue, 10 Apr 2007 11:49:42 +0000 (UTC) (envelope-from mux@freebsd.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 5233613C44B; Tue, 10 Apr 2007 11:49:42 +0000 (UTC) (envelope-from mux@freebsd.org) Received: by elvis.mu.org (Postfix, from userid 1920) id D38D01A4D81; Tue, 10 Apr 2007 04:21:50 -0700 (PDT) Date: Tue, 10 Apr 2007 13:21:50 +0200 From: Maxime Henrion To: Mark Kirkwood Message-ID: <20070410112150.GC39474@elvis.mu.org> References: <20070226002234.GA80974@xor.obsecurity.org> <461B69C0.4060707@paradise.net.nz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <461B69C0.4060707@paradise.net.nz> User-Agent: Mutt/1.4.2.2i Cc: pgsql-hackers , performance@FreeBSD.org, current@FreeBSD.org, Kris Kennaway Subject: Re: Anyone interested in improving postgresql scaling? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 Apr 2007 11:49:42 -0000 Mark Kirkwood wrote: > Kris Kennaway wrote: > >If so, then your task is the following: > > > >Make SYSV semaphores less dumb about process wakeups. Currently > >whenever the semaphore state changes, all processes sleeping on the > >semaphore are woken, even if we only have released enough resources > >for one waiting process to claim. i.e. there is a thundering herd > >wakeup situation which destroys performance at high loads. Fixing > >this will involve replacing the wakeup() calls with appropriate > >amounts of wakeup_one(). > > I'm forwarding this to the pgsql-hackers list so that folks more > qualified than I can comment, but as I understand the way postgres > implements locking each process has it *own* semaphore it waits on - > and who is waiting for what is controlled by an in (shared) memory hash > of lock structs (access to these is controlled via platform Dependant > spinlock code). So a given semaphore state change should only involve > one process wakeup. Yes but there are still a lot of wakeups to be avoided in the current System V semaphore code. More specifically, not only do we wakeup all the processes waiting on a single semaphore everytime something changes, but we also wakeup all processes waiting on *any* of the semaphore in the semaphore *set*, whatever the reason we're sleeping. I came up with a quick patch so that Kris could do some testing with it, and it appears to have helped, but only very slightly; apparently some contention within the netisr code caused problems, so that in some cases the patch helped slightly, and in others it didn't. The semaphore code needs a clean rewrite and I hope to take care of this soon, as time permits, since we are heavy consumers of PostgreSQL under FreeBSD at my company. Cheers, Maxime