Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 02 Apr 2006 23:08:11 -0400
From:      Tom Lane <tgl@sss.pgh.pa.us>
To:        "Marc G. Fournier" <scrappy@postgresql.org>
Cc:        Kris Kennaway <kris@obsecurity.org>, freebsd-stable@freebsd.org, pgsql-hackers@postgresql.org
Subject:   Re: [HACKERS] semaphore usage "port based"? 
Message-ID:  <27417.1144033691@sss.pgh.pa.us>
In-Reply-To: <20060402234459.Y947@ganymede.hub.org> 
References:  <20060402163504.T947@ganymede.hub.org> <25422.1144016604@sss.pgh.pa.us> <25526.1144017388@sss.pgh.pa.us> <20060402213921.V947@ganymede.hub.org> <26524.1144026385@sss.pgh.pa.us> <20060402222843.X947@ganymede.hub.org> <26796.1144028094@sss.pgh.pa.us> <20060402225204.U947@ganymede.hub.org> <26985.1144029657@sss.pgh.pa.us> <20060402231232.C947@ganymede.hub.org> <27148.1144030940@sss.pgh.pa.us> <20060402232832.M947@ganymede.hub.org> <20060402234459.Y947@ganymede.hub.org>

next in thread | previous in thread | raw e-mail | index | archive | help
"Marc G. Fournier" <scrappy@postgresql.org> writes:
> 'k, try this one ... looks better, actually has semget() calls in it :)

OK, here's our problem:

84250: semget(0x52e2c1,0x11,0x780)		 ERR#17 'File exists'

This is InternalIpcSemaphoreCreate failing because of key collision.
As it should.

84250: semget(0x52e2c1,0x11,0x0)		 = 1114112 (0x110000)

This is IpcSemaphoreCreate trying to see what's up.  OK.

84250: __semctl(0x110000,0x10,0x5,0x0)		 = 537 (0x219)

IpcSemaphoreGetValue indicates it has the right "magic number" to be
a Postgres semaphore set.  Still expected.

84250: __semctl(0x110000,0x10,0x4,0x0)		 = 83699 (0x146f3)

IpcSemaphoreGetLastPID says the sema set is last touched by pid 83699.
Looks reasonable (but do you want to double check that that matched the
first postmaster's PID?)

84250: getpid()					 = 84250 (0x1491a)

our pid ... as expected ...

84250: kill(0x146f3,0x0)			 ERR#3 'No such process'

Oops.  Here is the problem: kill() is lying by claiming there is no such
process as 83699.  It looks to me like there in fact is such a process,
but it's in a different jail.

I venture that FBSD 6 has decided to return ESRCH (no such process)
where FBSD 4 returned some other error that acknowledged that the
process did exist (EPERM would be a reasonable guess).

If this is the story, then FBSD have broken their system and must revert
their change.  They do not have kernel behavior that totally hides the
existence of the other process, and therefore having some calls that
pretend it's not there is simply inconsistent.

			regards, tom lane



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?27417.1144033691>