Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 10 Nov 1998 00:07:38 +0000
From:      Brian Somers <brian@Awfulhak.org>
To:        Marc Slemko <marcs@znep.com>
Cc:        Brian Somers <brian@Awfulhak.org>, freebsd-hackers@FreeBSD.ORG
Subject:   Re: bind()/listen() race 
Message-ID:  <199811100007.AAA02529@woof.lan.awfulhak.org>
In-Reply-To: Your message of "Sun, 08 Nov 1998 17:10:10 PST." <Pine.BSF.4.05.9811081702180.8174-100000@alive.znep.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
> On Sun, 8 Nov 1998, Brian Somers wrote:
> 
> > The code says something like:
> > 
> >   if (bind(blah) < 0) {
> >     /* we're happy to be the client */
> >     if (connect(blah) < 0) {
> >       socket is bound without a listen !
> >     }
> >   } else if (listen(blah) < 0) {
> >     oops !
> >   }
> > 
> > The sockaddrs are local domain sockets used by ppp in multilink mode. 
> > Whoever gets to be the server will survive.  The other ppp will 
> > become the client, pass a file descriptor to the server and hang 
> > around holding the session 'till the other ppp kills it.
> > 
> > However, if the two ppps get unlucky, one bind()s and the second 
> > fails then tries to connect() and fails 'cos the server hasn't 
> > listen()ed yet.  This is bad news.
> > 
> > The only way I can see around it - given that I can't sleep() in ppp 
> > without screwing up with other timing issues - is to detect the error 
> > and do a 1 second timeout, and try again then.  This is a nasty thing 
> > to have to do....  I'd prefer an atomic bind()/listen() facility....
> 
> No, an atomic bind/listen isn't the solution, you simply need some form of
> synchronization between the processes.
> 
> For example, you could use a lockfile and require a write lock arouncd the
> bind and listen.
> 
> Unfortunately, inter process synchronization is more of a pain than intra
> process synchronization.
> 
> If you used a lock file on disk with the server pid, you could also avoid
> mistakenly thinking that something else listening on the port is the
> server.

This is out of the question.  Ppp is only allowed block in select().  
That's why it's so difficult to ``sleep''.  I have to set up a 
timeout function that'll kick ppp into continuing where it left off, 
then go back and select().

Also, at this point in the communication process, there's no time to 
muck around.  If I can't bind() and can't connect() pretty quickly, the 
peer's going to give up on us.  At this (rather critical) point, 
we've told the peer that we'll do multilink and we've found that 
another ppp is already talking to the remote machine in multilink 
mode (or has crashed badly).

> You suggestion is possible as a workaround, and is probably the easiest
> fix.

It maybe the only practical one too.  Backing off for a second and 
trying again will give us a situation where we can fail due to a 
previously crashed ppp (or maybe even resurrect the socket), or we 
get a connection....  We'd be extremely unlikely to have had a whole 
second between a bind() and listen() :-)  If another ppp actually 
came up and went down again in that second, then we're probably going 
to do the same thing - due to authentication or IPCP negotiation 
failure (misconfiguration).

I appreciate that some sort of elaborate read-lock-promoted
-to-write-lock on the server side would allow the client to say ``if 
there's a write lock, connect(), otherwise if there's no read lock 
the server's dead - but this may be overkill....  The simpler the 
better.
-- 
Brian <brian@Awfulhak.org>, <brian@FreeBSD.org>, <brian@OpenBSD.org>
      <http://www.Awfulhak.org>;
Don't _EVER_ lose your sense of humour....



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199811100007.AAA02529>