Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 29 Apr 1999 18:04:29 -0600 (MDT)
From:      "David G. Andersen" <danderse@cs.utah.edu>
To:        "David E. Cross" <crossd@cs.rpi.edu>
Cc:        freebsd-hackers@FreeBSD.ORG
Subject:   Re: ypserv
Message-ID:  <14120.61942.222606.324094@torrey.cs.utah.edu>
In-Reply-To: David E. Cross's message of Mon, April 12 1999 <199904121852.OAA22126@cs.rpi.edu>
References:  <199904121852.OAA22126@cs.rpi.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
Lo and Behold, David E. Cross said:
> Our ypserv processes have been dieing a great deal lately (luckily they
> restart themselves, but not before all the clients rebind to another
> machine).  I have tracked the problem down to a stack corruption.  Apparently
> caused by a stack overflow (I am still working on it, don't get excited yet ;).

Sorry for taking so long to reply to this, David.  We've been pounding 
on ypserv lately, and have managed to get it to a fairly stable state
on our servers, but unfortunately, there's one patch we haven't
committed in because it breaks the semantics of the databases.

The problem you're most likely seeing is the database concurrent
access problem (bin/10971).  The solution to this is modifying the
Berkeley DB routines to use pread(), which has only been introduced
into -current recently, and I'm not sure if the db routines have been
modified accordingly.

We patched our DB routines (ONLY for ypserv) to explicitly lock the
database before doing their reads, but since it's a read lock, it
requires *write* access to the database, which changes the semantics
of how ypserv behaves.  It's not a great solution, but it does the
trick for us pretty well.

There's another fix we sent in to stop hangs due to an incorrect use
of the RPC library.  You'll want to apply it.  It's in bin/10970.

If you fix that and get things stable enough, you'll then run into
another probem which is fixed in -current and -stable (bin/11122) with 
a bad length into strncmp.  Bill Paul committed this fix a few weeks
ago, so if you've got an up to date system you'll be OK on that scope.

If you'd like, I can ship you a functioning ypserv binary for 3.0 with 
all of our patches, statically linked against the locking DB routines.

   -Dave Andersen

-- 
work: danderse@cs.utah.edu                     me:  angio@pobox.com
      University of Utah                            http://www.angio.net/
      Computer Science - Flux Research Group   "What's footnote FIVE?"


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?14120.61942.222606.324094>