From owner-freebsd-bugs  Mon Apr  5 17:32: 0 1999
Delivered-To: freebsd-bugs@freebsd.org
Received: from freefall.freebsd.org (freefall.FreeBSD.ORG [204.216.27.21])
	by hub.freebsd.org (Postfix) with ESMTP id F0F0C1553E
	for <freebsd-bugs@FreeBSD.org>; Mon,  5 Apr 1999 17:31:57 -0700 (PDT)
	(envelope-from gnats@FreeBSD.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.9.2/8.9.2) id RAA92554;
	Mon, 5 Apr 1999 17:30:01 -0700 (PDT)
	(envelope-from gnats@FreeBSD.org)
Received: from wrath.cs.utah.edu (wrath.cs.utah.edu [155.99.198.100])
	by hub.freebsd.org (Postfix) with ESMTP id 308551547B
	for <FreeBSD-gnats-submit@freebsd.org>; Mon,  5 Apr 1999 17:31:40 -0700 (PDT)
	(envelope-from sclawson@cs.utah.edu)
Received: from ibapah.cs.utah.edu (ibapah.cs.utah.edu [155.99.212.83])
	by wrath.cs.utah.edu (8.8.8/8.8.8) with ESMTP id SAA27780
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 5 Apr 1999 18:29:43 -0600 (MDT)
Received: (from sclawson@localhost)
	by ibapah.cs.utah.edu (8.9.1/8.9.1) id SAA19824;
	Mon, 5 Apr 1999 18:29:42 -0600 (MDT)
	(envelope-from sclawson@cs.utah.edu)
Message-Id: <199904060029.SAA19824@ibapah.cs.utah.edu>
Date: Mon, 5 Apr 1999 18:29:42 -0600 (MDT)
From: Stephen Clawson <sclawson@cs.utah.edu>
Reply-To: sclawson@cs.utah.edu
To: FreeBSD-gnats-submit@freebsd.org
X-Send-Pr-Version: 3.2
Subject: bin/10971: ypserv segfaults regularly (really: Race condition in the Berkeley db library).
Sender: owner-freebsd-bugs@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


>Number:         10971
>Category:       bin
>Synopsis:       ypserv segfaults regularly (really: Race condition in the Berkeley db library).
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Apr  5 17:30:01 PDT 1999
>Closed-Date:
>Last-Modified:
>Originator:     Stephen Clawson
>Release:        FreeBSD 3.0-CURRENT i386 (jan 27, 1999)
>Organization:
University of Utah Computer Science
>Environment:

We've been running a slave yp server on a dual Pentium II/350 for a
group of twenty or so FreeBSD machines and a few assorted linux boxes.
The underlying network is switched 100BaseT to the server.  

>Description:

After setting up a slave yp server for our group and switching all of
our machines to use it, I started to notice ypserv.core files in /.
It turns out that there's a race in Berkeley DB that gets tickled
through a combination of the DB_CACHE code in ypserv, running
ypserv on an SMP box and a bug in ypserv (PR bin/10970).

The problem comes down to the DB_CACHE code keeping heavily used
databases open.  This is usually a good thing, but because the
databases are already open, when the child forks it shares a file
descriptor (and thus a file description, and thus a file pointer) for
the database file with it's parent.  If a lot of requests for the same
database come in at the same time, this means that multiple children
will also be sharing the same file description, since they all came
from the same parent.

With all the children accessing the database concurrently, they react
badly to the race in libc/db/hash/hash_page.c:__get_page():

    if ((lseek(fd, (off_t)page << hashp->BSHIFT, SEEK_SET) == -1) ||
        ((rsize = read(fd, p, size)) == -1))
                return (-1);

The problem shows up in this fragment from a ktrace:

 26527 ypserv   CALL  lseek(0x7,0,0x1e000,0,0)
 26527 ypserv   RET   lseek 122880/0x1e000
 26533 ypserv   CALL  lseek(0x7,0,0x8000,0,0)
 26533 ypserv   RET   lseek 32768/0x8000
 26527 ypserv   CALL  read(0x7,0x80a2000,0x1000)
 26527 ypserv   GIO   fd 7 read 4096 bytes
       "0\0\M-{\^O\M-?\^O\M-7\^Oq\^Ol\^O)\^O#\^O\M-_\^N\M-\\^N\M-#\^N\M^\\^NZ\
        \^NR\^N\r\^N\^E\^N"
 26527 ypserv   RET   read 4096/0x1000
 26533 ypserv   CALL  read(0x7,0x80ad000,0x1000)
 26533 ypserv   GIO   fd 7 read 4096 bytes
       "\M-_\M^?\M^?\M^?\M^?\M^?\M^?\M^?\M^?\M^?\M^?\M^?\M^?\M^?\M^?\M^?\M^?\
        \M^?\M^?\M^?\M^?\M^?\M^?\M^?\M^?\M^?\M^?\M^?\M^?\M^?\M^?\M^?"
 26533 ypserv   RET   read 4096/0x1000
 26527 ypserv   CALL  lseek(0x7,0,0x1f000,0,0)
 26527 ypserv   RET   lseek 126976/0x1f000
 26533 ypserv   PSIG  SIGSEGV SIG_DFL
 26533 ypserv   NAMI  "ypserv.core"

Neither process is getting the correct data, but in the case of
the second process, it's getting whatever is 0x1000 bytes after the
data it really wants, causing hash_seq() to return a pointer into
unallocated memory, causing a segfault when it's dereferenced.

>How-To-Repeat:

Set up a yp server and barrage it with as many yp_all requests that
you can.  The simplest way to do this is to fork off a bunch of `ypcat
passwd' processes from a client machine (20 is usually sufficent).

You've got to get enough going at a time that the forked children will
respond to a bunch of them at once and interleave their reads from the 
database.

>Fix:

NetBSD's fix to this problem is to introduce a new system call,
pread, which takes an offset so that the lseek and read can be done
atomicly.

Some other fixes are either to put something in the manpage that warns
about making concurrent accesses to shared open databases and how you
will eventually loose, or locking the file descriptor before the lseek
and unlocking it after the read.  The problem with the first is that
it'll be easy to miss and the problem with the second is that you have
to have the fd opened read/write to get an exclusive lock. =)

If you only want to fix ypserv, then closing and reopening the
database on a fork should do the trick, since the problem dosen't show
up if you're not sharing a file pointer.

>Release-Note:
>Audit-Trail:
>Unformatted:


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message