From owner-freebsd-current  Mon Feb 22 23:12:54 1999
Delivered-To: freebsd-current@freebsd.org
Received: from lor.watermarkgroup.com (lor.watermarkgroup.com [207.202.73.33])
	by hub.freebsd.org (Postfix) with ESMTP id 0117411A6F
	for <freebsd-current@FreeBSD.ORG>; Mon, 22 Feb 1999 23:12:51 -0800 (PST)
	(envelope-from luoqi@watermarkgroup.com)
Received: (from luoqi@localhost)
	by lor.watermarkgroup.com (8.8.8/8.8.8) id CAA02316;
	Tue, 23 Feb 1999 02:12:50 -0500 (EST)
	(envelope-from luoqi)
Date: Tue, 23 Feb 1999 02:12:50 -0500 (EST)
From: Luoqi Chen <luoqi@watermarkgroup.com>
Message-Id: <199902230712.CAA02316@lor.watermarkgroup.com>
To: green@unixhelp.org, kan@sti.cz
Subject: Re: Filesystem deadlock
Cc: freebsd-current@FreeBSD.ORG
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> On Mon, 22 Feb 1999, Alexander N. Kabaev wrote:
> 
> > The following script reliably causes FreeBSD 4.0-CURRENT (and 3.1-STABLE
> > as of today) to lookup. Shortly after this script is started, all disk activity
> > 
> > stops and any attempt to create new process causes system to freese. While in DDB, ps command
> > 
> > shows, that all ten fgrep processes are sleeping on inode, all xargs are in waitpid and
> > 
> > all sh processes are in wait.
> 
> You forget about all the processes (just a few, actually) stuck in "kmaw"
> (kmem_alloc_wait). This is definitely reproducible :( Should be simple for
> someone more knowledgeable to diagnose, as it looks to be a straight
> vm/vfs(ufs/ffs) interaction.
> 
This seems to be the good old vnode deadlock during vm_fault() that has been
reported a couple of times, and there's still no satisfactory solution to it:
fgrep does something like this: (don't ask me why)

	addr = mmap(0, len, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, offset);
	read(fd, addr, count);

the read() syscall first locks the vnode, read the data from disk, then copy
the data to buffer at addr, now if addr is not in core, there'll be a page
fault and the fault handler vm_fault will try to lock the vnode pager backing
the page at addr, which is already locked, deadlock. This deadlock then
propagates all the way back to the root vnode and the whole system would
freeze.

-lq

> > 
> > Unfortunately, I cannot run -g kernel on my box
> > at this time, so amount of useful information I can provide is pretty much
> > limited :(
> > 
> > #!/bin/sh
> > for j in 1 2 3 4 5 6 7 8 9 10; do
> >   echo -n $i $j
> >     nohup sh -c 'while :; do find /usr -type f | xargs fgrep zukabuka;
> > done' \
> >               >/dev/null 2>&1 &
> >     echo
> > done
> > 


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message