From owner-freebsd-hackers@FreeBSD.ORG  Mon Aug  1 13:22:26 2005
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
X-Original-To: freebsd-hackers@freebsd.org
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 0B61F16A41F
	for <freebsd-hackers@freebsd.org>; Mon,  1 Aug 2005 13:22:26 +0000 (GMT)
	(envelope-from mckusick@mckusick.com)
Received: from beastie.mckusick.com (71-32-213-112.hlna.qwest.net
	[71.32.213.112])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 84E5043D45
	for <freebsd-hackers@freebsd.org>; Mon,  1 Aug 2005 13:22:25 +0000 (GMT)
	(envelope-from mckusick@mckusick.com)
Received: from beastie.mckusick.com (localhost [127.0.0.1])
	by beastie.mckusick.com (8.12.8/8.12.9) with ESMTP id j71DMGYd058644;
	Mon, 1 Aug 2005 06:22:16 -0700 (PDT)
	(envelope-from mckusick@beastie.mckusick.com)
Message-Id: <200508011322.j71DMGYd058644@beastie.mckusick.com>
To: Matthew Dillon <dillon@apollo.backplane.com>
In-Reply-To: Your message of "Sun, 31 Jul 2005 12:44:16 PDT."
Date: Mon, 01 Aug 2005 06:22:08 -0700
From: Kirk McKusick <mckusick@mckusick.com>
X-Mailman-Approved-At: Tue, 02 Aug 2005 12:21:06 +0000
Cc: freebsd-hackers@freebsd.org
Subject: Re: Possible softupdates bug when a indirect block buffer is reused
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 01 Aug 2005 13:22:26 -0000

This has been a long nagging problem that was finally tracked down and
fixed by Stephan Uphoff <ups@tree.com>. See revision 1.182 on 2005/07/31
to sys/ufs/ffs/ffs_softdep.c.

	Kirk McKusick

=-=-=-=-=-=-=

Date: Sun, 31 Jul 2005 11:40:32 -0700 (PDT)
From: Matthew Dillon <dillon@apollo.backplane.com>
To: Kirk McKusick <mckusick@mckusick.com>
Cc: freebsd-hackers@freebsd.org
Subject: Possible softupdates bug when a indirect block buffer is reused
X-ASK-Info: Whitelist match [from dillon@apollo.backplane.com] (2005/07/31 11:40:52)

    Hi Kirk, hackers!

    I'm trying to track down a bug that is causing a buffer to be left
    in a locked state and then causes the filesystem to lock up because
    of that.

    The symptoms are that a heavily used filesystem suddenly starts running
    out of space.  It isn't due to deleted files with open descriptors, it's
    due to the syncer getting stuck in a getblk state.  This is in DragonFly,
    but I can't find anything DFlyish wrong so I'm beginning to think it may
    be an actual bug in softupdates.

    I have wound up with a situation where a getblk()'d bp has been
    associated with a indirdep dependancy, i.e. stored in
    indirdep->ir_savebp, but is never released.  When something like
    the syncer comes along and tries to access it, it locks up, and this
    of course leads to inodes not getting cleared and the filesystem
    eventually runs out of space when a lot of files are being created and
    deleted.

    What has got me really confused is that the buffer in question seems to
    wind up with a D_INDIRDEP dependancy that points back to itself.

    Here's the situation from a live gdb.  Here is where the syncer is 
    stuck:

(kgdb) back
#0  lwkt_switch () at thread2.h:95
#1  0xc02a8a79 in tsleep (ident=0x0, flags=0, wmesg=0xc04eadb0 "getblk", 
    timo=0) at /usr/src-125beta/sys/kern/kern_synch.c:428
#2  0xc02956bb in acquire (lkp=0xc758b4e0, extflags=33554464, wanted=1536)
    at /usr/src-125beta/sys/kern/kern_lock.c:127
#3  0xc0295a92 in lockmgr (lkp=0xc758b4e0, flags=33620002, interlkp=0x0, 
    td=0xd68f6400) at /usr/src-125beta/sys/kern/kern_lock.c:354
#4  0xc02d6828 in getblk (vp=0xc71b3058, blkno=94440240, size=8192, slpflag=0, 
    slptimeo=0) at thread.h:79
#5  0xc02d4404 in bread (vp=0xc71b3058, blkno=0, size=0, bpp=0x0)
    at /usr/src-125beta/sys/kern/vfs_bio.c:567
#6  0xc03f24fe in indir_trunc (ip=0xe048fc0c, dbn=94440240, level=1, lbn=2060, 
    countp=0xe048fbf8) at /usr/src-125beta/sys/vfs/ufs/ffs_softdep.c:2221
#7  0xc03f22df in handle_workitem_freeblocks (freeblks=0xe2fcef98)
    at /usr/src-125beta/sys/vfs/ufs/ffs_softdep.c:2138
#8  0xc03f0462 in process_worklist_item (matchmnt=0x0, flags=0)
    at /usr/src-125beta/sys/vfs/ufs/ffs_softdep.c:726
#9  0xc03f026c in softdep_process_worklist (matchmnt=0x0)
    at /usr/src-125beta/sys/vfs/ufs/ffs_softdep.c:625
#10 0xc02e5ff3 in sched_sync () at /usr/src-125beta/sys/kern/vfs_sync.c:244
#11 0xc0294863 in kthread_create_stk (func=0, arg=0x0, tdp=0xff800000, 
    stksize=0, fmt=0x0) at /usr/src-125beta/sys/kern/kern_kthread.c:104
(kgdb) 

    The buffer it is stuck on:

(kgdb) print bp
$62 = (struct buf *) 0xc758b4b8
(kgdb) print *bp
$63 = {
  b_hash = {
    le_next = 0x0, 
    le_prev = 0xc7391348
  }, 
  b_vnbufs = {
    tqe_next = 0xc739b890, 
    tqe_prev = 0xc76f32b8
  }, 
  b_freelist = {
    tqe_next = 0xc768d610, 
    tqe_prev = 0xc0565ac0
  }, 
  b_act = {
    tqe_next = 0x0, 
    tqe_prev = 0x0
  }, 
  b_flags = 536870912, 	<<<<<<<<< 0x20000000 (getblk with no bread, etc)
  b_qindex = 0, 
  b_xflags = 2 '\002', 
  b_lock = {
    lk_interlock = {
      t_cpu = 0xff800000, 
      t_reqcpu = 0xff800000, 
      t_unused01 = 0
    }, 
    lk_flags = 2098176, 
    lk_sharecount = 0, 
    lk_waitcount = 1, 
    lk_exclusivecount = 1, 
    lk_prio = 0, 
    lk_wmesg = 0xc04eadb0 "getblk", 
    lk_timo = 0, 
    lk_lockholder = 0xfffffffe
  }, 
  b_error = 0, 
  b_bufsize = 8192, 
  b_runningbufspace = 0, 
  b_bcount = 8192, 
  b_resid = 0, 
  b_dev = 0xde0f0e38, 
  b_data = 0xcf824000 "¨\205Ð\002", 
  b_kvabase = 0xcf824000 "¨\205Ð\002", 
  b_kvasize = 16384, 
  b_lblkno = 94440240, 
  b_blkno = 94440240, 
  b_offset = 48353402880, 
  b_iodone = 0, 
  b_iodone_chain = 0x0, 
  b_vp = 0xc71b3058, 
  b_dirtyoff = 0, 
  b_dirtyend = 0, 
  b_pblkno = 87503631, 
  b_saveaddr = 0x0, 
  b_driver1 = 0x0, 
  b_caller1 = 0x0, 
  b_pager = {
    pg_spc = 0x0, 
    pg_reqpage = 0
  }, 
  b_cluster = {
    cluster_head = {
      tqh_first = 0x0, 
      tqh_last = 0xc768d6bc
---Type <return> to continue, or q <return> to quit--- 
    }, 
    cluster_entry = {
      tqe_next = 0x0, 
      tqe_prev = 0xc768d6bc
    }
  }, 
  b_xio = {
    xio_pages = 0xc758b584, 
    xio_npages = 2, 
    xio_offset = 0, 
    xio_bytes = 0, 
    xio_flags = 0, 
    xio_error = 0, 
    xio_internal_pages = {0xc34e5870, 0xc4aeb2b4, 0x0 <repeats 30 times>}
  }, 
  b_dep = {
    lh_first = 0xc7045040
  }, 
  b_chain = {
    parent = 0x0, 
    count = 0
  }
}

    As you can see from b_flags, which is 0x20000000, the buffer has been
    getblk()'d but not bread() or anything else.  It is the typical state
    that occurs when a buffer is placed in an indirdep->ir_savebp state
    in setup_allocindir_phase2().

    The buffer's dependancy looks like this:

(kgdb) print bp
$65 = (struct buf *) 0xc758b4b8
(kgdb) print *(struct indirdep *)bp->b_dep.lh_first
$66 = {
  ir_list = {
    wk_list = {
      le_next = 0x0, 
      le_prev = 0xc758b604
    }, 
    wk_type = 5, 
    wk_state = 33025		<<<<<<<<<<<<<  ATTACHED|GOINGAWAY|ONWORKLIST
  }, 
  ir_saveddata = 0xdeadc0de "", 
  ir_savebp = 0xc758b4b8, 	<<<<<<<<<<<<< points back to itself
  ir_donehd = {
    lh_first = 0x0
  }, 
  ir_deplisthd = {
    lh_first = 0x0
  }
}
(kgdb) 

    As you can see, the buffer's dependancy seems to point to itself.
    As you may know, ir_savebp buffers are left in a locked state, so a
    buffer that has gotten into this situation winds up being permanently
    deadlocked.

    This is on DragonFly, but I can't find anything else wrong.  This is 
    on a filesystem running a myrid of jails which is constantly creating
    and deleting files in parallel... so buffers are being reused quite often.
    It takes about a week of this heavy activity for the bug to rear its
    ugly head.  It's fairly difficult to reproduce (takes about a week to
    reproduce).

    When the problem does occur the system remains functional... the disk
    device continues to work, the filesystem continues to work, except for
    any blockages that chain down to the particular block that has deadlocked
    (usually the syncer, but as time goes by more parts of the system will
    deadlock).  It IS possible to run a live gdb when the situation is 
    caught early enough.

    I am trying to figure out how the buffer manages to get into this 
    self-locked state.  I'm pretty much stuck once I get to the
    deallocate_dependancies() procedure.  This procedure seems to be trying
    to move a D_INDIRDEP dependancy from the passed bp into the buffer 
    pointed to by ir_savebp.  As far as I can tell this is what is creating
    the situation that makes the buffer's dependancy self-referential
    and deadlocks the syncer.

    I have noticed a number of FreeBSD bug reports related to blocking in
    getblk or to softupdates.  I don't know if this is a similar problem
    but it's worth Ccing freebsd-hackers on it as well.

    My user reports that the problem also occurs on 
    FreeBSD 4.10 and 4.11, on uniprocessor builds (other builds
    and 5.x/6.x have not been tested).

    I took a look at the 4.x and 6.x softupdates code and didn't see any
    commits that might address the problem.

    This adds weight to the likelihood of it being a softupdates bug of
    some sort, possibly still present in 6.x.

						-Matt