From owner-freebsd-hackers@FreeBSD.ORG Mon Aug 1 13:22:26 2005 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0B61F16A41F for ; Mon, 1 Aug 2005 13:22:26 +0000 (GMT) (envelope-from mckusick@mckusick.com) Received: from beastie.mckusick.com (71-32-213-112.hlna.qwest.net [71.32.213.112]) by mx1.FreeBSD.org (Postfix) with ESMTP id 84E5043D45 for ; Mon, 1 Aug 2005 13:22:25 +0000 (GMT) (envelope-from mckusick@mckusick.com) Received: from beastie.mckusick.com (localhost [127.0.0.1]) by beastie.mckusick.com (8.12.8/8.12.9) with ESMTP id j71DMGYd058644; Mon, 1 Aug 2005 06:22:16 -0700 (PDT) (envelope-from mckusick@beastie.mckusick.com) Message-Id: <200508011322.j71DMGYd058644@beastie.mckusick.com> To: Matthew Dillon In-Reply-To: Your message of "Sun, 31 Jul 2005 12:44:16 PDT." Date: Mon, 01 Aug 2005 06:22:08 -0700 From: Kirk McKusick X-Mailman-Approved-At: Tue, 02 Aug 2005 12:21:06 +0000 Cc: freebsd-hackers@freebsd.org Subject: Re: Possible softupdates bug when a indirect block buffer is reused X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Aug 2005 13:22:26 -0000 This has been a long nagging problem that was finally tracked down and fixed by Stephan Uphoff . See revision 1.182 on 2005/07/31 to sys/ufs/ffs/ffs_softdep.c. Kirk McKusick =-=-=-=-=-=-= Date: Sun, 31 Jul 2005 11:40:32 -0700 (PDT) From: Matthew Dillon To: Kirk McKusick Cc: freebsd-hackers@freebsd.org Subject: Possible softupdates bug when a indirect block buffer is reused X-ASK-Info: Whitelist match [from dillon@apollo.backplane.com] (2005/07/31 11:40:52) Hi Kirk, hackers! I'm trying to track down a bug that is causing a buffer to be left in a locked state and then causes the filesystem to lock up because of that. The symptoms are that a heavily used filesystem suddenly starts running out of space. It isn't due to deleted files with open descriptors, it's due to the syncer getting stuck in a getblk state. This is in DragonFly, but I can't find anything DFlyish wrong so I'm beginning to think it may be an actual bug in softupdates. I have wound up with a situation where a getblk()'d bp has been associated with a indirdep dependancy, i.e. stored in indirdep->ir_savebp, but is never released. When something like the syncer comes along and tries to access it, it locks up, and this of course leads to inodes not getting cleared and the filesystem eventually runs out of space when a lot of files are being created and deleted. What has got me really confused is that the buffer in question seems to wind up with a D_INDIRDEP dependancy that points back to itself. Here's the situation from a live gdb. Here is where the syncer is stuck: (kgdb) back #0 lwkt_switch () at thread2.h:95 #1 0xc02a8a79 in tsleep (ident=0x0, flags=0, wmesg=0xc04eadb0 "getblk", timo=0) at /usr/src-125beta/sys/kern/kern_synch.c:428 #2 0xc02956bb in acquire (lkp=0xc758b4e0, extflags=33554464, wanted=1536) at /usr/src-125beta/sys/kern/kern_lock.c:127 #3 0xc0295a92 in lockmgr (lkp=0xc758b4e0, flags=33620002, interlkp=0x0, td=0xd68f6400) at /usr/src-125beta/sys/kern/kern_lock.c:354 #4 0xc02d6828 in getblk (vp=0xc71b3058, blkno=94440240, size=8192, slpflag=0, slptimeo=0) at thread.h:79 #5 0xc02d4404 in bread (vp=0xc71b3058, blkno=0, size=0, bpp=0x0) at /usr/src-125beta/sys/kern/vfs_bio.c:567 #6 0xc03f24fe in indir_trunc (ip=0xe048fc0c, dbn=94440240, level=1, lbn=2060, countp=0xe048fbf8) at /usr/src-125beta/sys/vfs/ufs/ffs_softdep.c:2221 #7 0xc03f22df in handle_workitem_freeblocks (freeblks=0xe2fcef98) at /usr/src-125beta/sys/vfs/ufs/ffs_softdep.c:2138 #8 0xc03f0462 in process_worklist_item (matchmnt=0x0, flags=0) at /usr/src-125beta/sys/vfs/ufs/ffs_softdep.c:726 #9 0xc03f026c in softdep_process_worklist (matchmnt=0x0) at /usr/src-125beta/sys/vfs/ufs/ffs_softdep.c:625 #10 0xc02e5ff3 in sched_sync () at /usr/src-125beta/sys/kern/vfs_sync.c:244 #11 0xc0294863 in kthread_create_stk (func=0, arg=0x0, tdp=0xff800000, stksize=0, fmt=0x0) at /usr/src-125beta/sys/kern/kern_kthread.c:104 (kgdb) The buffer it is stuck on: (kgdb) print bp $62 = (struct buf *) 0xc758b4b8 (kgdb) print *bp $63 = { b_hash = { le_next = 0x0, le_prev = 0xc7391348 }, b_vnbufs = { tqe_next = 0xc739b890, tqe_prev = 0xc76f32b8 }, b_freelist = { tqe_next = 0xc768d610, tqe_prev = 0xc0565ac0 }, b_act = { tqe_next = 0x0, tqe_prev = 0x0 }, b_flags = 536870912, <<<<<<<<< 0x20000000 (getblk with no bread, etc) b_qindex = 0, b_xflags = 2 '\002', b_lock = { lk_interlock = { t_cpu = 0xff800000, t_reqcpu = 0xff800000, t_unused01 = 0 }, lk_flags = 2098176, lk_sharecount = 0, lk_waitcount = 1, lk_exclusivecount = 1, lk_prio = 0, lk_wmesg = 0xc04eadb0 "getblk", lk_timo = 0, lk_lockholder = 0xfffffffe }, b_error = 0, b_bufsize = 8192, b_runningbufspace = 0, b_bcount = 8192, b_resid = 0, b_dev = 0xde0f0e38, b_data = 0xcf824000 "¨\205Ð\002", b_kvabase = 0xcf824000 "¨\205Ð\002", b_kvasize = 16384, b_lblkno = 94440240, b_blkno = 94440240, b_offset = 48353402880, b_iodone = 0, b_iodone_chain = 0x0, b_vp = 0xc71b3058, b_dirtyoff = 0, b_dirtyend = 0, b_pblkno = 87503631, b_saveaddr = 0x0, b_driver1 = 0x0, b_caller1 = 0x0, b_pager = { pg_spc = 0x0, pg_reqpage = 0 }, b_cluster = { cluster_head = { tqh_first = 0x0, tqh_last = 0xc768d6bc ---Type to continue, or q to quit--- }, cluster_entry = { tqe_next = 0x0, tqe_prev = 0xc768d6bc } }, b_xio = { xio_pages = 0xc758b584, xio_npages = 2, xio_offset = 0, xio_bytes = 0, xio_flags = 0, xio_error = 0, xio_internal_pages = {0xc34e5870, 0xc4aeb2b4, 0x0 } }, b_dep = { lh_first = 0xc7045040 }, b_chain = { parent = 0x0, count = 0 } } As you can see from b_flags, which is 0x20000000, the buffer has been getblk()'d but not bread() or anything else. It is the typical state that occurs when a buffer is placed in an indirdep->ir_savebp state in setup_allocindir_phase2(). The buffer's dependancy looks like this: (kgdb) print bp $65 = (struct buf *) 0xc758b4b8 (kgdb) print *(struct indirdep *)bp->b_dep.lh_first $66 = { ir_list = { wk_list = { le_next = 0x0, le_prev = 0xc758b604 }, wk_type = 5, wk_state = 33025 <<<<<<<<<<<<< ATTACHED|GOINGAWAY|ONWORKLIST }, ir_saveddata = 0xdeadc0de "", ir_savebp = 0xc758b4b8, <<<<<<<<<<<<< points back to itself ir_donehd = { lh_first = 0x0 }, ir_deplisthd = { lh_first = 0x0 } } (kgdb) As you can see, the buffer's dependancy seems to point to itself. As you may know, ir_savebp buffers are left in a locked state, so a buffer that has gotten into this situation winds up being permanently deadlocked. This is on DragonFly, but I can't find anything else wrong. This is on a filesystem running a myrid of jails which is constantly creating and deleting files in parallel... so buffers are being reused quite often. It takes about a week of this heavy activity for the bug to rear its ugly head. It's fairly difficult to reproduce (takes about a week to reproduce). When the problem does occur the system remains functional... the disk device continues to work, the filesystem continues to work, except for any blockages that chain down to the particular block that has deadlocked (usually the syncer, but as time goes by more parts of the system will deadlock). It IS possible to run a live gdb when the situation is caught early enough. I am trying to figure out how the buffer manages to get into this self-locked state. I'm pretty much stuck once I get to the deallocate_dependancies() procedure. This procedure seems to be trying to move a D_INDIRDEP dependancy from the passed bp into the buffer pointed to by ir_savebp. As far as I can tell this is what is creating the situation that makes the buffer's dependancy self-referential and deadlocks the syncer. I have noticed a number of FreeBSD bug reports related to blocking in getblk or to softupdates. I don't know if this is a similar problem but it's worth Ccing freebsd-hackers on it as well. My user reports that the problem also occurs on FreeBSD 4.10 and 4.11, on uniprocessor builds (other builds and 5.x/6.x have not been tested). I took a look at the 4.x and 6.x softupdates code and didn't see any commits that might address the problem. This adds weight to the likelihood of it being a softupdates bug of some sort, possibly still present in 6.x. -Matt