Date: Thu, 31 Jan 2008 11:43:10 +0100 From: "Attilio Rao" <attilio@freebsd.org> To: "Scot Hetzel" <swhetzel@gmail.com> Cc: Kostik Belousov <kostikbel@gmail.com>, Yar Tikhiy <yar@comp.chem.msu.su>, Doug Barton <dougb@freebsd.org>, freebsd-current@freebsd.org Subject: Re: panic: System call lstat returning with 1 locks held Message-ID: <3bbf2fe10801310243tddedfeckbc4c94be87f0a4ca@mail.gmail.com> In-Reply-To: <790a9fff0801301352xa91a69ci3f08488dfcfc982@mail.gmail.com> References: <3bbf2fe10801250000k5852c2f2j5d1897c900096818@mail.gmail.com> <479BBDAA.6000008@FreeBSD.org> <3bbf2fe10801261657x7d7c9de4q71adeaf3a2dd8159@mail.gmail.com> <479C0B5B.9030709@FreeBSD.org> <3bbf2fe10801270642m5ec609d8xb29add77ced36d8a@mail.gmail.com> <479FA3E8.10606@FreeBSD.org> <3bbf2fe10801291411v302dd33at54ebe538397e8fac@mail.gmail.com> <20080130130820.GA88429@comp.chem.msu.su> <3bbf2fe10801300707u3fd121c0k199605c2f0be6cbf@mail.gmail.com> <790a9fff0801301352xa91a69ci3f08488dfcfc982@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
2008/1/30, Scot Hetzel <swhetzel@gmail.com>: > On 1/30/08, Attilio Rao <attilio@freebsd.org> wrote: > > 2008/1/30, Yar Tikhiy <yar@comp.chem.msu.su>: > > > On Tue, Jan 29, 2008 at 11:11:13PM +0100, Attilio Rao wrote: > > > > > > > > I'm committing my WITNESS patch now to perforce so that other people > > > > can hopefully stress-test it before to be committed. > > > > > > Do you think that that patch is applicable in my case? I.e., shall > > > I use it to get more debug info on my panics? > > > > > > If so, where is the patched file in the depot? > > > > Sorry but I had to delay the operation so far. > > In the end, a suitable patch is located here: > > http://www.freebsd.org/~attilio/witness_lockmgr.diff > > > > I tried it and it alredy reported 4 LORs just when booting the kernel :) > > So I would expect reasonably LOR cascades with this patch. > > > > If you all 3 (Scot, Yar and Doug) could try and test it I would > > appreciate a lot. > > > > Reading back to Doug's and Yar's messages regarding the NTFS > filesystem, I noticed that I am also mounting NTFS filesystems at boot > time. I disabled the mounting of the NTFS filesystems. When 'cd > /usr/ports ; find . -print' or '/usr/local/etc/cvsup/update.sh' is > run, the panic doesn't occur. > > But when I mount the NTFS filesystem, and rerun the above commands, > they cause the lstat panic. Even though these commands are not > touching the NTFS filesystems. > > Also mounting/unmounting a NTFS filesystem will cause a panic. > > I applied the above patch to sources that were checked out about 2 hrs > ago. Rebuilt/installed kernel and rebooted. > > If I don't mount a NTFS filesystem then the kernel doesn't panic when > the above commands are run. So it seems NTFS is definitively busted. > But when the NTFS filesystem is mounted, the following lock order > reversal occurs: > > lock order reversal: > 1st 0xffffff0023285288 pseudofs (pseudofs) @ kern/vfs_subr.c:2061 > 2nd 0xffffff00232f2ca0 vfslock (vfslock) @ kern/vfs_subr.c:364 > > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > witness_checkorder() at witness_checkorder+0x606 > > _lockmgr() at _lockmgr+0x4cb > vfs_busy() at vfs_busy+0xdf > vfs_donmount() at vfs_donmount+0x9aa > nmount() at nmount+0xa4 > > syscall() at syscall+0x1ce > Xfast_syscall() at Xfast_syscall+0xab > > --- syscall (378, FreeBSD ELF64, nmount), rip = 0x80079a57c, rsp = 0x7fffffffe8 > 28, rbp = 0x65a9d0 --- > lock order reversal: > 1st 0xffffff002347f668 ntfs (ntfs) @ kern/vfs_subr.c:2061 > 2nd 0xffffff00232f2650 vfslock (vfslock) @ kern/vfs_subr.c:364 > > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > witness_checkorder() at witness_checkorder+0x606 > > _lockmgr() at _lockmgr+0x4cb > vfs_busy() at vfs_busy+0xdf > vfs_donmount() at vfs_donmount+0x9aa > nmount() at nmount+0xa4 > > syscall() at syscall+0x1ce > Xfast_syscall() at Xfast_syscall+0xab > > --- syscall (378, FreeBSD ELF64, nmount), rip = 0x80079a57c, rsp = 0x7fffffffe8 > 28, rbp = 0x65ad80 --- > > Instead of getting the lstat panic, I am now getting the following > panic when /usr/local/etc/cvsup/update.sh ran: > > Fatal trap 9: general protection fault while in kernel mode > cpuid = 0; apic id = 00 > instruction pointer = 0x8:0xffffffff80301051 > stack pointer = 0x10:0xffffffffd6bb0100 > frame pointer = 0x10:0xffffffffd6bb0190 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = resume, IOPL = 0 > current process = 1243 (cvsup) > panic: Assertion !mtx_owned(&w_mtx) failed at ../../../kern/subr_witness.c:959 > cpuid = 0 > Uptime: 11m14s > Physical memory: 2031 MB > Dumping 325 MB: 310 294 278 262 246 230 214 198 182 166 150 134 118 102 86 70 54 > 38 22 6 The assertion failing should not happen now. Could you please hand-add a check in _lockmgr_disown() (kern/kern_lock.c) in order to check for the panicstr before to call WITNESS? I cannot access to perforce now and produce a suitable diff, so you can just do this by hand: if (lkp->lk_lockholder == td) { if (panicstr != NULL) WITNESS_UNLOCK(&lkp->lk_object, LOP_EXCLUSIVE, file, line); td->td_locks--; } -- Peace can only be achieved by understanding - A. Einstein
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3bbf2fe10801310243tddedfeckbc4c94be87f0a4ca>