Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 31 Jan 2008 11:43:10 +0100
From:      "Attilio Rao" <attilio@freebsd.org>
To:        "Scot Hetzel" <swhetzel@gmail.com>
Cc:        Kostik Belousov <kostikbel@gmail.com>, Yar Tikhiy <yar@comp.chem.msu.su>, Doug Barton <dougb@freebsd.org>, freebsd-current@freebsd.org
Subject:   Re: panic: System call lstat returning with 1 locks held
Message-ID:  <3bbf2fe10801310243tddedfeckbc4c94be87f0a4ca@mail.gmail.com>
In-Reply-To: <790a9fff0801301352xa91a69ci3f08488dfcfc982@mail.gmail.com>
References:  <3bbf2fe10801250000k5852c2f2j5d1897c900096818@mail.gmail.com> <479BBDAA.6000008@FreeBSD.org> <3bbf2fe10801261657x7d7c9de4q71adeaf3a2dd8159@mail.gmail.com> <479C0B5B.9030709@FreeBSD.org> <3bbf2fe10801270642m5ec609d8xb29add77ced36d8a@mail.gmail.com> <479FA3E8.10606@FreeBSD.org> <3bbf2fe10801291411v302dd33at54ebe538397e8fac@mail.gmail.com> <20080130130820.GA88429@comp.chem.msu.su> <3bbf2fe10801300707u3fd121c0k199605c2f0be6cbf@mail.gmail.com> <790a9fff0801301352xa91a69ci3f08488dfcfc982@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
2008/1/30, Scot Hetzel <swhetzel@gmail.com>:
> On 1/30/08, Attilio Rao <attilio@freebsd.org> wrote:
>  > 2008/1/30, Yar Tikhiy <yar@comp.chem.msu.su>:
>  > > On Tue, Jan 29, 2008 at 11:11:13PM +0100, Attilio Rao wrote:
>  > > >
>  > > > I'm committing my WITNESS patch now to perforce so that other people
>  > > > can hopefully stress-test it before to be committed.
>  > >
>  > > Do you think that that patch is applicable in my case?  I.e., shall
>  > > I use it to get more debug info on my panics?
>  > >
>  > > If so, where is the patched file in the depot?
>  >
>  > Sorry but I had to delay the operation so far.
>  > In the end, a suitable patch is located here:
>  > http://www.freebsd.org/~attilio/witness_lockmgr.diff
>  >
>  > I tried it and it alredy reported 4 LORs just when booting the kernel :)
>  > So I would expect reasonably LOR cascades with this patch.
>  >
>  > If you all 3 (Scot, Yar and Doug) could try and test it I would
>  > appreciate a lot.
>  >
>
> Reading back to Doug's and Yar's messages regarding the NTFS
>  filesystem, I noticed that I am also mounting NTFS filesystems at boot
>  time.  I disabled the mounting of the NTFS filesystems.  When 'cd
>  /usr/ports ; find . -print' or '/usr/local/etc/cvsup/update.sh' is
>  run, the panic doesn't occur.
>
>  But when I mount the NTFS filesystem, and rerun the above commands,
>  they cause the lstat panic.  Even though these commands are not
>  touching the NTFS filesystems.
>
>  Also mounting/unmounting a NTFS filesystem will cause a panic.
>
>  I applied the above patch to sources that were checked out about 2 hrs
>  ago.  Rebuilt/installed kernel and rebooted.
>
>  If I don't mount a NTFS filesystem then the kernel doesn't panic when
>  the above commands are run.

So it seems NTFS is definitively busted.

>  But when the NTFS filesystem is mounted, the following lock order
>  reversal occurs:
>
>  lock order reversal:
>   1st 0xffffff0023285288 pseudofs (pseudofs) @ kern/vfs_subr.c:2061
>   2nd 0xffffff00232f2ca0 vfslock (vfslock) @ kern/vfs_subr.c:364
>
> KDB: stack backtrace:
>  db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
>  witness_checkorder() at witness_checkorder+0x606
>
> _lockmgr() at _lockmgr+0x4cb
>  vfs_busy() at vfs_busy+0xdf
>  vfs_donmount() at vfs_donmount+0x9aa
>  nmount() at nmount+0xa4
>
> syscall() at syscall+0x1ce
>  Xfast_syscall() at Xfast_syscall+0xab
>
> --- syscall (378, FreeBSD ELF64, nmount), rip = 0x80079a57c, rsp = 0x7fffffffe8
>  28, rbp = 0x65a9d0 ---
>  lock order reversal:
>   1st 0xffffff002347f668 ntfs (ntfs) @ kern/vfs_subr.c:2061
>   2nd 0xffffff00232f2650 vfslock (vfslock) @ kern/vfs_subr.c:364
>
> KDB: stack backtrace:
>  db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
>  witness_checkorder() at witness_checkorder+0x606
>
> _lockmgr() at _lockmgr+0x4cb
>  vfs_busy() at vfs_busy+0xdf
>  vfs_donmount() at vfs_donmount+0x9aa
>  nmount() at nmount+0xa4
>
> syscall() at syscall+0x1ce
>  Xfast_syscall() at Xfast_syscall+0xab
>
> --- syscall (378, FreeBSD ELF64, nmount), rip = 0x80079a57c, rsp = 0x7fffffffe8
>  28, rbp = 0x65ad80 ---
>
>  Instead of getting the lstat panic, I am now getting the following
>  panic when /usr/local/etc/cvsup/update.sh ran:
>
>  Fatal trap 9: general protection fault while in kernel mode
>  cpuid = 0; apic id = 00
>  instruction pointer     = 0x8:0xffffffff80301051
>  stack pointer           = 0x10:0xffffffffd6bb0100
>  frame pointer           = 0x10:0xffffffffd6bb0190
>  code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
>  processor eflags        = resume, IOPL = 0
>  current process         = 1243 (cvsup)
>  panic: Assertion !mtx_owned(&w_mtx) failed at ../../../kern/subr_witness.c:959
>  cpuid = 0
>  Uptime: 11m14s
>  Physical memory: 2031 MB
>  Dumping 325 MB: 310 294 278 262 246 230 214 198 182 166 150 134 118 102 86 70 54
>   38 22 6

The assertion failing should not happen now.
Could you please hand-add a check in _lockmgr_disown()
(kern/kern_lock.c) in order to check for the panicstr before to call
WITNESS? I cannot access to perforce now and produce a suitable diff,
so you can just do this by hand:

if (lkp->lk_lockholder == td) {
        if (panicstr != NULL)
                WITNESS_UNLOCK(&lkp->lk_object, LOP_EXCLUSIVE, file, line);
        td->td_locks--;
}


-- 
Peace can only be achieved by understanding - A. Einstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3bbf2fe10801310243tddedfeckbc4c94be87f0a4ca>