Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 26 Jun 2010 00:52:22 +0400
From:      pluknet <pluknet@gmail.com>
To:        Anton Yuzhaninov <citrin@citrin.ru>
Cc:        freebsd-current@freebsd.org
Subject:   Re: panic in deadlkres
Message-ID:  <AANLkTinam4rwsVtFPSAidVOzdzrlx-whNBId8cMg_ySQ@mail.gmail.com>
In-Reply-To: <i01u5g$39n$1@dough.gmane.org>
References:  <i01u5g$39n$1@dough.gmane.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 25 June 2010 13:50, Anton Yuzhaninov <citrin@citrin.ru> wrote:
> I've got panic on 9-current from Jun 25 2010
>
> May be this is bug in deadlock resolver
>
> panic: blockable sleep lock (sleep mutex) process lock @
> /usr/src/sys/kern/kern_clock.c:203
>
> db> show alllocks
> Process 0 (kernel) thread 0xc4dcd270 (100047)
> shared sx allproc (allproc) r =3D 0 (0xc0885ebc) locked @
> /usr/src/sys/kern/kern_clock.c:193
>
> db> show lock 0xc4dcd270
> =A0class: spin mutex
> =A0name: D
> =A0flags: {SPIN, RECURSE}
> =A0state: {OWNED}
>
> (kgdb) bt
> #0 =A0doadump () at pcpu.h:248
> #1 =A00xc05ae59f in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown=
.c:416
> #2 =A00xc05ae825 in panic (fmt=3DVariable "fmt" is not available.
> ) at /usr/src/sys/kern/kern_shutdown.c:590
> #3 =A00xc048ff45 in db_panic (addr=3DCould not find the frame base for "d=
b_panic".
> ) at /usr/src/sys/ddb/db_command.c:478
> #4 =A00xc0490533 in db_command (last_cmdp=3D0xc086ef1c, cmd_table=3D0x0, =
dopager=3D1) at /usr/src/sys/ddb/db_command.c:445
> #5 =A00xc0490662 in db_command_loop () at /usr/src/sys/ddb/db_command.c:4=
98
> #6 =A00xc04923ef in db_trap (type=3D3, code=3D0) at /usr/src/sys/ddb/db_m=
ain.c:229
> #7 =A00xc05dade6 in kdb_trap (type=3D3, code=3D0, tf=3D0xc4b31bd0) at /us=
r/src/sys/kern/subr_kdb.c:535
> #8 =A00xc078696b in trap (frame=3D0xc4b31bd0) at /usr/src/sys/i386/i386/t=
rap.c:692
> #9 =A00xc076ca0b in calltrap () at /usr/src/sys/i386/i386/exception.s:165
> #10 0xc05daf30 in kdb_enter (why=3D0xc07ea02d "panic", msg=3D0xc07ea02d "=
panic") at cpufunc.h:71
> #11 0xc05ae806 in panic (fmt=3D0xc07efd94 "blockable sleep lock (%s) %s @=
 %s:%d") at /usr/src/sys/kern/kern_shutdown.c:573
> #12 0xc05ee30b in witness_checkorder (lock=3D0xc5148088, flags=3D9, file=
=3D0xc07e3b20 "/usr/src/sys/kern/kern_clock.c", line=3D203, interlock=3D0x0=
)
> =A0 =A0at /usr/src/sys/kern/subr_witness.c:1067
> #13 0xc05a093c in _mtx_lock_flags (m=3D0xc5148088, opts=3D0, file=3D0xc07=
e3b20 "/usr/src/sys/kern/kern_clock.c", line=3D203)
> =A0 =A0at /usr/src/sys/kern/kern_mutex.c:200
> #14 0xc05706a9 in deadlkres () at /usr/src/sys/kern/kern_clock.c:203
> #15 0xc0588721 in fork_exit (callout=3D0xc05705ea <deadlkres>, arg=3D0x0,=
 frame=3D0xc4b31d38) at /usr/src/sys/kern/kern_fork.c:843
> #16 0xc076ca80 in fork_trampoline () at /usr/src/sys/i386/i386/exception.=
s:270

Hi!

[throw in ideas (just ignore them if they're dumb, thinking badly atm).]

AFAIK, that indicates that some thread already has
a spin mutex and then it tries to acquire a sleep mutex.

Looks like kern/kern_clock.c v1.213 (SVN rev 206482)
has a regression in handling ticks wrap-up
w.r.t. it doesn't release a thread mutex, does it?

>From subr_witness.c:
1062:                 * Since spin locks include a critical section, this c=
heck
1063:                 * implicitly enforces a lock order of all sleep
locks before
1064:                 * all spin locks.
1065:                 */
1066:                if (td->td_critnest !=3D 0 && !kdb_active)
1067:                        panic("blockable sleep lock (%s) %s @ %s:%d",
1068:                            class->lc_name, lock->lo_name, file, line)=
;

>From kern_clock.c, v1.213 (in several places, while holding a thread lock):
+					/* Handle ticks wrap-up. */
+					if (ticks < td->td_blktick)
+						continue;

Should not it be like the next:
+					/* Handle ticks wrap-up. */
+					if (ticks < td->td_blktick) {
+						thread_unlock(td);
+						continue;
+					}

The precondition idea to reproduce it is to lock a subject thread
in some deadlkres callout, handle re-wrap condition, then try
to lock a process to witch the thread belongs in (n+m)'th deadlkres
callout, or in different context.

--=20
wbr,
pluknet



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTinam4rwsVtFPSAidVOzdzrlx-whNBId8cMg_ySQ>