From owner-freebsd-current@FreeBSD.ORG Fri Jun 25 20:52:24 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 04BFB1065679 for ; Fri, 25 Jun 2010 20:52:24 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 80B428FC17 for ; Fri, 25 Jun 2010 20:52:23 +0000 (UTC) Received: by wyf22 with SMTP id 22so1712740wyf.13 for ; Fri, 25 Jun 2010 13:52:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=GOado0N2IXZ7L3yvvL2feojgYhgz0p8HlMP3+r+0bOc=; b=JYLBGvBqvP5nU+9XV9SJw1mmRDYMMiGQuRHDCaRnusoC9xuMC1p7YL+Ko6ZWtIfDuo iw9AdWsO73GH7NjK7IMYScKDxn+dDV3n+wIbUZG7y2SdSdHD3NlcSWyAM/ixfmWNY8Sc /kokWbaNXDK0n8AGAzeKsJgda79LPZpan7Srw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=EFA+Aor8zVej826GeJ5MxsmIXiMX6RY1ee+3gFWIIhoIIJZHzZQ4mzxNv3HeCnmen+ KL2MGibF/N/h3M5b9Ds4Q6g0ANvF9yEySQzudEC1tvdD3DzB0A9AlOcj1QG3nw9pgXQ8 tUuFLUO30yv1wvpSnXmH6Cq2q9X261RTKTn1o= MIME-Version: 1.0 Received: by 10.216.88.211 with SMTP id a61mr983796wef.65.1277499142497; Fri, 25 Jun 2010 13:52:22 -0700 (PDT) Received: by 10.216.37.68 with HTTP; Fri, 25 Jun 2010 13:52:22 -0700 (PDT) In-Reply-To: References: Date: Sat, 26 Jun 2010 00:52:22 +0400 Message-ID: From: pluknet To: Anton Yuzhaninov Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-current@freebsd.org Subject: Re: panic in deadlkres X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jun 2010 20:52:24 -0000 On 25 June 2010 13:50, Anton Yuzhaninov wrote: > I've got panic on 9-current from Jun 25 2010 > > May be this is bug in deadlock resolver > > panic: blockable sleep lock (sleep mutex) process lock @ > /usr/src/sys/kern/kern_clock.c:203 > > db> show alllocks > Process 0 (kernel) thread 0xc4dcd270 (100047) > shared sx allproc (allproc) r =3D 0 (0xc0885ebc) locked @ > /usr/src/sys/kern/kern_clock.c:193 > > db> show lock 0xc4dcd270 > =A0class: spin mutex > =A0name: D > =A0flags: {SPIN, RECURSE} > =A0state: {OWNED} > > (kgdb) bt > #0 =A0doadump () at pcpu.h:248 > #1 =A00xc05ae59f in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown= .c:416 > #2 =A00xc05ae825 in panic (fmt=3DVariable "fmt" is not available. > ) at /usr/src/sys/kern/kern_shutdown.c:590 > #3 =A00xc048ff45 in db_panic (addr=3DCould not find the frame base for "d= b_panic". > ) at /usr/src/sys/ddb/db_command.c:478 > #4 =A00xc0490533 in db_command (last_cmdp=3D0xc086ef1c, cmd_table=3D0x0, = dopager=3D1) at /usr/src/sys/ddb/db_command.c:445 > #5 =A00xc0490662 in db_command_loop () at /usr/src/sys/ddb/db_command.c:4= 98 > #6 =A00xc04923ef in db_trap (type=3D3, code=3D0) at /usr/src/sys/ddb/db_m= ain.c:229 > #7 =A00xc05dade6 in kdb_trap (type=3D3, code=3D0, tf=3D0xc4b31bd0) at /us= r/src/sys/kern/subr_kdb.c:535 > #8 =A00xc078696b in trap (frame=3D0xc4b31bd0) at /usr/src/sys/i386/i386/t= rap.c:692 > #9 =A00xc076ca0b in calltrap () at /usr/src/sys/i386/i386/exception.s:165 > #10 0xc05daf30 in kdb_enter (why=3D0xc07ea02d "panic", msg=3D0xc07ea02d "= panic") at cpufunc.h:71 > #11 0xc05ae806 in panic (fmt=3D0xc07efd94 "blockable sleep lock (%s) %s @= %s:%d") at /usr/src/sys/kern/kern_shutdown.c:573 > #12 0xc05ee30b in witness_checkorder (lock=3D0xc5148088, flags=3D9, file= =3D0xc07e3b20 "/usr/src/sys/kern/kern_clock.c", line=3D203, interlock=3D0x0= ) > =A0 =A0at /usr/src/sys/kern/subr_witness.c:1067 > #13 0xc05a093c in _mtx_lock_flags (m=3D0xc5148088, opts=3D0, file=3D0xc07= e3b20 "/usr/src/sys/kern/kern_clock.c", line=3D203) > =A0 =A0at /usr/src/sys/kern/kern_mutex.c:200 > #14 0xc05706a9 in deadlkres () at /usr/src/sys/kern/kern_clock.c:203 > #15 0xc0588721 in fork_exit (callout=3D0xc05705ea , arg=3D0x0,= frame=3D0xc4b31d38) at /usr/src/sys/kern/kern_fork.c:843 > #16 0xc076ca80 in fork_trampoline () at /usr/src/sys/i386/i386/exception.= s:270 Hi! [throw in ideas (just ignore them if they're dumb, thinking badly atm).] AFAIK, that indicates that some thread already has a spin mutex and then it tries to acquire a sleep mutex. Looks like kern/kern_clock.c v1.213 (SVN rev 206482) has a regression in handling ticks wrap-up w.r.t. it doesn't release a thread mutex, does it? >From subr_witness.c: 1062: * Since spin locks include a critical section, this c= heck 1063: * implicitly enforces a lock order of all sleep locks before 1064: * all spin locks. 1065: */ 1066: if (td->td_critnest !=3D 0 && !kdb_active) 1067: panic("blockable sleep lock (%s) %s @ %s:%d", 1068: class->lc_name, lock->lo_name, file, line)= ; >From kern_clock.c, v1.213 (in several places, while holding a thread lock): + /* Handle ticks wrap-up. */ + if (ticks < td->td_blktick) + continue; Should not it be like the next: + /* Handle ticks wrap-up. */ + if (ticks < td->td_blktick) { + thread_unlock(td); + continue; + } The precondition idea to reproduce it is to lock a subject thread in some deadlkres callout, handle re-wrap condition, then try to lock a process to witch the thread belongs in (n+m)'th deadlkres callout, or in different context. --=20 wbr, pluknet