From owner-freebsd-current@FreeBSD.ORG Tue Jan 17 15:34:24 2012 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0B3A31065674 for ; Tue, 17 Jan 2012 15:34:24 +0000 (UTC) (envelope-from mdf356@gmail.com) Received: from mail-pz0-f54.google.com (mail-pz0-f54.google.com [209.85.210.54]) by mx1.freebsd.org (Postfix) with ESMTP id D948C8FC1C for ; Tue, 17 Jan 2012 15:34:23 +0000 (UTC) Received: by dady13 with SMTP id y13so2981220dad.13 for ; Tue, 17 Jan 2012 07:34:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=om512G5tspYomn52gblnbkt3wQ2ZUVtOsw3lPplANLU=; b=fvKG8ffveIO1rbPQ1RfrSioipUcg3xKJX0jf821e9hvKamkCd0JSHqjLcs1nh+Irfx RRLnSk6zDFvyaKbXgoV6peHYZrEP6xAeer7n+xMLSpVpAIo0qYcaRJGoG/KaSmhVVsze d9hSdGrrDXZ5+jlR5CB0l62fwQCmqrOHpDat4= MIME-Version: 1.0 Received: by 10.68.212.40 with SMTP id nh8mr28233409pbc.73.1326814463596; Tue, 17 Jan 2012 07:34:23 -0800 (PST) Sender: mdf356@gmail.com Received: by 10.68.208.167 with HTTP; Tue, 17 Jan 2012 07:34:23 -0800 (PST) In-Reply-To: <20120117110242.GD12760@glebius.int.ru> References: <20120117110242.GD12760@glebius.int.ru> Date: Tue, 17 Jan 2012 07:34:23 -0800 X-Google-Sender-Auth: DdlK9NIN7exBAe_D9pvTPh53TQQ Message-ID: From: mdf@FreeBSD.org To: Gleb Smirnoff Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: current@freebsd.org Subject: Re: new panic in cpu_reset() with WITNESS X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Jan 2012 15:34:24 -0000 2012/1/17 Gleb Smirnoff : > =A0New panic has been introduced somewhere between > r229851 and r229932, that happens on shutdown if > kernel has WITNESS and doesn't have WITNESS_SKIPSPIN. > > Uptime: 1h0m17s > Rebooting... > panic: mtx_lock_spin: recursed on non-recursive mutex cnputs_mtx @ /usr/s= rc/head/sys/kern/kern_cons.c:500 > cpuid =3D 0 > KDB: enter: panic > [ thread pid 1 tid 100001 ] > Stopped at =A0 =A0 =A0kdb_enter+0x3b: movq =A0 =A0$0,0x514d32(%rip) > db> > db> bt > Tracing pid 1 tid 100001 td 0xfffffe0001d5e000 > kdb_enter() at kdb_enter+0x3b > panic() at panic+0x1c7 > _mtx_lock_spin_flags() at _mtx_lock_spin_flags+0x10f > cnputs() at cnputs+0x7a > putchar() at putchar+0x11f > kvprintf() at kvprintf+0x83 > vprintf() at vprintf+0x85 > printf() at printf+0x67 > witness_checkorder() at witness_checkorder+0x773 > _mtx_lock_spin_flags() at _mtx_lock_spin_flags+0x99 > uart_cnputc() at uart_cnputc+0x3e > cnputc() at cnputc+0x4c > cnputs() at cnputs+0x26 > putchar() at putchar+0x11f > kvprintf() at kvprintf+0x83 > vprintf() at vprintf+0x85 > printf() at printf+0x67 > cpu_reset() at cpu_reset+0x81 > kern_reboot() at kern_reboot+0x3a5 > --More--^M =A0 =A0 =A0 =A0^Msys_reboot() at sys_reboot+0x42 > amd64_syscall() at amd64_syscall+0x39e > Xfast_syscall() at Xfast_syscall+0xf7 > --- syscall (55, FreeBSD ELF64, sys_reboot), rip =3D 0x40ea3c, rsp =3D 0x= 7fffffffd6d8, rbp =3D 0x49 --- > db> > db> show locks > exclusive sleep mutex Giant (Giant) r =3D 0 (0xffffffff809bc560) locked @= /usr/src/head/sys/kern/kern_module.c:101 > exclusive spin mutex smp rendezvous (smp rendezvous) r =3D 0 (0xffffffff8= 0a08840) locked @ /usr/src/head/sys/kern/kern_shutdown.c:542 > db> > > So the problem is that we are holding smp rendezvous mutex during the cpu= _reset(). > No mutexes should be obtained after it. However, since cpu_reset() does p= riting > we obtain cnputs_mtx, and later obtain uart_hwmtx. The latter is hardcode= d in > the subr_witness.c as mutex to obtain before smp rendezvous, this trigger= s > yet another printf from witness, that finally panics due to recursing on > cnputs_mtx. At $WORK we explicitly marked cnputs_mtx as NO_WITNESS since it didn't seem possible to fit it into the heirarchy in any sane way, since a print can come from basically anywhere. If anyone has a better fix, that'd be great, but I haven't been able to think of one. Thanks, matthew