From owner-freebsd-current Sat Jan 4 8: 4:12 2003 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8D0A637B40E; Sat, 4 Jan 2003 08:04:09 -0800 (PST) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id DB5E343EC2; Sat, 4 Jan 2003 08:04:07 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id DAA24909; Sun, 5 Jan 2003 03:03:59 +1100 Date: Sun, 5 Jan 2003 03:04:13 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Poul-Henning Kamp Cc: current@FreeBSD.ORG Subject: Re: LOR on SIGINFO... In-Reply-To: <9158.1041683868@critter.freebsd.dk> Message-ID: <20030105021240.F10094-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sat, 4 Jan 2003, Poul-Henning Kamp wrote: > I pressed ctrl-T to see how far my compile had come and was rewarded > with a LOR: > > llock order reversal > 1st 0xc0347680 sched lock (sched lock) @ ../../../kern/tty.c:2386 > 2nd 0xc039f3e0 sio (sio) @ ../../../dev/sio/sio.c:3200 > Debugger("witness_lock") > Stopped at Debugger+0x54: xchgl %ebx,in_Debugger.0 > db> trace > Debugger(c0308bc1,c039f3e0,c035cf40,c035cf40,c0330b72) at Debugger+0x54 > witness_lock(c039f3e0,8,c0330b72,c80,c0322f08) at witness_lock+0x667 > _mtx_lock_spin_flags(c039f3e0,0,c0330b72,c80,0) at _mtx_lock_spin_flags+0xd1 > siocnputc(c036fca0,63,5,d6985b68,0) at siocnputc+0x81 > cnputc(63,ffffffff,1,c032024c,225) at cnputc+0x5d > putchar(63,d6985b68,d6985aa0,c01c84b4,c14f9800) at putchar+0xcd > kvprintf(c032024b,c01e04f0,d6985b68,a,d6985b88) at kvprintf+0x8d > printf(c032024b,2a51,108f5,c4b06868,c0322f08) at printf+0x57 > calcru(c4b06720,d6985c40,d6985c48,0,22a) at calcru+0x20b > ttyinfo(c14f9800,0,c0324bb5,22a,c01b98bd) at ttyinfo+0x1f9 > ttyinput(14,c14f9800,c0330b72,647,c3fe2740) at ttyinput+0x7c3 > sioinput(c401b000,0,c0330b72,856,0) at sioinput+0x1d3 > siopoll(0,0,c031e372,216,c14fe720) at siopoll+0x116 > ithread_loop(c14fbe80,d6985d48,c031e1ef,361,0) at ithread_loop+0x182 > fork_exit(c01b0680,c14fbe80,d6985d48) at fork_exit+0xc4 > fork_trampoline() at fork_trampoline+0x1a > --- trap 0x1, eip = 0, esp = 0xd6985d7c, ebp = 0 --- > db> This is a variation on the old, unfixed bug that printf() is not safe to call while sched_lock is held. calcru() always holds sched_lock but only calls printf() when the timecounter goes backwards. Timecounters were detected to go backwards much more often in mi_switch(), so the corresponding message there was first turned off and then shot. I thought that this bug didn't affect the sio console driver. Console drivers must not use normal locking for their low-level i/o for various reasons, but the bug here seems to be just that witness doesn't understand sio_lock. I think sio_lock needs to be initialized with MTX_NOWITNESS as well as or instead of MTX_QUIET. I changed my version to do this so long ago that I forget doing it: %%% Index: sio.c =================================================================== RCS file: /home/ncvs/src/sys/dev/sio/sio.c,v retrieving revision 1.382 diff -u -2 -r1.382 sio.c --- sio.c 11 Oct 2002 20:22:20 -0000 1.382 +++ sio.c 17 Oct 2002 20:18:53 -0000 @@ -512,7 +526,7 @@ while (sio_inited != 2) if (atomic_cmpset_int(&sio_inited, 0, 1)) { + /* XXX might need MTX_QUIET instead of MTX_NOWITNESS.*/ mtx_init(&sio_lock, sio_driver_name, NULL, - (comconsole != -1) ? - MTX_SPIN | MTX_QUIET : MTX_SPIN); + MTX_SPIN | MTX_NOWITNESS); atomic_store_rel_int(&sio_inited, 2); } @@ -3197,6 +3295,6 @@ s = spltty(); need_unlock = 0; - if (sio_inited == 2 && !mtx_owned(&sio_lock)) { - mtx_lock_spin(&sio_lock); + if (!db_active && sio_inited == 2 && !mtx_owned(&sio_lock)) { + COM_LOCK(); need_unlock = 1; } %%% Perhaps MTX_QUIET is still required. I think initializing the lock depending on whether comconsole != -1 is broken for at least the case where consoleness is initialized later using a sysctl. MTX_NOWITNESS may make the sio entry in the lock order list bogus. The patch also fixes the style bug of using KNF indentation for continued lines; sio is not in KNF. The changes in the second hunk do the following: - checking db_active stops ddb going near sio_lock to prevent problems when changes of sio_lock are traced. When ddb is active, other CPUs are locked out by a ddb lock and interrupts on the current CPU are locked out by disable_intr() (modulo bugs). - COM_LOCK() is mtx_lock_spin(&sio_lock) in the SMP case and null otherwise. Sio's locking is nothing like I want it to be in either version, but the macro hides some of the details. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message