Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 20 Dec 2011 08:42:25 -0800
From:      mdf@FreeBSD.org
To:        John Baldwin <jhb@freebsd.org>
Cc:        Robert Watson <rwatson@freebsd.org>, freebsd-current@freebsd.org, "O. Hartmann" <ohartman@zedat.fu-berlin.de>
Subject:   Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662
Message-ID:  <CAMBSHm-Dvvz7Wq-zCmdddknK8FOKhAfj=sYqm8vv8L%2Bg_yNXXw@mail.gmail.com>
In-Reply-To: <201112200932.21223.jhb@freebsd.org>
References:  <4EED2F1C.2060409@zedat.fu-berlin.de> <201112200852.23300.jhb@freebsd.org> <CAMBSHm_ZcMe2uC6HXL9vazYOxVSVVKJqmfHCHXRta8rgdda65w@mail.gmail.com> <201112200932.21223.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Dec 20, 2011 at 6:32 AM, John Baldwin <jhb@freebsd.org> wrote:
> On Tuesday, December 20, 2011 9:22:48 am mdf@freebsd.org wrote:
>> On Tue, Dec 20, 2011 at 5:52 AM, John Baldwin <jhb@freebsd.org> wrote:
>> > On Saturday, December 17, 2011 10:41:15 pm mdf@freebsd.org wrote:
>> >> On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev <kabaev@gmail.com> =
wrote:
>> >> > On Sun, 18 Dec 2011 01:09:00 +0100
>> >> > "O. Hartmann" <ohartman@zedat.fu-berlin.de> wrote:
>> >> >
>> >> >> Sleeping thread (tid 100033, pid 16) owns a non sleepable lock
>> >> >> panic: sleeping thread
>> >> >> cpuid =3D 0
>> >> >>
>> >> >> PID 16 is always USB on my box.
>> >> >
>> >> > You really need to give us a backtrace when you quote panics. It is
>> >> > impossible to make any sense of the above panic message without mor=
e
>> >> > context.
>> >>
>> >> In the case of this panic, the stack of the thread which panics is
>> >> useless; it's someone trying to propagate priority that discovered it=
.
>> >> =A0A backtrace on tid 100033 would be useful.
>> >>
>> >> With WITNESS enabled, it's possible to have this panic display the
>> >> stack of the incorrectly sleeping thread at the time it acquired the
>> >> lock, as well, but this code isn't in CURRENT or any release. =A0I ha=
ve
>> >> a patch at $WORK I can dig up on Monday.
>> >
>> > Huh? =A0The stock kernel dumps a stack trace of the offending thread i=
f you have
>> > DDB enabled:
>> >
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/*
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * If the thread is asleep, then we are=
 probably about
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * to deadlock. =A0To make debugging th=
is easier, just
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * panic and tell the user which thread=
 misbehaved so
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * they can hopefully get a stack trace=
 from the truly
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * misbehaving thread.
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 */
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (TD_IS_SLEEPING(td)) {
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0printf(
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"Sleeping thread (tid %d, pid %d) owns =
a non-sleepable lock\n",
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0td->td_tid, td-=
>td_proc->p_pid);
>> > #ifdef DDB
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0db_trace_thread(td, -1)=
;
>> > #endif
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0panic("sleeping thread"=
);
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
>>
>> Hmm, maybe this wasn't in 7, or maybe I'm just remembering that we
>> added code to print *which* lock it holds (using WITNESS data). =A0I do
>> recall that this panic alone was often not sufficient to debug the
>> problem.
>
> I think the db_trace_thread() has been around for a while (since 5 or 6),
> but it is true that we don't tell you which lock is held even with this.
> That might be a useful thing to output before the panic.


This patch isn't quite right since I had to hand-edit it.  There's a
small chance I can commit this in the near future, but of someone else
wants to take it, feel free.  Style isn't yet fixed up to be FreeBSD
standard either.


--- /data/sb/bsd.git/sys/kern/subr_turnstile.c	2011-12-12
10:23:12.542196632 -0800
+++ kern/subr_turnstile.c	2011-12-09 10:59:29.882643558 -0800
@@ -165,10 +165,43 @@
 static void	turnstile_dtor(void *mem, int size, void *arg);
 #endif
 static int	turnstile_init(void *mem, int size, int flags);
 static void	turnstile_fini(void *mem, int size);

+#ifdef INVARIANTS
+static void
+sleeping_thread_owns_a_nonsleepable_lock(struct thread *td)
+{
+	printf("Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n",
+	    td->td_tid, td->td_proc->p_pid);
+#ifdef DDB
+	db_trace_thread(td, -1);
+#endif
+#ifdef WITNESS
+	struct lock_list_entry *lock_list, *lle;
+	int i;
+
+	lock_list =3D td->td_sleeplocks;
+	if (lock_list =3D=3D NULL || lock_list->ll_count =3D=3D 0) {
+		printf("Thread does not appear to hold any mutexes!\n");
+		return;
+	}
+
+	for (lle =3D lock_list; lle !=3D NULL; lle =3D lle->ll_next) {
+		for (i =3D lle->ll_count - 1; i >=3D 0; i--) {
+			struct lock_instance *li =3D &lle->ll_children[i];
+
+			printf("Lock %s acquired at %s:%d\n",
+			    li->li_lock->lo_name, li->li_file, li->li_line);
+		}
+	}
+#endif /* WITNESS */
+}
+#else
+#define sleeping_thread_owns_a_nonsleepable_lock(td) do { } while (0)
+#endif /* INVARIANTS */
+
 /*
  * Walks the chain of turnstiles and their owners to propagate the priorit=
y
  * of the thread being blocked to all the threads holding locks that have =
to
  * release their locks before this thread can run again.
  */
@@ -210,19 +243,31 @@
 		 * If the thread is asleep, then we are probably about
 		 * to deadlock.  To make debugging this easier, just
 		 * panic and tell the user which thread misbehaved so
 		 * they can hopefully get a stack trace from the truly
 		 * misbehaving thread.
 		 */
 		if (TD_IS_SLEEPING(td)) {
-			printf(
-		"Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n",
-			    td->td_tid, td->td_proc->p_pid);
-#ifdef DDB
-			db_trace_thread(td, -1);
-#endif
-			panic("sleeping thread");
+			sleeping_thread_owns_a_nonsleepable_lock(td);
+			panic("sleeping thread %p owns a nonsleepable lock",
+			    td);
 		}

 		/*
 		 * If this thread already has higher priority than the
 		 * thread that is being blocked, we are finished.


Cheers,
matthew



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAMBSHm-Dvvz7Wq-zCmdddknK8FOKhAfj=sYqm8vv8L%2Bg_yNXXw>