From owner-freebsd-arch@FreeBSD.ORG  Fri Jul  1 13:21:04 2005
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
X-Original-To: arch@freebsd.org
Delivered-To: freebsd-arch@FreeBSD.ORG
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id B448516A41C;
	Fri,  1 Jul 2005 13:21:04 +0000 (GMT)
	(envelope-from peadar@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 97ABE43D1F;
	Fri,  1 Jul 2005 13:21:04 +0000 (GMT)
	(envelope-from peadar@FreeBSD.org)
Received: from freefall.freebsd.org (peadar@localhost [127.0.0.1])
	by freefall.freebsd.org (8.13.3/8.13.3) with ESMTP id j61DL4tr000562;
	Fri, 1 Jul 2005 13:21:04 GMT
	(envelope-from peadar@freefall.freebsd.org)
Received: (from peadar@localhost)
	by freefall.freebsd.org (8.13.3/8.13.1/Submit) id j61DL4sx000561;
	Fri, 1 Jul 2005 13:21:04 GMT (envelope-from peadar)
Date: Fri, 1 Jul 2005 13:21:04 +0000
From: Peter Edwards <peadar@FreeBSD.org>
To: arch@freebsd.org
Message-ID: <20050701132104.GA95135@freefall.freebsd.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.4.2.1i
Cc: 
Subject: ktrace and KTR_DROP
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Jul 2005 13:21:04 -0000

Y'all,

Ever since the introduction of a separate ktrace worker thread for
writing output, there's the distinct possibility that ktrace output
will drop requests. For some proceses, it's actually inevitable:
as long as the traced processes can sustain a rate of generating
ktrace events faster than the ktrace thread can write them, you'll
eventually run out of ktrace requests.

I'd like to propose that rather than just dropping the request on
the floor, we at least configurably allow ktraced threads to block
until there are resources available to satisfy their requests.

My rationale is that previously, requests would never get lost:
each thread would serialise on the vnode used to write the requests
out. The current FIFO cushions the extra latency of writing the
requests from a separate thread. Anything beyond that is really
just acting as a low pass filter to stop applications that sporadically
generate more traffic for the ktrace thread than it can handle.

Making processes wait for free requests will still allow the queue
to deal with the latency, and revert to the original model of flow
controling the traced threads to the rate the trace file can be
written.

Anytime I use ktrace, I really don't care if I penalize the performance
of the traced process: I _want_ the trace output, above all else.
I'm not saying that's a universal constant, but I imagine its most
often the desired behaviour, and currently no matter how high you set
kern.ktrace.request_pool, "for (;;) { getpid(); }" will exhaust its
capacity in a second.

The following patch is making my debugging a much more productive
experience.

Index: sys/kern/kern_ktrace.c
===================================================================
RCS file: /usr/cvs/FreeBSD/src/sys/kern/kern_ktrace.c,v
retrieving revision 1.101
diff -u -r1.101 kern_ktrace.c
--- sys/kern/kern_ktrace.c	24 Jun 2005 12:05:24 -0000	1.101
+++ sys/kern/kern_ktrace.c	1 Jul 2005 12:48:32 -0000
@@ -100,9 +100,15 @@
 SYSCTL_UINT(_kern_ktrace, OID_AUTO, genio_size, CTLFLAG_RW, &ktr_geniosize,
     0, "Maximum size of genio event payload");
 
+static int ktr_neverdrop = 0;
+SYSCTL_UINT(_kern_ktrace, OID_AUTO, neverdrop, CTLFLAG_RW, &ktr_neverdrop,
+    0, "Wait for free resources rather than dropping events ");
+
 static int print_message = 1;
 struct mtx ktrace_mtx;
-static struct cv ktrace_cv;
+static struct cv ktrace_todo_cv; /* service thread waiting for work */
+static struct cv ktrace_freeq_cv; /* threads waiting for free requests */
+static int ktrace_free_waiters = 0; /* No. of threads waiting on freeq_cv */
 
 static void ktrace_init(void *dummy);
 static int sysctl_kern_ktrace_request_pool(SYSCTL_HANDLER_ARGS);
@@ -123,7 +129,8 @@
 	int i;
 
 	mtx_init(&ktrace_mtx, "ktrace", NULL, MTX_DEF | MTX_QUIET);
-	cv_init(&ktrace_cv, "ktrace");
+	cv_init(&ktrace_todo_cv, "ktrace.todo");
+	cv_init(&ktrace_freeq_cv, "ktrace.freeq");
 	STAILQ_INIT(&ktr_todo);
 	STAILQ_INIT(&ktr_free);
 	for (i = 0; i < ktr_requestpool; i++) {
@@ -200,6 +207,8 @@
 			    M_WAITOK);
 			mtx_lock(&ktrace_mtx);
 			STAILQ_INSERT_HEAD(&ktr_free, req, ktr_list);
+			if (ktrace_free_waiters)
+				cv_signal(&ktrace_freeq_cv);
 			ktr_requestpool++;
 		}
 	return (ktr_requestpool);
@@ -220,7 +229,16 @@
 		td->td_pflags &= ~TDP_INKTRACE;
 		return (NULL);
 	}
-	req = STAILQ_FIRST(&ktr_free);
+
+	for (;;) {
+		req = STAILQ_FIRST(&ktr_free);
+		if (req != 0 || ktr_neverdrop == 0)
+			break;
+		ktrace_free_waiters++;
+		cv_wait(&ktrace_freeq_cv, &ktrace_mtx);
+		ktrace_free_waiters--;
+	}
+
 	if (req != NULL) {
 		STAILQ_REMOVE_HEAD(&ktr_free, ktr_list);
 		req->ktr_header.ktr_type = type;
@@ -257,7 +275,7 @@
 
 	mtx_lock(&ktrace_mtx);
 	STAILQ_INSERT_TAIL(&ktr_todo, req, ktr_list);
-	cv_signal(&ktrace_cv);
+	cv_signal(&ktrace_todo_cv);
 	mtx_unlock(&ktrace_mtx);
 	curthread->td_pflags &= ~TDP_INKTRACE;
 }
@@ -276,6 +294,8 @@
 		free(req->ktr_header.ktr_buffer, M_KTRACE);
 	mtx_lock(&ktrace_mtx);
 	STAILQ_INSERT_HEAD(&ktr_free, req, ktr_list);
+	if (ktrace_free_waiters)
+		cv_signal(&ktrace_freeq_cv);
 	mtx_unlock(&ktrace_mtx);
 }
 
@@ -292,7 +312,7 @@
 	for (;;) {
 		mtx_lock(&ktrace_mtx);
 		while (STAILQ_EMPTY(&ktr_todo))
-			cv_wait(&ktrace_cv, &ktrace_mtx);
+			cv_wait(&ktrace_todo_cv, &ktrace_mtx);
 		req = STAILQ_FIRST(&ktr_todo);
 		STAILQ_REMOVE_HEAD(&ktr_todo, ktr_list);
 		KASSERT(req != NULL, ("got a NULL request"));
Index: lib/libc/sys/ktrace.2
===================================================================
RCS file: /usr/cvs/FreeBSD/src/lib/libc/sys/ktrace.2,v
retrieving revision 1.24
diff -u -r1.24 ktrace.2
--- lib/libc/sys/ktrace.2	14 Dec 2003 14:54:53 -0000	1.24
+++ lib/libc/sys/ktrace.2	1 Jul 2005 12:28:42 -0000
@@ -146,6 +146,13 @@
 to the trace file.
 .It Va kern.ktrace.request_pool
 bounds the number of trace events being logged at a time.
+.It Va kern.ktrace.neverdrop
+When non-zero, make processes wait for resources to become available
+rather than dropping trace requests. This makes the tracing more
+reliable at the possible expense of slowing down the traced thread(s).
+This will avoid events being flagged with KTR_DROP (see
+.Sx ERRORS
+)
 .El
 .Pp
 Sysctl tunables that control process debuggability (as determined by