From owner-freebsd-current  Mon Mar 19 10:12:15 2001
Delivered-To: freebsd-current@freebsd.org
Received: from salmon.maths.tcd.ie (salmon.maths.tcd.ie [134.226.81.11])
	by hub.freebsd.org (Postfix) with SMTP id 390E337B718
	for <current@freebsd.org>; Mon, 19 Mar 2001 10:12:11 -0800 (PST)
	(envelope-from iedowse@maths.tcd.ie)
Received: from walton.maths.tcd.ie by salmon.maths.tcd.ie with SMTP
          id <aa82109@salmon>; 19 Mar 2001 18:12:10 +0000 (GMT)
To: current@freebsd.org
Subject: reboot(8) delay between SIGTERM and SIGKILL
Date: Mon, 19 Mar 2001 18:12:10 +0000
From: Ian Dowse <iedowse@maths.tcd.ie>
Message-ID:  <200103191812.aa82109@salmon.maths.tcd.ie>
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


I have noticed that reboot(8) sometimes appears not to wait long
enough before sending the final SIGKILL to all processes. On a
system that has a lot of processes swapped out, some processes such
as the X server may get a SIGKILL before they have had a chance to
perform their exit cleanup.

The patch below causes reboot to wait up to 60 seconds for paging
activity to end before sending the SIGKILLs. It does this by
monitoring the sysctl `vm.stats.vm.v_swappgsian', and extending
the default 5-second delay if page-in operations are observed.

On my laptop (64Mb, IDE disk) with a number of big apps running,
it can take around 20 seconds for all the paging to die down after
the SIGTERMs are sent.

I know the choice of sysctl to monitor is slightly arbitrary, but
it seems to have the right overall effect. Does anyone have any
objections to my committing this?

Ian

Index: reboot.c
===================================================================
RCS file: /dump/FreeBSD-CVS/src/sbin/reboot/reboot.c,v
retrieving revision 1.9
diff -u -r1.9 reboot.c
--- reboot.c	1999/11/21 21:52:40	1.9
+++ reboot.c	2001/03/19 17:01:37
@@ -47,6 +47,7 @@
 
 #include <sys/reboot.h>
 #include <sys/types.h>
+#include <sys/sysctl.h>
 #include <signal.h>
 #include <err.h>
 #include <errno.h>
@@ -58,6 +59,7 @@
 #include <unistd.h>
 
 void usage __P((void));
+u_int get_pageins __P((void));
 
 int dohalt;
 
@@ -152,13 +154,22 @@
 	/*
 	 * After the processes receive the signal, start the rest of the
 	 * buffers on their way.  Wait 5 seconds between the SIGTERM and
-	 * the SIGKILL to give everybody a chance.
+	 * the SIGKILL to give everybody a chance. If there is a lot of
+	 * paging activity then wait longer, up to a maximum of approx
+	 * 60 seconds.
 	 */
 	sleep(2);
 	if (!nflag)
 		sync();
-	sleep(3);
+	for (i = 0; i < 20; i++) {
+		u_int old_pageins;
 
+		old_pageins = get_pageins();
+		sleep(3);
+		if (get_pageins() == old_pageins)
+			break;
+	}
+
 	for (i = 1;; ++i) {
 		if (kill(-1, SIGKILL) == -1) {
 			if (errno == ESRCH)
@@ -189,4 +200,19 @@
 	(void)fprintf(stderr, "usage: %s [-dnpq]\n",
 	    dohalt ? "halt" : "reboot");
 	exit(1);
+}
+
+u_int
+get_pageins()
+{
+	u_int pageins;
+	size_t len;
+
+	len = sizeof(pageins);
+	if (sysctlbyname("vm.stats.vm.v_swappgsin", &pageins, &len, NULL, 0)
+	    != 0) {
+		warnx("v_swappgsin");
+		return (0);
+	}
+	return pageins;
 }

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message