Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 18 Apr 2001 00:36:50 -0500 (EST)
From:      ajk@iu.edu
To:        FreeBSD-gnats-submit@freebsd.org
Subject:   bin/26665: [PATCH] syslogd hangs when logging from remote hosts
Message-ID:  <200104180536.f3I5ao631260@kobayashi.uits.iupui.edu>

next in thread | raw e-mail | index | archive | help

>Number:         26665
>Category:       bin
>Synopsis:       [PATCH] syslogd hangs when logging from remote hosts
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Apr 17 22:40:01 PDT 2001
>Closed-Date:
>Last-Modified:
>Originator:     Andrew J. Korty
>Release:        FreeBSD 4.2-RELEASE i386
>Organization:
Information Technology Security Office, Indiana University
>Environment:

FreeBSD 4.1.1-RELEASE and later

>Description:

The syslogd program seems to hang after a few days of logging from
remote hosts.  The problem appears to be similar to one discovered
last year, before the resolver was changed to use kqueue()/kevent()
rather than poll().  Jonathon Lemon posted output from ktrace and a
proposed solution to -hackers, and res_send.c was subsequently
patched.

Apparently, the fix was ignored when kqueue()/kevent() was introduced.
My ktrace looks similar:

  2920 syslogd  987293451.580333 PSIG  SIGALRM caught handler=0x804b4c0 mask=0x1 code=0x0
  2920 syslogd  987293451.580414 RET   kevent -1 errno 4 Interrupted system call
  2920 syslogd  987293451.580470 CALL  gettimeofday(0xbfbfe3d0,0)
  2920 syslogd  987293451.580518 RET   gettimeofday 0
  2920 syslogd  987293451.580576 CALL  setitimer(0,0xbfbfe3c8,0xbfbfe3b8)
  2920 syslogd  987293451.580627 RET   setitimer 0
  2920 syslogd  987293451.580673 CALL  sigreturn(0xbfbfe424)
  2920 syslogd  987293451.580722 RET   sigreturn JUSTRETURN
  2920 syslogd  987293451.580768 CALL  kevent(0x6,0xbfbfe634,0x1,0xbfbfe634,0x1,0xbfbfe620)
  2920 syslogd  987293481.591332 PSIG  SIGALRM caught handler=0x804b4c0 mask=0x1 code=0x0
  2920 syslogd  987293481.591453 RET   kevent -1 errno 4 Interrupted system call
  2920 syslogd  987293481.591505 CALL  gettimeofday(0xbfbfe3d0,0)
  2920 syslogd  987293481.591556 RET   gettimeofday 0
  2920 syslogd  987293481.591612 CALL  setitimer(0,0xbfbfe3c8,0xbfbfe3b8)
  2920 syslogd  987293481.591664 RET   setitimer 0
  2920 syslogd  987293481.591708 CALL  sigreturn(0xbfbfe424)
  2920 syslogd  987293481.591757 RET   sigreturn JUSTRETURN
  2920 syslogd  987293481.591806 CALL  kevent(0x6,0xbfbfe634,0x1,0xbfbfe634,0x1,0xbfbfe620)
  2920 syslogd  987293511.602331 PSIG  SIGALRM caught handler=0x804b4c0 mask=0x1 code=0x0
  2920 syslogd  987293511.602456 RET   kevent -1 errno 4 Interrupted system call

Earlier in the ktrace, it is evident that kevent times out after 5,
10, and 20 seconds, respectively.  Presumably, the timeout is then
increased to 40 seconds, exceeding the alarm value.

>How-To-Repeat:

Run syslogd with heavy traffic from remote clients for several days
in an environment in which DNS times out occasionally.  Using -a
with domain wildcards several times may aggravate the problem.

>Fix:

This patch is similar to the one committed in version 1.32 of
res_send.c.  It should apply cleanly against 4.2-RELEASE and
-CURRENT, but I have only tested it with the former.

--- res_send.c.orig	Wed Sep 20 16:37:01 2000
+++ res_send.c	Tue Apr 17 23:53:42 2001
@@ -597,6 +597,8 @@
 			 */
 			struct kevent kv;
 			struct timespec timeout;
+			struct timeval timeout_tv;
+			struct timeval target;
 			struct sockaddr_storage from;
 			int fromlen;
 
@@ -706,6 +708,10 @@
 			if ((long) timeout.tv_sec <= 0)
 				timeout.tv_sec = 1;
 			timeout.tv_nsec = 0;
+			(void)gettimeofday(&target, NULL);
+			timeout_tv.tv_sec = timeout.tv_sec;
+			timeout_tv.tv_usec = timeout.tv_nsec / 1000;
+			timeradd(&target, &timeout_tv, &target);
     wait:
 			if (s < 0) {
 				Perror(stderr, "s out-of-bounds", EMFILE);
@@ -719,11 +725,25 @@
 				
 			n = kevent(kq, &kv, 1, &kv, 1, &timeout);
 			if (n < 0) {
-				if (errno == EINTR)
-					goto wait;
-				Perror(stderr, "kevent", errno);
-				res_close();
-				goto next_ns;
+				if (errno == EINTR) {
+					struct timeval current;
+
+					(void)gettimeofday(&current, NULL);
+					if (timercmp(&current, &target,
+					    <)) {
+						timersub(&target, &current,
+						    &current);
+						timeout.tv_sec =
+						    current.tv_sec;
+						timeout.tv_nsec =
+						    current.tv_usec * 1000;
+						goto wait;
+					}
+				} else {
+					Perror(stderr, "kevent", errno);
+					res_close();
+					goto next_ns;
+				}
 			}
 
 			if (n == 0) {
>Release-Note:
>Audit-Trail:
>Unformatted:

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200104180536.f3I5ao631260>