Date: Wed, 18 Apr 2001 00:36:50 -0500 (EST) From: ajk@iu.edu To: FreeBSD-gnats-submit@freebsd.org Subject: bin/26665: [PATCH] syslogd hangs when logging from remote hosts Message-ID: <200104180536.f3I5ao631260@kobayashi.uits.iupui.edu>
next in thread | raw e-mail | index | archive | help
>Number: 26665 >Category: bin >Synopsis: [PATCH] syslogd hangs when logging from remote hosts >Confidential: no >Severity: critical >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Tue Apr 17 22:40:01 PDT 2001 >Closed-Date: >Last-Modified: >Originator: Andrew J. Korty >Release: FreeBSD 4.2-RELEASE i386 >Organization: Information Technology Security Office, Indiana University >Environment: FreeBSD 4.1.1-RELEASE and later >Description: The syslogd program seems to hang after a few days of logging from remote hosts. The problem appears to be similar to one discovered last year, before the resolver was changed to use kqueue()/kevent() rather than poll(). Jonathon Lemon posted output from ktrace and a proposed solution to -hackers, and res_send.c was subsequently patched. Apparently, the fix was ignored when kqueue()/kevent() was introduced. My ktrace looks similar: 2920 syslogd 987293451.580333 PSIG SIGALRM caught handler=0x804b4c0 mask=0x1 code=0x0 2920 syslogd 987293451.580414 RET kevent -1 errno 4 Interrupted system call 2920 syslogd 987293451.580470 CALL gettimeofday(0xbfbfe3d0,0) 2920 syslogd 987293451.580518 RET gettimeofday 0 2920 syslogd 987293451.580576 CALL setitimer(0,0xbfbfe3c8,0xbfbfe3b8) 2920 syslogd 987293451.580627 RET setitimer 0 2920 syslogd 987293451.580673 CALL sigreturn(0xbfbfe424) 2920 syslogd 987293451.580722 RET sigreturn JUSTRETURN 2920 syslogd 987293451.580768 CALL kevent(0x6,0xbfbfe634,0x1,0xbfbfe634,0x1,0xbfbfe620) 2920 syslogd 987293481.591332 PSIG SIGALRM caught handler=0x804b4c0 mask=0x1 code=0x0 2920 syslogd 987293481.591453 RET kevent -1 errno 4 Interrupted system call 2920 syslogd 987293481.591505 CALL gettimeofday(0xbfbfe3d0,0) 2920 syslogd 987293481.591556 RET gettimeofday 0 2920 syslogd 987293481.591612 CALL setitimer(0,0xbfbfe3c8,0xbfbfe3b8) 2920 syslogd 987293481.591664 RET setitimer 0 2920 syslogd 987293481.591708 CALL sigreturn(0xbfbfe424) 2920 syslogd 987293481.591757 RET sigreturn JUSTRETURN 2920 syslogd 987293481.591806 CALL kevent(0x6,0xbfbfe634,0x1,0xbfbfe634,0x1,0xbfbfe620) 2920 syslogd 987293511.602331 PSIG SIGALRM caught handler=0x804b4c0 mask=0x1 code=0x0 2920 syslogd 987293511.602456 RET kevent -1 errno 4 Interrupted system call Earlier in the ktrace, it is evident that kevent times out after 5, 10, and 20 seconds, respectively. Presumably, the timeout is then increased to 40 seconds, exceeding the alarm value. >How-To-Repeat: Run syslogd with heavy traffic from remote clients for several days in an environment in which DNS times out occasionally. Using -a with domain wildcards several times may aggravate the problem. >Fix: This patch is similar to the one committed in version 1.32 of res_send.c. It should apply cleanly against 4.2-RELEASE and -CURRENT, but I have only tested it with the former. --- res_send.c.orig Wed Sep 20 16:37:01 2000 +++ res_send.c Tue Apr 17 23:53:42 2001 @@ -597,6 +597,8 @@ */ struct kevent kv; struct timespec timeout; + struct timeval timeout_tv; + struct timeval target; struct sockaddr_storage from; int fromlen; @@ -706,6 +708,10 @@ if ((long) timeout.tv_sec <= 0) timeout.tv_sec = 1; timeout.tv_nsec = 0; + (void)gettimeofday(&target, NULL); + timeout_tv.tv_sec = timeout.tv_sec; + timeout_tv.tv_usec = timeout.tv_nsec / 1000; + timeradd(&target, &timeout_tv, &target); wait: if (s < 0) { Perror(stderr, "s out-of-bounds", EMFILE); @@ -719,11 +725,25 @@ n = kevent(kq, &kv, 1, &kv, 1, &timeout); if (n < 0) { - if (errno == EINTR) - goto wait; - Perror(stderr, "kevent", errno); - res_close(); - goto next_ns; + if (errno == EINTR) { + struct timeval current; + + (void)gettimeofday(¤t, NULL); + if (timercmp(¤t, &target, + <)) { + timersub(&target, ¤t, + ¤t); + timeout.tv_sec = + current.tv_sec; + timeout.tv_nsec = + current.tv_usec * 1000; + goto wait; + } + } else { + Perror(stderr, "kevent", errno); + res_close(); + goto next_ns; + } } if (n == 0) { >Release-Note: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200104180536.f3I5ao631260>