Skip site navigation (1)Skip section navigation (2)
Date:      Thu,  4 Oct 2001 04:23:42 -0400 (EDT)
From:      The Anarcat <anarcat@anarcat.dyndns.org>
To:        FreeBSD-gnats-submit@freebsd.org
Subject:   bin/31029: syslogd remote logging back down
Message-ID:  <20011004082342.A6EBC20BE1@shall.anarcat.dyndns.org>

next in thread | raw e-mail | index | archive | help

>Number:         31029
>Category:       bin
>Synopsis:       syslogd remote logging back down
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          change-request
>Submitter-Id:   current-users
>Arrival-Date:   Thu Oct 04 01:30:01 PDT 2001
>Closed-Date:
>Last-Modified:
>Originator:     The Anarcat
>Release:        FreeBSD 4.4-STABLE i386
>Organization:
Nada, inc.
>Environment:
System: FreeBSD shall.anarcat.dyndns.org 4.4-STABLE FreeBSD 4.4-STABLE #7: Sat Sep 15 00:41:38 EDT 2001 anarcat@shall.anarcat.dyndns.org:/usr/obj/usr/src/sys/SHALL i386

>Description:

From -questions:

On Tue, Oct 02, 2001 at 11:57:08AM -0400, The Anarcat wrote:
> Hi.
>  
> I think I noticed what seems to me undesirable (and undocumented?)
> behavior in syslogd. When a remote logging host (@host) is
> unreachable:
> 
> syslogd: sendto: Host is down
> 
> syslogd *never* tries to reach it again, unless it receives a HUP.
> Shouldn't it try to reach it again, from time to time?
> 
> The @host was indeed down, but when it was brought back up, remote
> logging wasn't resumed.

>How-To-Repeat:

*.*			@host

where host is down or unreachable.

>Fix:

This is a draft of what I would call "approximate exponential backoff
algorithm". :)

There's a lot of debugging code that can be removed, but they help
seeing what's going on.

There's probably a better way to do this too. :)

--- syslogd.c.orig	Wed Oct  3 15:56:32 2001
+++ syslogd.c	Thu Oct  4 00:06:49 2001
@@ -142,6 +142,9 @@
 #define MARK		0x008	/* this message is a mark */
 #define ISKERNEL	0x010	/* kernel generated message */
 
+#define DELAY_MUL	2       /* delay multiplier */
+#define DELAY_INIT	30	/* initial delay in seconds */
+
 /*
  * This structure represents the files that will have log
  * copies printed.
@@ -159,6 +162,9 @@
 #define PRI_EQ	0x2
 #define PRI_GT	0x4
 	char	*f_program;		/* program this applies to */
+	/* should this be part of the union? */
+	time_t  f_unreach;	      /* time since last unreach */
+	time_t  f_delay;		/* backoff time */
 	union {
 		char	f_uname[MAXUNAMES][UT_NAMESIZE+1];
 		struct {
@@ -999,6 +1005,15 @@
 			l = MAXLINE;
 
 		if (finet) {
+			dprintf("FORW: now: %d f_unreach: %d f_delay: %d\n", (int) now, (int) f->f_unreach, (int) f->f_delay);
+			/* XXX: must make sure this is initialized to 0 */
+			if (f->f_unreach) { /* there was a failure last time */
+				dprintf("another try at host\n");
+				if ( (now - f->f_unreach) < f->f_delay) {
+					dprintf("skipping: now: %d, f_unreach: %d f_delay: %d\n", (int) now, (int) f->f_unreach, (int) f->f_delay);
+					break; /* do not send */
+				}
+			}
 			for (r = f->f_un.f_forw.f_addr; r; r = r->ai_next) {
 				for (i = 0; i < *finet; i++) {
 #if 0 
@@ -1017,12 +1032,38 @@
 				if (lsent == l && !send_to_all) 
 					break;
 			}
+			dprintf("lsent: %d\n", lsent);
 			if (lsent != l) {
 				int e = errno;
-				(void)close(f->f_file);
-				errno = e;
-				f->f_type = F_UNUSED;
+				dprintf("sendto: f_unreach: %d f_delay: %d\n", (int) f->f_unreach, (int) f->f_delay);
 				logerror("sendto");
+				errno = e;
+				switch (errno) {
+				case EHOSTUNREACH:
+				case EHOSTDOWN:
+					if (f->f_unreach)
+						f->f_delay *= DELAY_MUL;
+					else {
+						f->f_unreach = now;
+						f->f_delay = DELAY_INIT;
+					}
+					dprintf("setting: f_unreach: %d f_delay: %d\n", (int) f->f_unreach, (int) f->f_delay);
+					break;
+				/* case EBADF: */
+				/* case EACCES: */
+				/* case ENOTSOCK: */
+				/* case EFAULT: */
+				/* case EMSGSIZE: */
+				/* case EAGAIN: */
+				/* case ENOBUFS: */
+				/* case ECONNREFUSED: */
+				default:
+					dprintf("removing entry\n", e);
+					(void)close(f->f_file);
+					errno = e;
+					f->f_type = F_UNUSED;
+					break;
+				}
 			}
 		}
 		break;
@@ -2301,3 +2342,7 @@
 
 	return(socks);
 }
+
+/* Local Variables: *** */
+/* c-basic-offset:8 *** */
+/* End: *** */
>Release-Note:
>Audit-Trail:
>Unformatted:

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011004082342.A6EBC20BE1>