From owner-freebsd-hackers@FreeBSD.ORG Mon Jan 24 15:39:24 2005 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6039616A4CE for ; Mon, 24 Jan 2005 15:39:24 +0000 (GMT) Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.198]) by mx1.FreeBSD.org (Postfix) with ESMTP id D6CE543D1F for ; Mon, 24 Jan 2005 15:39:23 +0000 (GMT) (envelope-from felixhc@gmail.com) Received: by wproxy.gmail.com with SMTP id 71so355038wra for ; Mon, 24 Jan 2005 07:39:23 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:mime-version:content-type:content-transfer-encoding; b=SEDv3gJUdiD8cXmO5aArewvEzrfSw61TGVLAthBLY1zCTDBv/TRxxm7/diEMh58RsBzgEMbxKLdcVg78hkRXIBZnxkrYqX8iI8hq89SWOUW26/esKZcUKikpwbr0P/PEgh91hKHv4Ea/OTOUdIIuTHQGpbHeQbwytaqP6Ck7FE4= Received: by 10.54.23.3 with SMTP id 3mr32840wrw; Mon, 24 Jan 2005 07:39:23 -0800 (PST) Received: by 10.54.24.13 with HTTP; Mon, 24 Jan 2005 07:39:23 -0800 (PST) Message-ID: Date: Mon, 24 Jan 2005 10:39:23 -0500 From: Felix Hernandez-Campos To: freebsd-hackers@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: syslogd pipelines losing messages X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: at List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Jan 2005 15:39:24 -0000 Hi, I've been using syslogd to monitor a few hundred hosts, and I encountered an interesting problem, which seems to be explained by a bug in syslogd. In my work, I'm using a slow Pentium II host running FreeBSD 4.10 to collect syslog messages that arrive at a rate of at most a couple hundred messages per second. A large percentage of these messages is useless and I filter them out with grep using a syslog pipeline, i.e.,/etc/syslog.conf reads local0.* | exec grep -v DUPLEX_MISMATCH >> /usr/home2/wap_syslog_loc= al0 Given the moderate arrival rate and that the machine is only used for this purpose, I would not expect any losses of syslog messages. After I disabled DNS lookups (syslogd -n), syslogd's CPU usage is very small (1%), and grep's one is even smaller. However, I observe a large number of "resource temporarily unavailable" messages in /var/log/messages. Furthermore, I also noticed that each of these errors also implies a restart of the grep process (which worried me). I tracked this down to the failure of a writev call in syslogd. This is the relevant part of the code (see writev in second "if" statement): fprintlog(struct filed *f, int flags, const char *msg) { =E2=80=A6 =09case F_PIPE: =09=09dprintf(" %s\n", f->f_un.f_pipe.f_pname); =09=09v->iov_base =3D lf; =09=09v->iov_len =3D 1; =09=09if (f->f_un.f_pipe.f_pid =3D=3D 0) { =09=09=09if ((f->f_file =3D p_open(f->f_un.f_pipe.f_pname, =09=09=09=09=09=09&f->f_un.f_pipe.f_pid)) < 0) { =09=09=09=09f->f_type =3D F_UNUSED; =09=09=09=09logerror(f->f_un.f_pipe.f_pname); =09=09=09=09break; =09=09=09} =09=09} =09=09if (writev(f->f_file, iov, 7) < 0) { =09=09=09int e =3D errno; =09=09=09(void)close(f->f_file); =09=09=09if (f->f_un.f_pipe.f_pid > 0) =09=09=09=09deadq_enter(f->f_un.f_pipe.f_pid, =09=09=09=09=09 f->f_un.f_pipe.f_pname); =09=09=09f->f_un.f_pipe.f_pid =3D 0; =09=09=09errno =3D e; =09=09=09logerror(f->f_un.f_pipe.f_pname); =09=09} =09=09break; For some reason, the pipeline is often not writeable and writev fails with errno EWOULDBLOCK (f_file is opened in non-blocking mode). After writev fails, the message is quietly discarded, and the pipeline is closed. This means that the process reading the pipeline (grep) does finish its work gracefully, but one message is completely lost. The next message goes through the first "if" statement in the "switch", which creates a new pipeline reader. Interestingly, if I add code inside the second "if" statement to give grep a chance to complete, e.g., if (e =3D=3D EWOULDBLOCK) { usleep(1); if (writev(f->f_file, iov, 7) >=3D 0) { logerror("IT WORKED!"); break; } } the writev call always works. I'm not familiar with the pipeline implementation, but I would guess the reader is locking the pipeline, and sometimes syslogd ends up running in the middle of the read operation, so writev fails. The usleep call gives the pipeline reader a chance to finish, so the second writev call works fine every time after my change. My solution is just a proof of concept that solves my problem, but I wonder whether someone in this list can suggest a better solution, which should go into the source tree. The simplest one would be to make f_file blocking, but there was a good reason for making it non-blocking in the first place, as stated in the code p_open(const char *prog, pid_t *rpid) { =E2=80=A6 =09/* =09 * Avoid blocking on a hung pipe. With O_NONBLOCK, we are =09 * supposed to get an EWOULDBLOCK on writev(2), which is =09 * caught by the logic above anyway, which will in turn close =09 * the pipe, and fork a new logging subprocess if necessary. =09 * The stale subprocess will be killed some time later unless =09 * it terminated itself due to closing its input pipe (so we =09 * get rid of really dead puppies). =09 */ =09if (fcntl(pfd[1], F_SETFL, O_NONBLOCK) =3D=3D -1) {=E2=80=A6} Anyway, I think I did all the homework, and I just need someone to suggest an elegant solution rather than my usleep (is there a yield-type syscall?). I'm more than willing to try out your ideas in our environment and prepare the patch when something good comes up. Best regards, Felix Hernandez-Campos