Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 18 Feb 2002 14:04:04 +1100 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        David Malone <dwmalone@maths.tcd.ie>
Cc:        "Todd C. Miller" <Todd.Miller@courtesan.com>, <audit@FreeBSD.ORG>, Chris Johnson <cjohnson@palomine.net>, Brian McDonald <brian@lustygrapes.net>
Subject:   Re: Syslog hangong on console. 
Message-ID:  <20020218131837.K4236-100000@gamplex.bde.org>
In-Reply-To: <200202171759.aa61427@salmon.maths.tcd.ie>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 17 Feb 2002, David Malone wrote:

> > Below is the diff I committed to OpenBSD some time ago.  It goes a
> > bit farther and opens all files with O_NONBLOCK and then changes
> > to blocking writes for real files.
> ...
> BTW - some people seemed to be indicating that syslogd was blocking
> at some stage other than the open, which is what I wasn't able to
> reproduce. FreeBSD's syslogd uses ttymsg() to write to the tty,
> which should never block. The only way I could see it happening was
> if isatty() lied after the tty was opened.

I sent David a large private mail which was mostly about this problem.
ttymsg() may block in close(2) or _exit(2) when it clears O_NONBLOCK,
which is eventually the usual case if there is a blockage downstream.
(Usually the first few writes go to driver buffers and write(2) and
writev(2) return successfully, but the data isn't guaranteed to go out
unless you do a tcdrain(3) or equivalent, and this is not practical
in ttymsg() or syslog() (since it might block).)  Blocking in _exit(2)
is especially bad, since it gives unkillable processes.  These can
cause the process table to fill up in ttymsg().  I sent David some old
patches related to limiting the children.

Using O_NONBLOCK without using tcdrain(3) gives a different kind of
brokenness.  Unfortunately, David's change to syslog.c gives a perfect
example of this.  The changed code is essentially:

	fd = open(... O_NONBLOCK);
	write(fd, ...);
	close(fd);

Here the write normally immediately returns successfully after copying
the data to driver buffers, even when the physical device is completely
blocked.  Then the close flushes the data in the driver and hardware
buffers because O_NONBLOCK is still set at close time.  I "fixed" this
in FreeBSD.  In 4.4BSD-Lite, ttylclose() checks the IO_NDELAY flag to
decide whether to flush the buffers.  This is nonsense, since the flags
passed to ttylclose are the open/fcntl flags, not those flags converted
to IO_* flags.  FreeBSD's ttylclose() checks FNONBLOCK instead.

The result of the fix is that if the close is the last-close, code
like the above drops all the data if the device is completely blocked,
and writes only a few bytes even if the device is completely unblocked
(only thise bytes that have reached their destination before the close
flushes the buffers are sure to have gone out)

Without the fix, the behaviour is worse: processes may block forever
in close(2) or _exit(2) despite use of O_NONBLOCK.  Perhaps multiple
processess for the same device -- there are some races that may permit
first-opens to complete while last-closes are blocked.

Bruce


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-audit" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020218131837.K4236-100000>