Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 Oct 2000 08:09:30 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        jandrese@mitre.org (Andresen, Jason R.)
Cc:        jgreco@ns.sol.net (Joe Greco), gjb@gbch.net (Greg Black), hackers@FreeBSD.ORG, ryan@sasknow.com, andrew@ugh.net.au
Subject:   Re: Logging users out
Message-ID:  <200010310809.BAA27414@usr02.primenet.com>
In-Reply-To: <39FDE4D7.1020C4B2@mitre.org> from "Andresen,Jason R." at Oct 30, 2000 04:15:03 PM

next in thread | previous in thread | raw e-mail | index | archive | help
> > Uh, well, "foolproof" != "calling ps and awk and grep and looking for
> > processes".  For ANY definition of foolproof.
> > 
> > And it is certainly foolproof from the point of view that there's no way
> > in hell for the session not to be terminated, unlike some ps garbage I've
> > seen.
> 
> Unfortunatly, sometimes when processes suddenly lose stdin/stdout,
> they jump into infinate loops and start eating cpu cycles like
> crazy.  I'd hate to see what happens when you kill off a
> significant number of people running these poorly behaved programs.
> FVWM95 Taskbars used to be notorious for this, I remember seeing
> upwards of a dozen of them vying for CPU time on some lab
> machines.

This is because the FreeBSD tty revocation code is broken, though
in technical compliance with POSIX.

The way it's supposed to happen is that "hupcl" (hangup on close)
is supposed to be set on the tty, and the signal is supposed to
be sent to the process group leader, so it can be trampolined.
Being a group signal, not a process signal, it will be delivered
to all children of the leader, as well -- just as group signal
delivery has been supposed to work for forever.  Only by the
time it gets there, there aren't any children technically in
the group any more.

What happens in the revoke code is that effectively, everyone is
made into a process group leader, so the SIGHUP to the process
group leader is not properly propagated to the other processes
in the group.

The correct order of operation is to revoke, promote, then signal,
which would result in the SIGHUP being delivered to all processes
which have not explicitly blocked it.  FreeBSD does revoke, signal,
then promote, which means the newly promotes processes aren't in
the signalled process group by the time the SIGHUP is delivered.

Traditional BSD (and UNIX) behaviour actually iteratively did
the revocation of the controlling tty on a per process basis,
after signal delivery, but the global "revoke" changed things.

This change went in during the POSIX-me-harder tournament, early
in FreeBSD's infancy.  Before it was POSIX-ized, SIGHUP was
correctly delivered on hangup to _all_ processes for which the
tty was the controlling tty.  After it went in, we started having
runaway processes, which were then labelled as being "broken" for
not noticing that read was returning 0 (which is returned on EOF,
but is also returned on perfectly goo non-blocking fds, and in the
case that vmin is set to zero to effect a timed poll via vtime).

Yeah, I basically replay this broken record any time someone
tries to blame the application for not getting out the Ouiji
board and trying to contact the dear departed tty, since there
are actually people who really do use non-blocking fds and vmin/vtime
to do things like user space threads and background computation
while waiting for user input.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200010310809.BAA27414>