Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 27 Oct 2002 01:50:17 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Juli Mallett <jmallett@FreeBSD.org>
Cc:        Maxim Sobolev <sobomax@FreeBSD.ORG>, Nate Lawson <nate@root.org>, jlemon@FreeBSD.ORG, hackers@FreeBSD.ORG, audit@FreeBSD.ORG
Subject:   Re: New kevent types: NOTE_STARTEXEC and NOTE_STOPEXEC
Message-ID:  <3DBBB6D9.F369A5CD@mindspring.com>
References:  <3DB79DFA.FA719B8F@FreeBSD.org> <Pine.BSF.4.21.0210261715520.78755-100000@root.org> <20021027075043.GA36533@vega.vega.com> <20021027010429.A90908@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Juli Mallett wrote:
> > EVFILT_PROC operates on pids, while NOTE_{START,STOP}EXEC operate on
> > vnodes - it is the main difference. Currently, you can't reliably
> > get a notification when kernes started executing some arbitrary
> > executable from your fs.
> 
> This is not a job for the kernel, I don't think.  Implement it in userland
> in terms of having the daemon write to a pidfile at startup, and have SIGUSR1
> make it tell the sender it's alive (using my sigq stuff this is trivial, just
> send SIGUSR2 back), and periodically read the pidfile and try to communciate
> with the daemon, and respawn it if it fails.  This could be racey if done
> poorly.  However if you want this for *any* executable, rather than just
> "some arbitrary executable" rather than some specific job, then while I wonder
> how useful it is in a generic concept, the kq solution might be more
> reasonable.

The problem with daemon's is that they are not children of a process
other than "init".  It's only useful to use SIGCHLD as the termination
notification if it's a parent process being notified.

For most daemons, this would mean relacing "init".

It's useful to replace "init", and "syslog", both, to be able to
track state for services; however, the notification on termination
is significantly better if anyone who cares can monitor for it.

The main problem with this approach is that it's not the service
identity that's communicated, but the name of a file, and the file
name is for a path that's probably relative to the current directory
in most cases (or could be), and so isn't globally unique.

It is also not useful to poll, since your daemon could be down for
an entire poll interval; also, it means you have to trust the poll
interval itself will be maintained... and if your failure is a
failure of the process doing the polling itself, that's broken.

In terms of most services, actually gross notification is the
minmal level of assurance: just because a program that's supposed
to offer a service is caught in an endless loop, so that there
has not been an exit, doesn't mean that it's actually offering
the service that it was intended to offer.

The best check is to actually attach to, and utilize the service.

There is also an intermediate level... which is probably best
served by kevents, actually.  For network services, when a listen
on a socket occurs, it occurs on a socket specific to a service:
a well known port.

One obvious thing to do is to cause events to occur when a listen
is issued, and when a listen socket is closed.  One might also do
an implicit listen in the case of an outstanding listen socket,
for which no listen was executed (default listen queue depth)...
basically a blocking accept vs. a connect and/or a select for
readability on an outstanding socket.

So there are various levels of service availability notification:

0)	A "ping" of the system offering the service indicates that
	the interrupt code and network stack are active, but nothing
	else.

1)	The program is running/stopped, with a notification latency
	(this is your suggested scenario, having to do with pid
	files).

2)	The program is running/stopped, with no notification latency
	(this is the NOTE_STARTEXEC/NOTE_STOPEXEC scenario).

3)	A server is listening on a well known port/has stopped
	listening on a well known port (this is my suggested scenario,
	and has the benefit of accounting for configuration and other
	latencies that #1 and #2 can miss -- it closes race windows).

4)	Active status monitoring, through attempts to actually use a
	service offered by a server (e.g. making a DNS request to a
	DNS server, doing a "HEAD /" on an HTTP server, saying "EHLO"
	to an SMTP server, etc.).

Obviously, #4 is the best approach.  Barring that, #3 provides much
better identification of services, but onl works on netwqork services.
#2 provides information about services by (potentially relative or
non-matching) path name.  Actually, the best approach for this would be
to return a "stat" buffer for the file that was exec'ed or stopped, so
that the caller could use the name information for logging, but could
verify that the "sendmail" being run was, say, /usr/local/bin/sendmail,
instead of /usr/sbin/sendmail: the way this would work is to have the
program which cared stat the executable that it expected to run, and
then compare the "stat" information (dev_t and ino_t) to verify that
the uniquely identified executable was in fact what was running.  This
is *much* better than a (partial) path comparison by (nonunique) file
name (consider also: hard links with different names).

So, in general:

o	NOTE_STARTEXEC/NOTE_STOPEXEC are a good idea; they close a
	latency window, and they close a race window, and the eliminate
	a single-point-of-failure that would come from having one
	monitoring program ("who monitors the monitors?").

o	The name information being returned is insufficient to
	uniquely identify the executable to an interested program

o	Minimally, the information the interested program really
	wants is the dev_t/ino_t pair

o	It's OK to offer the exec path name information, at least
	as much of it as is known, but that's additional information,
	and is not sufficient for the application for which the
	interface was intended.  At best, it may be useful for logging
	purposes

o	Another event set for NOTE_STARTLISTEN/NOTE_STOPLISTEN would
	be significantly more useful for most purposes for which the
	NOTE_STARTEXEC/NOTE_STOPEXEC events are being introduced

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3DBBB6D9.F369A5CD>