Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 31 Oct 2003 12:42:51 -0800
From:      andi payn <andi_payn@speedymail.org>
To:        Terry Lambert <tlambert2@mindspring.com>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: kevent and related stuff
Message-ID:  <1067632970.825.164.camel@verdammt.falcotronic.net>
In-Reply-To: <3FA22EF1.39B64387@mindspring.com>
References:  <1067529247.36829.2138.camel@verdammt.falcotronic.net> <3FA22EF1.39B64387@mindspring.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 2003-10-31 at 01:44, Terry Lambert wrote:
> andi payn wrote:
> > First, some background: On Irix and Linux, fam works by asking the
> > kernel to send it a signal whenever the specified accesses occur. On
> > FreeBSD, since there is no imon interface and no dnotify fcntl, it
> > instead works by periodically stating all of the files it's
> > watching--which is obviously not as good. The fam FAQ suggests that
> > FreeBSD users should adapt fam to use the kevent interface.
> 
> Yes.  The "file access monitor" tool is the classic argument.

The "classic argument" for what? As far as I can tell, the fam FAQ is
only arguing that fam needs to be adapted to use kevent. 

Why do I get the impression that I've inadvertently opened a can of
worms that had only recently been shoved back into the can?
 
> > * I think (but I'm not sure) that kevent doesn't notify at all if the
> > only change to a file is its ATIME. If I'm right, this makes kevent
> > completely useless for fam. Adding a NOTE_ACCESS or something similar
> > would fix this. Since I'm pretty new to FreeBSD: What process do I go
> > through to figure out whether anyone else wants this, whether the
> > interface I've come up with is acceptable, etc.? And, once I write the
> > code, do I submit it as a pr?
> 
> You add it, submit it as a PR, if send-pr will work from your
> machine properly, discuss it on the lists, and if someone with
> a commit bit has the time and likes the idea, it will be committed.

OK, that's what I would have guessed/hoped. And I'd assume it's worth
splitting separatable issues into separate patches (e.g., NOTE_ACCESS
doesn't rely on anything else changing).

> > * The kevent mechanism doesn't monitor directories in a sufficient way
> > to make fam happy. If you change a file in a directory that you're
> > watching, unlike imon or dnotify, kevent won't see anything worth
> > reporting at all. This means that for directory monitoring, kevent is
> > useless as-is. Again, if I wanted to patch kevent to provide this
> > additional notification, would others want this?
> 
> I'm not sure that this is correct, unless you are talking about
> monitoring all files in a directory by merely monitoring the
> directory.  If you make a modification to the file metadata (e.g.
> add a link or rename it), then you will be notified that the
> directory has changed.

Well, the way imon works, you get notified whenever a file in the
directory changes--and this includes a change to ATIME or MTIME.
Presumably this is because you could have a GUI folder displaying, or
even sorting by, that value, and SGI wanted that to be updated in real
time just like anything else.

In fact, if you look at the way fam works without kernel support (e.g.,
the way it works on FreeBSD right now) you get such notifications.
Therefore, if fam were converted to use kevent, not having such
notifications could break existing FreeBSD usage of fam (as well as
being incompatible with usage on other platforms).

(By the way, dnotify will, in that situation, notify you that something
in the directory has changed, but then you have to go stat all of the
files in the directory to see which one--which is a bit ugly, to say the
least. But the dnotify patch to fam actually does this, so fam can work
the way it's supposed to.)

> The argument against subhierarchy monitoring is that it will, by
> definition, stop at the directory level, and it can not be
> successfully implemented for all FS types.

If an extra flag were added (say, "EV_CONTAINED" to scan one level
down), and this could not be implemented for some FS type, then trying
to add such a notification would fail for such FS types. In which case
the app would do whatever's appropriate--decide not to monitor files
within that directory, or fall back to explicitly monitoring all files,
or spit out a message and abort. 

In fam's case, it would fall back to monitoring all of the files. As it
already does when, e.g., imon support is missing on some filesystems.

More important, trying to add a vnode notification with _any_ flags
already fails for anything but UFS--so fam had better be prepared to
fall back when kevent fails!

So, the fact that contents-monitoring won't work on every filesystem is
not much of an argument against implementing it where it can work.

> > * When two equivalent events appear in the queue, kevent aggregates
> > them. This means that if there are two updates to a file since the last
> > time you checked, you'll only see the most recent one. For some uses of
> > fam (keeping a folder window up to date), this is what you want; for
> > others (keeping track of how often a file is read), this is useless. The
> > only solution I can think of is to add an additional flag, or some other
> > way to specify that you want duplicated events.
> 
> This is the classic "edge triggered vs. level triggered" argument
> that Linux people bring up every time someone suggest they implement
> kqueue in Linux.

Yes, but the argument can be avoided entirely if it's possible to
provide both edge-triggered and level-triggered behavior (on the basis
of a flag) with almost no additional code complexity. The fact that
linux people like to insist that there's a right way and a wrong way and
FreeBSD does it wrong is irrelevant; if both ways are possible, why not
provide both?

> This is easily fixable: you seperate the flag from the data, adding
> an additional argument to KNOTE().  This also has the side effect
> of removing the restriction on the PID size, which is imposed by
> the limited number of bits left over for representing the PID.
> 
> This is a trivial change, and I've done it several times.

OK, it's a trivial change to the kernel, but doesn't it break every
piece of existing code that uses kevent? A "don't aggregate duplicates"
flag, off by default, would have no effect on existing code, and would
give new code the choice.

"Every piece of existing code that uses kevent" may well be only a
handful of code, in which case this compatibility isn't much of an
argument; if so, feel free to correct me.

> > * Unlike imon and dnotify, kevent doesn't provide any kind of callback
> > mechanism; instead, you have to poll the queue for events. Would it be
> > useful to specify another flag/parameter that would tell the kernel to
> > signal the monitoring process whenever an event is available? (It would
> > certainly make the fam code easier to write--but if it's not useful
> > anywhere else, that's probably not enough.)
> 
> You can SIGPOLL on the event descriptor returned by kqueue().  You
> can use it in a select() or poll() call.  You can pass it to another
> kqueue() as an EVFILT_READ event.
> 
> Snding signals ("callbacks") is probably the absolutely least
> efficient way of getting the notification back.  The presumption
> here (and it's likely a good one) is that, rather than polling,
> your application will be event driven, and get the events by
> blocking and waiting for them.

The point isn't (just) how user code should be written; it's how fam is
already written.

Converting fam's dnotify code to a signaling kevent would be a matter of
half an hour's work; writing code for a mechanism that doesn't work like
either dnotify or imon would be substantially less trivial.

However, adding signaling support to the kernel is also obviously
non-trivial. That's why, as I said before, if this behavior is not
useful anywhere else, that's probably not enough reason to add it.

> > * The kevent vnode stuff apparently only works on UFS. And it looks like
> > it would be a major project to port it to other filesystems.
> 
> It's actually pretty trivial, so long as you know the FS you are
> porting it to.  The notifications for many things should probably
> be migrated to the VFS layer, instead (see Darwin's implementation).

That does make sense; I'll look into it.

> > Would this be useful for anything other than improving fam?
> 
> Sure; it would be as useful for other FS's as it's useful in UFS
> now.  The primary utility (IMO) is for GUI interfaces which want
> to update their displays in as close to real time as possible.

Except that lots of existing file managers and other GUIs already use
fam, and I don't think many use kevent. (Of course if kevent were ported
to linux, that would probably change....) So, improving kevent for the
benefit of file managers may not be particularly useful.

> > What about a port of
> > the imon kernel interface (and/or the dnotify fcntl) to FreeBSD instead?
> 
> The way Linux people feel about "edge vs. level triggered" kqueue
> is about the same way FreeBSD people feel about "dnotify"... but
> there's no obvious way to fix the complaints about "dnotify".

The complaints being the signaling mechanism that it uses? Or is there
something else wrong with it?

> imon is pretty useless; if it would be implemented at all, the way
> to do it is in terms of kevents.  Either way, you are resolving the
> kqueue "issue", so you might as well use kqueue.

Well, for imon to be implemented in terms of kevents would require the
changes I mentioned above. If people were unwilling to accept
NOTE_ACCESS, EV_CONTAINED, etc., but willing to accept imon, then you
wouldn't implement imon in terms of kqueue. If they were willing to
accept those changes, then you're right, there'd be little or no reason
to implement imon.

> > * The kqueue doesn't appear to have any maximum size. If this is true,
> > the dnotify/fam problem where you get hideous errors from overflowing
> > queues wouldn't be an issue, but you could instead end up wasting
> > massive amounts of memory in the kernel if you didn't get around to
> > reading the queue.... Which is it?
> 
> This is not strictly true; with the non-contract kqueue interface, it
> will aggregate the notes on a per object basis, so it's very hard to
> overflow (you'd have to monitor more things than you have memory
> available to monitor).

OK, so you won't get the overflow issue that fam's dnotify patch had to
deal with. Which is good.

> > Any answers, or pointers to where I can find these answers, would be
> > greatly appreciated.
> 
> I don't pretend that my answers are authoritative; however, having
> done what you want to do for two commercial companies now, I can
> tell you the approach I describe (using a contract between the
> kernel an user space, and separating the parameter from the flag
> bits) will work.  8-).
>
> In fact, if you check the -current archives from about two years
> ago in June or July or so, you will see patches that perform this
> separation, and which add support for System V IPC message queues
> sending events, thereby allowing them to be selected/polled/etc.
> via their kqueue descriptor.

This brings up another question: When I try to scan the archives, they
only go back about 8 months. Where are the old archives?

Actually, it brings up another question: Presumably these patches were
suggested and rejected; in other words, the FreeBSD community didn't
want this separation. Therefore, it might be worth pitching a different
change even if it doesn't sound as good....

(Maybe FreeBSD developers aren't supposed to have ideas like this one,
or the one about imon above? If so, I apologize; too many years in
linuxland and/or corporate development land have warped my fragile
little mind....)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1067632970.825.164.camel>