From owner-freebsd-arch@FreeBSD.ORG  Sun Nov 11 10:37:47 2007
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B589316A418;
	Sun, 11 Nov 2007 10:37:47 +0000 (UTC)
	(envelope-from alexander@leidinger.net)
Received: from redbull.bpaserver.net (redbullneu.bpaserver.net
	[213.198.78.217])
	by mx1.freebsd.org (Postfix) with ESMTP id 2C2D713C4B6;
	Sun, 11 Nov 2007 10:37:46 +0000 (UTC)
	(envelope-from alexander@leidinger.net)
Received: from outgoing.leidinger.net (p54A5694F.dip.t-dialin.net
	[84.165.105.79])
	by redbull.bpaserver.net (Postfix) with ESMTP id 76A512E2F5;
	Sun, 11 Nov 2007 11:37:20 +0100 (CET)
Received: from deskjail (deskjail.Leidinger.net [192.168.1.109])
	by outgoing.leidinger.net (Postfix) with ESMTP id DE31D62A37;
	Sun, 11 Nov 2007 11:37:17 +0100 (CET)
Date: Sun, 11 Nov 2007 11:37:17 +0100
From: Alexander Leidinger <Alexander@Leidinger.net>
To: "Poul-Henning Kamp" <phk@phk.freebsd.dk>
Message-ID: <20071111113717.4803b3ab@deskjail>
In-Reply-To: <3274.1194769611@critter.freebsd.dk>
References: <20071111091801.761ba5c5@deskjail>
	<3274.1194769611@critter.freebsd.dk>
X-Mailer: Claws Mail 3.0.1 (GTK+ 2.10.14; i686-portbld-freebsd7.0)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-BPAnet-MailScanner-Information: Please contact the ISP for more information
X-BPAnet-MailScanner: Found to be clean
X-BPAnet-MailScanner-SpamCheck: not spam, SpamAssassin (not cached,
	score=-0.085, required 8, BAYES_40 -0.18, RDNS_DYNAMIC 0.10)
X-BPAnet-MailScanner-From: alexander@leidinger.net
X-Spam-Status: No
Cc: rwatson@freebsd.org, cnst@freebsd.org, imp@freebsd.org, arch@freebsd.org
Subject: Re: sensors framework continued (architecture)
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 11 Nov 2007 10:37:47 -0000

Quoting "Poul-Henning Kamp" <phk@phk.freebsd.dk> (Sun, 11 Nov 2007 08:26:51 +0000):

> In message <20071111091801.761ba5c5@deskjail>, Alexander Leidinger writes:
> >Quoting "Poul-Henning Kamp" <phk@phk.freebsd.dk> (Sat, 10 Nov 2007 22:54:41 +0000):
> >
> 
> >> But you forget that sensors may have considerable "conversion" time.
> >> It is a benefit for us to be able to start the sensor and not block
> >> on the syscall waiting for it to do its thing.
> >
> >How do you know from the approach you propose when to read out the
> >newly polled data without blocking in some syscall? 
> 
> You use the select(2), poll(2) or kqueue(2) systemcall to wait until
> one of the several fd's the sensord(8) process will have to service
> becomes ready.

This sounds like you propose more than one kernel access point for all
sensors. Maybe something like /dev/sensors/senX instead of
the /dev/sensor_interface I thought initially? What about the
hierarchic aspect (/dev/sensors/hierlev1/.../hierlevX/senY, ... this is
where I came up with my filesystem comment in the previous mail)? And
then you open all those fd's the sensord process wants to service and
issue a select. And to get the sensor data you issue either one ioctl
for each sensor in the beginning to let the kernel return the data of
each sensor periodically (polling code in the kernel for most sensors
(the simple sensors)), or you issue an "poll now" ioctl each time you
want the data and wait for the return of the select/poll/kqueue. So in
the end you do a blocking wait, with slow sensors comming back later
than faster ones, and in the monitoring software those get attributed
to about the same time slot (if they are all polled at the same time).
In both cases you need to read the data with another syscall. And you
repeat this until all sensors of one poll run delivered the data (this
is true even for one fd instead of multiple ones).

As I don't like the generic poll logic for simple sensors (used in the
majority of use cases) in the kernel, let's look at the "poll
now"-case: 1 syscall for the "poll now" for each sensor (N calls), 1-N
syscalls for waiting, and 1 syscall to read for each sensor (for only
one fd for all sensors, this reduces to 1 call in the best case and N
calls in the worst case). So we have between N+2 and 3N calls.

The simple sysctl approach has N calls. Done in a naive way, this takes
longer than the fd approach (fast sensors can not overtake slow ones).

Again: when does it hurt that it takes longer?

For sysctls you go directly to a sensor (benefit of the hierarchic
property), for the single fd approach you need additional code in the
kernel to go dispatch to the sensor, and in the multiple fd case you
need to write some kind of filesystem logic to get the hierarchical
benefit. Both fd approaches (which have to be written, additionally to
the kernel API for hooking up sensors) are more complex than using the
existing sysctl logic (which can just be used in the kernel API for
hooking up sensors).

And all this to have maybe 1% of sensors which are smart handled in the
kernel. This assumes that those sensors need to be handled in the
kernel. That they are some kind of device in the PC which needs a
kernel driver, and not an external device which can be accessed over a
serial line or some other userland driven code and should therefore be
handled by the userland part of the single-system sensors framework and
not the kernel sensors framework.

> Remember the userland access API ?  That will need to be serviced
> via some kind of interface, most likely a unix domain socket (although
> a shared memory based interface might also work).

Why? We want a userland library to access it, so all tools which query
a sensor need to use this. This library can access the interface
directly (for the fd based approach this means congruent access needs
to work, for the sysctl based approach with caching it works already).
The admin is supposed to make sure not too (this is a number which is
different from system to system, and typically you only have 1-3 tools
running) much utilities are run at the same time.

> >How do you
> >quantify "a problem"? Can this problem be circumvented by userland code
> >instead (maybe some tunable amount of threads or some other way)?
> 
> Anybody who proposed "a tunable amount of threads" a solution where
> poll(2)/select(2)/kqueue(2) would do just fine, doesn't know what
> he is talking about.

Threads are an userland hammer. It's less complex to get right than a
hierarchic fd approach in the kernel. And as already asked (that's the
important thing of the paragraph, which you haven't replied to): when
does the time to get the sensor data a problem that we would have to
look at problem mitigation solutions. You need to add _a lot_ of _slow_
sensors to the _kernel_ in a _single_ system that this becomes a
problem.

> >I don't say your proposal is bad, currently I still think you are in
> >the premature optimization territory [...]
> 
> No, this is purely architectural, it has nothing to do with "premature
> optimization".

You propose to write more code with more complex logic to get faster
to the sensor data. This is contrary to KISS and looks very much as
premature optimization. It doesn't matter if it is micro-optimization
in an implementation or macro-optimization on the architecture level.
Fact is that you propose a more complex handling in the kernel of what
can be done simpler in userland. Feel free to come up with hard
facts of the contrary.

> >> You are reading waaay too much into Roberts email and overlooking
> >> that a fd based kernel interface can also be MIB based.
> >
> >I don't overlook the MIB based part. I see that we get the MIB part via
> >sysctls for free, and that we have to write a filesystem with all the
> >bells and whistles VFS needs to work, to get this MIB based part via
> >the fd approach.
> 
> Ahh, here we have the misunderstanding!
> 
> Nobody is proposing writing a filesystem, which would be a terribly
> stupid way to do it.

See above.

> To see why a filesystem isn't needed, please study devd(8)'s
> kernel interface.
> 
> We can continue when you have done so.

I've done this. Passing strings down a fd from the kernel is no magic.
It's good for the kernel->userland part, but not for the
userland->kernel querying of only a subset of the sensors. See the
dispatching in the kernel comment for the one fd based approach above,
using the existing sysctl API is less complex and less work. Also see
the comment above which talks about those 1% if smart sensors which IMO
most of the time should be handled by the userland part of the
single-system sensors framework and not by the kernel.

Bye,
Alexander.

-- 
I'm in direct contact with many advanced fun CONCEPTS.
http://www.Leidinger.net  Alexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org     netchild @ FreeBSD.org  : PGP ID = 72077137