From owner-freebsd-arch@FreeBSD.ORG Tue Nov 13 20:51:40 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A045E16A419; Tue, 13 Nov 2007 20:51:40 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from redbull.bpaserver.net (redbullneu.bpaserver.net [213.198.78.217]) by mx1.freebsd.org (Postfix) with ESMTP id 20CAC13C4C6; Tue, 13 Nov 2007 20:51:40 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from outgoing.leidinger.net (p54A5744B.dip.t-dialin.net [84.165.116.75]) by redbull.bpaserver.net (Postfix) with ESMTP id C5BDB2E2FC; Tue, 13 Nov 2007 21:51:20 +0100 (CET) Received: from deskjail (deskjail.Leidinger.net [192.168.1.109]) by outgoing.leidinger.net (Postfix) with ESMTP id B6042639BF; Mon, 13 Nov 2000 21:51:15 +0100 (CET) Date: Tue, 13 Nov 2007 21:51:17 +0100 From: Alexander Leidinger To: "Poul-Henning Kamp" Message-ID: <20071113215117.3b9bab8c@deskjail> In-Reply-To: <3720.1194780644@critter.freebsd.dk> References: <20071111113717.4803b3ab@deskjail> <3720.1194780644@critter.freebsd.dk> X-Mailer: Claws Mail 3.0.1 (GTK+ 2.10.14; i686-portbld-freebsd7.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BPAnet-MailScanner-Information: Please contact the ISP for more information X-BPAnet-MailScanner: Found to be clean X-BPAnet-MailScanner-SpamCheck: not spam, SpamAssassin (not cached, score=2.1, required 8, BAYES_50 2.50, RDNS_DYNAMIC 0.10, SMILEY -0.50) X-BPAnet-MailScanner-SpamScore: ss X-BPAnet-MailScanner-From: alexander@leidinger.net X-Spam-Status: No Cc: rwatson@freebsd.org, cnst@freebsd.org, imp@freebsd.org, arch@freebsd.org Subject: Re: sensors framework continued (architecture) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2007 20:51:40 -0000 Quoting "Poul-Henning Kamp" (Sun, 11 Nov 2007 11:30:44 +0000): > In message <20071111113717.4803b3ab@deskjail>, Alexander Leidinger writes: > > >> You use the select(2), poll(2) or kqueue(2) systemcall to wait until > >> one of the several fd's the sensord(8) process will have to service > >> becomes ready. > > > >This sounds like you propose more than one kernel access point for all > >sensors. Maybe something like /dev/sensors/senX instead of > >the /dev/sensor_interface I thought initially? > > One device node is plenty: /dev/sensors > > >the /dev/sensor_interface I thought initially? What about the > >hierarchic aspect (/dev/sensors/hierlev1/.../hierlevX/senY, ... this is > >where I came up with my filesystem comment in the previous mail)? > > There is no need to waste device nodes and vnodes on that, it can > be more efficiently encoded inband, just like devd does. > > >(the simple sensors)), or you issue an "poll now" ioctl each time you > >want the data and wait for the return of the select/poll/kqueue. So in > >the end you do a blocking wait, with slow sensors comming back later > >than faster ones, and in the monitoring software those get attributed > >to about the same time slot (if they are all polled at the same time). > > That would only happen if you implement things in a truly pointless > way. For an active monitoring program (doing probes on it's own, instead of waiting that probes connected to the monitoring program and deliver data), it starts a probe and and polls for data. If it starts probes in parallel, the returned data has about the same timestamp. If it doesn't probes in parallel, there will be some seconds or even minutes in difference, but all of those probes have a timestamp within the current round of probing, which can be describes as one current timestamp (the polling round). For a time-guy like you, this description is very inaccurate, for a normal operator which has to monitor the monitoring program (or a manger of the boss of the just mentioned operator), this description fits. The scenario I was talking about (99% of the use cases) doesn't require a precision to the second. For the remaining 1% you should use a special software and don't complicate the normal case for the other 99%. > >As I don't like the generic poll logic for simple sensors (used in the > >majority of use cases) in the kernel, let's look at the "poll > >now"-case: 1 syscall for the "poll now" for each sensor (N calls), > > Why couldn't you tell multiple sensors to poll in one syscall ? You can do that for the one fd for all sensors case. I forgot to write about this case at this place, sorry. > >1-N > >syscalls for waiting, and 1 syscall to read for each sensor (for only > >one fd for all sensors, > > And read all the results in one read(2) operation, if they are ready ? This is what you can do if all data is ready at the same time. But what's the point of doing a select/poll/kevent, if you wait in the kernel until all data is ready before returning something? You can do a blocking read for this. See also my next sentence below. > You sound like an IBM mainframe-guy: "There shall be one record per > syscall only!" :-) I was talking about the one fd per sensor part here, and you snipped the part where I've read about the one fd for all sensors part. > You could, best case, poll _all_ the sensors in two syscalls. Great... in the best case I can poll all the sensors in one syscall with the sysctl apporach (nothing prevents me from writting a sysctl_xxx which returns the data from a subtree). It's just that the best case will not happen often for sensors when you want to measure some latency of a probe, in such cases you don't probe all at once and wait that all data returns at the same time. And please, don't tell me to do the latency measurement in the kernel, it would contradict your requirement to do as less work as possible in the kernel (it's already stretched too much by your suggestion to let the kernel poll the simple sensors in a configurable time interval). If you get this info back from a smart sensor, fine, you can have (MIB notation) x.y.smart.data.value and x.y.smart.data.latency as separate sensors which you can correlate. But I still think a smart sensor is better attached to the userland sensor part, than to the kernel sensor part (where to draw the line between having it in the kernel or not, is up to the person writting the access code). BTW: you still haven't answered my question about examples of real work sensors which are smart. > >The simple sysctl approach has N calls. > > Which is a terrible waste of syscalls in my mind. With just one syscall (see above for the sysctl approach) you can not do latency measurement in userland. And latency measurement of simple sensors doesn't belong into the kernel. > >Again: when does it hurt that it takes longer? > > > >For sysctls you go directly to a sensor (benefit of the hierarchic > >property), for the single fd approach you need additional code in the > >kernel to go dispatch to the sensor, > > You mean, code that isn't hampered by the sysctl semantics and which > can do so in a very efficient way ? Yes, that would be a great > thing indeed. I'm sill waiting for hard data where you show that sysctl semantics hurt, and that it is not efficient enough, and that the more complex development for the fd approach is necessary. You are good in skipping the questions you don't want/can't answer. I think a lot of the questions you skipped would show that a sysctl approach instead of writting fd handling code is sufficient to cover 99% of cases. > >and in the multiple fd case you > >need to write some kind of filesystem logic to get the hierarchical > >benefit. > > Only you talk about one devicenode per sensor, please forget that > red herring. I wrote about both cases, a single fd and multiple fd's. Now that you clarified that you talk about one fd, see my comment regarding additional code for the single fd case in my last mail. > >> Remember the userland access API ? That will need to be serviced > >> via some kind of interface, most likely a unix domain socket (although > >> a shared memory based interface might also work). > > > >Why? We want a userland library to access it, so all tools which query > >a sensor need to use this. This library can access the interface > >directly [...] > > No, then you clearly have not understood what people told you, the > diagram looks like this: > > > N * accessing application > | > | > N * sensor-library > | > | > 1 * sensor daemon ---- N * sensor-library - N * userland sensors > | > | > N * kernel sensors This is what you understood (feel free to explain why you need N sensor libraries, one is enough). The description allows another interpretation: N * userland applications (a sensorsd, systat, ..) | 1 * sensors library | | N * kernel sensors N * userland sensors It also allows this interpretation: single-system sensors framework (see note 1) | | 1 * kernel sensors library 1 * userland sensors library | | N * kernel sensors N * userland sensors Note 1: this can be another lib, it can be one daemon, it can be N applications (if it makes sense or not). We didn't talk about this part in enough detail to say "the diagram looks like this". What we agree upon is, that we want a userland lib to abstract the kernel interface away from an application programmer. This means that programs which want to show data from kernel sensors need to use this lib. You can not depend upon the fact, that there's always a sensor daemon running. If you are in single user mode and need the data of a sensor, you should be able to get it even without a sensor daemon running. If we extend the kernel sensor lib with stuff so that it also understands userland sensors or not was not discussed at all. Having no lib between sensord and the kernel in your drawing let's me thing you haven't understood what the people where talking about. > >You propose to write more code with more complex logic to get faster > >to the sensor data. > > No, I propose to solve the problem, rather than hack up bad code > using bad interfaces for 20% of the problem. I asked multiple times that you provide technical facts for the "bad interface" part. So far you only provided suggestions for changes which are beneficial for an insignificant amount of cases. Those changes unnecessary complicate the code for 99% of cases. > >I've done this. Passing strings down a fd from the kernel is no magic. > >It's good for the kernel->userland part, but not for the > >userland->kernel querying of only a subset of the sensors. > > Here is a straw-man API for the kernel<->userland device: > > Kernel sends > "S 32 acpi.cpu.0.temperature bla bla bla\n" > > This means: I have a sensor which I know as number 32, and it tells me > it has these properties. > > Userland does an ioctl: > > SENSOR_POLL(32) > > Kernel sends, when the data is ready, > "D 32 34.45\n" > > There you are, can it be any simpler ? We have this already. It's called sysctl. Ok, the syntax is a little bit different, but the syntax you provided is just an example, and not a final decision. > Amongst the points you totally overlook, is the fact that the sensors > don't need to be a hierarchy in the kernel, as long as they tell > sensord about their placement in the hierarchy. How does a sensor know about their placement in the hierarchy if this data is not in the kernel? It has to be in the kernel. And either the sensor needs to know it's parent if you want that it returns its placement in the hierarchy, or the parent needs to tell that the child belongs to him. And it sounds like you want to write some additional code to do this. By using sysctl to access the sensor, you get this for free. > In fact, if for no other reason, the tremendous overhead for the > hierarchy in sysctl is reason not to use it. > to sysctl for this. How much is this tremendous overhead of sysctl, and when does it start to be a bottleneck. As I asked those questions already without getting an answer from you, I don't expect to get an answer now. As long as we don't get answers, you talk about premature optimization (yes, you told me already that you don't think that you talk about premature optimization). Warner, John, Robert, others: either I don't understand Poul's arguments why the fd approach is better / all this things which can be done in userland need to be done in the kernel / ..., or he doesn't understand my arguments why the fd approach is not better / why those things he proposes to be done in the kernel can be done in userland. Could someone please help out and explain either him or me the parts we fail to understand? That would be very nice, as it looks like we're running around in a circle ATM. Bye, Alexander. -- Only a mediocre person is always at their best. http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137