From owner-freebsd-hackers Sun Sep 7 07:08:04 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id HAA19276 for hackers-outgoing; Sun, 7 Sep 1997 07:08:04 -0700 (PDT) Received: from usr09.primenet.com (tlambert@usr09.primenet.com [206.165.6.209]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id HAA19262 for ; Sun, 7 Sep 1997 07:08:00 -0700 (PDT) Received: (from tlambert@localhost) by usr09.primenet.com (8.8.5/8.8.5) id HAA07835; Sun, 7 Sep 1997 07:07:56 -0700 (MST) From: Terry Lambert Message-Id: <199709071407.HAA07835@usr09.primenet.com> Subject: Re: IOCTL Commands - Where is my mistake? To: joerg_wunsch@uriah.heep.sax.de Date: Sun, 7 Sep 1997 14:07:54 +0000 (GMT) Cc: freebsd-hackers@FreeBSD.ORG In-Reply-To: <19970907110903.WE07508@uriah.heep.sax.de> from "J Wunsch" at Sep 7, 97 11:09:03 am X-Mailer: ELM [version 2.4 PL23] Content-Type: text Sender: owner-freebsd-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > > In SystemV, it would not have been luck, it would have been the way it > > should be. One could argue that BSD's was of encoding three separate > > arguments into one is not exactly a mark of engineering ellegance. > > Well, it offers two advantages: > > . It's failsafe. Change the size of the structure, and it will make > it a different ioctl command. You can still support the old one > if you want, if your kernel driver declares the old struct as > `ofoo_ioctl_t'. Otherwise, an application will simply get an ENOTTY, > as opposed to trashing arbitrary data in the kernel in the assumption > the ioctl would be called from a matching userland program. In fact, I use exactly this fact to transparently include system id and remote pid information in the NFS locking code. The reason it works is that the old fcntl() values don't transport the information, but the new fcntl() values (F_R...) do. So in the kernel, I can choose to pull in only the old structure, which is a subset of the new structure, anytime I'm not decoding a new call. This maintains binary compatability with old applications without needing to recompile them for the larger structure size (which you would have to do, since the data in user space being copied to kernel space may butt-up against an unmapped region, and attempting to copy in a larger-than-old-structure could cause the program to segfault). > . It concentrates the copyin/copyout at a single place, including all > the EFAULT handling etc (that older SysV's IMHO didn't even provide > for). When i first saw the BSD approach, i immediately thought: > ``Hey, why hasn't it been this way all the time?'' The SysV approach > where each driver does a boring copyin/copyout plain sucks. :) > (...and is more prone to kernel programmer errors) There are other issues as well, dealing with this. In a kernel threaded or kernel preemptive environment (realtime or SMP, etc.), you can easily get screwed. Putting the copies in up front and out at the end means the the intermediate code is no longer dependent on maintaining the page mappings for the user process. This would be an especially serious issue for an async call gate, which is critical to the functioning of a cooperative scheduling of user space threads on kernel threads to ensure that you don't give away quantum as frequently. This is actually a necessity, since without a CPU affinity model in the scheduler, a kernel thread (a normal process is a user thread bound to a single kernel thread) may be run on any CPU... after all, the CPU's are symmetric. Without this, you will end up migrating processes unnecessarily. This destroys the value of your L1 cache and your instruction pipelines, and would have a big negative impact on overall performance, and in the end, the amount of CPUs you can add before diminishing your returns. Actually, SVR4 and Solaris kernel threading have this problem now, which is why you won't see an unmodified version of either running on Sequent-type boxes (ie: 10's of processors). Think of it as "not being like SVR4"... most BSD people find that palletable enough that they won't even adopt a good technology, if it passed through SVR4, and was thus impugned by association. 8-) 8-). Linux had a big problem, in that it prevalidated source and target ranges, especially on ioctl's that took arguments in a structure and returned arguments in the same area. This was a win, in that it saved a validation, and increased concurrency (for some operations). But overall, it's a loss, since with kernel preeemption coming on line, the mapping may have changed between the time the call started and when it completed. I don't know if they still do this, or if they thrash the page table on each wakeup, or if they are simply succeptible to race condition based hacks at this time (I haven't looked lately). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.