From owner-freebsd-arch Fri Sep 20 13:53: 5 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AAB8337B401; Fri, 20 Sep 2002 13:53:02 -0700 (PDT) Received: from snipe.mail.pas.earthlink.net (snipe.mail.pas.earthlink.net [207.217.120.62]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3D70543E3B; Fri, 20 Sep 2002 13:53:02 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0248.cvx40-bradley.dialup.earthlink.net ([216.244.42.248] helo=mindspring.com) by snipe.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 17sUln-0004J7-00; Fri, 20 Sep 2002 13:53:00 -0700 Message-ID: <3D8B8A63.9B3DE20B@mindspring.com> Date: Fri, 20 Sep 2002 13:51:47 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Jon Mini Cc: Daniel Eischen , Bill Huey , freebsd-arch@FreeBSD.ORG Subject: Re: New Linux threading model References: <3D8B62DB.C27B7E07@mindspring.com> <20020920191244.GY24394@elvis.mu.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Jon Mini wrote: > It's not scontext(2), but setcontext(2) -- of Solaris fame. Currnetly, > we have {get,set,swap}context(3), but being in userland causes some > interesting race conditions. Really, these functions needs to be > atomic from the process's perspective, and since they have to call > sigprocmask(2) anyways, the best solution is to just move them into > the kernel. This is how Solaris does it, among others. Well, it's *a* solution, anyway. ;^). The reason Solaris does it, though, is because it can't know about the existant register frames when it comes to a push, and so it has to put an explicit, rather than implicit, stall barrier in there to make sure. Otherwise, you would need to unwind the context switches in the reverse order they were originally made. The Keppel paper has details: http://citeseer.nj.nec.com/keppel91register.html Register Windows and User-Space Threads on the SPARC David Keppel Department of Computer Science and Engineering University of Washington > Under KSE, we needn't consult the kernel for thread context swaps, > because we can enter a critical section and avoid the race conditions > endemic with setcontext(2). Also, we don't modify the process signal > mask when we swap thread contexts, so we don't need to call > sigprocmask(2). Which kind of begs the question of why it needs to be there, or be called, which is what I was saying. I think there are legitimate reasons for having it, so it's not avoidable, like the Linux paper implies, but I don't agree with the Linux reasons they say it's needed for N:M threads. You argument here invalidates a couple of theirs, fromt he paper, but not all of them. > > I think the "the Linux scheduler is O(1), so we don't have to > > care" argument is invalid; O(1) means that for N entries, there > > need to be N elements traversed 1 time. What this really means > > is that, as the number of schedulable entitites goes up, the time > > required goes up linearly, as opposed to going up exponentially; > > or, better, to *not* going up in the first place. > > Terry? You must have misspoken here. O(N) is linear, O(1) is constant. See Rik's posting; My N in this case is not the N in N:M, it's what Rik's calling 'n'. I've upcased it to make it visually distinct in my text; sorry if that confused things. Over the set of all processes, it *is* a linear algorithm. Scheduling the next thing to run is not as interesting as scheduling the thing you are descheduling now so that it's run *again*. The distance that needs consideration is the distance between the times that it's scheduled. If you think about this in the context of my microbenchmarking comments, this should be more clear. > > One exception is the use > > "futex" wakeup in order to improve thread joins: FreeBSD should > > look closely at this. > > "Futexes" are not new. We had this at Be, but we called them Bennaphores. I didn't mean looking closely at it as a new technology, I meant looking closely at it because the current FreeBSD recursion-able mutex implementation is really too heavy weight for the problem at hand. The "futex" (or "bennaphore" or whatever) implementation differs in that it has significantly lower overhead, with the cost being that you can't just regrab a lock, and expect it to be magically counted up and down. If you've ever programmed timer code in the Windows 95/98/NT/XP/2000 kernels, the timers basically run on whatever kernel thread is available to run on, rather than a specific thread (kernel threads only provide context). This basically means that you have to build non-reentrant semaphores on top of the kernel services that are already there, or you can grab a semaphore in a normal operation, have a timer fire, and, even though it's technically a seperate context, in theory, in application, you end up being allowed to grab a semaphore that is already grabbed by the kernel context that the timer is "borrowing" to run itself. Matt Day ran into this with the soft updates syncer in our port of the Heidemann stacking VFS code to Windows 95 (different soft updates implementation than Kirk's code; it predates Kirks work by a couple of years). The upshot is that things you think are protected aren't really protected, under certain conditions that, while uncommon, are still possible. My personal preference is for the tradeoff that Linux made here, where they ate the code refactoring overhead implied by failure to permit recursive acquisition. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message