From owner-freebsd-hackers@FreeBSD.ORG  Wed Jan  7 23:53:47 2004
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 7CB9816A4D0; Wed,  7 Jan 2004 23:53:47 -0800 (PST)
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 11DAD43D2F; Wed,  7 Jan 2004 23:53:44 -0800 (PST)
	(envelope-from dillon@apollo.backplane.com)
Received: from apollo.backplane.com (localhost [127.0.0.1])
	i087rgtV068200;	Wed, 7 Jan 2004 23:53:42 -0800 (PST)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id i087rdFj068197;
	Wed, 7 Jan 2004 23:53:39 -0800 (PST)
	(envelope-from dillon)
Date: Wed, 7 Jan 2004 23:53:39 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200401080753.i087rdFj068197@apollo.backplane.com>
To: Miguel Mendez <flynn@energyhq.es.eu.org>
References: <200401062000.i06K0hSI012184@dyson.jdyson.com>
	<200401072317.i07NHaM9065411@apollo.backplane.com>
	<3FFD01CE.5070301@energyhq.es.eu.org>
cc: freebsd-chat@freebsd.org
cc: freebsd-hackers@freebsd.org
cc: Brett Glass <brett@lariat.org>
cc: dyson@iquest.net
cc: Munden Randall J <Randall.Munden@umb.com>
cc: jsd@jdyson.com
Subject: Re: Where is FreeBSD going?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 08 Jan 2004 07:53:47 -0000


:>     See?  I didn't mention DragonFly even once!  Ooops, I didn't mention
:>     DFly twice.  oops!  Well, I didn't mention it more then twice anyway.
:
:Makes me wonder if some of the solutions proposed by DragonFly could be 
:ported to FreeBSD, but I doubt it will be done, since it's more or less 
:admitting that the current solution is wrong.
:
:Yes, I mentioned DragonFly (how dare he!). Feel free to flame, I've 
:become extremely efficient at adding people to /etc/postfix/access :-P
:
:Cheers,
:-- 
:	Miguel Mendez <flynn@energyhq.es.eu.org>

    I think the correct approach to thinking about these abstractions would
    be to look at the code design implifications rather then just looking
    at performance, and then decide whether FreeBSD would benefit from
    the type of API simplification that these algorithms make possible.

    The best example of this that I have, and probably the *easiest*
    subsystem to port to FreeBSD (John could probably do it in a day),
    which I think would even wind up being exremely useful in a number
    of existing subsystems in FreeBSD (such as the slab allocator),
    would be DFly's IPI messaging code.  I use the IPI messaging abstraction
    sort of like a 'remote procedure call' interface... a way to execute
    a procedure on some other cpu rather then the current cpu.

    This abstraction allows me to execute operations on data structures
    which are 'owned' by another cpu on the target cpu itself, which means
    that instead of getting a mutex, operating on the data structure, and
    releasing the mutex, I simply send an asynch (don't wait for it to
    complete on the source cpu) IPI message to the target cpu.  By running
    the particular function, such as a scheduling request, in the target
    cpu's context, you suddenly find yourself in a situation where *NONE*
    of the related scheduler functions, and there are over a dozen of them,
    need to mess with mutexes.  Not one.  All they need to do to protect
    their turf is enter a critical section for a short period of time.
    The algorithm simplification is significant... you don't have to worry
    about crossing a thread boundary, you can remain in the critical section
    through the actual switch code which removes a huge number of special
    cases from the switch code.  You don't have to worry about mutexes
    blocking, you don't have to worry about futzing the owner of any mutexes,
    you don't have to worry about the BGL, you don't have to worry about
    stale caches between cpus, the code works equally well in a UP environment
    as it does in an SMP environment... cache pollution is minimized...
    the list goes on an on.

    So looking at these abstractions just from a performance standpoint
    misses some of the biggest reasons for why you might want to use them.
    Algorithmic simplification and maintainability are very important.
    Performance is important but not relevant if the resulting optimization
    cannot be maintained.

    In anycase, I use IPIs to do all sorts of things.  Not all have worked
    out... my token passing code, which I tried to use as a replacement
    for lockmgr interlocks, is pretty aweful and I consider it a conceptual
    failure.  But our scheduler, slab allocator, and messaging code,
    and a number of other mechanisms, benefit from huge simplifications
    through their use of IPI messaging.  Imagine... the messaging code
    is able to implement its entire API, including queueing and dequeueing
    messages on ports, without using a single mutex and (for all intents
    and purposes) without lock-related blocking.  The code is utterly
    simple yet works between cpus, between mainline code and interrupts
    with preemption capabilities, and vise-versa.  There are virtually no
    special cases.  Same with the slab code, except when it needs to 
    allocate a new zone from kernel_map, and same with the scheduler.

					    -Matt
					    Matthew Dillon 
					    <dillon@backplane.com>