Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 1 Sep 2014 13:57:12 -0700
From:      Jordan Hubbard <jkh@mail.turbofuzz.com>
To:        Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc:        Alfred Perlstein <alfred@freebsd.org>, Gleb Smirnoff <glebius@FreeBSD.org>, freebsd-arch@freebsd.org
Subject:   Re: script(2) [was: [CFT/review] new sendfile(2)]
Message-ID:  <A18F2D00-8B24-4886-BB0A-C50A88FBAFB2@mail.turbofuzz.com>
In-Reply-To: <2770.1409522711@critter.freebsd.dk>
References:  <20140529102054.GX50679@FreeBSD.org> <20140729232404.GF43962@funkthat.com> <20140831165022.GE7693@FreeBSD.org> <540382E2.3040004@freebsd.org> <2770.1409522711@critter.freebsd.dk>

next in thread | previous in thread | raw e-mail | index | archive | help

On Aug 31, 2014, at 3:05 PM, Poul-Henning Kamp <phk@phk.freebsd.dk> =
wrote:

> Can I inject an old idea whose time may finally have arrived ?
> [ =85 ]
> Imagine we instead define a byte-code-engine which interprets a
> string of commands, sort of like the pcap filtering engine already
> does.  The corresponding syscall would be "follow_the_script(2)"

Having seen this pattern used for several kernel-related things in a few =
of my former lives, I think this idea has a lot of merit, though I=92d =
be careful not to conceptualize it purely (or only) as an =93engine for =
off-loading work to in order to avoid the kernel/userland boundary cost=94=
 since I think the concept has a much broader application than that.  It =
can also obviously be used for match filters (for the packet capture =
example already given) or security policies (firewalling, sandboxing) =
that are in the kernel simply because that=92s the most logical place to =
put them, and that means that the =93script=94 may be a full-on complex =
task or a really short little script fragment (scriptlet?) which =
potentially needs access to a lot more of the kernel than the file =
primitives.  If it=92s a firewall related task, obviously it wants to be =
able to interpose itself into the networking path.  If it=92s a sandbox =
rule script, it=92s going to need to be able to gate access to a wide =
variety of kernel services (not unlike all the checks that phk added for =
jails).  Perhaps that=92s what phk meant and I=92m just reading his =
original message too narrowly.

That=92s also why I think the rubber will most meet the road in figuring =
out just how many =93bytecode primitives=94 to surface, a far more =
bike-sheddy topic than the actual higher-level description format, =
though we also have plenty of empirical evidence to suggest that the MAC =
hook mechanism in TrustedBSD already pretty much describes all of the =
logical places to place the hooks and therefore also suggests what the =
full enumeration of bytecode primitives might look like.  If TrustedBSD =
added a hook point, consider creating a corresponding primitive which =
can act on the corresponding subject/target at that point and boom, =
there=92s your trail of breadcrumbs to follow.

I would also add a corresponding DFA engine for acting on paths, since I =
think that=92s a necessary sub-component of the bytecode engine.  Unix =
is path oriented.  Allow all of the relevant primitives which act on =
files to have a DFA for matching the ones it applies to and you=92ve =
really got something pretty powerful.

When we implemented application sandboxing in OS X and iOS, we chose to =
use Scheme as the implementation language (see /usr/share/sandbox on any =
OS X system for a good selection of examples) and a =93sandbox compiler=94=
 process to turn that (and the regex DFAs) into bytecode, but we could =
have honestly chosen almost any scripting language so I really don=92t =
think this discussion needs to get too hung up on the selection of a =
higher-level language.  You want Lua?  Sure.  Just make it a =93rule=94 =
that the kernel itself doesn=92t have to know beans about Lua and some =
userland agent or library will turn the Lua code into the appropriate =
bytecode, and now you=92ve got the ability to write your bytecode in =
Lua.  When Lua is no longer in vogue and has been replaced by some other =
new hotness, that library/agent can be written too without having to =
change a line of kernel code.  Yay for proper abstraction layers and not =
stuffing interpreters where they don=92t belong anyway!

That said, I=92ll also point out that we already have a bytecode =
=93engine=94 in the kernel and a corresponding higher-level language =
which compiles into it.  That language is called D and the bytecode =
interpreter is the DTrace support code.  But all of you already knew =
that.  The fact that Sun only chose to use it for instrumentation and =
debugging may be coloring everyone=92s thinking insofar as what it=92s =
theoretical limits as a more general purpose mechanism are, I don=92t =
know.   We can only speculate as to how much farther Sun might have =
taken it if they had survived as a company (if each dtrace =93worker=94 =
were a kernel thread, for example, they could have added looping =
primitives and other features which assumed a longer lifetime for given =
units of work).

It=92s an interesting topic of discussion, that=92s for sure.  I had a =
lot of fun with the sandboxing stuff at Apple.  It would be interesting =
to see where FreeBSD could go with an even more general purpose =
mechanism.

- Jordan




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A18F2D00-8B24-4886-BB0A-C50A88FBAFB2>