Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Jul 1999 09:04:19 -0600
From:      Nate Williams <nate@mt.sri.com>
To:        Robert Watson <robert+freebsd@cyrus.watson.org>
Cc:        Nate Williams <nate@mt.sri.com>, freebsd-security@FreeBSD.ORG
Subject:   Re: how to keep track of root users?
Message-ID:  <199907151504.JAA02317@mt.sri.com>
In-Reply-To: <Pine.BSF.3.96.990712042257.8908B-100000@fledge.watson.org>
References:  <199907091711.LAA07208@mt.sri.com> <Pine.BSF.3.96.990712042257.8908B-100000@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
[ I'm supposed to be on vacation ;( ]

> > My thinking is that we 'pre-allocate' a AUDIT_RECORD_FAILED record, and
> > use it to inform the system that a record was unable to be generated.
> > Therefore, you have an idea that something is missing, but you don't
> > slow down the the system or cause deadlock.
> 
> Sounds great.  And the existence of such a record in the record stream
> would cause an appropriate IDS module to start flailing.

You got it.

> > Ahh, I understand now.  You are worried about one or the other of
> > namei/audit copyin being redundant.  I misunderstood both you and
> > Garrett.  Would it be possible to copy the string from the namei buffer,
> > thus avoiding the issue of modifying namei?
> 
> I haven't looked closely at the namei implementation -- at some point the
> entire string is dumped into a KTRACE record by namei()

Assuming we use KTRACE, which I though we decided wasn't up to the task
due to many things.  Given that it's not up to it, let's not rely on it
doing anything, or being reliant on it.

> I assume, and
> that might be an appropriate place to deal with this.  I don't have
> bundles of source code in front of me, but I seem to recall that namei()
> acts on a name lookup descriptor of some kind, perhaps allocated on the
> stack for the process?

It acts on a vnode, but the information may be gotten out of it.  Then
again, it may not be gotten out of it. :(

> > > I did something like this to add speculative process execution to
> > > FreeBSD/i386 a few months ago (that is, generating disk prefetch hints
> > > based on speculatively executing a sandboxed process copy), and it proved
> > > quite straightforward.  However, I believe the architecture-dependent code
> > > is what sits directly below the syscall code: we should perhaps insert
> > > another architecture-independent layer that wraps the syscall, where
> > > things like this can be placed.
> > 
> > However, in the 'generic' code, it may not be obvious why the error
> > occured, and this makes it more difficult to generate an audit record
> > 'atomically' since the creation of the record happens in a completely
> > different code-base from the 'end' of the record.  We'd need to design
> > some sort of even model in the audit record generation code, as well as
> > pass in information in each sub-record to identify which record the
> > sub-record belongs to.
> 
> My thought was to arrange the code as follows.  Currently, the FreeBSD
> kernel code does something like the following:
> 
>                    kern_exec.c:execve()
>                    +------------------+
> archspecific:trap.c|                  |
> -------------------+                  +---------
> 
> Again, without source in front of me I'm not sure I have this exacting
> right but... I am suggesting breaking out some of the syscall switch code
> from occuring in the architecture-specific section.  I.e., 
> 
>                          kern_exec.c:execve()
>                          +-------------------
>       kern/kern_syscall.c|
>       +------------------+
> archsp|
> ------+

The issue I have with this is that there is *NO* specific information
you can gather at this layer (kern_syscall) that isn't already gathered
by KTRACE, unless you speculatively guess at what's needed, thus causing
you to end up 're-implementing' much of the code that is already in
kern_exec.  You only have the arguments to the syscall, but not any of
the 'extended' information needed by an IDS setup.  To get that
information, you need to perform the same kinds of calls that execve is
already doing.  Why not just stick the IDS code into execve, thus
avoiding re-inventing the wheel?

> And the middle syscall layer would accept a struct proc and information on
> the syscall request.  Its responsibility would be to create an audit
> record, tag it with the syscall number, pid, credentials, and any other
> data consistent across syscalls that needs to be in the audit record, then
> call the appropriate syscall code through the switch/sysvec array.

We could do that, but what would it gain us?

> On
> return from the syscall, it would tag the audit record with the return
> code, error condition a timestamp, and commit the audit record.

But, we'd still need to instrument execve to get information such as the
stat information from the inode (ownership, permissions, mount
information, etc...).  This information only exists inside of execve()
(unless we decided to do another stat on the file in the code, which
seems absolutely silly since the information already exists.)

> It would then be the responsibility of the syscall code itself to
> submit a list of arguments as appropriate, as well as any more
> detailed subject/object information, etc.

Again, since we're already instrumenting the syscall code, let's do it
all there, and provide some 'generic' stubs that each syscall can do on
it's own, rather than do the re-direction in the kernel.

Again, I think we're in agreement on *what* needs to be done, but not on
the specifics of how it needs to be done.

Your way adds a level of indirection that isn't obvious.  My way makes
it obvious what is being done *PLUS* allows us to create a record in one
fell swoop, instead of in pieces.

Ex:

execve()
{
....
#if _POSIX_AUD
   audit_rec ac;

   ac = new ac(p);
   ac.syscallNum = KERN_EXECVE;

#endif
....
#if _POSIX_AUD
   /* Totall bogus code */
   ac.perms = vnode.perms;
   ac.owner = vnode.uid;
   ac.group = vnode.gid;
#endif

Hopefully you get the point.  This entire record is created from start
to finish inside the syscall, thus making it very obvious what
information is gathered at the code level, rather than trying to run
around the different parts of the kernel to figure out what is going on.
   
> > > Similarly, auditing signal delivery would
> > > need to happen the same way: currently signal deliver lives in
> > > architecture-dependent-land, and we'd want the auditing wrapper to sit
> > > somewhere independent of architecture, I suspect.
> > 
> > Are signals required for IDS?  (Showing my ignorance here...)
> 
> A useful IDS module might consist of:
> 
>    Watch for a process that receives data from a network socket or local
>    IPC pipe and sigsegv's within two (or n) syscalls of receiving the
>    data. 
> 
> Or more specifically:
> 
>    Watch for a process started by inetd that receives data from a network
>    socket or local IPC pipe and sigsegv's within n syscalls of receiving
>    the data.
> 
> Etc, etc.

Gotcha.

> > > As Garrett mentions, there will still be a context record from somewhere
> > > that could be extended to carry an active audit record for the activities
> > > of that context.  Presumably that is the place to put it?
> > 
> > How is this record 'identified' from the other records being generated
> > in parallel by the other CPUs?  (In other words, what identifies this
> > process from other process in the above code.  We're not passing the
> > proc structure around....)
> 
> Well, presumably proc p is passed into the syscall, and could be passed
> elsewhere.

My issue was that 'p' wasn't passed into any of your code above, so I
was concerned to how it was going to be done.

> Largely.  The nice thing about adding the audit record creation/commital
> outside of this syscall code though is that we get at least introductory
> auditing on all syscalls without hassle: we know the syscall number,
> information on the process, and the return value.

This is essentially what KTRACE is doing today, and it's not adequate.
There's no need to add 'Yet Another' auditing system on top of it,
because it would buy us very little.

But, it's not adequate, so let's take the time to do it Right (tm), and
make it obvious what's we're doing.

> We indeed both violently agree that argument auditing, etc, must happen
> inside the syscall.  But my temptation is to push a little of the
> general-purpose work out of the syscall and into the handler.

See above.  I think by pushing it outside of the syscall we end up
obfuscating (a little bit) of what's going on.  Making these easy to
understand is of prime importance to me, because the chances of it being
accepted and maintained are much greater.  If kernel writers think they
don't have to do anything do make new syscalls work correctly in the new
IDS setup, they won't.
> > > See above: simple stuff in kernel may be the optimum approach, and I
> > > suspect a little bit of simple goes a long way.
> > 
> > Agreed, although a mechanism similar to BPF may allow for more 'complex'
> > filtering mechanisms and still be quite effecient at the kernel.
> 
> My concern is not that a language and programming model couldn't be
> devised, but that seeking through arguments to syscalls and managing types
> may be fairly costly: with packets in bpf you just look at offsets and
> values, and no looping.  Clearly this is something we'll need to
> experiment with quite a bit: having a flexible query front end in the
> user-land audit daemon could allow us to experiment with various backends
> with various degrees of complexity.

Agreed.  This is why I don't think this is a 1.0 feature. :)


Nate


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-security" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199907151504.JAA02317>