From owner-freebsd-net@FreeBSD.ORG  Thu Aug 22 10:16:26 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id 9CC40362
 for <freebsd-net@freebsd.org>; Thu, 22 Aug 2013 10:16:26 +0000 (UTC)
 (envelope-from rmind@netbsd.org)
Received: from mail.netbsd.org (mail.NetBSD.org [IPv6:2001:4f8:3:7::25])
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 78BDB2FA6
 for <freebsd-net@freebsd.org>; Thu, 22 Aug 2013 10:16:26 +0000 (UTC)
Received: from ws (localhost [IPv6:::1])
 by mail.netbsd.org (Postfix) with SMTP id 3837E14A21D;
 Thu, 22 Aug 2013 10:16:22 +0000 (UTC)
Date: Thu, 22 Aug 2013 11:16:02 +0100
From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: tech-net@netbsd.org, freebsd-net@freebsd.org, guy@alum.mit.edu
Subject: Re: BPF_MISC+BPF_COP and BPF_COPX (summary and patch)
In-Reply-To: <20130804191310.2FFBB14A152@mail.netbsd.org>
References: <20130804191310.2FFBB14A152@mail.netbsd.org>
X-Mailer: mail(1)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Message-Id: <20130822101623.3837E14A21D@mail.netbsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Aug 2013 10:16:26 -0000

Hi,

OK, to summarise what has been discussed:

- Problem

There is a need to perform more complex operations from the BPF program.
Currently, there is no (practical) way to do that from the byte-code.
Such functionality is useful for the packet filters or other components,
which could integrate with BPF.  For example, while most of the packet
inspection logic can stay in the byte-code, such operations as looking up
an IP address in some container or walking the IPv6 headers and returning
some offsets have to be done externally.  The first existing user of such
capability would be NPF in NetBSD.

- Overview of the solution

BPF_COP/BPF_COPX instructions in the misc category (BPF_MISC) would add a
capability to call external functions in a predetermined way: the array of
function pointers is pre-loaded and the functions are called by the index.
It can be thought as a BPF "coprocessor" -- a generic mechanism to offload
more complex packet inspection operations.  Such generic mechanism provides
much greater flexibility than specialised instructions.  Any components,
not excluding the proprietary ones, may implement and use their own custom
coprocessors without adding new instructions to the BPF opcode space which
is limited.

The original proposal:

> I would like propose new BPF instructions for the misc category: BPF_COP
> and BPF_COPX.  It would provide a capability of calling an external
> function - think of BPF "coprocessor".  The argument for BPF_COP is an
> index to a pre-loaded array of function pointers.  BPF_COPX takes the
> function index from the register X rather than a constant.
> 
>       BPF_STMT(BPF_MISC+BPF_COP, 0), /* A <- funcs[0](...) */
> 
>       typedef uint32_t(*bpf_copfunc_t)(struct mbuf *pkt,
>           uint32_t A, uint32_t *M);
> 
>       int bpf_set_cop(bpf_ctx_t *c, bpf_copfunc_t funcs[], size_t n);
> 
> The arguments passed to a called function would be the packet, accumulator
> and the memory store.  The return value would be stored in the accumulator
> and the register X would be reset to 0.  Note that the function may also
> change the memory store.  If the function index is out of range, then the
> register X would be set to 0xffffffff.
> 
> Note that bpf_filter(9) would need to take some context structure (which
> is preferable in general).

One change to the original proposal: if the function index is out of range,
we are going to abort the program and return zero (BPF_RET 0).

- Illustrative example (kernel code)

        static uint32_t
        my_inspection(const struct mbuf *pkt, uint32_t A, uint32_t *M)
        {
                ...
        }

        bpf_ctx_t *bc;
        bpf_copfunc_t my_cop[] = {
                my_inspection,
                ...
        };

        bc = bpf_create();
        bpf_set_cop(bc, my_cop, __arraycount(my_cop));
        ret = bpf_filter(bc, bcode, (const unsigned char *)m, pktlen, 0);
        bpf_destroy(bc);

- Security implications

There is no default coprocessor and this functionality is not targeted to
the /dev/bpf.  The user would be a *kernel* subsystem, which would have
to set the coprocessor using bpf_set_cop(9) and call bpf_filter(9) on a
packet itself.  There is no way to set the coprocessor at the userlevel.
Each BPF user (caller) in the kernel would have its own independent
context (state) and therefore different users would not affect each other.
The functions are predetermined and cannot change during the life-cycle.
Also, the coprocessor cannot change the flow of the program.  It can
inspect the packet in a read-only manner and return some numeric values.

- Patch

http://www.netbsd.org/~rmind/bpf_cop.diff

Thanks.

-- 
Mindaugas