From owner-freebsd-net@freebsd.org Thu Oct 1 10:52:57 2015 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E5580A0DE56 for ; Thu, 1 Oct 2015 10:52:56 +0000 (UTC) (envelope-from rizzo.unipi@gmail.com) Received: from mail-lb0-x234.google.com (mail-lb0-x234.google.com [IPv6:2a00:1450:4010:c04::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 70E621197 for ; Thu, 1 Oct 2015 10:52:56 +0000 (UTC) (envelope-from rizzo.unipi@gmail.com) Received: by lbos8 with SMTP id s8so6786128lbo.0 for ; Thu, 01 Oct 2015 03:52:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:message-id:subject:from:to:content-type; bh=O7VPn75LdqKLW/Gl4wDx/d9sNycHQRGnH1+E4iBBLKA=; b=gxnhfafd/+JsU3lITOeQHoCGmqjpvfLUEusHRo5SPjLpu3F6XghevgTRpcTJL2esiZ COhetVd1BJ6Jec0m/rR42Sywq2polajXfhTv/KS7HrB9gGeB1gK2ElbW/dpvf4JRXuo0 PNFh8xLG+Eo8/WjXn5aP7wGICWktvmhTYo6fVISZXjPu4EGYd0+DdfqIOzujyygLpgQg nllmbuH9zgj/JKU4f4UNxfGmQi4+deu2KdhIdfiMphXtGs8DaEuO/wJHqIgCpYq2/kBR aHfH3m5Q/ZnF+Sx1qE6PcSU/g9kBPJzUhH1w5rL4bepYEO06Be1wZUSDUvarKxCN+EBa /JDA== MIME-Version: 1.0 X-Received: by 10.112.13.136 with SMTP id h8mr2591257lbc.23.1443696773507; Thu, 01 Oct 2015 03:52:53 -0700 (PDT) Sender: rizzo.unipi@gmail.com Received: by 10.114.96.168 with HTTP; Thu, 1 Oct 2015 03:52:53 -0700 (PDT) Date: Thu, 1 Oct 2015 12:52:53 +0200 X-Google-Sender-Auth: 8ucFRaT_0fs1DVhPiLktQ6wt2fA Message-ID: Subject: RFC: revising netmap ring initialization From: Luigi Rizzo To: "freebsd-net@freebsd.org" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Oct 2015 10:52:57 -0000 I would like people's suggestions on the following =E2=80=8B topic.=E2=80=8B Right now, on enteringng netmap mode on a NIC we do an ifconfig down, flush the tx and rx queues, and replace the rx buffers with the netmap ones. Similarly, on exit, we down the interface, flush queues and restore the mbufs/skbufs. The annoying side effect is that in this way the link goes down and sometimes it takes a long time for autonegotiation and/or spanning tree to restore connectivity. I was thinking of a different way, as follows (i omit the locking requirements): 1. always keep the interface active and the mbufs associated to the NIC rings as allocated by the network stack, whether or not the interface is in normal or netmap mode; 2. when entering netmap mode just record the new operating mode (in turn redirecting the interrupt handlers to use the netmap routines rather than the usual *txeof(), *rxeof() ), and set the ring indexes according to the state of the ring (ie. pending tx mbufs are reported as unavailable); 3. the *_rxsync() routine in each driver will track which slots are still using mbufs (initially, all of them). When an incoming packet is in an mbuf, *_rxsync() will copy the payload in the netmap buffer, and mark the slot as a standard netmap bufffer for the future. After one round through the ring, all buffers will be standard netmap and there is no copy anymore. 4. on the tx side things are even simpler, all it takes is to do an m_freem() of completed tx mbufs and then mark the slots as available. Here again, after one round there is no overhead anymore. 5. when switching out of netmap mode things are a bit trickier, because we cannot release immediately the netmap buffers that are under processing, so we should add special code in the standard *txeof()/*rxeof() to report when the netmap buffers are not in use anymore. The code changes in the standard driver should be relatively simple, but the annoying thing is that we cannot free the netmap buffers on detach. For the tx side we could just loop shortly until the tx queue has been drained., which should happen quickly. For the rx side, however, we cannot tell when we will receive incoming traffic and reclaiming a buffer that is already in the rx engine may require a ring reset. So, any comment on the above, especially on the last issue ? cheers luigi