From owner-freebsd-net@FreeBSD.ORG Tue Jan 6 16:33:20 2015 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CC9ED592 for ; Tue, 6 Jan 2015 16:33:20 +0000 (UTC) Received: from mail-qa0-x232.google.com (mail-qa0-x232.google.com [IPv6:2607:f8b0:400d:c00::232]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7FCB864095 for ; Tue, 6 Jan 2015 16:33:20 +0000 (UTC) Received: by mail-qa0-f50.google.com with SMTP id dc16so16729492qab.9 for ; Tue, 06 Jan 2015 08:33:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=3PyS61ecQAO8Vy1A3P9msdvzJgytUrWM5wGu9HWDgbE=; b=GTGG5dnEFHH9KqwnYe3EGoLfsYJxnPQ+U1JRiFasW6U6za2fbci71xnKq5pqsrh5j9 ZCCXJDCYlNPhMgilVXCscCm3jXczQ09CX9wS0XLcxDssZL/aqSfPCqzamsuDep8eW9jB wafJgUJYRdzG2oD1C2pSaLPU2d2gGKaxj7NLTSCHc2bH1LbMz0n/s3R11QMS4OFpP8cZ ywnixSnUZfAjcoYvujF/BoiaOlQi6OHP9j3tEFIszeJ7ZOa0uOLDIgzuLL4p2u/1Dq7N 27xwOOsP7jZDmY0bcD2nzV8ANVm4Xd3uVr5p/dA32CkpFqn7hMfgV7a4O5OUh5ESqI63 hbOg== MIME-Version: 1.0 X-Received: by 10.140.37.115 with SMTP id q106mr89296128qgq.38.1420561999450; Tue, 06 Jan 2015 08:33:19 -0800 (PST) Received: by 10.96.76.201 with HTTP; Tue, 6 Jan 2015 08:33:19 -0800 (PST) In-Reply-To: References: Date: Tue, 6 Jan 2015 08:33:19 -0800 Message-ID: Subject: Re: netmap over virtio giving packets with extra 12 bytes From: Avinash Sridharan To: Vincenzo Maffione Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Luigi Rizzo , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Jan 2015 16:33:21 -0000 By the way, Vincenzo, your assumptions about my system setup: "From what I can guess you are dealing with a QEMU-KVM guest that uses virtio-net device(s) and runs netmap over that device(s). Then, you connect the guest to the host (gentoo) network stack using a standard linux bridge: a TAP device is used by QEMU to forward guest traffic from/to the host network stack." are correct. On Tue, Jan 6, 2015 at 8:31 AM, Avinash Sridharan < avinash.sridharan@gmail.com> wrote: > Hi Vincenzo, > Thanks for the explanation. From your explanation it seems like the > netmap in "native" mode over virtio-net should be giving some indication of > how many extra bytes have been added by the virtio-net driver (or for that > matter any other driver that provides this type of rx-descriptor). > Otherwise, the application will have to store knowledge about the specifics > of the underlying devices which dosen't seem that clean. (I think Adrian > was referring to the same issue) > > That said, how do we handle TX in this case? Since the underlying driver > (netmap + virtio-net) expects an extra 12 bytes of header that the > application should know when to add. Or is this optional? > > > > > > On Tue, Jan 6, 2015 at 8:17 AM, Vincenzo Maffione > wrote: > >> Hello, >> >> From what I can guess you are dealing with a QEMU-KVM guest that >> uses virtio-net device(s) and runs netmap over that device(s). >> Then, you connect the guest to the host (gentoo) network stack using a >> standard linux bridge: a TAP device is used by QEMU to forward guest >> traffic from/to the host network stack. >> >> Is that correct? >> >> Following Luigi's explanations, the virtio-net header is part of the >> virtio standard, and its purpose is to carry offloading info >> (checksum, TSO) across the guests and host kernels. For instance, your >> guest kernel can offload the TCP checksum to the virtio-net device, >> which in turn uses the virtio-net header (that requires TAP driver >> support) to postpone the checksum to the host kernel. If packets >> arrive to a physical NIC that supports checksum offloading (e.g. a >> r8169 NIC attached to the same bridge to which the TAP is attached), >> you have effectively offloaded the checksum computation from the guest >> kernel straight to the physical NIC in the physical host. >> >> If you see the virtio-net header with "pkt-gen -f rx", it means that >> you are using netmap in "native" mode, that is you use the specific >> virtio netmap adapter to send/receive packets from the (virtual) NIC. >> If you used netmap over virtio-net in "emulated" mode you wouldn't see >> the virtio-net header, because netmap would be using the standard >> driver (slow) datapath under the hood: In the rx datapath, the driver >> converts the virtio-net header into skbuffs/mbufs metadata, so you >> don't see it. >> >> I don't remember having tried to make QEMU use a TAP with no >> virtio-net-header extension, but I see that it is possible to disable >> it invoking qemu from command line >> >> $ x86_64-softmmu/qemu-system-x86_64 --help | grep tap >> >> -net >> tap[,vlan=n][,name=str][,fd=h][,fds=x:y:...:z][,ifname=name][,script=file][,downscript=dfile][,helper=helper][,sndbuf=nbytes][,vnet_hdr=on|off][,vhost=on|off][,vhostfd=h][,vhostfds=x:y:...:z][,vhostforce=on|off][,queues=n] >> use vnet_hdr=off to avoid enabling the IFF_VNET_HDR tap >> flag >> -netdev >> [user|tap|bridge|vde|netmap|vhost-user|socket|hubport],id=str[,option][,option][,...] >> >> where you see that you can specify "vnet_hdr=off" when declaring the >> qemu "backend" associated to the virtio-net guest device. >> Never tried, but it should work. In the worst case you can recompile >> the tap driver without IFF_VNET_HDR extension, so that QEMU does not >> find it. >> >> >> Cheers, >> Vincenzo >> >> 2015-01-05 13:19 GMT+01:00 Luigi Rizzo : >> > What you see is a virtio issue. >> > >> > virtio prepends a 10 or 12-byte "virtio header" >> > to all packets, which is used to define what sort >> > of NIC accelerations (checksum, tso etc.) are >> > expected on the link. >> > >> > I do not remember if there is a way in qemu-kvm to >> > remove the header. Maybe Vincenzo (in Cc) remembers. >> > >> > cheers >> > luigi >> > >> > On Mon, Jan 5, 2015 at 6:54 AM, Avinash Sridharan >> > wrote: >> >> >> >> I am using netmap with the click modular router, running the >> click-modular >> >> router in user space. A while back I was using this combination with >> the >> >> e1000 device driver, with a slightly older netmap code-base. >> >> >> >> Recently I updated my netmap code base and am trying to use the >> >> click-modular router with netmap over a virtio-net device driver (over >> KVM). >> >> With this combination, though I was able to receive packets I was >> unable to >> >> interpret any packets coming from the FromDevice element. >> >> >> >> To debug this issue (and to negate any changes I made to the >> click-modular >> >> router), I ran the pkt-gen application with the "dump payload" option: >> >> >> >> sudo ~/pkt-gen -i eth1 -f rx -X >> >> >> >> This showed that packets are being received correctly from the >> >> netmap-enabled interface, but there are an extra "12" bytes appended >> to the >> >> packet. >> >> >> >> 381.088570 main_thread [1446] 1 pps (1 pkts in 1001088 usec) >> >> >> >> ring 0x7f133bca6000 cur 1 [buf 516 flags 0x0000 len 72] >> >> >> >> 0: 00 00 00 00 00 00 00 00 00 00 01 00 01 80 c2 00 >> ................ << >> >> extra 12 bytes >> >> >> >> 16: 00 00 40 16 7e 5b 50 f0 00 26 42 42 03 00 00 00 ..@.~[P..&BB.... >> >> >> >> 32: 00 00 80 00 40 16 7e 5b 50 f0 00 00 00 00 80 00 ....@.~[P....... >> >> >> >> 48: 40 16 7e 5b 50 f0 80 01 00 00 14 00 02 00 00 00 @.~[P........... >> >> >> >> 64: 00 00 00 00 bc 9b f6 74 >> >> >> >> >> >> As we can see, the above is an STP BPDU, and there are 12 leading >> bytes in >> >> the payload. >> >> >> >> >> >> The extra leading bytes screw up the packet interpretation. >> >> >> >> >> >> So is this is an artifact of the virtio-net driver or has something >> >> changed in the netmap device driver? >> >> >> >> >> >> Thanks, >> >> >> >> Avinash >> > >> > >> > >> > >> > -- >> > >> -----------------------------------------+------------------------------- >> > Prof. Luigi RIZZO, rizzo@iet.unipi.it . Dip. di Ing. >> dell'Informazione >> > http://www.iet.unipi.it/~luigi/ . Universita` di Pisa >> > TEL +39-050-2211611 . via Diotisalvi 2 >> > Mobile +39-338-6809875 . 56122 PISA (Italy) >> > >> -----------------------------------------+------------------------------- >> >> >> >> -- >> Vincenzo Maffione >> > >