From owner-soc-status@FreeBSD.ORG Tue May 31 11:31:12 2011 Return-Path: Delivered-To: soc-status@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D3685106566B; Tue, 31 May 2011 11:31:12 +0000 (UTC) (envelope-from syuu@dokukino.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 527DC8FC13; Tue, 31 May 2011 11:31:11 +0000 (UTC) Received: by wyf23 with SMTP id 23so4364540wyf.13 for ; Tue, 31 May 2011 04:31:11 -0700 (PDT) Received: by 10.216.62.195 with SMTP id y45mr3130279wec.15.1306840081155; Tue, 31 May 2011 04:08:01 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.6.196 with HTTP; Tue, 31 May 2011 04:07:41 -0700 (PDT) From: Takuya ASADA Date: Tue, 31 May 2011 20:07:41 +0900 Message-ID: To: soc-status@freebsd.org Content-Type: text/plain; charset=UTF-8 Cc: "Robert N. M. Watson" , George Neville-Neil , Kazuya Goda Subject: Weekly status report (27th May) X-BeenThere: soc-status@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Summer of Code Status Reports and Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 May 2011 11:31:13 -0000 Sorry for delaying weekly status report, * Overview Here are progress of the project: - Implement set affinity ioctl on BPF Experimental code are implemented, worked - Implement affinity support on bpf_tap/bpf_mtap/bpf_mtap2 Experimental code are implemented, worked - Implement sample application Quick hack for tcpdump/libpcap, worked - Implement multi-queue tap driver Experimental core are implemented, not tested - Implement interface to deliver queue information on network device driver Partially implemented on igb(4), not tested - Reduce lock granularity on bpf_tap/bpf_mtap/bpf_mtap2 Not yet - Implement test case Not yet - Update man document, write description of sample code Not yet * Detail On an ethernet card, bpf_mtap is called when RX/TX are performing. If the card supports multiqueue, every packets through bpf_mtap should belong to RX queue id or TX queue id. To handle this, I defined new members on mbuf pkthdr. In if_start function on igb(4), I added following line: m->m_pkthdr.rxqid = (uint32_t)-1; m->m_pkthdr.txqid = [tx queue id]; And also receive function: m->m_pkthdr.rxqid = [rx queue id]; m->m_pkthdr.txqid = (uint32_t)-1; Then I define following members on bpf descriptor: d->bd_qmask.qm_enabled d->bd_qmask.qm_rxq_mask[] d->bd_qmask.qm_txq_mask[] Since qm_rxq_mask[] and qm_txq_mask[] size may differ on each cards, we need to pass size of queue from driver to bpf and allocate arrays by the size. I added them on struct ifnet: d->bd_bif->bif_ifp->if_rxq_num d->bd_bif->bif_ifp->if_txq_num Now we can filter unwanted packet on bpf_mtap like this: LIST_FOREACH(d, &bp->bif_dlist, bd_next) { if (d->bd_qmask.qm_enabled) { if (m->m_pkthdr.rxqid != (uint32_t)-1 && !d->bd_qmask.qm_rxq_mask[m->m_pkthdr.rxqid]) continue; if (m->m_pkthdr.txqid != (uint32_t)-1 && !d->bd_qmask.qm_txq_mask[m->m_pkthdr.txqid]) continue; } d->bd_qmask.qm_enabled should FALSE by default to keep compatibility with existing applications. And here are ioctls for set/get queue mask: #define BIOCENAQMASK _IO('B', 137) This does d->bd_qmask.qm_enabled = TRUE #define BIOCDISQMASK _IO('B', 138) This does d->bd_qmask.qm_enabled = FALSE #define BIOCRXQLEN _IOR('B', 133, int) Returns ifp->if_rxq_num #define BIOCTXQLEN _IOR('B', 134, int) Returns ifp->if_txq_num #define BIOCSTRXQMASK _IOWR('B', 139, uint32_t) This does d->bd_qmask.qm_rxq_mask[*addr] = TRUE #define BIOCGTRXQMASK _IOR('B', 140, uint32_t) Returns d->bd_qmask.qm_rxq_mask[*addr] /* XXX: We should have rxq_mask[*addr] = FALSE ioctl too */ #define BIOCSTTXQMASK _IOWR('B', 141, uint32_t) This does d->bd_qmask.qm_txq_mask[*addr] = TRUE /* XXX: We should have txq_mask[*addr] = FALSE ioctl too */ #define BIOCGTTXQMASK _IOR('B', 142, uint32_t) Returns d->bd_qmask.qm_rxq_mask[*addr] However, the packet which comes bpf_tap doesn't have mbuf, we won't able to classify queue id for it. So I added d->bd_qmask.qm_other_mask and BIOSTOTHERMASK/BIOGTOTHERMASK for them. If d->bd_qmask.qm_enabled && !d->bd_qmask.qm_other_mask, all packets through bpf_tap will be ignored. If we only care about CPU affinity of packet / thread(= bpf descriptor), checking PCPU_GET(cpuid) is enough. But if we want to take care queue affinity, we probably need structures as referred to above. * Argument I discussed about this project with some Japanese BSD hackers, they argue this plan, suggested me two things: - Isn't it possible to filter by queue id in BPF filter language by extend it? - Do we really need to expose queue information and threads to user applications? Probably most of BPF application requires to merge packet streams from threads at last. For example, sniffer app such as tcpdump and wireshark need to output packet dump on a screen, before output it on the screen we need to merge packet streams for each queues into one stream. If so, isn't it better to merge stream in kernel, not userland? I'm not really sure about use case of BPF, maybe there's use case can get benefit from multithreaded BPF? syuu