From owner-svn-src-all@FreeBSD.ORG Mon Jun 30 14:56:05 2014 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D4FFA41A; Mon, 30 Jun 2014 14:56:05 +0000 (UTC) Received: from mail-lb0-x236.google.com (mail-lb0-x236.google.com [IPv6:2a00:1450:4010:c04::236]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7B4D02050; Mon, 30 Jun 2014 14:56:04 +0000 (UTC) Received: by mail-lb0-f182.google.com with SMTP id c11so6024754lbj.13 for ; Mon, 30 Jun 2014 07:56:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=s7vMybR1NxVCaIOhLV4ippWDSkyo4g2p+nrCW+ZTVXE=; b=zkbyyhxdjXfdSIV75YPGBDkwLRFM8R9pY4DCYdX2ujHs4GOTU0hdBdqKUFTIQaxgml ptcHN6lH5SolzI8HpJQFalfyT0X+AzHs2abFSGWw8rW/TNw7VxVoDW70WaFAKHjxjh+w hhV+Sl1cxY8bIwg/PRlhRBxxbuUZhE2OiduWIL095bfG7ozrDNlM5wsB3VPLKCc0Kwaq lrqRlX+ONHroky6nO+uJCNS9TCQvHwpxcFpzKZfsfMz/+zS87Vc0y98nuL14jqusyEC1 6LBiWG8YzFjBP133ktChQFRyiv5MLn4YP2Ukpykru11tkx6R3xcXrjs+yG/OBvbHwQZe vCug== MIME-Version: 1.0 X-Received: by 10.152.184.41 with SMTP id er9mr4410lac.97.1404140161942; Mon, 30 Jun 2014 07:56:01 -0700 (PDT) Sender: rizzo.unipi@gmail.com Received: by 10.114.22.100 with HTTP; Mon, 30 Jun 2014 07:56:01 -0700 (PDT) In-Reply-To: References: <201406300434.s5U4YxGY043870@svn.freebsd.org> Date: Mon, 30 Jun 2014 16:56:01 +0200 X-Google-Sender-Auth: pPtSX84bn-f550w2ABdJ3QEX_hY Message-ID: Subject: Re: svn commit: r268028 - head/sys/dev/e1000 From: Luigi Rizzo To: Adrian Chadd Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18 Cc: "svn-src-head@freebsd.org" , "svn-src-all@freebsd.org" , "src-committers@freebsd.org" X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Jun 2014 14:56:05 -0000 I am getting the following error while compiling with gcc: cc1: warnings being treated as errors /usr/home/luigi/FreeBSD/head/sys/dev/e1000/if_igb.c: In function 'igb_attach': /usr/home/luigi/FreeBSD/head/sys/dev/e1000/if_igb.c:2461: warning: 'cpu_id' may be used uninitialized in this function /usr/home/luigi/FreeBSD/head/sys/dev/e1000/if_igb.c:2461: note: 'cpu_id' was declared here (the code is correct, apparently gcc cannot understand that). For compatibility i'd throw in an explicit initialization, however it might be worthwhile rewriting the code with a single if (adapter->num_queues > 1) { ...} block within the for(), and also using only one #ifdef RSS -- the other places only differ in the string passed to diagnostics or comments. cheers luigi the code detect that cpu_id is only used when On Mon, Jun 30, 2014 at 6:39 AM, Adrian Chadd wrote: > Reviewed by: jfv, gnn > > On 29 June 2014 21:34, Adrian Chadd wrote: > > Author: adrian > > Date: Mon Jun 30 04:34:59 2014 > > New Revision: 268028 > > URL: http://svnweb.freebsd.org/changeset/base/268028 > > > > Log: > > Add initial RSS awareness to the igb(4) driver. > > > > The igb(4) hardware is capable of RSS hashing RX packets and doing RSS > > queue selection for up to 8 queues. (I believe some hardware is > limited > > to 4 queues, but I haven't tested on that.) > > > > However, even if multi-queue is enabled for igb(4), the RX path > doesn't use > > the RSS flowid from the received descriptor. It just uses the MSIX > queue id. > > > > This patch does a handful of things if RSS is enabled: > > > > * Instead of using a random key at boot, fetch the RSS key from the > RSS code > > and program that in to the RSS redirection table. > > > > That whole chunk of code should be double checked for endian > correctness. > > > > * Use the RSS queue mapping to CPU ID to figure out where to thread pin > > the RX swi thread and the taskqueue threads for each queue. > > > > * The software queue is now really an "RSS bucket". > > > > * When programming the RSS indirection table, use the RSS code to > > figure out which RSS bucket each slot in the indirection table maps > > to. > > > > * When transmitting, use the flowid RSS mapping if the mbuf has > > an RSS aware hash. The existing method wasn't guaranteed to align > > correctly with the destination RSS bucket (and thus CPU ID.) > > > > This code warns if the number of RSS buckets isn't the same as the > > automatically configured number of hardware queues. The administrator > > will have to tweak one of them for better performance. > > > > There's currently no way to re-balance the RSS indirection table after > > startup. I'll worry about that later. > > > > Additionally, it may be worthwhile to always use the full 32 bit > flowid if > > multi-queue is enabled. It'll make things like lagg(4) behave better > with > > respect to traffic distribution. > > > > Modified: > > head/sys/dev/e1000/if_igb.c > > > > Modified: head/sys/dev/e1000/if_igb.c > > > ============================================================================== > > --- head/sys/dev/e1000/if_igb.c Mon Jun 30 04:26:29 2014 (r268027) > > +++ head/sys/dev/e1000/if_igb.c Mon Jun 30 04:34:59 2014 (r268028) > > @@ -35,6 +35,7 @@ > > > > #include "opt_inet.h" > > #include "opt_inet6.h" > > +#include "opt_rss.h" > > > > #ifdef HAVE_KERNEL_OPTION_HEADERS > > #include "opt_device_polling.h" > > @@ -84,6 +85,9 @@ > > #include > > #include > > #include > > +#ifdef RSS > > +#include > > +#endif > > > > #include > > #include > > @@ -967,12 +971,33 @@ igb_mq_start(struct ifnet *ifp, struct m > > struct igb_queue *que; > > struct tx_ring *txr; > > int i, err = 0; > > +#ifdef RSS > > + uint32_t bucket_id; > > +#endif > > > > /* Which queue to use */ > > - if ((m->m_flags & M_FLOWID) != 0) > > - i = m->m_pkthdr.flowid % adapter->num_queues; > > - else > > + /* > > + * When doing RSS, map it to the same outbound queue > > + * as the incoming flow would be mapped to. > > + * > > + * If everything is setup correctly, it should be the > > + * same bucket that the current CPU we're on is. > > + */ > > + if ((m->m_flags & M_FLOWID) != 0) { > > +#ifdef RSS > > + if (rss_hash2bucket(m->m_pkthdr.flowid, > > + M_HASHTYPE_GET(m), &bucket_id) == 0) { > > + /* XXX TODO: spit out something if bucket_id > > num_queues? */ > > + i = bucket_id % adapter->num_queues; > > + } else { > > +#endif > > + i = m->m_pkthdr.flowid % adapter->num_queues; > > +#ifdef RSS > > + } > > +#endif > > + } else { > > i = curcpu % adapter->num_queues; > > + } > > txr = &adapter->tx_rings[i]; > > que = &adapter->queues[i]; > > > > @@ -2433,11 +2458,34 @@ igb_allocate_msix(struct adapter *adapte > > device_t dev = adapter->dev; > > struct igb_queue *que = adapter->queues; > > int error, rid, vector = 0; > > + int cpu_id; > > > > /* Be sure to start with all interrupts disabled */ > > E1000_WRITE_REG(&adapter->hw, E1000_IMC, ~0); > > E1000_WRITE_FLUSH(&adapter->hw); > > > > +#ifdef RSS > > + /* > > + * If we're doing RSS, the number of queues needs to > > + * match the number of RSS buckets that are configured. > > + * > > + * + If there's more queues than RSS buckets, we'll end > > + * up with queues that get no traffic. > > + * > > + * + If there's more RSS buckets than queues, we'll end > > + * up having multiple RSS buckets map to the same queue, > > + * so there'll be some contention. > > + */ > > + if (adapter->num_queues != rss_getnumbuckets()) { > > + device_printf(dev, > > + "%s: number of queues (%d) != number of RSS buckets > (%d)" > > + "; performance will be impacted.\n", > > + __func__, > > + adapter->num_queues, > > + rss_getnumbuckets()); > > + } > > +#endif > > + > > for (int i = 0; i < adapter->num_queues; i++, vector++, que++) { > > rid = vector +1; > > que->res = bus_alloc_resource_any(dev, > > @@ -2464,19 +2512,42 @@ igb_allocate_msix(struct adapter *adapte > > que->eims = E1000_EICR_TX_QUEUE0 << i; > > else > > que->eims = 1 << vector; > > + > > +#ifdef RSS > > /* > > - ** Bind the msix vector, and thus the > > - ** rings to the corresponding cpu. > > - */ > > + * The queue ID is used as the RSS layer bucket ID. > > + * We look up the queue ID -> RSS CPU ID and select > > + * that. > > + */ > > + cpu_id = rss_getcpu(i % rss_getnumbuckets()); > > +#else > > + /* > > + * Bind the msix vector, and thus the > > + * rings to the corresponding cpu. > > + * > > + * This just happens to match the default RSS round-robin > > + * bucket -> queue -> CPU allocation. > > + */ > > if (adapter->num_queues > 1) { > > if (igb_last_bind_cpu < 0) > > igb_last_bind_cpu = CPU_FIRST(); > > - bus_bind_intr(dev, que->res, igb_last_bind_cpu); > > + cpu_id = igb_last_bind_cpu; > > + } > > +#endif > > + > > + if (adapter->num_queues > 1) { > > + bus_bind_intr(dev, que->res, cpu_id); > > +#ifdef RSS > > + device_printf(dev, > > + "Bound queue %d to RSS bucket %d\n", > > + i, cpu_id); > > +#else > > device_printf(dev, > > "Bound queue %d to cpu %d\n", > > - i,igb_last_bind_cpu); > > - igb_last_bind_cpu = CPU_NEXT(igb_last_bind_cpu); > > + i, cpu_id); > > +#endif > > } > > + > > #ifndef IGB_LEGACY_TX > > TASK_INIT(&que->txr->txq_task, 0, igb_deferred_mq_start, > > que->txr); > > @@ -2485,8 +2556,34 @@ igb_allocate_msix(struct adapter *adapte > > TASK_INIT(&que->que_task, 0, igb_handle_que, que); > > que->tq = taskqueue_create("igb_que", M_NOWAIT, > > taskqueue_thread_enqueue, &que->tq); > > - taskqueue_start_threads(&que->tq, 1, PI_NET, "%s que", > > - device_get_nameunit(adapter->dev)); > > + if (adapter->num_queues > 1) { > > + /* > > + * Only pin the taskqueue thread to a CPU if > > + * RSS is in use. > > + * > > + * This again just happens to match the default > RSS > > + * round-robin bucket -> queue -> CPU allocation. > > + */ > > +#ifdef RSS > > + taskqueue_start_threads_pinned(&que->tq, 1, > PI_NET, > > + cpu_id, > > + "%s que (bucket %d)", > > + device_get_nameunit(adapter->dev), > > + cpu_id); > > +#else > > + taskqueue_start_threads(&que->tq, 1, PI_NET, > > + "%s que (qid %d)", > > + device_get_nameunit(adapter->dev), > > + cpu_id); > > +#endif > > + } else { > > + taskqueue_start_threads(&que->tq, 1, PI_NET, "%s > que", > > + device_get_nameunit(adapter->dev)); > > + } > > + > > + /* Finally update the last bound CPU id */ > > + if (adapter->num_queues > 1) > > + igb_last_bind_cpu = CPU_NEXT(igb_last_bind_cpu); > > } > > > > /* And Link */ > > @@ -2763,6 +2860,13 @@ igb_setup_msix(struct adapter *adapter) > > /* Figure out a reasonable auto config value */ > > queues = (mp_ncpus > (msgs-1)) ? (msgs-1) : mp_ncpus; > > > > +#ifdef RSS > > + /* If we're doing RSS, clamp at the number of RSS buckets */ > > + if (queues > rss_getnumbuckets()) > > + queues = rss_getnumbuckets(); > > +#endif > > + > > + > > /* Manual override */ > > if (igb_num_queues != 0) > > queues = igb_num_queues; > > @@ -4455,6 +4559,99 @@ fail: > > return (ENOBUFS); > > } > > > > +/* > > + * Initialise the RSS mapping for NICs that support multiple transmit/ > > + * receive rings. > > + */ > > +static void > > +igb_initialise_rss_mapping(struct adapter *adapter) > > +{ > > + struct e1000_hw *hw = &adapter->hw; > > + int i; > > + int queue_id; > > + > > + u32 rss_key[10], mrqc, shift = 0; > > + union igb_reta { > > + u32 dword; > > + u8 bytes[4]; > > + } reta; > > + > > + /* XXX? */ > > + if (adapter->hw.mac.type == e1000_82575) > > + shift = 6; > > + > > + /* > > + * The redirection table controls which destination > > + * queue each bucket redirects traffic to. > > + * Each DWORD represents four queues, with the LSB > > + * being the first queue in the DWORD. > > + * > > + * This just allocates buckets to queues using round-robin > > + * allocation. > > + * > > + * NOTE: It Just Happens to line up with the default > > + * RSS allocation method. > > + */ > > + > > + /* Warning FM follows */ > > + for (i = 0; i < 128; i++) { > > +#ifdef RSS > > + queue_id = rss_get_indirection_to_bucket(i); > > + /* > > + * If we have more queues than buckets, we'll > > + * end up mapping buckets to a subset of the > > + * queues. > > + * > > + * If we have more buckets than queues, we'll > > + * end up instead assigning multiple buckets > > + * to queues. > > + * > > + * Both are suboptimal, but we need to handle > > + * the case so we don't go out of bounds > > + * indexing arrays and such. > > + */ > > + queue_id = queue_id % adapter->num_queues; > > +#else > > + queue_id = (i % adapter->num_queues); > > +#endif > > + reta.bytes[i & 3] = queue_id << shift; > > + > > + if ((i & 3) == 3) > > + E1000_WRITE_REG(hw, > > + E1000_RETA(i >> 2), reta.dword); > > + } > > + > > + > > + /* Now fill in hash table */ > > + > > + /* XXX This means RSS enable + 8 queues for my igb (82580.) */ > > + mrqc = E1000_MRQC_ENABLE_RSS_4Q; > > + > > +#ifdef RSS > > + /* XXX ew typecasting */ > > + rss_getkey((uint8_t *) &rss_key); > > +#else > > + arc4rand(&rss_key, sizeof(rss_key), 0); > > +#endif > > + for (i = 0; i < 10; i++) > > + E1000_WRITE_REG_ARRAY(hw, > > + E1000_RSSRK(0), i, rss_key[i]); > > + > > + /* > > + * Configure the RSS fields to hash upon. > > + */ > > + mrqc |= (E1000_MRQC_RSS_FIELD_IPV4 | > > + E1000_MRQC_RSS_FIELD_IPV4_TCP); > > + mrqc |= (E1000_MRQC_RSS_FIELD_IPV6 | > > + E1000_MRQC_RSS_FIELD_IPV6_TCP); > > + mrqc |=( E1000_MRQC_RSS_FIELD_IPV4_UDP | > > + E1000_MRQC_RSS_FIELD_IPV6_UDP); > > + mrqc |=( E1000_MRQC_RSS_FIELD_IPV6_UDP_EX | > > + E1000_MRQC_RSS_FIELD_IPV6_TCP_EX); > > + > > + E1000_WRITE_REG(hw, E1000_MRQC, mrqc); > > +} > > + > > /********************************************************************* > > * > > * Enable receive unit. > > @@ -4538,39 +4735,9 @@ igb_initialize_receive_units(struct adap > > */ > > rxcsum = E1000_READ_REG(hw, E1000_RXCSUM); > > if (adapter->num_queues >1) { > > - u32 random[10], mrqc, shift = 0; > > - union igb_reta { > > - u32 dword; > > - u8 bytes[4]; > > - } reta; > > - > > - arc4rand(&random, sizeof(random), 0); > > - if (adapter->hw.mac.type == e1000_82575) > > - shift = 6; > > - /* Warning FM follows */ > > - for (int i = 0; i < 128; i++) { > > - reta.bytes[i & 3] = > > - (i % adapter->num_queues) << shift; > > - if ((i & 3) == 3) > > - E1000_WRITE_REG(hw, > > - E1000_RETA(i >> 2), reta.dword); > > - } > > - /* Now fill in hash table */ > > - mrqc = E1000_MRQC_ENABLE_RSS_4Q; > > - for (int i = 0; i < 10; i++) > > - E1000_WRITE_REG_ARRAY(hw, > > - E1000_RSSRK(0), i, random[i]); > > - > > - mrqc |= (E1000_MRQC_RSS_FIELD_IPV4 | > > - E1000_MRQC_RSS_FIELD_IPV4_TCP); > > - mrqc |= (E1000_MRQC_RSS_FIELD_IPV6 | > > - E1000_MRQC_RSS_FIELD_IPV6_TCP); > > - mrqc |=( E1000_MRQC_RSS_FIELD_IPV4_UDP | > > - E1000_MRQC_RSS_FIELD_IPV6_UDP); > > - mrqc |=( E1000_MRQC_RSS_FIELD_IPV6_UDP_EX | > > - E1000_MRQC_RSS_FIELD_IPV6_TCP_EX); > > > > - E1000_WRITE_REG(hw, E1000_MRQC, mrqc); > > + /* rss setup */ > > + igb_initialise_rss_mapping(adapter); > > > > /* > > ** NOTE: Receive Full-Packet Checksum Offload > > @@ -4831,7 +4998,7 @@ igb_rxeof(struct igb_queue *que, int cou > > for (i = rxr->next_to_check; count != 0;) { > > struct mbuf *sendmp, *mh, *mp; > > struct igb_rx_buf *rxbuf; > > - u16 hlen, plen, hdr, vtag; > > + u16 hlen, plen, hdr, vtag, pkt_info; > > bool eop = FALSE; > > > > cur = &rxr->rx_base[i]; > > @@ -4853,6 +5020,7 @@ igb_rxeof(struct igb_queue *que, int cou > > else > > vtag = le16toh(cur->wb.upper.vlan); > > hdr = le16toh(cur->wb.lower.lo_dword.hs_rss.hdr_info); > > + pkt_info = > le16toh(cur->wb.lower.lo_dword.hs_rss.pkt_info); > > eop = ((staterr & E1000_RXD_STAT_EOP) == > E1000_RXD_STAT_EOP); > > > > /* Make sure all segments of a bad packet are discarded > */ > > @@ -4952,7 +5120,43 @@ igb_rxeof(struct igb_queue *que, int cou > > rxr->fmp->m_pkthdr.ether_vtag = vtag; > > rxr->fmp->m_flags |= M_VLANTAG; > > } > > -#ifndef IGB_LEGACY_TX > > +#ifdef RSS > > + /* XXX set flowtype once this works right */ > > + rxr->fmp->m_pkthdr.flowid = > > + le32toh(cur->wb.lower.hi_dword.rss); > > + rxr->fmp->m_flags |= M_FLOWID; > > + switch (pkt_info & E1000_RXDADV_RSSTYPE_MASK) { > > + case E1000_RXDADV_RSSTYPE_IPV4_TCP: > > + M_HASHTYPE_SET(rxr->fmp, > M_HASHTYPE_RSS_TCP_IPV4); > > + break; > > + case E1000_RXDADV_RSSTYPE_IPV4: > > + M_HASHTYPE_SET(rxr->fmp, > M_HASHTYPE_RSS_IPV4); > > + break; > > + case E1000_RXDADV_RSSTYPE_IPV6_TCP: > > + M_HASHTYPE_SET(rxr->fmp, > M_HASHTYPE_RSS_TCP_IPV6); > > + break; > > + case E1000_RXDADV_RSSTYPE_IPV6_EX: > > + M_HASHTYPE_SET(rxr->fmp, > M_HASHTYPE_RSS_IPV6_EX); > > + break; > > + case E1000_RXDADV_RSSTYPE_IPV6: > > + M_HASHTYPE_SET(rxr->fmp, > M_HASHTYPE_RSS_IPV6); > > + break; > > + case E1000_RXDADV_RSSTYPE_IPV6_TCP_EX: > > + M_HASHTYPE_SET(rxr->fmp, > M_HASHTYPE_RSS_TCP_IPV6_EX); > > + break; > > + > > + /* XXX no UDP support in RSS just yet */ > > +#ifdef notyet > > + case E1000_RXDADV_RSSTYPE_IPV4_UDP: > > + case E1000_RXDADV_RSSTYPE_IPV6_UDP: > > + case E1000_RXDADV_RSSTYPE_IPV6_UDP_EX: > > +#endif > > + > > + default: > > + /* XXX fallthrough */ > > + M_HASHTYPE_SET(rxr->fmp, > M_HASHTYPE_NONE); > > + } > > +#elif !defined(IGB_LEGACY_TX) > > rxr->fmp->m_pkthdr.flowid = que->msix; > > rxr->fmp->m_flags |= M_FLOWID; > > #endif > > > > -- -----------------------------------------+------------------------------- Prof. Luigi RIZZO, rizzo@iet.unipi.it . Dip. di Ing. dell'Informazione http://www.iet.unipi.it/~luigi/ . Universita` di Pisa TEL +39-050-2211611 . via Diotisalvi 2 Mobile +39-338-6809875 . 56122 PISA (Italy) -----------------------------------------+-------------------------------