From owner-svn-src-all@FreeBSD.ORG Tue Jun 19 07:34:15 2012 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id ECC69106564A; Tue, 19 Jun 2012 07:34:14 +0000 (UTC) (envelope-from np@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id D2DA78FC08; Tue, 19 Jun 2012 07:34:14 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.4/8.14.4) with ESMTP id q5J7YEjw028629; Tue, 19 Jun 2012 07:34:14 GMT (envelope-from np@svn.freebsd.org) Received: (from np@localhost) by svn.freebsd.org (8.14.4/8.14.4/Submit) id q5J7YErt028615; Tue, 19 Jun 2012 07:34:14 GMT (envelope-from np@svn.freebsd.org) Message-Id: <201206190734.q5J7YErt028615@svn.freebsd.org> From: Navdeep Parhar Date: Tue, 19 Jun 2012 07:34:14 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r237263 - in head: sbin/ifconfig sys/amd64/conf sys/conf sys/contrib/rdma sys/contrib/rdma/krping sys/dev/cxgb sys/dev/cxgb/common sys/dev/cxgb/sys sys/dev/cxgb/ulp/iw_cxgb sys/dev/cxgb... X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jun 2012 07:34:15 -0000 Author: np Date: Tue Jun 19 07:34:13 2012 New Revision: 237263 URL: http://svn.freebsd.org/changeset/base/237263 Log: - Updated TOE support in the kernel. - Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs. These are available as t3_tom and t4_tom modules that augment cxgb(4) and cxgbe(4) respectively. The cxgb/cxgbe drivers continue to work as usual with or without these extra features. - iWARP driver for Terminator 3 ASIC (kernel verbs). T4 iWARP in the works and will follow soon. Build-tested with make universe. 30s overview ============ What interfaces support TCP offload? Look for TOE4 and/or TOE6 in the capabilities of an interface: # ifconfig -m | grep TOE Enable/disable TCP offload on an interface (just like any other ifnet capability): # ifconfig cxgbe0 toe # ifconfig cxgbe0 -toe Which connections are offloaded? Look for toe4 and/or toe6 in the output of netstat and sockstat: # netstat -np tcp | grep toe # sockstat -46c | grep toe Reviewed by: bz, gnn Sponsored by: Chelsio communications. MFC after: ~3 months (after 9.1, and after ensuring MFC is feasible) Added: head/sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_ib_intfc.h (contents, props changed) head/sys/dev/cxgbe/tom/ head/sys/dev/cxgbe/tom/t4_connect.c (contents, props changed) head/sys/dev/cxgbe/tom/t4_cpl_io.c (contents, props changed) head/sys/dev/cxgbe/tom/t4_listen.c (contents, props changed) head/sys/dev/cxgbe/tom/t4_tom.c (contents, props changed) head/sys/dev/cxgbe/tom/t4_tom.h (contents, props changed) head/sys/dev/cxgbe/tom/t4_tom_l2t.c (contents, props changed) head/sys/dev/cxgbe/tom/t4_tom_l2t.h (contents, props changed) head/sys/modules/cxgbe/tom/ head/sys/modules/cxgbe/tom/Makefile (contents, props changed) head/sys/modules/toecore/ head/sys/modules/toecore/Makefile (contents, props changed) head/sys/netinet/toecore.c (contents, props changed) head/sys/netinet/toecore.h (contents, props changed) Deleted: head/sys/dev/cxgb/cxgb_offload.c head/sys/dev/cxgb/t3cdev.h head/sys/dev/cxgb/ulp/toecore/ head/sys/dev/cxgb/ulp/tom/cxgb_cpl_socket.c head/sys/dev/cxgb/ulp/tom/cxgb_ddp.c head/sys/dev/cxgb/ulp/tom/cxgb_defs.h head/sys/dev/cxgb/ulp/tom/cxgb_t3_ddp.h head/sys/dev/cxgb/ulp/tom/cxgb_tcp.h head/sys/dev/cxgb/ulp/tom/cxgb_tcp_offload.c head/sys/dev/cxgb/ulp/tom/cxgb_tcp_offload.h head/sys/dev/cxgb/ulp/tom/cxgb_tom_sysctl.c head/sys/modules/cxgb/toecore/ head/sys/netinet/toedev.h Modified: head/sbin/ifconfig/ifconfig.c head/sys/amd64/conf/GENERIC head/sys/conf/NOTES head/sys/conf/files head/sys/conf/options head/sys/contrib/rdma/krping/krping.c head/sys/contrib/rdma/krping/krping.h head/sys/contrib/rdma/krping/krping_dev.c head/sys/contrib/rdma/rdma_addr.c head/sys/contrib/rdma/rdma_cache.c head/sys/dev/cxgb/common/cxgb_ctl_defs.h head/sys/dev/cxgb/cxgb_adapter.h head/sys/dev/cxgb/cxgb_main.c head/sys/dev/cxgb/cxgb_offload.h head/sys/dev/cxgb/cxgb_osdep.h head/sys/dev/cxgb/cxgb_sge.c head/sys/dev/cxgb/sys/mvec.h head/sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb.c head/sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb.h head/sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_cm.c head/sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_cm.h head/sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_cq.c head/sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_dbg.c head/sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_ev.c head/sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_hal.c head/sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_hal.h head/sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_mem.c head/sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_provider.c head/sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_provider.h head/sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_qp.c head/sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_resource.c head/sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_user.h head/sys/dev/cxgb/ulp/iw_cxgb/iw_cxgb_wr.h head/sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c head/sys/dev/cxgb/ulp/tom/cxgb_l2t.c head/sys/dev/cxgb/ulp/tom/cxgb_l2t.h head/sys/dev/cxgb/ulp/tom/cxgb_listen.c head/sys/dev/cxgb/ulp/tom/cxgb_toepcb.h head/sys/dev/cxgb/ulp/tom/cxgb_tom.c head/sys/dev/cxgb/ulp/tom/cxgb_tom.h head/sys/dev/cxgbe/adapter.h head/sys/dev/cxgbe/common/t4_hw.c head/sys/dev/cxgbe/offload.h head/sys/dev/cxgbe/t4_l2t.c head/sys/dev/cxgbe/t4_l2t.h head/sys/dev/cxgbe/t4_main.c head/sys/dev/cxgbe/t4_sge.c head/sys/i386/conf/GENERIC head/sys/i386/conf/XEN head/sys/modules/Makefile head/sys/modules/cxgb/Makefile head/sys/modules/cxgb/cxgb/Makefile head/sys/modules/cxgb/iw_cxgb/Makefile head/sys/modules/cxgb/tom/Makefile head/sys/modules/cxgbe/Makefile head/sys/modules/rdma/krping/Makefile head/sys/net/if_var.h head/sys/net/if_vlan.c head/sys/netinet/if_ether.c head/sys/netinet/if_ether.h head/sys/netinet/in.c head/sys/netinet/tcp_input.c head/sys/netinet/tcp_offload.c head/sys/netinet/tcp_offload.h head/sys/netinet/tcp_output.c head/sys/netinet/tcp_subr.c head/sys/netinet/tcp_syncache.c head/sys/netinet/tcp_syncache.h head/sys/netinet/tcp_timer.c head/sys/netinet/tcp_usrreq.c head/sys/netinet/tcp_var.h head/sys/ofed/drivers/infiniband/core/cma.c head/sys/ofed/drivers/infiniband/core/iwcm.c head/sys/ofed/include/linux/net.h head/sys/ofed/include/net/netevent.h head/sys/ofed/include/rdma/iw_cm.h head/usr.bin/netstat/inet.c head/usr.bin/sockstat/sockstat.c Modified: head/sbin/ifconfig/ifconfig.c ============================================================================== --- head/sbin/ifconfig/ifconfig.c Tue Jun 19 06:52:21 2012 (r237262) +++ head/sbin/ifconfig/ifconfig.c Tue Jun 19 07:34:13 2012 (r237263) @@ -916,7 +916,7 @@ unsetifdescr(const char *val, int value, #define IFCAPBITS \ "\020\1RXCSUM\2TXCSUM\3NETCONS\4VLAN_MTU\5VLAN_HWTAGGING\6JUMBO_MTU\7POLLING" \ "\10VLAN_HWCSUM\11TSO4\12TSO6\13LRO\14WOL_UCAST\15WOL_MCAST\16WOL_MAGIC" \ -"\21VLAN_HWFILTER\23VLAN_HWTSO\24LINKSTATE\25NETMAP" \ +"\17TOE4\20TOE6\21VLAN_HWFILTER\23VLAN_HWTSO\24LINKSTATE\25NETMAP" \ "\26RXCSUM_IPV6\27TXCSUM_IPV6" /* @@ -1212,6 +1212,8 @@ static struct cmd basic_cmds[] = { DEF_CMD("-tso4", -IFCAP_TSO4, setifcap), DEF_CMD("tso", IFCAP_TSO, setifcap), DEF_CMD("-tso", -IFCAP_TSO, setifcap), + DEF_CMD("toe", IFCAP_TOE, setifcap), + DEF_CMD("-toe", -IFCAP_TOE, setifcap), DEF_CMD("lro", IFCAP_LRO, setifcap), DEF_CMD("-lro", -IFCAP_LRO, setifcap), DEF_CMD("wol", IFCAP_WOL, setifcap), Modified: head/sys/amd64/conf/GENERIC ============================================================================== --- head/sys/amd64/conf/GENERIC Tue Jun 19 06:52:21 2012 (r237262) +++ head/sys/amd64/conf/GENERIC Tue Jun 19 07:34:13 2012 (r237263) @@ -28,6 +28,7 @@ options SCHED_ULE # ULE scheduler options PREEMPTION # Enable kernel thread preemption options INET # InterNETworking options INET6 # IPv6 communications protocols +options TCP_OFFLOAD # TCP offload options SCTP # Stream Control Transmission Protocol options FFS # Berkeley Fast Filesystem options SOFTUPDATES # Enable FFS soft updates support Modified: head/sys/conf/NOTES ============================================================================== --- head/sys/conf/NOTES Tue Jun 19 06:52:21 2012 (r237262) +++ head/sys/conf/NOTES Tue Jun 19 07:34:13 2012 (r237263) @@ -545,6 +545,8 @@ options INET6 #IPv6 communications pr options ROUTETABLES=2 # max 16. 1 is back compatible. +options TCP_OFFLOAD # TCP offload support. + # In order to enable IPSEC you MUST also add device crypto to # your kernel configuration options IPSEC #IP security (requires device crypto) Modified: head/sys/conf/files ============================================================================== --- head/sys/conf/files Tue Jun 19 06:52:21 2012 (r237262) +++ head/sys/conf/files Tue Jun 19 07:34:13 2012 (r237263) @@ -1038,8 +1038,6 @@ dev/cs/if_cs_isa.c optional cs isa dev/cs/if_cs_pccard.c optional cs pccard dev/cxgb/cxgb_main.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" -dev/cxgb/cxgb_offload.c optional cxgb pci \ - compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/cxgb_sge.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_mc5.c optional cxgb pci \ @@ -3037,7 +3035,7 @@ netinet/tcp_hostcache.c optional inet | netinet/tcp_input.c optional inet | inet6 netinet/tcp_lro.c optional inet | inet6 netinet/tcp_output.c optional inet | inet6 -netinet/tcp_offload.c optional inet | inet6 +netinet/tcp_offload.c optional tcp_offload inet | tcp_offload inet6 netinet/tcp_reass.c optional inet | inet6 netinet/tcp_sack.c optional inet | inet6 netinet/tcp_subr.c optional inet | inet6 Modified: head/sys/conf/options ============================================================================== --- head/sys/conf/options Tue Jun 19 06:52:21 2012 (r237262) +++ head/sys/conf/options Tue Jun 19 07:34:13 2012 (r237263) @@ -434,7 +434,7 @@ RADIX_MPATH opt_mpath.h ROUTETABLES opt_route.h SLIP_IFF_OPTS opt_slip.h TCPDEBUG -TCP_OFFLOAD_DISABLE opt_inet.h #Disable code to dispatch tcp offloading +TCP_OFFLOAD opt_inet.h # Enable code to dispatch TCP offloading TCP_SIGNATURE opt_inet.h VLAN_ARRAY opt_vlan.h XBONEHACK Modified: head/sys/contrib/rdma/krping/krping.c ============================================================================== --- head/sys/contrib/rdma/krping/krping.c Tue Jun 19 06:52:21 2012 (r237262) +++ head/sys/contrib/rdma/krping/krping.c Tue Jun 19 07:34:13 2012 (r237263) @@ -41,7 +41,6 @@ __FBSDID("$FreeBSD$"); #include #include #include -#include #include #include #include @@ -53,11 +52,13 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include -#include +#include +#include #include "getopt.h" #include "krping.h" @@ -83,6 +84,7 @@ static const struct krping_option krping {"bw", OPT_NOPARAM, 'B'}, {"tx-depth", OPT_INT, 't'}, {"poll", OPT_NOPARAM, 'P'}, + {"memlimit", OPT_INT, 'm'}, {NULL, 0, 0} }; @@ -254,10 +256,14 @@ static void krping_cq_event_handler(stru ib_req_notify_cq(cb->cq, IB_CQ_NEXT_COMP); while ((ret = ib_poll_cq(cb->cq, 1, &wc)) == 1) { if (wc.status) { - if (wc.status != IB_WC_WR_FLUSH_ERR) - log(LOG_ERR, "cq completion failed status %d\n", + if (wc.status == IB_WC_WR_FLUSH_ERR) { + DEBUG_LOG("cq flushed\n"); + continue; + } else { + log(LOG_CRIT, "cq completion failed status %d\n", wc.status); - goto error; + goto error; + } } switch (wc.opcode) { @@ -432,8 +438,17 @@ static int krping_setup_buffers(struct k } } - cb->rdma_buf = contigmalloc(cb->size, M_DEVBUF, M_WAITOK, 0, -1UL, - PAGE_SIZE, 0); + /* RNIC adapters have a limit upto which it can register physical memory + * If DMA-MR memory mode is set then normally driver registers maximum + * supported memory. After that if contigmalloc allocates memory beyond the + * specified RNIC limit then Krping may not work. + */ + if (cb->use_dmamr && cb->memlimit) + cb->rdma_buf = contigmalloc(cb->size, M_DEVBUF, M_WAITOK, 0, cb->memlimit, + PAGE_SIZE, 0); + else + cb->rdma_buf = contigmalloc(cb->size, M_DEVBUF, M_WAITOK, 0, -1UL, + PAGE_SIZE, 0); if (!cb->rdma_buf) { log(LOG_ERR, "rdma_buf malloc failed\n"); @@ -458,8 +473,12 @@ static int krping_setup_buffers(struct k } if (!cb->server || cb->wlat || cb->rlat || cb->bw) { - cb->start_buf = contigmalloc(cb->size, M_DEVBUF, M_WAITOK, - 0, -1UL, PAGE_SIZE, 0); + if (cb->use_dmamr && cb->memlimit) + cb->start_buf = contigmalloc(cb->size, M_DEVBUF, M_WAITOK, + 0, cb->memlimit, PAGE_SIZE, 0); + else + cb->start_buf = contigmalloc(cb->size, M_DEVBUF, M_WAITOK, + 0, -1UL, PAGE_SIZE, 0); if (!cb->start_buf) { log(LOG_ERR, "start_buf malloc failed\n"); ret = ENOMEM; @@ -1636,6 +1655,8 @@ int krping_doit(char *cmd) cb->state = IDLE; cb->size = 64; cb->txdepth = RPING_SQ_DEPTH; + cb->use_dmamr = 1; + cb->memlimit = 0; mtx_init(&cb->lock, "krping mtx", NULL, MTX_DUPOK|MTX_DEF); while ((op = krping_getopt("krping", &cmd, krping_opts, NULL, &optarg, @@ -1713,6 +1734,15 @@ int krping_doit(char *cmd) case 'd': debug++; break; + case 'm': + cb->memlimit = optint; + if (cb->memlimit < 1) { + log(LOG_ERR, "Invalid memory limit %ju\n", + cb->memlimit); + ret = EINVAL; + } else + DEBUG_LOG(PFX "memory limit %d\n", (int)optint); + break; default: log(LOG_ERR, "unknown opt %s\n", optarg); ret = EINVAL; Modified: head/sys/contrib/rdma/krping/krping.h ============================================================================== --- head/sys/contrib/rdma/krping/krping.h Tue Jun 19 06:52:21 2012 (r237262) +++ head/sys/contrib/rdma/krping/krping.h Tue Jun 19 07:34:13 2012 (r237263) @@ -1,7 +1,7 @@ /* * $FreeBSD$ */ -#include +#include #include /* @@ -92,6 +92,8 @@ struct krping_cb { int count; /* ping count */ int size; /* ping data size */ int validate; /* validate ping data */ + uint64_t memlimit; /* limit of the physical memory that + can be registered with dma_mr mode */ /* CM stuff */ struct rdma_cm_id *cm_id; /* connection on client side,*/ Modified: head/sys/contrib/rdma/krping/krping_dev.c ============================================================================== --- head/sys/contrib/rdma/krping/krping_dev.c Tue Jun 19 06:52:21 2012 (r237262) +++ head/sys/contrib/rdma/krping/krping_dev.c Tue Jun 19 07:34:13 2012 (r237263) @@ -14,7 +14,6 @@ __FBSDID("$FreeBSD$"); #include -#include #include /* uprintf */ #include #include /* defines used in kernel.h */ @@ -51,6 +50,9 @@ typedef struct s_krping { /* vars */ static struct cdev *krping_dev; +#undef MODULE_VERSION +#include + static int krping_loader(struct module *m, int what, void *arg) { @@ -175,6 +177,4 @@ krping_write(struct cdev *dev, struct ui return(err); } -MODULE_DEPEND(krping, rdma_core, 1, 1, 1); -MODULE_DEPEND(krping, rdma_cma, 1, 1, 1); DEV_MODULE(krping,krping_loader,NULL); Modified: head/sys/contrib/rdma/rdma_addr.c ============================================================================== --- head/sys/contrib/rdma/rdma_addr.c Tue Jun 19 06:52:21 2012 (r237262) +++ head/sys/contrib/rdma/rdma_addr.c Tue Jun 19 07:34:13 2012 (r237263) @@ -117,7 +117,8 @@ int rdma_copy_addr(struct rdma_dev_addr const unsigned char *dst_dev_addr) { dev_addr->dev_type = RDMA_NODE_RNIC; - memcpy(dev_addr->src_dev_addr, IF_LLADDR(dev), MAX_ADDR_LEN); + memset(dev_addr->src_dev_addr, 0, MAX_ADDR_LEN); + memcpy(dev_addr->src_dev_addr, IF_LLADDR(dev), dev->if_addrlen); memcpy(dev_addr->broadcast, dev->if_broadcastaddr, MAX_ADDR_LEN); if (dst_dev_addr) memcpy(dev_addr->dst_dev_addr, dst_dev_addr, MAX_ADDR_LEN); @@ -207,7 +208,7 @@ static int addr_resolve_remote(struct so goto put; } ret = arpresolve(iproute.ro_rt->rt_ifp, iproute.ro_rt, NULL, - rt_key(iproute.ro_rt), dmac, &lle); + (struct sockaddr *)dst_in, dmac, &lle); if (ret) { goto put; } Modified: head/sys/contrib/rdma/rdma_cache.c ============================================================================== --- head/sys/contrib/rdma/rdma_cache.c Tue Jun 19 06:52:21 2012 (r237262) +++ head/sys/contrib/rdma/rdma_cache.c Tue Jun 19 07:34:13 2012 (r237263) @@ -132,7 +132,7 @@ int ib_find_cached_gid(struct ib_device for (p = 0; p <= end_port(device) - start_port(device); ++p) { cache = device->cache.gid_cache[p]; for (i = 0; i < cache->table_len; ++i) { - if (!memcmp(gid, &cache->table[i], 6)) { /* XXX */ + if (!memcmp(gid, &cache->table[i], sizeof *gid)) { *port_num = p + start_port(device); if (index) *index = i; Modified: head/sys/dev/cxgb/common/cxgb_ctl_defs.h ============================================================================== --- head/sys/dev/cxgb/common/cxgb_ctl_defs.h Tue Jun 19 06:52:21 2012 (r237262) +++ head/sys/dev/cxgb/common/cxgb_ctl_defs.h Tue Jun 19 07:34:13 2012 (r237263) @@ -60,14 +60,12 @@ struct mtutab { const unsigned short *mtus; /* the MTU table values */ }; -struct net_device; - /* - * Structure used to request the adapter net_device owning a given MAC address. + * Structure used to request the ifnet that owns a given MAC address. */ struct iff_mac { - struct net_device *dev; /* the net_device */ - const unsigned char *mac_addr; /* MAC address to lookup */ + struct ifnet *dev; + const unsigned char *mac_addr; u16 vlan_tag; }; @@ -85,7 +83,7 @@ struct ddp_params { struct adap_ports { unsigned int nports; /* number of ports on this adapter */ - struct net_device *lldevs[MAX_NPORTS]; + struct ifnet *lldevs[MAX_NPORTS]; }; /* Modified: head/sys/dev/cxgb/cxgb_adapter.h ============================================================================== --- head/sys/dev/cxgb/cxgb_adapter.h Tue Jun 19 06:52:21 2012 (r237262) +++ head/sys/dev/cxgb/cxgb_adapter.h Tue Jun 19 07:34:13 2012 (r237263) @@ -57,7 +57,6 @@ $FreeBSD$ #include #include -#include #include struct adapter; @@ -130,6 +129,7 @@ enum { CXGB_OFLD_INIT = (1 << 7), TP_PARITY_INIT = (1 << 8), CXGB_BUSY = (1 << 9), + TOM_INIT_DONE = (1 << 10), /* port flags */ DOOMED = (1 << 0), @@ -179,7 +179,6 @@ struct sge_rspq { uint32_t async_notif; uint32_t cntxt_id; uint32_t offload_pkts; - uint32_t offload_bundles; uint32_t pure_rsps; uint32_t unhandled_irqs; uint32_t starved; @@ -291,6 +290,7 @@ struct sge_qset { uint32_t txq_stopped; /* which Tx queues are stopped */ uint64_t port_stats[SGE_PSTAT_MAX]; struct port_info *port; + struct adapter *adap; int idx; /* qset # */ int qs_flags; int coalescing; @@ -307,10 +307,13 @@ struct sge { struct filter_info; +typedef int (*cpl_handler_t)(struct sge_qset *, struct rsp_desc *, + struct mbuf *); + struct adapter { + SLIST_ENTRY(adapter) link; device_t dev; int flags; - TAILQ_ENTRY(adapter) adapter_entry; /* PCI register resources */ int regs_rid; @@ -376,11 +379,16 @@ struct adapter { struct port_info port[MAX_NPORTS]; device_t portdev[MAX_NPORTS]; - struct t3cdev tdev; +#ifdef TCP_OFFLOAD + void *tom_softc; + void *iwarp_softc; +#endif char fw_version[64]; char port_types[MAX_NPORTS + 1]; uint32_t open_device_map; - uint32_t registered_device_map; +#ifdef TCP_OFFLOAD + int offload_map; +#endif struct mtx lock; driver_intr_t *cxgb_intr; int msi_count; @@ -392,6 +400,11 @@ struct adapter { char elmerlockbuf[ADAPTER_LOCK_NAME_LEN]; int timestamp; + +#ifdef TCP_OFFLOAD +#define NUM_CPL_HANDLERS 0xa7 + cpl_handler_t cpl_handler[NUM_CPL_HANDLERS] __aligned(CACHE_LINE_SIZE); +#endif }; struct t3_rx_mode { @@ -502,10 +515,12 @@ void t3_os_link_changed(adapter_t *adapt int speed, int duplex, int fc, int mac_was_reset); void t3_os_phymod_changed(struct adapter *adap, int port_id); void t3_sge_err_intr_handler(adapter_t *adapter); -int t3_offload_tx(struct t3cdev *, struct mbuf *); +#ifdef TCP_OFFLOAD +int t3_offload_tx(struct adapter *, struct mbuf *); +#endif void t3_os_set_hw_addr(adapter_t *adapter, int port_idx, u8 hw_addr[]); int t3_mgmt_tx(adapter_t *adap, struct mbuf *m); - +int t3_register_cpl_handler(struct adapter *, int, cpl_handler_t); int t3_sge_alloc(struct adapter *); int t3_sge_free(struct adapter *); @@ -556,15 +571,9 @@ txq_to_qset(struct sge_txq *q, int qidx) return container_of(q, struct sge_qset, txq[qidx]); } -static __inline struct adapter * -tdev2adap(struct t3cdev *d) -{ - return container_of(d, struct adapter, tdev); -} - #undef container_of -#define OFFLOAD_DEVMAP_BIT 15 +#define OFFLOAD_DEVMAP_BIT (1 << MAX_NPORTS) static inline int offload_running(adapter_t *adapter) { return isset(&adapter->open_device_map, OFFLOAD_DEVMAP_BIT); @@ -573,4 +582,5 @@ static inline int offload_running(adapte void cxgb_tx_watchdog(void *arg); int cxgb_transmit(struct ifnet *ifp, struct mbuf *m); void cxgb_qflush(struct ifnet *ifp); +void t3_iterate(void (*)(struct adapter *, void *), void *); #endif Modified: head/sys/dev/cxgb/cxgb_main.c ============================================================================== --- head/sys/dev/cxgb/cxgb_main.c Tue Jun 19 06:52:21 2012 (r237262) +++ head/sys/dev/cxgb/cxgb_main.c Tue Jun 19 07:34:13 2012 (r237263) @@ -30,6 +30,8 @@ POSSIBILITY OF SUCH DAMAGE. #include __FBSDID("$FreeBSD$"); +#include "opt_inet.h" + #include #include #include @@ -107,6 +109,9 @@ static inline void mk_set_tcb_field(stru unsigned int, u64, u64); static inline void set_tcb_field_ulp(struct cpl_set_tcb_field *, unsigned int, unsigned int, u64, u64); +#ifdef TCP_OFFLOAD +static int cpl_not_handled(struct sge_qset *, struct rsp_desc *, struct mbuf *); +#endif /* Attachment glue for the PCI controller end of the device. Each port of * the device is attached separately, as defined later. @@ -119,10 +124,11 @@ static __inline void reg_block_dump(stru unsigned int end); static void cxgb_get_regs(adapter_t *sc, struct ch_ifconf_regs *regs, uint8_t *buf); static int cxgb_get_regs_len(void); -static int offload_open(struct port_info *pi); static void touch_bars(device_t dev); -static int offload_close(struct t3cdev *tdev); static void cxgb_update_mac_settings(struct port_info *p); +#ifdef TCP_OFFLOAD +static int toe_capability(struct port_info *, int); +#endif static device_method_t cxgb_controller_methods[] = { DEVMETHOD(device_probe, cxgb_controller_probe), @@ -138,8 +144,11 @@ static driver_t cxgb_controller_driver = sizeof(struct adapter) }; +static int cxgbc_mod_event(module_t, int, void *); static devclass_t cxgb_controller_devclass; -DRIVER_MODULE(cxgbc, pci, cxgb_controller_driver, cxgb_controller_devclass, 0, 0); +DRIVER_MODULE(cxgbc, pci, cxgb_controller_driver, cxgb_controller_devclass, + cxgbc_mod_event, 0); +MODULE_VERSION(cxgbc, 1); /* * Attachment glue for the ports. Attachment is done directly to the @@ -177,6 +186,14 @@ static struct cdevsw cxgb_cdevsw = { static devclass_t cxgb_port_devclass; DRIVER_MODULE(cxgb, cxgbc, cxgb_port_driver, cxgb_port_devclass, 0, 0); +MODULE_VERSION(cxgb, 1); + +static struct mtx t3_list_lock; +static SLIST_HEAD(, adapter) t3_list; +#ifdef TCP_OFFLOAD +static struct mtx t3_uld_list_lock; +static SLIST_HEAD(, uld_info) t3_uld_list; +#endif /* * The driver uses the best interrupt scheme available on a platform in the @@ -195,15 +212,6 @@ SYSCTL_INT(_hw_cxgb, OID_AUTO, msi_allow "MSI-X, MSI, INTx selector"); /* - * The driver enables offload as a default. - * To disable it, use ofld_disable = 1. - */ -static int ofld_disable = 0; -TUNABLE_INT("hw.cxgb.ofld_disable", &ofld_disable); -SYSCTL_INT(_hw_cxgb, OID_AUTO, ofld_disable, CTLFLAG_RDTUN, &ofld_disable, 0, - "disable ULP offload"); - -/* * The driver uses an auto-queue algorithm by default. * To disable it and force a single queue-set per port, use multiq = 0 */ @@ -445,6 +453,25 @@ cxgb_controller_attach(device_t dev) sc->msi_count = 0; ai = cxgb_get_adapter_info(dev); + snprintf(sc->lockbuf, ADAPTER_LOCK_NAME_LEN, "cxgb controller lock %d", + device_get_unit(dev)); + ADAPTER_LOCK_INIT(sc, sc->lockbuf); + + snprintf(sc->reglockbuf, ADAPTER_LOCK_NAME_LEN, "SGE reg lock %d", + device_get_unit(dev)); + snprintf(sc->mdiolockbuf, ADAPTER_LOCK_NAME_LEN, "cxgb mdio lock %d", + device_get_unit(dev)); + snprintf(sc->elmerlockbuf, ADAPTER_LOCK_NAME_LEN, "cxgb elmer lock %d", + device_get_unit(dev)); + + MTX_INIT(&sc->sge.reg_lock, sc->reglockbuf, NULL, MTX_SPIN); + MTX_INIT(&sc->mdio_lock, sc->mdiolockbuf, NULL, MTX_DEF); + MTX_INIT(&sc->elmer_lock, sc->elmerlockbuf, NULL, MTX_DEF); + + mtx_lock(&t3_list_lock); + SLIST_INSERT_HEAD(&t3_list, sc, link); + mtx_unlock(&t3_list_lock); + /* find the PCIe link width and set max read request to 4KB*/ if (pci_find_cap(dev, PCIY_EXPRESS, ®) == 0) { uint16_t lnk; @@ -471,24 +498,10 @@ cxgb_controller_attach(device_t dev) if ((sc->regs_res = bus_alloc_resource_any(dev, SYS_RES_MEMORY, &sc->regs_rid, RF_ACTIVE)) == NULL) { device_printf(dev, "Cannot allocate BAR region 0\n"); - return (ENXIO); + error = ENXIO; + goto out; } - snprintf(sc->lockbuf, ADAPTER_LOCK_NAME_LEN, "cxgb controller lock %d", - device_get_unit(dev)); - ADAPTER_LOCK_INIT(sc, sc->lockbuf); - - snprintf(sc->reglockbuf, ADAPTER_LOCK_NAME_LEN, "SGE reg lock %d", - device_get_unit(dev)); - snprintf(sc->mdiolockbuf, ADAPTER_LOCK_NAME_LEN, "cxgb mdio lock %d", - device_get_unit(dev)); - snprintf(sc->elmerlockbuf, ADAPTER_LOCK_NAME_LEN, "cxgb elmer lock %d", - device_get_unit(dev)); - - MTX_INIT(&sc->sge.reg_lock, sc->reglockbuf, NULL, MTX_SPIN); - MTX_INIT(&sc->mdio_lock, sc->mdiolockbuf, NULL, MTX_DEF); - MTX_INIT(&sc->elmer_lock, sc->elmerlockbuf, NULL, MTX_DEF); - sc->bt = rman_get_bustag(sc->regs_res); sc->bh = rman_get_bushandle(sc->regs_res); sc->mmio_len = rman_get_size(sc->regs_res); @@ -604,7 +617,7 @@ cxgb_controller_attach(device_t dev) } else { sc->flags |= TPS_UPTODATE; } - + /* * Create a child device for each MAC. The ethernet attachment * will be done in these children. @@ -636,12 +649,7 @@ cxgb_controller_attach(device_t dev) t3_sge_init_adapter(sc); t3_led_ready(sc); - - cxgb_offload_init(); - if (is_offload(sc)) { - setbit(&sc->registered_device_map, OFFLOAD_DEVMAP_BIT); - cxgb_adapter_ofld(sc); - } + error = t3_get_fw_version(sc, &vers); if (error) goto out; @@ -662,6 +670,11 @@ cxgb_controller_attach(device_t dev) device_printf(sc->dev, "Firmware Version %s\n", &sc->fw_version[0]); callout_reset(&sc->cxgb_tick_ch, hz, cxgb_tick, sc); t3_add_attach_sysctls(sc); + +#ifdef TCP_OFFLOAD + for (i = 0; i < NUM_CPL_HANDLERS; i++) + sc->cpl_handler[i] = cpl_not_handled; +#endif out: if (error) cxgb_free(sc); @@ -775,20 +788,9 @@ cxgb_free(struct adapter *sc) sc->tq = NULL; } - if (is_offload(sc)) { - clrbit(&sc->registered_device_map, OFFLOAD_DEVMAP_BIT); - cxgb_adapter_unofld(sc); - } - -#ifdef notyet - if (sc->flags & CXGB_OFLD_INIT) - cxgb_offload_deactivate(sc); -#endif free(sc->filters, M_DEVBUF); t3_sge_free(sc); - cxgb_offload_exit(); - if (sc->udbs_res != NULL) bus_release_resource(sc->dev, SYS_RES_MEMORY, sc->udbs_rid, sc->udbs_res); @@ -800,6 +802,9 @@ cxgb_free(struct adapter *sc) MTX_DESTROY(&sc->mdio_lock); MTX_DESTROY(&sc->sge.reg_lock); MTX_DESTROY(&sc->elmer_lock); + mtx_lock(&t3_list_lock); + SLIST_REMOVE(&t3_list, sc, adapter, link); + mtx_unlock(&t3_list_lock); ADAPTER_LOCK_DEINIT(sc); } @@ -1017,6 +1022,10 @@ cxgb_port_attach(device_t dev) ifp->if_qflush = cxgb_qflush; ifp->if_capabilities = CXGB_CAP; +#ifdef TCP_OFFLOAD + if (is_offload(sc)) + ifp->if_capabilities |= IFCAP_TOE4; +#endif ifp->if_capenable = CXGB_CAP_ENABLE; ifp->if_hwassist = CSUM_TCP | CSUM_UDP | CSUM_IP | CSUM_TSO; @@ -1420,65 +1429,6 @@ setup_rss(adapter_t *adap) cpus, rspq_map); } - -/* - * Sends an mbuf to an offload queue driver - * after dealing with any active network taps. - */ -static inline int -offload_tx(struct t3cdev *tdev, struct mbuf *m) -{ - int ret; - - ret = t3_offload_tx(tdev, m); - return (ret); -} - -static int -write_smt_entry(struct adapter *adapter, int idx) -{ - struct port_info *pi = &adapter->port[idx]; - struct cpl_smt_write_req *req; - struct mbuf *m; - - if ((m = m_gethdr(M_NOWAIT, MT_DATA)) == NULL) - return (ENOMEM); - - req = mtod(m, struct cpl_smt_write_req *); - m->m_pkthdr.len = m->m_len = sizeof(struct cpl_smt_write_req); - - req->wr.wrh_hi = htonl(V_WR_OP(FW_WROPCODE_FORWARD)); - OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_SMT_WRITE_REQ, idx)); - req->mtu_idx = NMTUS - 1; /* should be 0 but there's a T3 bug */ - req->iff = idx; - memset(req->src_mac1, 0, sizeof(req->src_mac1)); - memcpy(req->src_mac0, pi->hw_addr, ETHER_ADDR_LEN); - - m_set_priority(m, 1); - - offload_tx(&adapter->tdev, m); - - return (0); -} - -static int -init_smt(struct adapter *adapter) -{ - int i; - - for_each_port(adapter, i) - write_smt_entry(adapter, i); - return 0; -} - -static void -init_port_mtus(adapter_t *adapter) -{ - unsigned int mtus = ETHERMTU | (ETHERMTU << 16); - - t3_write_reg(adapter, A_TP_MTU_PORT_TABLE, mtus); -} - static void send_pktsched_cmd(struct adapter *adap, int sched, int qidx, int lo, int hi, int port) @@ -1705,45 +1655,6 @@ cxgb_down(struct adapter *sc) t3_intr_disable(sc); } -static int -offload_open(struct port_info *pi) -{ - struct adapter *sc = pi->adapter; - struct t3cdev *tdev = &sc->tdev; - - setbit(&sc->open_device_map, OFFLOAD_DEVMAP_BIT); - - t3_tp_set_offload_mode(sc, 1); - tdev->lldev = pi->ifp; - init_port_mtus(sc); - t3_load_mtus(sc, sc->params.mtus, sc->params.a_wnd, sc->params.b_wnd, - sc->params.rev == 0 ? sc->port[0].ifp->if_mtu : 0xffff); - init_smt(sc); - cxgb_add_clients(tdev); - - return (0); -} - -static int -offload_close(struct t3cdev *tdev) -{ - struct adapter *adapter = tdev2adap(tdev); - - if (!isset(&adapter->open_device_map, OFFLOAD_DEVMAP_BIT)) - return (0); - - /* Call back all registered clients */ - cxgb_remove_clients(tdev); - - tdev->lldev = NULL; - cxgb_set_dummy_ops(tdev); - t3_tp_set_offload_mode(adapter, 0); - - clrbit(&adapter->open_device_map, OFFLOAD_DEVMAP_BIT); - - return (0); -} - /* * if_init for cxgb ports. */ @@ -1793,15 +1704,9 @@ cxgb_init_locked(struct port_info *p) ADAPTER_UNLOCK(sc); } - if (sc->open_device_map == 0) { - if ((rc = cxgb_up(sc)) != 0) + if (sc->open_device_map == 0 && ((rc = cxgb_up(sc)) != 0)) goto done; - if (is_offload(sc) && !ofld_disable && offload_open(p)) - log(LOG_WARNING, - "Could not initialize offload capabilities\n"); - } - PORT_LOCK(p); if (isset(&sc->open_device_map, p->port_id) && (ifp->if_drv_flags & IFF_DRV_RUNNING)) { @@ -1929,7 +1834,6 @@ cxgb_uninit_synchronized(struct port_inf DELAY(100 * 1000); t3_mac_disable(&pi->mac, MAC_DIRECTION_RX); - pi->phy.ops->power_down(&pi->phy, 1); PORT_UNLOCK(pi); @@ -1937,9 +1841,6 @@ cxgb_uninit_synchronized(struct port_inf pi->link_config.link_ok = 0; t3_os_link_changed(sc, pi->port_id, 0, 0, 0, 0, 0); - if ((sc->open_device_map & PORT_MASK) == 0) - offload_close(&sc->tdev); - if (sc->open_device_map == 0) cxgb_down(pi->adapter); @@ -2081,6 +1982,15 @@ fail: /* Safe to do this even if cxgb_up not called yet */ cxgb_set_lro(p, ifp->if_capenable & IFCAP_LRO); } +#ifdef TCP_OFFLOAD + if (mask & IFCAP_TOE4) { + int enable = (ifp->if_capenable ^ mask) & IFCAP_TOE4; + + error = toe_capability(p, enable); + if (error == 0) + ifp->if_capenable ^= mask; + } +#endif if (mask & IFCAP_VLAN_HWTAGGING) { ifp->if_capenable ^= IFCAP_VLAN_HWTAGGING; if (ifp->if_drv_flags & IFF_DRV_RUNNING) { @@ -3362,3 +3272,235 @@ set_tcb_field_ulp(struct cpl_set_tcb_fie txpkt->len = htonl(V_ULPTX_NFLITS(sizeof(*req) / 8)); mk_set_tcb_field(req, tid, word, mask, val); } + +void +t3_iterate(void (*func)(struct adapter *, void *), void *arg) +{ + struct adapter *sc; + + mtx_lock(&t3_list_lock); + SLIST_FOREACH(sc, &t3_list, link) { + /* + * func should not make any assumptions about what state sc is + * in - the only guarantee is that sc->sc_lock is a valid lock. + */ + func(sc, arg); + } + mtx_unlock(&t3_list_lock); +} + +#ifdef TCP_OFFLOAD +static int +toe_capability(struct port_info *pi, int enable) +{ + int rc; + struct adapter *sc = pi->adapter; + + ADAPTER_LOCK_ASSERT_OWNED(sc); + + if (!is_offload(sc)) + return (ENODEV); + + if (enable) { + if (!(sc->flags & FULL_INIT_DONE)) { + log(LOG_WARNING, + "You must enable a cxgb interface first\n"); + return (EAGAIN); + } + + if (isset(&sc->offload_map, pi->port_id)) + return (0); + + if (!(sc->flags & TOM_INIT_DONE)) { + rc = t3_activate_uld(sc, ULD_TOM); + if (rc == EAGAIN) { + log(LOG_WARNING, + "You must kldload t3_tom.ko before trying " + "to enable TOE on a cxgb interface.\n"); + } + if (rc != 0) + return (rc); + KASSERT(sc->tom_softc != NULL, + ("%s: TOM activated but softc NULL", __func__)); + KASSERT(sc->flags & TOM_INIT_DONE, + ("%s: TOM activated but flag not set", __func__)); + } + + setbit(&sc->offload_map, pi->port_id); + + /* + * XXX: Temporary code to allow iWARP to be enabled when TOE is + * enabled on any port. Need to figure out how to enable, + * disable, load, and unload iWARP cleanly. + */ + if (!isset(&sc->offload_map, MAX_NPORTS) && + t3_activate_uld(sc, ULD_IWARP) == 0) + setbit(&sc->offload_map, MAX_NPORTS); + } else { + if (!isset(&sc->offload_map, pi->port_id)) + return (0); + + KASSERT(sc->flags & TOM_INIT_DONE, + ("%s: TOM never initialized?", __func__)); + clrbit(&sc->offload_map, pi->port_id); + } + + return (0); +} + +/* + * Add an upper layer driver to the global list. + */ +int +t3_register_uld(struct uld_info *ui) +{ + int rc = 0; + struct uld_info *u; + + mtx_lock(&t3_uld_list_lock); + SLIST_FOREACH(u, &t3_uld_list, link) { + if (u->uld_id == ui->uld_id) { + rc = EEXIST; + goto done; + } + } + + SLIST_INSERT_HEAD(&t3_uld_list, ui, link); + ui->refcount = 0; +done: + mtx_unlock(&t3_uld_list_lock); + return (rc); +} + +int +t3_unregister_uld(struct uld_info *ui) +{ + int rc = EINVAL; + struct uld_info *u; + + mtx_lock(&t3_uld_list_lock); + + SLIST_FOREACH(u, &t3_uld_list, link) { + if (u == ui) { + if (ui->refcount > 0) { + rc = EBUSY; + goto done; + } + + SLIST_REMOVE(&t3_uld_list, ui, uld_info, link); + rc = 0; + goto done; + } + } +done: + mtx_unlock(&t3_uld_list_lock); + return (rc); +} + +int +t3_activate_uld(struct adapter *sc, int id) +{ + int rc = EAGAIN; + struct uld_info *ui; + + mtx_lock(&t3_uld_list_lock); + + SLIST_FOREACH(ui, &t3_uld_list, link) { + if (ui->uld_id == id) { + rc = ui->activate(sc); + if (rc == 0) + ui->refcount++; + goto done; + } + } +done: + mtx_unlock(&t3_uld_list_lock); + + return (rc); +} + +int +t3_deactivate_uld(struct adapter *sc, int id) +{ + int rc = EINVAL; + struct uld_info *ui; + + mtx_lock(&t3_uld_list_lock); + + SLIST_FOREACH(ui, &t3_uld_list, link) { + if (ui->uld_id == id) { + rc = ui->deactivate(sc); + if (rc == 0) + ui->refcount--; + goto done; + } + } +done: + mtx_unlock(&t3_uld_list_lock); + + return (rc); +} + +static int +cpl_not_handled(struct sge_qset *qs __unused, struct rsp_desc *r __unused, + struct mbuf *m) +{ + m_freem(m); + return (EDOOFUS); +} + +int +t3_register_cpl_handler(struct adapter *sc, int opcode, cpl_handler_t h) +{ + uintptr_t *loc, new; + + if (opcode >= NUM_CPL_HANDLERS) + return (EINVAL); + + new = h ? (uintptr_t)h : (uintptr_t)cpl_not_handled; + loc = (uintptr_t *) &sc->cpl_handler[opcode]; + atomic_store_rel_ptr(loc, new); + + return (0); +} +#endif + +static int +cxgbc_mod_event(module_t mod, int cmd, void *arg) +{ + int rc = 0; + + switch (cmd) { + case MOD_LOAD: + mtx_init(&t3_list_lock, "T3 adapters", 0, MTX_DEF); + SLIST_INIT(&t3_list); +#ifdef TCP_OFFLOAD + mtx_init(&t3_uld_list_lock, "T3 ULDs", 0, MTX_DEF); + SLIST_INIT(&t3_uld_list); +#endif + break; + + case MOD_UNLOAD: +#ifdef TCP_OFFLOAD + mtx_lock(&t3_uld_list_lock); + if (!SLIST_EMPTY(&t3_uld_list)) { + rc = EBUSY; + mtx_unlock(&t3_uld_list_lock); + break; *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***