From owner-svn-src-stable@freebsd.org Fri Dec 11 18:28:21 2015 Return-Path: Delivered-To: svn-src-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C500CA049D7; Fri, 11 Dec 2015 18:28:21 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8CAD51528; Fri, 11 Dec 2015 18:28:21 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id tBBISK1T057210; Fri, 11 Dec 2015 18:28:20 GMT (envelope-from hselasky@FreeBSD.org) Received: (from hselasky@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id tBBISKXJ057205; Fri, 11 Dec 2015 18:28:20 GMT (envelope-from hselasky@FreeBSD.org) Message-Id: <201512111828.tBBISKXJ057205@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: hselasky set sender to hselasky@FreeBSD.org using -f From: Hans Petter Selasky Date: Fri, 11 Dec 2015 18:28:20 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-9@freebsd.org Subject: svn commit: r292115 - in stable/9/sys: modules/mlxen ofed/drivers/net/mlx4 ofed/include/linux/mlx4 X-SVN-Group: stable-9 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Dec 2015 18:28:21 -0000 Author: hselasky Date: Fri Dec 11 18:28:20 2015 New Revision: 292115 URL: https://svnweb.freebsd.org/changeset/base/292115 Log: MFC r283612, r290710, r291694, r291699 and r291793: - Add SIOCGI2C ioctl support to the driver. Would work only on ConnectX-3 with fresh firmware. The low level code is based on code provided by Mellanox. - Fix print formatting compile warnings for Sparc64 and PowerPC platforms. - Updated the mlx4 and mlxen drivers to the latest version, v2.1.6: - Added support for dumping the SFP EEPROM content to dmesg. - Fixed handling of network interface capability IOCTLs. - Fixed race when loading and unloading the mlxen driver by applying appropriate locking. - Removed two unused C-files. - Convert the mlxen driver to use the BUSDMA(9) APIs instead of vtophys() when loading mbufs for transmission and reception. While at it all pointer arithmetic and cast qualifier issues were fixed, mostly related to transmission and reception. - Fix i386 build WITH_OFED=YES. Remove some redundant KASSERTs. Sponsored by: Mellanox Technologies Differential Revision: https://reviews.freebsd.org/D4283 Differential Revision: https://reviews.freebsd.org/D4284 Deleted: stable/9/sys/ofed/drivers/net/mlx4/en_ethtool.c stable/9/sys/ofed/drivers/net/mlx4/en_selftest.c Modified: stable/9/sys/modules/mlxen/Makefile stable/9/sys/ofed/drivers/net/mlx4/en_main.c stable/9/sys/ofed/drivers/net/mlx4/en_netdev.c stable/9/sys/ofed/drivers/net/mlx4/en_port.c stable/9/sys/ofed/drivers/net/mlx4/en_rx.c stable/9/sys/ofed/drivers/net/mlx4/en_tx.c stable/9/sys/ofed/drivers/net/mlx4/main.c stable/9/sys/ofed/drivers/net/mlx4/mlx4.h stable/9/sys/ofed/drivers/net/mlx4/mlx4_en.h stable/9/sys/ofed/drivers/net/mlx4/mlx4_stats.h stable/9/sys/ofed/drivers/net/mlx4/port.c stable/9/sys/ofed/include/linux/mlx4/cq.h stable/9/sys/ofed/include/linux/mlx4/device.h stable/9/sys/ofed/include/linux/mlx4/doorbell.h stable/9/sys/ofed/include/linux/mlx4/qp.h Directory Properties: stable/9/sys/ (props changed) stable/9/sys/modules/ (props changed) Modified: stable/9/sys/modules/mlxen/Makefile ============================================================================== --- stable/9/sys/modules/mlxen/Makefile Fri Dec 11 16:51:04 2015 (r292114) +++ stable/9/sys/modules/mlxen/Makefile Fri Dec 11 18:28:20 2015 (r292115) @@ -26,4 +26,4 @@ opt_inet6.h: .include -CFLAGS+= -Wno-cast-qual -Wno-pointer-arith ${GCC_MS_EXTENSIONS} +CFLAGS+= ${GCC_MS_EXTENSIONS} Modified: stable/9/sys/ofed/drivers/net/mlx4/en_main.c ============================================================================== --- stable/9/sys/ofed/drivers/net/mlx4/en_main.c Fri Dec 11 16:51:04 2015 (r292114) +++ stable/9/sys/ofed/drivers/net/mlx4/en_main.c Fri Dec 11 18:28:20 2015 (r292115) @@ -42,14 +42,7 @@ #include "mlx4_en.h" -MODULE_AUTHOR("Liran Liss, Yevgeny Petrilin"); -MODULE_DESCRIPTION("Mellanox ConnectX HCA Ethernet driver"); -MODULE_LICENSE("Dual BSD/GPL"); -MODULE_VERSION(DRV_VERSION " ("DRV_RELDATE")"); - -static const char mlx4_en_version[] = - DRV_NAME ": Mellanox ConnectX HCA Ethernet driver v" - DRV_VERSION " (" DRV_RELDATE ")\n"; +/* Mellanox ConnectX HCA Ethernet driver */ #define MLX4_EN_PARM_INT(X, def_val, desc) \ static unsigned int X = def_val;\ @@ -174,8 +167,6 @@ static void *mlx4_en_add(struct mlx4_dev int i; int err; - printk_once(KERN_INFO "%s", mlx4_en_version); - mdev = kzalloc(sizeof *mdev, GFP_KERNEL); if (!mdev) { dev_err(&dev->pdev->dev, "Device struct alloc failed, " Modified: stable/9/sys/ofed/drivers/net/mlx4/en_netdev.c ============================================================================== --- stable/9/sys/ofed/drivers/net/mlx4/en_netdev.c Fri Dec 11 16:51:04 2015 (r292114) +++ stable/9/sys/ofed/drivers/net/mlx4/en_netdev.c Fri Dec 11 18:28:20 2015 (r292115) @@ -659,8 +659,10 @@ static void mlx4_en_cache_mclist(struct continue; /* Make sure the list didn't grow. */ tmp = kzalloc(sizeof(struct mlx4_en_mc_list), GFP_ATOMIC); - if (tmp == NULL) + if (tmp == NULL) { + en_err(priv, "Failed to allocate multicast list\n"); break; + } memcpy(tmp->addr, LLADDR((struct sockaddr_dl *)ifma->ifma_addr), ETH_ALEN); list_add_tail(&tmp->list, &priv->mc_list); @@ -971,12 +973,12 @@ static void mlx4_en_do_set_rx_mode(struc if (!mlx4_en_QUERY_PORT(mdev, priv->port)) { if (priv->port_state.link_state) { priv->last_link_state = MLX4_DEV_EVENT_PORT_UP; - /* Important note: the following call for if_link_state_change - * is needed for interface up scenario (start port, link state - * change) */ /* update netif baudrate */ priv->dev->if_baudrate = IF_Mbps(priv->port_state.link_speed); + /* Important note: the following call for if_link_state_change + * is needed for interface up scenario (start port, link state + * change) */ if_link_state_change(priv->dev, LINK_STATE_UP); en_dbg(HW, priv, "Link Up\n"); } @@ -1196,8 +1198,8 @@ static void mlx4_en_linkstate(struct wor /* update netif baudrate */ priv->dev->if_baudrate = 0; - /* make sure the port is up before notifying the OS. - * This is tricky since we get here on INIT_PORT and + /* make sure the port is up before notifying the OS. + * This is tricky since we get here on INIT_PORT and * in such case we can't tell the OS the port is up. * To solve this there is a call to if_link_state_change * in set_rx_mode. @@ -1246,7 +1248,6 @@ int mlx4_en_start_port(struct net_device PAGE_SIZE); priv->rx_alloc_order = get_order(priv->rx_alloc_size); priv->rx_buf_size = roundup_pow_of_two(priv->rx_mb_size); - priv->log_rx_info = ROUNDUP_LOG2(sizeof(struct mlx4_en_rx_buf)); en_dbg(DRV, priv, "Rx buf size:%d\n", priv->rx_mb_size); /* Configure rx cq's and rings */ @@ -1575,6 +1576,7 @@ static void mlx4_en_clear_stats(struct n priv->tx_ring[i]->bytes = 0; priv->tx_ring[i]->packets = 0; priv->tx_ring[i]->tx_csum = 0; + priv->tx_ring[i]->oversized_packets = 0; } for (i = 0; i < priv->rx_ring_num; i++) { priv->rx_ring[i]->bytes = 0; @@ -1644,8 +1646,6 @@ void mlx4_en_free_resources(struct mlx4_ if (priv->sysctl) sysctl_ctx_free(&priv->stat_ctx); - - } int mlx4_en_alloc_resources(struct mlx4_en_priv *priv) @@ -1730,8 +1730,11 @@ void mlx4_en_destroy_netdev(struct net_d EVENTHANDLER_DEREGISTER(vlan_unconfig, priv->vlan_detach); /* Unregister device - this will close the port if it was up */ - if (priv->registered) + if (priv->registered) { + mutex_lock(&mdev->state_lock); ether_ifdetach(dev); + mutex_unlock(&mdev->state_lock); + } if (priv->allocated) mlx4_free_hwq_res(mdev->dev, &priv->res, MLX4_EN_PAGE_SIZE); @@ -1809,13 +1812,6 @@ static int mlx4_en_calc_media(struct mlx active = IFM_ETHER; if (priv->last_link_state == MLX4_DEV_EVENT_PORT_DOWN) return (active); - /* - * [ShaharK] mlx4_en_QUERY_PORT sleeps and cannot be called under a - * non-sleepable lock. - * I moved it to the periodic mlx4_en_do_get_stats. - if (mlx4_en_QUERY_PORT(priv->mdev, priv->port)) - return (active); - */ active |= IFM_FDX; trans_type = priv->port_state.transciver; /* XXX I don't know all of the transceiver values. */ @@ -1948,12 +1944,57 @@ static int mlx4_en_ioctl(struct ifnet *d case SIOCSIFCAP: mutex_lock(&mdev->state_lock); mask = ifr->ifr_reqcap ^ dev->if_capenable; - if (mask & IFCAP_HWCSUM) - dev->if_capenable ^= IFCAP_HWCSUM; - if (mask & IFCAP_TSO4) + if (mask & IFCAP_TXCSUM) { + dev->if_capenable ^= IFCAP_TXCSUM; + dev->if_hwassist ^= (CSUM_TCP | CSUM_UDP | CSUM_IP); + + if (IFCAP_TSO4 & dev->if_capenable && + !(IFCAP_TXCSUM & dev->if_capenable)) { + dev->if_capenable &= ~IFCAP_TSO4; + dev->if_hwassist &= ~CSUM_TSO; + if_printf(dev, + "tso4 disabled due to -txcsum.\n"); + } + } + if (mask & IFCAP_TXCSUM_IPV6) { + dev->if_capenable ^= IFCAP_TXCSUM_IPV6; + dev->if_hwassist ^= (CSUM_UDP_IPV6 | CSUM_TCP_IPV6); + + if (IFCAP_TSO6 & dev->if_capenable && + !(IFCAP_TXCSUM_IPV6 & dev->if_capenable)) { + dev->if_capenable &= ~IFCAP_TSO6; + dev->if_hwassist &= ~CSUM_TSO; + if_printf(dev, + "tso6 disabled due to -txcsum6.\n"); + } + } + if (mask & IFCAP_RXCSUM) + dev->if_capenable ^= IFCAP_RXCSUM; + if (mask & IFCAP_RXCSUM_IPV6) + dev->if_capenable ^= IFCAP_RXCSUM_IPV6; + + if (mask & IFCAP_TSO4) { + if (!(IFCAP_TSO4 & dev->if_capenable) && + !(IFCAP_TXCSUM & dev->if_capenable)) { + if_printf(dev, "enable txcsum first.\n"); + error = EAGAIN; + goto out; + } dev->if_capenable ^= IFCAP_TSO4; - if (mask & IFCAP_TSO6) + } + if (mask & IFCAP_TSO6) { + if (!(IFCAP_TSO6 & dev->if_capenable) && + !(IFCAP_TXCSUM_IPV6 & dev->if_capenable)) { + if_printf(dev, "enable txcsum6 first.\n"); + error = EAGAIN; + goto out; + } dev->if_capenable ^= IFCAP_TSO6; + } + if (dev->if_capenable & (IFCAP_TSO4 | IFCAP_TSO6)) + dev->if_hwassist |= CSUM_TSO; + else + dev->if_hwassist &= ~CSUM_TSO; if (mask & IFCAP_LRO) dev->if_capenable ^= IFCAP_LRO; if (mask & IFCAP_VLAN_HWTAGGING) @@ -1964,9 +2005,35 @@ static int mlx4_en_ioctl(struct ifnet *d dev->if_capenable ^= IFCAP_WOL_MAGIC; if (dev->if_drv_flags & IFF_DRV_RUNNING) mlx4_en_start_port(dev); +out: mutex_unlock(&mdev->state_lock); VLAN_CAPABILITIES(dev); break; +#if __FreeBSD_version >= 1100036 + case SIOCGI2C: { + struct ifi2creq i2c; + + error = copyin(ifr->ifr_data, &i2c, sizeof(i2c)); + if (error) + break; + if (i2c.len > sizeof(i2c.data)) { + error = EINVAL; + break; + } + /* + * Note that we ignore i2c.addr here. The driver hardcodes + * the address to 0x50, while standard expects it to be 0xA0. + */ + error = mlx4_get_module_info(mdev->dev, priv->port, + i2c.offset, i2c.len, i2c.data); + if (error < 0) { + error = -error; + break; + } + error = copyout(&i2c, ifr->ifr_data, sizeof(i2c)); + break; + } +#endif default: error = ether_ioctl(dev, command, data); break; @@ -2026,8 +2093,6 @@ int mlx4_en_init_netdev(struct mlx4_en_d priv->port = port; priv->port_up = false; priv->flags = prof->flags; - priv->ctrl_flags = cpu_to_be32(MLX4_WQE_CTRL_CQ_UPDATE | - MLX4_WQE_CTRL_SOLICITED); priv->num_tx_rings_p_up = mdev->profile.num_tx_rings_p_up; priv->tx_ring_num = prof->tx_ring_num; @@ -2043,7 +2108,7 @@ int mlx4_en_init_netdev(struct mlx4_en_d err = -ENOMEM; goto out; } - + priv->rx_ring_num = prof->rx_ring_num; priv->cqe_factor = (mdev->dev->caps.cqe_size == 64) ? 1 : 0; priv->mac_index = -1; @@ -2066,7 +2131,6 @@ int mlx4_en_init_netdev(struct mlx4_en_d for (i = 0; i < MLX4_EN_MAC_HASH_SIZE; ++i) INIT_HLIST_HEAD(&priv->mac_hash[i]); - /* Query for default mac and max mtu */ priv->max_mtu = mdev->dev->caps.eth_mtu_cap[priv->port]; priv->mac = mdev->dev->caps.def_mac[priv->port]; @@ -2082,8 +2146,6 @@ int mlx4_en_init_netdev(struct mlx4_en_d goto out; } - - priv->stride = roundup_pow_of_two(sizeof(struct mlx4_en_rx_desc) + DS_SIZE); @@ -2105,7 +2167,7 @@ int mlx4_en_init_netdev(struct mlx4_en_d /* * Set driver features */ - dev->if_capabilities |= IFCAP_RXCSUM | IFCAP_TXCSUM; + dev->if_capabilities |= IFCAP_HWCSUM | IFCAP_HWCSUM_IPV6; dev->if_capabilities |= IFCAP_VLAN_MTU | IFCAP_VLAN_HWTAGGING; dev->if_capabilities |= IFCAP_VLAN_HWCSUM | IFCAP_VLAN_HWFILTER; dev->if_capabilities |= IFCAP_LINKSTATE | IFCAP_JUMBO_MTU; @@ -2115,9 +2177,9 @@ int mlx4_en_init_netdev(struct mlx4_en_d dev->if_capabilities |= IFCAP_TSO4 | IFCAP_TSO6 | IFCAP_VLAN_HWTSO; /* set TSO limits so that we don't have to drop TX packets */ - dev->if_hw_tsomax = 65536 - (ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN); - dev->if_hw_tsomaxsegcount = 16; - dev->if_hw_tsomaxsegsize = 65536; /* XXX can do up to 4GByte */ + dev->if_hw_tsomax = MLX4_EN_TX_MAX_PAYLOAD_SIZE - (ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN) /* hdr */; + dev->if_hw_tsomaxsegcount = MLX4_EN_TX_MAX_MBUF_FRAGS - 1 /* hdr */; + dev->if_hw_tsomaxsegsize = MLX4_EN_TX_MAX_MBUF_SIZE; dev->if_capenable = dev->if_capabilities; @@ -2126,6 +2188,8 @@ int mlx4_en_init_netdev(struct mlx4_en_d dev->if_hwassist |= CSUM_TSO; if (dev->if_capenable & IFCAP_TXCSUM) dev->if_hwassist |= (CSUM_TCP | CSUM_UDP | CSUM_IP); + if (dev->if_capenable & IFCAP_TXCSUM_IPV6) + dev->if_hwassist |= (CSUM_UDP_IPV6 | CSUM_TCP_IPV6); /* Register for VLAN events */ @@ -2188,8 +2252,6 @@ int mlx4_en_init_netdev(struct mlx4_en_d if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_TS) queue_delayed_work(mdev->workqueue, &priv->service_task, SERVICE_TASK_DELAY); - - return 0; out: @@ -2271,6 +2333,162 @@ static int mlx4_en_set_tx_ring_size(SYSC return (error); } +static int mlx4_en_get_module_info(struct net_device *dev, + struct ethtool_modinfo *modinfo) +{ + struct mlx4_en_priv *priv = netdev_priv(dev); + struct mlx4_en_dev *mdev = priv->mdev; + int ret; + u8 data[4]; + + /* Read first 2 bytes to get Module & REV ID */ + ret = mlx4_get_module_info(mdev->dev, priv->port, + 0/*offset*/, 2/*size*/, data); + + if (ret < 2) { + en_err(priv, "Failed to read eeprom module first two bytes, error: 0x%x\n", -ret); + return -EIO; + } + + switch (data[0] /* identifier */) { + case MLX4_MODULE_ID_QSFP: + modinfo->type = ETH_MODULE_SFF_8436; + modinfo->eeprom_len = ETH_MODULE_SFF_8436_LEN; + break; + case MLX4_MODULE_ID_QSFP_PLUS: + if (data[1] >= 0x3) { /* revision id */ + modinfo->type = ETH_MODULE_SFF_8636; + modinfo->eeprom_len = ETH_MODULE_SFF_8636_LEN; + } else { + modinfo->type = ETH_MODULE_SFF_8436; + modinfo->eeprom_len = ETH_MODULE_SFF_8436_LEN; + } + break; + case MLX4_MODULE_ID_QSFP28: + modinfo->type = ETH_MODULE_SFF_8636; + modinfo->eeprom_len = ETH_MODULE_SFF_8636_LEN; + break; + case MLX4_MODULE_ID_SFP: + modinfo->type = ETH_MODULE_SFF_8472; + modinfo->eeprom_len = ETH_MODULE_SFF_8472_LEN; + break; + default: + en_err(priv, "mlx4_en_get_module_info : Not recognized cable type\n"); + return -EINVAL; + } + + return 0; +} + +static int mlx4_en_get_module_eeprom(struct net_device *dev, + struct ethtool_eeprom *ee, + u8 *data) +{ + struct mlx4_en_priv *priv = netdev_priv(dev); + struct mlx4_en_dev *mdev = priv->mdev; + int offset = ee->offset; + int i = 0, ret; + + if (ee->len == 0) + return -EINVAL; + + memset(data, 0, ee->len); + + while (i < ee->len) { + en_dbg(DRV, priv, + "mlx4_get_module_info i(%d) offset(%d) len(%d)\n", + i, offset, ee->len - i); + + ret = mlx4_get_module_info(mdev->dev, priv->port, + offset, ee->len - i, data + i); + + if (!ret) /* Done reading */ + return 0; + + if (ret < 0) { + en_err(priv, + "mlx4_get_module_info i(%d) offset(%d) bytes_to_read(%d) - FAILED (0x%x)\n", + i, offset, ee->len - i, ret); + return -1; + } + + i += ret; + offset += ret; + } + return 0; +} + +static void mlx4_en_print_eeprom(u8 *data, __u32 len) +{ + int i; + int j = 0; + int row = 0; + const int NUM_OF_BYTES = 16; + + printf("\nOffset\t\tValues\n"); + printf("------\t\t------\n"); + while(row < len){ + printf("0x%04x\t\t",row); + for(i=0; i < NUM_OF_BYTES; i++){ + printf("%02x ", data[j]); + row++; + j++; + } + printf("\n"); + } +} + +/* Read cable EEPROM module information by first inspecting the first + * two bytes to get the length and then read the rest of the information. + * The information is printed to dmesg. */ +static int mlx4_en_read_eeprom(SYSCTL_HANDLER_ARGS) +{ + + u8* data; + int error; + int result = 0; + struct mlx4_en_priv *priv; + struct net_device *dev; + struct ethtool_modinfo modinfo; + struct ethtool_eeprom ee; + + error = sysctl_handle_int(oidp, &result, 0, req); + if (error || !req->newptr) + return (error); + + if (result == 1) { + priv = arg1; + dev = priv->dev; + data = kmalloc(PAGE_SIZE, GFP_KERNEL); + + error = mlx4_en_get_module_info(dev, &modinfo); + if (error) { + en_err(priv, + "mlx4_en_get_module_info returned with error - FAILED (0x%x)\n", + -error); + goto out; + } + + ee.len = modinfo.eeprom_len; + ee.offset = 0; + + error = mlx4_en_get_module_eeprom(dev, &ee, data); + if (error) { + en_err(priv, + "mlx4_en_get_module_eeprom returned with error - FAILED (0x%x)\n", + -error); + /* Continue printing partial information in case of an error */ + } + + /* EEPROM information will be printed in dmesg */ + mlx4_en_print_eeprom(data, ee.len); +out: + kfree(data); + } + /* Return zero to prevent sysctl failure. */ + return (0); +} + static int mlx4_en_set_tx_ppp(SYSCTL_HANDLER_ARGS) { struct mlx4_en_priv *priv; @@ -2396,7 +2614,7 @@ static void mlx4_en_sysctl_conf(struct m /* Add coalescer configuration. */ coal = SYSCTL_ADD_NODE(ctx, node_list, OID_AUTO, "coalesce", CTLFLAG_RD, NULL, "Interrupt coalesce configuration"); - coal_list = SYSCTL_CHILDREN(node); + coal_list = SYSCTL_CHILDREN(coal); SYSCTL_ADD_UINT(ctx, coal_list, OID_AUTO, "pkt_rate_low", CTLFLAG_RW, &priv->pkt_rate_low, 0, "Packets per-second for minimum delay"); @@ -2415,11 +2633,14 @@ static void mlx4_en_sysctl_conf(struct m SYSCTL_ADD_UINT(ctx, coal_list, OID_AUTO, "adaptive_rx_coal", CTLFLAG_RW, &priv->adaptive_rx_coal, 0, "Enable adaptive rx coalescing"); + /* EEPROM support */ + SYSCTL_ADD_PROC(ctx, node_list, OID_AUTO, "eeprom_info", + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, priv, 0, + mlx4_en_read_eeprom, "I", "EEPROM information"); } static void mlx4_en_sysctl_stat(struct mlx4_en_priv *priv) { - struct net_device *dev; struct sysctl_ctx_list *ctx; struct sysctl_oid *node; struct sysctl_oid_list *node_list; @@ -2430,8 +2651,6 @@ static void mlx4_en_sysctl_stat(struct m char namebuf[128]; int i; - dev = priv->dev; - ctx = &priv->stat_ctx; sysctl_ctx_init(ctx); node = SYSCTL_ADD_NODE(ctx, SYSCTL_CHILDREN(priv->sysctl), OID_AUTO, @@ -2459,6 +2678,8 @@ static void mlx4_en_sysctl_stat(struct m &priv->port_stats.wake_queue, "Queue resumed after full"); SYSCTL_ADD_ULONG(ctx, node_list, OID_AUTO, "tx_timeout", CTLFLAG_RD, &priv->port_stats.tx_timeout, "Transmit timeouts"); + SYSCTL_ADD_ULONG(ctx, node_list, OID_AUTO, "tx_oversized_packets", CTLFLAG_RD, + &priv->port_stats.oversized_packets, "TX oversized packets, m_defrag failed"); SYSCTL_ADD_ULONG(ctx, node_list, OID_AUTO, "rx_alloc_failed", CTLFLAG_RD, &priv->port_stats.rx_alloc_failed, "RX failed to allocate mbuf"); SYSCTL_ADD_ULONG(ctx, node_list, OID_AUTO, "rx_chksum_good", CTLFLAG_RD, @@ -2542,7 +2763,7 @@ struct mlx4_en_pkt_stats { SYSCTL_ADD_ULONG(ctx, node_list, OID_AUTO, "tx_packets", CTLFLAG_RD, &priv->pkstats.tx_packets, "TX packets"); SYSCTL_ADD_ULONG(ctx, node_list, OID_AUTO, "tx_bytes", CTLFLAG_RD, - &priv->pkstats.tx_packets, "TX Bytes"); + &priv->pkstats.tx_bytes, "TX Bytes"); SYSCTL_ADD_ULONG(ctx, node_list, OID_AUTO, "tx_multicast_packets", CTLFLAG_RD, &priv->pkstats.tx_multicast_packets, "TX Multicast Packets"); SYSCTL_ADD_ULONG(ctx, node_list, OID_AUTO, "tx_broadcast_packets", CTLFLAG_RD, @@ -2583,8 +2804,8 @@ struct mlx4_en_pkt_stats { CTLFLAG_RD, &tx_ring->packets, "TX packets"); SYSCTL_ADD_ULONG(ctx, ring_list, OID_AUTO, "bytes", CTLFLAG_RD, &tx_ring->bytes, "TX bytes"); - } + for (i = 0; i < priv->rx_ring_num; i++) { rx_ring = priv->rx_ring[i]; snprintf(namebuf, sizeof(namebuf), "rx_ring%d", i); Modified: stable/9/sys/ofed/drivers/net/mlx4/en_port.c ============================================================================== --- stable/9/sys/ofed/drivers/net/mlx4/en_port.c Fri Dec 11 16:51:04 2015 (r292114) +++ stable/9/sys/ofed/drivers/net/mlx4/en_port.c Fri Dec 11 18:28:20 2015 (r292115) @@ -194,6 +194,7 @@ int mlx4_en_DUMP_ETH_STATS(struct mlx4_e priv->port_stats.tx_chksum_offload += priv->tx_ring[i]->tx_csum; priv->port_stats.queue_stopped += priv->tx_ring[i]->queue_stopped; priv->port_stats.wake_queue += priv->tx_ring[i]->wake_queue; + priv->port_stats.oversized_packets += priv->tx_ring[i]->oversized_packets; } /* RX Statistics */ priv->pkstats.rx_packets = be64_to_cpu(mlx4_en_stats->RTOT_prio_0) + Modified: stable/9/sys/ofed/drivers/net/mlx4/en_rx.c ============================================================================== --- stable/9/sys/ofed/drivers/net/mlx4/en_rx.c Fri Dec 11 16:51:04 2015 (r292114) +++ stable/9/sys/ofed/drivers/net/mlx4/en_rx.c Fri Dec 11 18:28:20 2015 (r292115) @@ -49,106 +49,135 @@ static void mlx4_en_init_rx_desc(struct struct mlx4_en_rx_ring *ring, int index) { - struct mlx4_en_rx_desc *rx_desc = ring->buf + ring->stride * index; + struct mlx4_en_rx_desc *rx_desc = (struct mlx4_en_rx_desc *) + (ring->buf + (ring->stride * index)); int possible_frags; int i; - /* Set size and memtype fields */ - for (i = 0; i < priv->num_frags; i++) { - rx_desc->data[i].byte_count = - cpu_to_be32(priv->frag_info[i].frag_size); - rx_desc->data[i].lkey = cpu_to_be32(priv->mdev->mr.key); - } + rx_desc->data[0].byte_count = cpu_to_be32(priv->rx_mb_size); + rx_desc->data[0].lkey = cpu_to_be32(priv->mdev->mr.key); - /* If the number of used fragments does not fill up the ring stride, - * * remaining (unused) fragments must be padded with null address/size - * * and a special memory key */ + /* + * If the number of used fragments does not fill up the ring + * stride, remaining (unused) fragments must be padded with + * null address/size and a special memory key: + */ possible_frags = (ring->stride - sizeof(struct mlx4_en_rx_desc)) / DS_SIZE; - for (i = priv->num_frags; i < possible_frags; i++) { + for (i = 1; i < possible_frags; i++) { rx_desc->data[i].byte_count = 0; rx_desc->data[i].lkey = cpu_to_be32(MLX4_EN_MEMTYPE_PAD); rx_desc->data[i].addr = 0; } - } -static int mlx4_en_alloc_buf(struct mlx4_en_priv *priv, - struct mlx4_en_rx_desc *rx_desc, - struct mbuf **mb_list, - int i) +static int +mlx4_en_alloc_buf(struct mlx4_en_rx_ring *ring, + __be64 *pdma, struct mlx4_en_rx_mbuf *mb_list) { - struct mlx4_en_dev *mdev = priv->mdev; - struct mlx4_en_frag_info *frag_info = &priv->frag_info[i]; + bus_dma_segment_t segs[1]; + bus_dmamap_t map; struct mbuf *mb; - dma_addr_t dma; + int nsegs; + int err; - if (i == 0) - mb = m_getjcl(M_NOWAIT, MT_DATA, M_PKTHDR, frag_info->frag_size); - else - mb = m_getjcl(M_NOWAIT, MT_DATA, 0, frag_info->frag_size); - if (mb == NULL) { - priv->port_stats.rx_alloc_failed++; - return -ENOMEM; + /* try to allocate a new spare mbuf */ + if (unlikely(ring->spare.mbuf == NULL)) { + mb = m_getjcl(M_NOWAIT, MT_DATA, M_PKTHDR, ring->rx_mb_size); + if (unlikely(mb == NULL)) + return (-ENOMEM); + /* setup correct length */ + mb->m_len = ring->rx_mb_size; + + /* load spare mbuf into BUSDMA */ + err = -bus_dmamap_load_mbuf_sg(ring->dma_tag, ring->spare.dma_map, + mb, segs, &nsegs, BUS_DMA_NOWAIT); + if (unlikely(err != 0)) { + m_freem(mb); + return (err); + } + + /* store spare info */ + ring->spare.mbuf = mb; + ring->spare.paddr_be = cpu_to_be64(segs[0].ds_addr); + + bus_dmamap_sync(ring->dma_tag, ring->spare.dma_map, + BUS_DMASYNC_PREREAD); } - dma = pci_map_single(mdev->pdev, mb->m_data, frag_info->frag_size, - PCI_DMA_FROMDEVICE); - rx_desc->data[i].addr = cpu_to_be64(dma); - mb_list[i] = mb; - return 0; -} + /* synchronize and unload the current mbuf, if any */ + if (likely(mb_list->mbuf != NULL)) { + bus_dmamap_sync(ring->dma_tag, mb_list->dma_map, + BUS_DMASYNC_POSTREAD); + bus_dmamap_unload(ring->dma_tag, mb_list->dma_map); + } -static int mlx4_en_prepare_rx_desc(struct mlx4_en_priv *priv, - struct mlx4_en_rx_ring *ring, int index) -{ - struct mlx4_en_rx_desc *rx_desc = ring->buf + (index * ring->stride); - struct mbuf **mb_list = ring->rx_info + (index << priv->log_rx_info); - int i; + mb = m_getjcl(M_NOWAIT, MT_DATA, M_PKTHDR, ring->rx_mb_size); + if (unlikely(mb == NULL)) + goto use_spare; + + /* setup correct length */ + mb->m_len = ring->rx_mb_size; + + err = -bus_dmamap_load_mbuf_sg(ring->dma_tag, mb_list->dma_map, + mb, segs, &nsegs, BUS_DMA_NOWAIT); + if (unlikely(err != 0)) { + m_freem(mb); + goto use_spare; + } + + *pdma = cpu_to_be64(segs[0].ds_addr); + mb_list->mbuf = mb; + + bus_dmamap_sync(ring->dma_tag, mb_list->dma_map, BUS_DMASYNC_PREREAD); + return (0); - for (i = 0; i < priv->num_frags; i++) - if (mlx4_en_alloc_buf(priv, rx_desc, mb_list, i)) - goto err; +use_spare: + /* swap DMA maps */ + map = mb_list->dma_map; + mb_list->dma_map = ring->spare.dma_map; + ring->spare.dma_map = map; - return 0; + /* swap MBUFs */ + mb_list->mbuf = ring->spare.mbuf; + ring->spare.mbuf = NULL; -err: - while (i--) - m_free(mb_list[i]); - return -ENOMEM; + /* store physical address */ + *pdma = ring->spare.paddr_be; + return (0); } -static inline void mlx4_en_update_rx_prod_db(struct mlx4_en_rx_ring *ring) +static void +mlx4_en_free_buf(struct mlx4_en_rx_ring *ring, struct mlx4_en_rx_mbuf *mb_list) { - *ring->wqres.db.db = cpu_to_be32(ring->prod & 0xffff); + bus_dmamap_t map = mb_list->dma_map; + bus_dmamap_sync(ring->dma_tag, map, BUS_DMASYNC_POSTREAD); + bus_dmamap_unload(ring->dma_tag, map); + m_freem(mb_list->mbuf); + mb_list->mbuf = NULL; /* safety clearing */ } -static void mlx4_en_free_rx_desc(struct mlx4_en_priv *priv, - struct mlx4_en_rx_ring *ring, - int index) +static int +mlx4_en_prepare_rx_desc(struct mlx4_en_priv *priv, + struct mlx4_en_rx_ring *ring, int index) { - struct mlx4_en_frag_info *frag_info; - struct mlx4_en_dev *mdev = priv->mdev; - struct mbuf **mb_list; - struct mlx4_en_rx_desc *rx_desc = ring->buf + (index << ring->log_stride); - dma_addr_t dma; - int nr; - - mb_list = ring->rx_info + (index << priv->log_rx_info); - for (nr = 0; nr < priv->num_frags; nr++) { - en_dbg(DRV, priv, "Freeing fragment:%d\n", nr); - frag_info = &priv->frag_info[nr]; - dma = be64_to_cpu(rx_desc->data[nr].addr); - -#if BITS_PER_LONG == 64 - en_dbg(DRV, priv, "Unmaping buffer at dma:0x%lx\n", (u64) dma); -#elif BITS_PER_LONG == 32 - en_dbg(DRV, priv, "Unmaping buffer at dma:0x%llx\n", (u64) dma); -#endif - pci_unmap_single(mdev->pdev, dma, frag_info->frag_size, - PCI_DMA_FROMDEVICE); - m_free(mb_list[nr]); + struct mlx4_en_rx_desc *rx_desc = (struct mlx4_en_rx_desc *) + (ring->buf + (index * ring->stride)); + struct mlx4_en_rx_mbuf *mb_list = ring->mbuf + index; + + mb_list->mbuf = NULL; + + if (mlx4_en_alloc_buf(ring, &rx_desc->data[0].addr, mb_list)) { + priv->port_stats.rx_alloc_failed++; + return (-ENOMEM); } + return (0); +} + +static inline void +mlx4_en_update_rx_prod_db(struct mlx4_en_rx_ring *ring) +{ + *ring->wqres.db.db = cpu_to_be32(ring->prod & 0xffff); } static int mlx4_en_fill_rx_buffers(struct mlx4_en_priv *priv) @@ -191,7 +220,8 @@ reduce_rings: while (ring->actual_size > new_size) { ring->actual_size--; ring->prod--; - mlx4_en_free_rx_desc(priv, ring, ring->actual_size); + mlx4_en_free_buf(ring, + ring->mbuf + ring->actual_size); } } @@ -211,100 +241,106 @@ static void mlx4_en_free_rx_buf(struct m while (ring->cons != ring->prod) { index = ring->cons & ring->size_mask; en_dbg(DRV, priv, "Processing descriptor:%d\n", index); - mlx4_en_free_rx_desc(priv, ring, index); + mlx4_en_free_buf(ring, ring->mbuf + index); ++ring->cons; } } -#if MLX4_EN_MAX_RX_FRAGS == 3 -static int frag_sizes[] = { - FRAG_SZ0, - FRAG_SZ1, - FRAG_SZ2, -}; -#elif MLX4_EN_MAX_RX_FRAGS == 2 -static int frag_sizes[] = { - FRAG_SZ0, - FRAG_SZ1, -}; -#else -#error "Unknown MAX_RX_FRAGS" -#endif - void mlx4_en_calc_rx_buf(struct net_device *dev) { struct mlx4_en_priv *priv = netdev_priv(dev); int eff_mtu = dev->if_mtu + ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN; - int buf_size = 0; - int i, frag; - for (i = 0, frag = 0; buf_size < eff_mtu; frag++, i++) { - /* - * Allocate small to large but only as much as is needed for - * the tail. - */ - while (i > 0 && eff_mtu - buf_size <= frag_sizes[i - 1]) - i--; - priv->frag_info[frag].frag_size = frag_sizes[i]; - priv->frag_info[frag].frag_prefix_size = buf_size; - buf_size += priv->frag_info[frag].frag_size; - } + if (eff_mtu > MJUM16BYTES) { + en_err(priv, "MTU(%d) is too big\n", (int)dev->if_mtu); + eff_mtu = MJUM16BYTES; + } else if (eff_mtu > MJUM9BYTES) { + eff_mtu = MJUM16BYTES; + } else if (eff_mtu > MJUMPAGESIZE) { + eff_mtu = MJUM9BYTES; + } else if (eff_mtu > MCLBYTES) { + eff_mtu = MJUMPAGESIZE; + } else { + eff_mtu = MCLBYTES; + } - priv->num_frags = frag; priv->rx_mb_size = eff_mtu; - priv->log_rx_info = - ROUNDUP_LOG2(priv->num_frags * sizeof(struct mbuf *)); - en_dbg(DRV, priv, "Rx buffer scatter-list (effective-mtu:%d " - "num_frags:%d):\n", eff_mtu, priv->num_frags); - for (i = 0; i < priv->num_frags; i++) { - en_dbg(DRV, priv, " frag:%d - size:%d prefix:%d\n", i, - priv->frag_info[i].frag_size, - priv->frag_info[i].frag_prefix_size); - } + en_dbg(DRV, priv, "Effective RX MTU: %d bytes\n", eff_mtu); } - int mlx4_en_create_rx_ring(struct mlx4_en_priv *priv, struct mlx4_en_rx_ring **pring, u32 size, int node) { struct mlx4_en_dev *mdev = priv->mdev; struct mlx4_en_rx_ring *ring; - int err = -ENOMEM; + int err; int tmp; + uint32_t x; ring = kzalloc(sizeof(struct mlx4_en_rx_ring), GFP_KERNEL); if (!ring) { en_err(priv, "Failed to allocate RX ring structure\n"); return -ENOMEM; } - + + /* Create DMA descriptor TAG */ + if ((err = -bus_dma_tag_create( + bus_get_dma_tag(mdev->pdev->dev.bsddev), + 1, /* any alignment */ + 0, /* no boundary */ + BUS_SPACE_MAXADDR, /* lowaddr */ + BUS_SPACE_MAXADDR, /* highaddr */ + NULL, NULL, /* filter, filterarg */ + MJUM16BYTES, /* maxsize */ + 1, /* nsegments */ + MJUM16BYTES, /* maxsegsize */ + 0, /* flags */ + NULL, NULL, /* lockfunc, lockfuncarg */ + &ring->dma_tag))) { + en_err(priv, "Failed to create DMA tag\n"); + goto err_ring; + } + ring->prod = 0; ring->cons = 0; ring->size = size; ring->size_mask = size - 1; - ring->stride = roundup_pow_of_two(sizeof(struct mlx4_en_rx_desc) + - DS_SIZE * MLX4_EN_MAX_RX_FRAGS); + ring->stride = roundup_pow_of_two( + sizeof(struct mlx4_en_rx_desc) + DS_SIZE); ring->log_stride = ffs(ring->stride) - 1; ring->buf_size = ring->size * ring->stride + TXBB_SIZE; - tmp = size * roundup_pow_of_two(MLX4_EN_MAX_RX_FRAGS * - sizeof(struct mbuf *)); + tmp = size * sizeof(struct mlx4_en_rx_mbuf); - ring->rx_info = kmalloc(tmp, GFP_KERNEL); - if (!ring->rx_info) { + ring->mbuf = kzalloc(tmp, GFP_KERNEL); + if (ring->mbuf == NULL) { err = -ENOMEM; - goto err_ring; + goto err_dma_tag; } - en_dbg(DRV, priv, "Allocated rx_info ring at addr:%p size:%d\n", - ring->rx_info, tmp); + err = -bus_dmamap_create(ring->dma_tag, 0, &ring->spare.dma_map); + if (err != 0) + goto err_info; + + for (x = 0; x != size; x++) { + err = -bus_dmamap_create(ring->dma_tag, 0, + &ring->mbuf[x].dma_map); + if (err != 0) { + while (x--) + bus_dmamap_destroy(ring->dma_tag, + ring->mbuf[x].dma_map); + goto err_info; + } + } + en_dbg(DRV, priv, "Allocated MBUF ring at addr:%p size:%d\n", + ring->mbuf, tmp); err = mlx4_alloc_hwq_res(mdev->dev, &ring->wqres, ring->buf_size, 2 * PAGE_SIZE); if (err) - goto err_info; + goto err_dma_map; err = mlx4_en_map_buffer(&ring->wqres.buf); if (err) { @@ -317,23 +353,29 @@ int mlx4_en_create_rx_ring(struct mlx4_e err_hwq: mlx4_free_hwq_res(mdev->dev, &ring->wqres, ring->buf_size); +err_dma_map: + for (x = 0; x != size; x++) { + bus_dmamap_destroy(ring->dma_tag, + ring->mbuf[x].dma_map); + } + bus_dmamap_destroy(ring->dma_tag, ring->spare.dma_map); err_info: - vfree(ring->rx_info); + vfree(ring->mbuf); +err_dma_tag: + bus_dma_tag_destroy(ring->dma_tag); err_ring: kfree(ring); - - return err; + return (err); } - int mlx4_en_activate_rx_rings(struct mlx4_en_priv *priv) { struct mlx4_en_rx_ring *ring; int i; int ring_ind; int err; - int stride = roundup_pow_of_two(sizeof(struct mlx4_en_rx_desc) + - DS_SIZE * priv->num_frags); + int stride = roundup_pow_of_two( + sizeof(struct mlx4_en_rx_desc) + DS_SIZE); for (ring_ind = 0; ring_ind < priv->rx_ring_num; ring_ind++) { ring = priv->rx_ring[ring_ind]; @@ -409,10 +451,22 @@ void mlx4_en_destroy_rx_ring(struct mlx4 { struct mlx4_en_dev *mdev = priv->mdev; struct mlx4_en_rx_ring *ring = *pring; + uint32_t x; mlx4_en_unmap_buffer(&ring->wqres.buf); mlx4_free_hwq_res(mdev->dev, &ring->wqres, size * stride + TXBB_SIZE); - vfree(ring->rx_info); + for (x = 0; x != size; x++) + bus_dmamap_destroy(ring->dma_tag, ring->mbuf[x].dma_map); + /* free spare mbuf, if any */ + if (ring->spare.mbuf != NULL) { + bus_dmamap_sync(ring->dma_tag, ring->spare.dma_map, + BUS_DMASYNC_POSTREAD); + bus_dmamap_unload(ring->dma_tag, ring->spare.dma_map); + m_freem(ring->spare.mbuf); + } + bus_dmamap_destroy(ring->dma_tag, ring->spare.dma_map); + vfree(ring->mbuf); + bus_dma_tag_destroy(ring->dma_tag); kfree(ring); *pring = NULL; #ifdef CONFIG_RFS_ACCEL @@ -420,7 +474,6 @@ void mlx4_en_destroy_rx_ring(struct mlx4 #endif } - void mlx4_en_deactivate_rx_ring(struct mlx4_en_priv *priv, struct mlx4_en_rx_ring *ring) { @@ -469,69 +522,27 @@ static inline int invalid_cqe(struct mlx return 0; } - -/* Unmap a completed descriptor and free unused pages */ -static int mlx4_en_complete_rx_desc(struct mlx4_en_priv *priv, - struct mlx4_en_rx_desc *rx_desc, - struct mbuf **mb_list, - int length) +static struct mbuf * +mlx4_en_rx_mb(struct mlx4_en_priv *priv, struct mlx4_en_rx_ring *ring, + struct mlx4_en_rx_desc *rx_desc, struct mlx4_en_rx_mbuf *mb_list, + int length) { - struct mlx4_en_dev *mdev = priv->mdev; - struct mlx4_en_frag_info *frag_info; *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***