Date: Thu, 26 Jan 2017 13:08:06 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 216493] [Hyper-V] Mellanox ConnectX-3 VF driver can't work when FreeBSD runs on Hyper-V 2016 Message-ID: <bug-216493-8@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D216493 Bug ID: 216493 Summary: [Hyper-V] Mellanox ConnectX-3 VF driver can't work when FreeBSD runs on Hyper-V 2016 Product: Base System Version: CURRENT Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: decui@microsoft.com Windows Server 2016 (Hyper-V 2016) has the ability to support PCIe pass-thr= ough and NIC SR-IOV for non-Windows virtual machines (VMs) like Linux and FreeBSD VMs. A few months ago, we enabled PCIe pass-through for FreeBSD VM running = on Hyper-V and successfully assigned a Mellanox ConnectX-3 PF device to the VM= and the device worked fine in the VM.=20 Now we have added code to support NIC SR-IOV (which is based on PCIe pass-through) in the Hyper-V hv_netvsc driver, but it turned out the VF dri= ver failed to load, so I ported two patches from Linux: https://reviews.freebsd.org/D8867 https://reviews.freebsd.org/D8868 (Note: I only tested the PF/VF drivers in FreeBSD VM running on Hyper-V, bu= t I didn=E2=80=99t test them with the patches on a bare metal FreeBSD machine (= it=E2=80=99s not so easy to install such a FreeBSD machine in our lab for now), so it would be really helpful & important if people could review the patches and help to t= est bare metal.) With the 2 patches, the VF driver worked in my limited test. BTW, this link (https://community.mellanox.com/docs/DOC-2242) shows how to enable Mellanox ConnectX-3 VF for Windows VM running on Hyper-V 2012 R2. Wh= at I did to FreeBSD VM on Hyper-V 2016 is pretty similar.=20 Next, I did more testing and identified 4 issues we need to address: 1. When the VF is hot removed, I see the below error, but it looks nonfatal, because later when the VF is hot added, it can still work. mlx4_core0: Failed to free mtt range at:20769 order:0 mlx4_core0: detached 2. The VF works fine when the VM has <=3D12 virtual CPUs, but if the VM has= >=3D13 vCPUs, the VF driver fails to load: mlx4_core0: <mlx4_core> at device 2.0 on pci1 mlx4_core: Initializing mlx4_core: Mellanox ConnectX VPI driver v2.1.6 vmbus0: allocated type 3 (0xfe0800000-0xfe0ffffff) for rid 18 of mlx4_cor= e0 mlx4_core0: Lazy allocation of 0x800000 bytes rid 0x18 type 3 at 0xfe0800= 000 mlx4_core0: Detected virtual function - running in slave mode mlx4_core0: Sending reset mlx4_core0: Sending vhcr0 mlx4_core0: HCA minimum page size:512 mlx4_core0: Timestamping is not supported in slave mode. mlx4_core0: attempting to allocate 20 MSI-X vectors (52 supported) mlx4_core0: using IRQs 256-275 for MSI-X mlx4_core0: Failed to allocate mtts for 1024 pages(order 10) mlx4_core0: Failed to initialize event queue table (err=3D-12), aborting. 3. The VF can't ping other VM's VF on the same host, and can't ping the PF = on the same host either. On the same host, Windows VM <-> Windows VM and=20 Windows VM <-> Linux VM are both OK. Only FreeBSD VM <-> Windows/Linux VMs can't work. I suspect something is wrong or missing in the mlx4 VF driver in FreeBSD. 4. I got the below when Live Migration didn=E2=80=99t work. It seems the VF= =E2=80=99s detach method couldn=E2=80=99t finish successfully. Jan 11 19:16:43 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm chan= nel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:16:43 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode FREE_RES (0xf01) Jan 11 19:16:43 decui-b11 kernel: mlx4_core0: Failed to free mtt range at:5= 937 order:0 Jan 11 19:16:54 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm chan= nel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:16:54 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode CLOSE_PORT (0xa) Jan 11 19:18:04 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm chan= nel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:18:04 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode FREE_RES (0xf01) Jan 11 19:19:14 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm chan= nel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:19:14 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode QP_FLOW_STEERING_DETACH (0x66) Jan 11 19:19:14 decui-b11 kernel: mlx4_core0: Fail to detach network rule. registration id =3D 0x9000000000002 Jan 11 19:20:24 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm chan= nel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:20:24 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode QP_FLOW_STEERING_DETACH (0x66) Jan 11 19:20:24 decui-b11 kernel: mlx4_core0: Fail to detach network rule. registration id =3D 0x9000000000003 Jan 11 19:21:34 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm chan= nel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:21:34 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode QP_FLOW_STEERING_DETACH (0x66) Jan 11 19:21:34 decui-b11 kernel: mlx4_core0: Fail to detach network rule. registration id =3D 0x9000000000004 Jan 11 19:22:46 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm chan= nel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:22:46 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode QP_FLOW_STEERING_DETACH (0x66) Jan 11 19:22:46 decui-b11 kernel: mlx4_core0: Fail to detach network rule. registration id =3D 0x9000000000005 Jan 11 19:23:56 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm chan= nel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:23:56 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode QP_FLOW_STEERING_DETACH (0x66) Jan 11 19:23:56 decui-b11 kernel: mlx4_core0: Fail to detach network rule. registration id =3D 0x9000000000006 Jan 11 19:25:06 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm chan= nel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:25:06 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode QP_FLOW_STEERING_DETACH (0x66) Jan 11 19:25:06 decui-b11 kernel: mlx4_core0: Fail to detach network rule. registration id =3D 0x9000000000007 Jan 11 19:26:16 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm chan= nel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:26:16 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode SET_MCAST_FLTR (0x48) Jan 11 19:27:26 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm chan= nel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:27:26 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode FREE_RES (0xf01) Jan 11 19:27:26 decui-b11 kernel: mlx4_core0: Failed to free icm of qp:2279 Jan 11 19:28:36 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm chan= nel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:28:36 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode FREE_RES (0xf01) Jan 11 19:28:36 decui-b11 kernel: mlx4_core0: Failed to release qp range base:2279 cnt:1 Jan 11 19:29:46 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm chan= nel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:29:46 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode 2RST_QP (0x21) Jan 11 19:30:56 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm chan= nel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:30:56 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode HW2SW_CQ (0x17) Jan 11 19:30:56 decui-b11 kernel: mlx4_core0: HW2SW_CQ failed (-35) for CQN 0000b5 Jan 11 19:32:06 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm chan= nel is not idle. My toggle is 0 (op: 0x5) Jan 11 19:32:06 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST commandopcode FREE_RES (0xf01) Jan 11 19:32:06 decui-b11 kernel: mlx4_core0: Failed freeing cq:181 More info about issue 4: In the case of Live Migration, it looks the host just rescinds the VF by fo= rce without sending the PCI_EJECT message to the VM. It looks the current Mella= nox VF driver in FreeBSD can=E2=80=99t handle this case (i.e. the VF device dis= appears suddenly) and always hangs due to command timeout, because at that time the host denies the VM=E2=80=99s access to the VF.=20=20 BTW, the VF driver in Linux VM doesn=E2=80=99t hang and it looks Live Migra= tion can work, but the driver also prints out these scary messages: Jan 26 02:40:06 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: Internal error detected on the communication channel Jan 26 02:40:06 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: device is goin= g to be reset Jan 26 02:40:06 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: VF reset is not needed Jan 26 02:40:06 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: device was res= et successfully Jan 26 02:40:06 decui-lin-vm kernel: mlx4_en 99bb:00:02.0: Internal error detected, restarting device Jan 26 02:40:06 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: command 0x5 failed: fw status =3D 0x1 Jan 26 02:40:06 decui-lin-vm kernel: hv_netvsc vmbus_16 eth1: VF down: enP39355p0s2 Jan 26 02:40:06 decui-lin-vm kernel: hv_netvsc vmbus_16 eth1: Data path switched from VF: enP39355p0s2 Jan 26 02:40:06 decui-lin-vm kernel: hv_netvsc vmbus_16 eth1: VF unregister= ing: enP39355p0s2 Jan 26 02:40:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: Failed to close slave function Jan 26 02:40:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: Detected virtu= al function - running in slave mode Jan 26 02:40:37 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: recovering from previously mis-behaved VM Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: Communication channel is offline. Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: PF is not responsive, skipping initialization Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: Failed to initialize slave Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: mlx4_restart_o= ne: ERROR: mlx4_load_one failed, pci_name=3D99bb:00:02.0, err=3D-5 Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: mlx4_restart_o= ne was ended, ret=3D-5 Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: mlx4_remove_on= e: interface is down I think at least we need to port this patch =E2=80=9Cnet/mlx4_core: Enable device recovery flow with SRIOV =E2=80=9C (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id= =3D55ad359225b2232b9b8f04a0dfa169bd3a7d86d2) from Linux to FreeBSD. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-216493-8>