Date: Wed, 11 Jan 2017 19:54:02 -0800 From: <soralx@cydem.org> To: <freebsd-virtualization@freebsd.org> Subject: Re: Issues with GTX960 on CentOS7 using bhyve PCI passthru (FreeBSD 11-RC2) Message-ID: <20170111195402.785f27c6@mscad14> In-Reply-To: <93196ea2-5439-49ff-54fd-7b7273bdec85@freebsd.org> References: <20170110003332.7cf8ba15@mscad14> <0de7e0fe-5680-b1be-bd57-6bf446c2fd38@talk2dom.com> <0c927784-3e3f-7946-fba9-c25001f4156c@talk2dom.com> <20170110180117.7f246b5a@mscad14> <20170111014544.70670784@mscad14> <93196ea2-5439-49ff-54fd-7b7273bdec85@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
I had a bit more play with nVidia and FreeBSD guest. First, `nvidia-smi -q` output diff [0] is interesting. It suggests that the card may be in some incompletely initialized state: notice the "Unknown Error" instead of real UUID, and the P8 power state. Could it be that the driver doesn't put the card's BIOS in the right state? The command was run in both host and guest without Xorg loaded. Second, I was able to start Xorg by disabling rendering acceleration (Option "NoAccel"). Now nVidia's Xorg module does not fail to allocate DMA (I guess it does not try?), but oddly, reserves 48 GB (!?) of virtual memory instead. Sadly there is still no display for some reason. Relevant dmesg bits are below [1]. Of particular interest is the line "nvidia-modeset: Allocated GPU:0 () @ PCI:0000:00:00.0" -- the PCI address is obviously incorrect. Xorg log bits [2] show that X is up. But the monitor stays in sleep mode. With more options [3], I get this: [4]. Edit: actually, host reboot made it behave the same as just with "NoAccel", maybe. Clearly the driver is able to talk to the card: e.g., it attaches and responds to `nvidia-smi` [with the exception of UUID], reads EDID from the monitor. But some channel of communication is clearly missing or not working right. Any ideas how to go about finding out which one? [0] ==============NVSMI LOG============== -Timestamp : Wed Jan 11 19:40:54 2017 +Timestamp : Wed Jan 11 11:08:40 2017 Driver Version : 367.44 Attached GPUs : 1 -GPU 0000:01:00.0 +GPU 0000:00:04.0 Product Name : Quadro 2000 Product Brand : Quadro Display Mode : Enabled @@ -17,11 +17,11 @@ Current : N/A Pending : N/A Serial Number : N/A - GPU UUID : GPU-f6c71b8e-f6c8-5a42-260d-1164720bf4f2 + GPU UUID : Unknown Error Minor Number : 0 VBIOS Version : 70.06.0D.00.02 MultiGPU Board : No - Board ID : 0x100 + Board ID : 0x4 GPU Part Number : N/A Inforom Version Image Version : N/A @@ -34,16 +34,16 @@ GPU Virtualization Mode Virtualization mode : None PCI - Bus : 0x01 - Device : 0x00 + Bus : 0x00 + Device : 0x04 Domain : 0x0000 Device Id : 0x0DD810DE - Bus Id : 0000:01:00.0 + Bus Id : 0000:00:04.0 Sub System Id : 0x084A10DE GPU Link Info PCIe Generation Max : 2 - Current : 2 + Current : 1 Link Width Max : 16x Current : 16x @@ -54,7 +54,7 @@ Tx Throughput : N/A Rx Throughput : N/A Fan Speed : 30 % - Performance State : P0 + Performance State : P8 Clocks Throttle Reasons : N/A FB Memory Usage Total : 963 MiB @@ -113,7 +113,7 @@ Double Bit ECC : N/A Pending : N/A Temperature - GPU Current Temp : 38 C + GPU Current Temp : 35 C GPU Shutdown Temp : N/A GPU Slowdown Temp : N/A Power Readings @@ -125,10 +125,10 @@ Min Power Limit : N/A Max Power Limit : N/A Clocks - Graphics : 625 MHz - SM : 1251 MHz - Memory : 1304 MHz - Video : 540 MHz + Graphics : 405 MHz + SM : 810 MHz + Memory : 324 MHz + Video : 405 MHz Applications Clocks Graphics : N/A Memory : N/A [1] nvidia0: <Quadro 2000> on vgapci0 vgapci0: child nvidia0 requested pci_enable_io vgapci0: attempting to allocate 1 MSI vectors (1 supported) msi: routing MSI IRQ 269 to local APIC 3 vector 51 vgapci0: using IRQ 269 for MSI vgapci0: child nvidia0 requested pci_enable_io nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 367.44 Wed Aug 17 22:05:09 PDT 2016 acquiring duplicate lock of same type: "os.lock_sx" 1st os.lock_sx @ nvidia_os.c:599 2nd os.lock_sx @ nvidia_os.c:599 stack backtrace: #0 0xffffffff80aa6780 at witness_debugger+0x70 #1 0xffffffff80aa6683 at witness_checkorder+0xde3 #2 0xffffffff80a4fac2 at _sx_xlock+0x72 #3 0xffffffff82a515c2 at os_acquire_mutex+0x32 #4 0xffffffff82a21068 at _nv016673rm+0x18 nvidia-modeset: Allocated GPU:0 () @ PCI:0000:00:00.0 nvidia-modeset: WARNING: GPU:0: Lost display notification; continuing. NVRM: Xid (PCI:0000:00:04): 16, Head 00000000 Count 00000000 NVRM: Xid (PCI:0000:00:04): 16, Head 00000000 Count 00000001 NVRM: Xid (PCI:0000:00:04): 16, Head 00000000 Count 00000002 NVRM: Xid (PCI:0000:00:04): 16, Head 00000000 Count 00000003 NVRM: Xid (PCI:0000:00:04): 16, Head 00000000 Count 00000004 NVRM: Xid (PCI:0000:00:04): 16, Head 00000000 Count 00000005 NVRM: Xid (PCI:0000:00:04): 16, Head 00000000 Count 00000006 NVRM: Xid (PCI:0000:00:04): 16, Head 00000000 Count 00000007 When rebooting, I get this: nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000857d:0:0:0x00000040 [2] [ 659.406] (--) PCI:*(0:0:4:0) 10de:0dd8:10de:084a rev 161, Mem @ 0xc2000000/33554432, 0x3400000000/134217728, 0x3408000000/67108864, I/O @ 0x00002080/128, BIOS @ 0x????????/65536 [ 659.407] (II) LoadModule: "glx" [ 659.412] (II) Loading /usr/local/lib/xorg/modules/extensions/libglx.so [ 659.594] (II) Module glx: vendor="NVIDIA Corporation" [ 659.594] compiled for 4.0.2, module version = 1.0.0 [ 659.594] Module class: X.Org Server Extension [ 659.594] (II) NVIDIA GLX Module 367.44 Wed Aug 17 22:01:17 PDT 2016 [ 659.595] (II) LoadModule: "nvidia" [ 659.595] (II) Loading /usr/local/lib/xorg/modules/drivers/nvidia_drv.so [ 659.603] (II) Module nvidia: vendor="NVIDIA Corporation" [ 659.603] compiled for 4.0.2, module version = 1.0.0 [ 659.603] Module class: X.Org Video Driver [ 659.604] (II) NVIDIA dlloader X Driver 367.44 Wed Aug 17 21:41:06 PDT 2016 [ 659.604] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs [ 659.604] (--) Using syscons driver with X support (version 2.0) [ 659.604] (--) using VT number 9 [...] [ 659.609] (WW) VGA arbiter: cannot open kernel arbiter, no multi-card support [ 659.609] (**) NVIDIA(0): Depth 24, (--) framebuffer bpp 32 [ 659.609] (==) NVIDIA(0): RGB weight 888 [ 659.609] (==) NVIDIA(0): Default visual is TrueColor [ 659.609] (==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0) [ 659.610] (**) NVIDIA(0): Option "NoPowerConnectorCheck" "yes" [ 659.610] (**) NVIDIA(0): Option "ThermalConfigurationCheck" "no" [ 659.610] (**) NVIDIA(0): Option "Accel" "no" [ 659.610] (**) NVIDIA(0): Disabling 2D acceleration [ 668.054] (--) NVIDIA(0): Valid display device(s) on GPU-0 at PCI:0:4:0 [ 668.054] (--) NVIDIA(0): CRT-0 [ 668.054] (--) NVIDIA(0): DFP-0 (boot) [...] [ 668.125] (II) NVIDIA(GPU-0): Skipping Power Connector Check. [ 668.125] (II) NVIDIA(GPU-0): Skipping Thermal Configuration Check. [ 668.125] (II) NVIDIA(GPU-0): Acceleration disabled. [ 668.125] (II) NVIDIA(0): NVIDIA GPU Quadro 2000 (GF106GL) at PCI:0:4:0 (GPU-0) [ 668.125] (--) NVIDIA(0): Memory: 1048576 kBytes [ 668.125] (--) NVIDIA(0): VideoBIOS: 70.06.0d.00.02 [ 668.125] (II) NVIDIA(0): Detected PCI Express Link width: 16X [ 668.125] (**) NVIDIA(0): Using HorizSync/VertRefresh ranges from the EDID for display [ 668.125] (**) NVIDIA(0): device DELL 2007FP (DFP-0) (Using EDID frequencies has [ 668.125] (**) NVIDIA(0): been enabled on all display devices.) [ 668.127] (==) NVIDIA(0): [ 668.127] (==) NVIDIA(0): No modes were requested; the default mode "nvidia-auto-select" [ 668.127] (==) NVIDIA(0): will be used as the requested mode. [ 668.127] (==) NVIDIA(0): [ 668.128] (II) NVIDIA(0): Validated MetaModes: [ 668.128] (II) NVIDIA(0): "DFP-0:nvidia-auto-select" [ 668.128] (II) NVIDIA(0): Virtual screen size determined to be 1600 x 1200 [ 668.139] (--) NVIDIA(0): DPI set to (99, 98); computed from "UseEdidDpi" X config [ 668.139] (--) NVIDIA(0): option [ 668.139] (--) Depth 24 pixmap format is 32 bpp [ 668.144] (II) NVIDIA: Reserving 49152.00 MB of virtual memory for indirect memory [ 668.144] (II) NVIDIA: access. [ 668.370] (II) NVIDIA(0): Setting mode "DFP-0:nvidia-auto-select" [ 675.574] (==) NVIDIA(0): Disabling shared memory pixmaps [ 675.574] (==) NVIDIA(0): Backing store enabled [ 675.574] (==) NVIDIA(0): Silken mouse enabled [ 675.575] (**) NVIDIA(0): DPMS enabled [ 675.575] (II) Loading sub module "dri2" [ 675.575] (II) LoadModule: "dri2" [ 675.575] (II) Module "dri2" already built-in [ 675.575] (II) NVIDIA(0): [DRI2] Setup complete [ 675.575] (II) NVIDIA(0): [DRI2] VDPAU driver: nvidia [ 675.576] (--) RandR disabled [ 675.577] (EE) Failed to initialize GLX extension (Compatible NVIDIA X driver not found) [ 675.577] (II) Loading sub module "shadow" [ 675.577] (II) LoadModule: "shadow" [ 675.577] (II) Loading /usr/local/lib/xorg/modules/libshadow.so [ 675.577] (II) Module shadow: vendor="X.Org Foundation" [ 675.577] compiled for 1.17.4, module version = 1.1.0 [ 675.577] ABI class: X.Org ANSI C Emulation, version 0.4 [ 675.638] (II) config/devd: probing input devices... [ 675.638] (II) config/devd: adding input device (null) (/dev/kbdmux) [ 675.638] (II) LoadModule: "kbd" [ 675.639] (II) Loading /usr/local/lib/xorg/modules/input/kbd_drv.so [ 675.640] (II) Module kbd: vendor="X.Org Foundation" [ 675.640] compiled for 1.17.4, module version = 1.8.1 [ 675.640] Module class: X.Org XInput Driver [ 675.640] ABI class: X.Org XInput driver, version 21.0 [ 675.640] (II) Using input driver 'kbd' for 'kbdmux' [ 675.640] (**) kbdmux: always reports core events [ 675.640] (**) kbdmux: always reports core events [ 675.640] (**) Option "Protocol" "standard" [ 675.640] (**) Option "XkbRules" "base" [ 675.640] (**) Option "XkbModel" "pc105" [ 675.640] (**) Option "XkbLayout" "us" [ 675.640] (**) Option "config_info" "devd:kbdmux" [ 675.640] (II) XINPUT: Adding extended input device "kbdmux" (type: KEYBOARD, id 6) [ 675.640] (II) config/devd: kbdmux is enabled, ignoring device atkbd0 [ 675.640] (II) config/devd: adding input device (null) (/dev/sysmouse) [ 675.640] (II) LoadModule: "mouse" [ 675.640] (II) Loading /usr/local/lib/xorg/modules/input/mouse_drv.so [ 675.642] (II) Module mouse: vendor="X.Org Foundation" [ 675.642] compiled for 1.17.4, module version = 1.9.1 [ 675.642] Module class: X.Org XInput Driver [ 675.642] ABI class: X.Org XInput driver, version 21.0 [ 675.642] (II) Using input driver 'mouse' for 'sysmouse' [ 675.642] (**) sysmouse: always reports core events [ 675.642] (**) Option "Device" "/dev/sysmouse" [ 675.642] (==) sysmouse: Protocol: "Auto" [ 675.642] (**) sysmouse: always reports core events [ 675.642] (==) sysmouse: Emulate3Buttons, Emulate3Timeout: 50 [ 675.642] (**) sysmouse: ZAxisMapping: buttons 4 and 5 [ 675.642] (**) sysmouse: Buttons: 5 [ 675.642] (**) Option "config_info" "devd:sysmouse" [ 675.642] (II) XINPUT: Adding extended input device "sysmouse" (type: MOUSE, id 7) [ 675.642] (**) sysmouse: (accel) keeping acceleration scheme 1 [ 675.642] (**) sysmouse: (accel) acceleration profile 0 [ 675.642] (**) sysmouse: (accel) acceleration factor: 2.000 [ 675.642] (**) sysmouse: (accel) acceleration threshold: 4 [ 675.642] (II) sysmouse: SetupAuto: hw.iftype is 4, hw.model is 0 [ 675.642] (II) sysmouse: SetupAuto: protocol is SysMouse [3] Option "Accel" "no" Option "RenderAccel" "no" Option "UBB" "no" Option "NoFlip" "yes" Option "Overlay" "no" Option "MultisampleCompatibility" "off" Option "NoPowerConnectorCheck" "yes" Option "ThermalConfigurationCheck" "no" Option "TripleBuffer" "off" Option "ModeDebug" "yes" Option "IndirectMemoryAccess" "no" Option "UseSysmemPixmapAccel" "no" #Option "UseDPLib" "off" [4] Jan 11 11:34:49 fbsd12tst kernel: nvidia-modeset: ERROR: GPU:0: Display engine push buffer channel allocation failed Jan 11 11:34:49 fbsd12tst kernel: nvidia-modeset: ERROR: GPU:0: Failed to allocate display engine core DMA push buffer Fatal trap 12: page fault while in kernel mode cpuid = 2; apic id = 02 fault virtual address = 0x20 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff8243997b stack pointer = 0x28:0xfffffe007aec8ab8 frame pointer = 0x28:0xfffff80006a58808 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 3 current process = 631 (Xorg) [ thread pid 631 tid 100062 ] Stopped at _nv002035kms+0x3b: movq 0x20(%rax,%rdx,8),%rax db> -- [SorAlx] ridin' VN2000 Classic LT
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170111195402.785f27c6>