Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Mar 2019 11:48:21 -0400
From:      Nick Wolff <darkfiberiru@gmail.com>
To:        Robert Crowston <crowston@protonmail.com>
Cc:        "freebsd-virtualization@freebsd.org" <freebsd-virtualization@freebsd.org>, mmacy@freebsd.org
Subject:   Re: GPU passthrough: mixed success on Linux, not yet on Windows
Message-ID:  <CACxAneB0C09zDHwemLEVO9W1Mc==L6XBc0vAXkeTqK96_AQ45g@mail.gmail.com>
In-Reply-To: <H0Gbov17YtZC1-Ao1YkjZ-nuOqPv4LPggc_mni3cS8WWOjlSLBAfOGGPf4aZEpOBiC5PAUGg6fkgeutcLrdbmXNO5QfaxFtK_ANn-Nrklws=@protonmail.com>
References:  <H0Gbov17YtZC1-Ao1YkjZ-nuOqPv4LPggc_mni3cS8WWOjlSLBAfOGGPf4aZEpOBiC5PAUGg6fkgeutcLrdbmXNO5QfaxFtK_ANn-Nrklws=@protonmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Robert,

So for problem 2/3 you may want to look at the  ppt_pci_reset(device_t dev)
function in ppt.c. I'm not sure why this isn't dealing with both of your
problems. Matt Macy may have some ideas.

Great work by the way

Thanks,

Nick Wolff

On Sun, Mar 17, 2019 at 12:23 PM Robert Crowston via freebsd-virtualization
<freebsd-virtualization@freebsd.org> wrote:

> Hi folks, this is my first post to the group. Apologies for length.
>
> I've been experimenting with GPU passthrough on bhyve. For background, the
> host system is FreeBSD 12.0-RELEASE on an AMD Ryzen 1700 CPU @ 3.8 GHz, 32
> GB of ECC RAM, with two nVidia GPUs. I'm working with a Linux Debian 9
> guest and a Windows Server 2019 (desktop experience installed) guest. I
> also have a USB controller passed-through for bluetooth and keyboard.
>
> With some unpleasant hacks I have succeeded in starting X on the Linux
> guest, passing-through an nVidia GT 710 under the nouveau driver. I can run
> the "mate" desktop and glxgears, both of which are smooth at 4K. The Unity
> Heaven benchmark runs at an embarrassing 0.1 fps, and 2160p x264 video in
> VLC runs at about 5 fps. Neither appears to be CPU-bound in the host or the
> guest.
>
> The hack I had to make: I found that many instructions to access
> memory-mapped PCI BARs are not being executed on the CPU in guest mode but
> are being passed back for emulation in the hypervisor. This causes an
> assertion to fail inside passthru_write() in pci_passthru.c
> ["pi->pi_bar[baridx].type == PCIBAR_IO"] because it does not expect to
> perform memory-mapped IO for the guest. Examining the to-be-emulated
> instructions in vmexit_inst_emul() {e.g., movl (%rdi), %eax}, they look
> benign to me, and I have no explanation for why the CPU refused to execute
> them in guest mode.
>
> As an amateur work-around, I removed the assertion and instead I obtain
> the desired offset into the guest's BAR, calculate what that guest address
> translates to in the host's address space, open(2) /dev/mem, mmap(2) over
> to that address, and perform the write directly. I do a similar trick in
> passthru_read(). Ugly, slow, but functional.
>
> This code path is accessed continuously whether or not X is running, with
> an increase in activity when running anything GPU-heavy. Always to bar 1,
> and mostly around the same offsets. I added some logging of this event. It
> runs at about 100 lines per second while playing video. An excerpt is:
> ...
> Unexpected out-of-vm passthrough write #492036 to bar 1 at offset 41100.
> Unexpected out-of-vm passthrough write #492037 to bar 1 at offset 41100.
> Unexpected out-of-vm passthrough read #276162 to bar 1 at offset 561280.
> Unexpected out-of-vm passthrough write #492038 to bar 1 at offset 38028.
> Unexpected out-of-vm passthrough write #492039 to bar 1 at offset 38028.
> Unexpected out-of-vm passthrough read #276163 to bar 1 at offset 561184.
> Unexpected out-of-vm passthrough read #276164 to bar 1 at offset 561184.
> Unexpected out-of-vm passthrough read #276165 to bar 1 at offset 561184.
> Unexpected out-of-vm passthrough read #276166 to bar 1 at offset 561184.
> ...
>
> So my question here is,
> 1. How do I diagnose why the instructions are not being executed in guest
> mode?
>
> Some other problems:
>
> 2. Once the virtual machine is shut down, the passed-through GPU doesn't
> get turned off. Whatever message was on the screen in the final throes of
> Linux's shutdown stays there. Maybe there is a specific detach command
> which bhyve or nouveau hasn't yet implemented? Alternatively, maybe I could
> exploit some power management feature to reset the card when bhyve exits.
>
> 3. It is not possible to reboot the guest and then start X again without
> an intervening host reboot. The text console works fine. Xorg.0.log has a
> message like
>     (EE) [drm] Failed to open DRM device for pci:0000:00:06.0: -19
>     (EE) open /dev/dri/card0: No such file or directory
> dmesg is not very helpful either.[0] I suspect that this is related to
> problem (2).
>
> 4. There is a known bug in the version of the Xorg server that ships with
> Debian 9, where the switch from an animated mouse cursor back to a static
> cursor causes the X server to sit in a busy loop of gradually increasing
> stack depth, if the GPU takes too long to communicate with the driver.[1]
> For me, this consistently happens after I type my password into the Debian
> login dialog box and eventually (~ 120 minutes) locks up the host by eating
> all the swap. A work-around is to replace the guest's animated cursors with
> static cursors. The bug is fixed in newer versions of X, but I haven't
> tested whether their fix works for me yet.
>
> 5. The GPU doesn't come to life until the nouveau driver kicks in. What is
> special about the driver? Why doesn't the UEFI open the GPU and send it
> output before the boot? Any idea if the problem is on the UEFI side or the
> hypervisor side?
>
> 6. On Windows, the way Windows probes multi-BAR devices seems to be
> inconsistent with bhyve's model for storing io memory mappings.
> Specifically, I believe Windows assigns the 0xffffffff sentinel to all BARs
> on a device in one shot, then reads them back and assigns the true
> addresses afterwards. However, bhyve sees the multiple 0xffffffff
> assignments to different BARs as a clash and errors out on the second BAR
> probe. I removed most of the mmio_rb_tree error handling in mem.c and this
> is sufficient for Windows to boot, and detect and correctly identify the
> GPU. (A better solution might be to handle the initial 0xffffffff write as
> a special case.) I can then install the official nVidia drivers without
> problem over Remote Desktop. However, the GPU never springs into life: I am
> stuck with a "Windows has stopped this device because it has reported
> problems. (Code 43)" error in the device manager, a blank screen, and not
> much else to go on.
>
> Is it worth me continuing to hack away at these problems---of course I'm
> happy to share anything I come up with---or is there an official solution
> to GPU support in the pipe about to make my efforts redundant :)?
>
> Thanks,
> Robert Crowston.
>
> ---
> Footnotes
>
> [0]  Diff'ing dmesg after successful GPU initialization (+) and after
> failure (-), and cutting out some lines that aren't relevant:
>  nouveau 0000:00:06.0: bios: version 80.28.a6.00.10
> +nouveau 0000:00:06.0: priv: HUB0: 085014 ffffffff (1f70820b)
>  nouveau 0000:00:06.0: fb: 1024 MiB DDR3
> @@ -466,24 +467,17 @@
>  nouveau 0000:00:06.0: DRM: DCB conn 00: 00001031
>  nouveau 0000:00:06.0: DRM: DCB conn 01: 00002161
>  nouveau 0000:00:06.0: DRM: DCB conn 02: 00000200
> -nouveau 0000:00:06.0: disp: chid 0 mthd 0000 data 00000400 00001000
> 00000002
> -nouveau 0000:00:06.0: timeout at
> /build/linux-UEAD6s/linux-4.9.144/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgf119.c:88/gf119_disp_dmac_init()!
> -nouveau 0000:00:06.0: disp: ch 1 init: c207009b
> -nouveau: DRM:00000000:0000927c: init failed with -16
> -nouveau 0000:00:06.0: timeout at
> /build/linux-UEAD6s/linux-4.9.144/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgf119.c:54/gf119_disp_dmac_fini()!
> -nouveau 0000:00:06.0: disp: ch 1 fini: c2071088
> -nouveau 0000:00:06.0: timeout at
> /build/linux-UEAD6s/linux-4.9.144/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgf119.c:54/gf119_disp_dmac_fini()!
> -nouveau 0000:00:06.0: disp: ch 1 fini: c2071088
> +[drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> +[drm] Driver supports precise vblank timestamp query.
> +nouveau 0000:00:06.0: DRM: MM: using COPY for buffer copies
> +nouveau 0000:00:06.0: DRM: allocated 1920x1080 fb: 0x60000, bo
> ffff96fdb39a1800
> +fbcon: nouveaufb (fb0) is primary device
> -nouveau 0000:00:06.0: timeout at
> /build/linux-UEAD6s/linux-4.9.144/drivers/gpu/drm/nouveau/nvkm/engine/disp/coregf119.c:187/gf119_disp_core_fini()
> -nouveau 0000:00:06.0: disp: core fini: 8d0f0088
> -[TTM] Finalizing pool allocator
> -[TTM] Finalizing DMA pool allocator
> -[TTM] Zone  kernel: Used memory at exit: 0 kiB
> -[TTM] Zone   dma32: Used memory at exit: 0 kiB
> -nouveau: probe of 0000:00:06.0 failed with error -16
> +Console: switching to colour frame buffer device 240x67
> +nouveau 0000:00:06.0: fb0: nouveaufb frame buffer device
> +[drm] Initialized nouveau 1.3.1 20120801 for 0000:00:06.0 on minor 0
>
> [1]
> https://devtalk.nvidia.com/default/topic/1028172/linux/titan-v-ubuntu-16-04lts-and-387-34-driver-crashes-badly/post/5230898/#5230898
> _______________________________________________
> freebsd-virtualization@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
> To unsubscribe, send any mail to "
> freebsd-virtualization-unsubscribe@freebsd.org"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACxAneB0C09zDHwemLEVO9W1Mc==L6XBc0vAXkeTqK96_AQ45g>