Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 09 Jun 2017 10:22:09 +0200
From:      Harry Schmalzbauer <freebsd@omnilan.de>
To:        Anish <akgupt3@gmail.com>
Cc:        "freebsd-virtualization@freebsd.org" <freebsd-virtualization@freebsd.org>
Subject:   Re: PCIe passthrough really that expensive?
Message-ID:  <593A5AB1.7090301@omnilan.de>
In-Reply-To: <CALnRwMRst1d_O_ix-_JaS=tH8=dPtNNkDo9WyzRH1_nBi1N6zA@mail.gmail.com>
References:  <59383F5C.8020801@omnilan.de> <CALnRwMRst1d_O_ix-_JaS=tH8=dPtNNkDo9WyzRH1_nBi1N6zA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Bezüglich Anish's Nachricht vom 08.06.2017 14:35 (localtime):
> Hi Harry,
>>I thought I'd save these expensive VM_Exits by using the passthru path.
> Completely wrong, is it?
> 
> It depends on which processor you are using. For example APICv was
> introduced in IvyBridge which enabled h/w assisted localAPIC rather than
> using s/w emulated, bhyve supports it on Intel processors. 
> 
> Intel Broadwell introduced PostedInterrupt which enabled interrupt to
> delivered to guest directly, bypassing hypervisor[2] for
> passthrough devices. Emulated devices interrupt will still go through
> hypervisor. 

That's very interesting, thanks so much!
I wasn't ware that there were post VT-c improvements, guess I'll have to
refresh my very basic knowledge urgently.
I'm still usign IvyBridge (E3v2) with this "new" machine, but haven't
ever heard/thought about APCIv!


> You can verify capability using sysctl hw.vmm.vmx. What processor you
> are using for these performance benchmarking?

hw.vmm.vmx.vpid_alloc_failed: 0
hw.vmm.vmx.posted_interrupt_vector: -1
hw.vmm.vmx.cap.posted_interrupts: 0
hw.vmm.vmx.cap.virtual_interrupt_delivery: 0
hw.vmm.vmx.cap.invpcid: 0
hw.vmm.vmx.cap.monitor_trap: 1
hw.vmm.vmx.cap.unrestricted_guest: 1
hw.vmm.vmx.cap.pause_exit: 1
hw.vmm.vmx.cap.halt_exit: 1
hw.vmm.vmx.initialized: 1
hw.vmm.vmx.cr4_zeros_mask: 18446744073708017664
hw.vmm.vmx.cr4_ones_mask: 8192
hw.vmm.vmx.cr0_zeros_mask: 18446744071025197056
hw.vmm.vmx.cr0_ones_mask: 3

I did very simply 'time cp' with 8GB files over NFSv4, which come from
ZFS-cache on the remote side and locally watching host+guest vmstat.


> Can you run a simple experiment, assign pptdev interrupts to core that's
> not running guest/vcpu? This will reduce #VMEXIT on vcpu which we know
> is expensive.

Interesting approach.  But I have no idea how I should assign a PCIe
specific core to a PCIe dev.  Is it pptdev specific? The tunables in
device.hints(5) can't be used for that, can they?
It seems pptdev(0) couldn't get a man page yet, but I'll have a look at
the sources, maybe I can find hints until earth has done it's job and
present you the same nice sunshine I'm enjoying today :-)

-harry



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?593A5AB1.7090301>