From owner-freebsd-virtualization@FreeBSD.ORG Wed Mar 25 09:30:15 2015 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CF8DE1AB for ; Wed, 25 Mar 2015 09:30:15 +0000 (UTC) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8A05CE8F for ; Wed, 25 Mar 2015 09:30:14 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1YahdV-0004nH-0h for freebsd-virtualization@freebsd.org; Wed, 25 Mar 2015 10:30:05 +0100 Received: from ip184-189-251-175.sb.sd.cox.net ([184.189.251.175]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 25 Mar 2015 10:30:04 +0100 Received: from madoka by ip184-189-251-175.sb.sd.cox.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 25 Mar 2015 10:30:04 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-virtualization@freebsd.org From: Julian Hsiao Subject: Several bhyve quirks Date: Wed, 25 Mar 2015 02:24:49 -0700 Lines: 69 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: ip184-189-251-175.sb.sd.cox.net User-Agent: Unison/2.1.10 X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Mar 2015 09:30:16 -0000 Hi, I'm running bhyve on 10.1, mostly with OpenBSD (5.7) guests, and I ran into a few strange issues: 1. The guest RTC is several hours off every time I start bhyve. The host RTC is set to UTC, and /etc/localtime on both the host and guests are set to US/Pacific (currently PDT). I thought maybe bhyve is setting the RTC to the local time, and indeed changing TZ environment variable affects the guest's RTC. However, with TZ=UTC the guest is still off by an hour, and to get the correct offset I set TZ='UTC+1'; perhaps something's not handling DST correctly? Also, one time the offset was mysteriously tens of hours off (i.e. the guest RTS is a day or two ahead), and the condition persisted across multiple host and guest reboots. Unfortunately, the problem went away a few hours later and I was unable to reproduce it since. suggests that I'm on the right track, but it doesn't explain the off-by-one nor the (one time) multi-day offset. As an aside, the commit message implies that this only affects OpenBSD guest, when in fact this probably affects all guests (at least also Linux). Perhaps he meant you cannot configure OpenBSD to assume that the RTC is set to local time instead of UTC. 2. What's the preferred solution for minimizing guest clock drift in bhyve? Based on some Google searches, I run ntpd in the guests and set kern.timecounter.hardware=acpitimer0 instead of the default acpihpet0. acpitimer0 drifts by ~600 ppm while acpihpet0 drifts by ~1500 ppm; why? 3. Even moderate guest disk I/O completely kills guest network performance. For example, whenever security(8) (security(7) in FreeBSD) runs, guest network throughput drops from 150+ Mbps to ~20 Mbps, and jitter from ping jumps from <0.01 ms to 100+ ms. If I try to build something in the guest, then network becomes almost unusable. The network performance degradation only affects the guest that's generating the I/O; high I/O on guest B doesn't affect guest A, nor would high I/O on the host. I'm using both virtio-blk and virio-net drivers, and the guests' disk images are backed by zvol+geli. Removing geli has no effect. There are some commits in CURRENT that suggests improved virtio performance, but I'm not comfortable running CURRENT. Is there a workaround I could use for 10.1? 4. virtio-blk always reports the virtual disk as having 512-byte sectors, and so I get I/O errors on OpenBSD guests when the disk image is backed by zvol+geli with 4K sector size. Curiously, this only seems to affect zvol+geli; with just zvol it seems to work. Also, it works either way on Linux guests. ATM I changed the zvol / geli sector size to 512 bytes, which probably made #2 worse. I think this bug / feature is addressed by: , but again is there a workaround to force a specific sector size for 10.1? 5. This may be better directed at OpenBSD but I'll ask here anyway: if I enable virtio-rnd then OpenBSD would not boot with "couldn't map interrupt" error. The kernel in bsd.rd will boot, but not the installed kernel (or the one built from STABLE; I forgot). Again, Linux seems unaffected, but I couldn't tell if it's actually working. Julian Hsiao From owner-freebsd-virtualization@FreeBSD.ORG Wed Mar 25 15:44:48 2015 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 91BE8A45 for ; Wed, 25 Mar 2015 15:44:48 +0000 (UTC) Received: from iredmail.onthenet.com.au (iredmail.onthenet.com.au [203.13.68.150]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4C9DF8E0 for ; Wed, 25 Mar 2015 15:44:47 +0000 (UTC) Received: from localhost (iredmail.onthenet.com.au [127.0.0.1]) by iredmail.onthenet.com.au (Postfix) with ESMTP id D2E25280F61 for ; Thu, 26 Mar 2015 01:44:38 +1000 (EST) X-Amavis-Modified: Mail body modified (using disclaimer) - iredmail.onthenet.com.au X-Virus-Scanned: amavisd-new at iredmail.onthenet.com.au Received: from iredmail.onthenet.com.au ([127.0.0.1]) by localhost (iredmail.onthenet.com.au [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zAs27BruUC8O for ; Thu, 26 Mar 2015 01:44:38 +1000 (EST) Received: from Peters-MacBook-Pro.local (c-76-126-65-88.hsd1.ca.comcast.net [76.126.65.88]) by iredmail.onthenet.com.au (Postfix) with ESMTPSA id 974F7280F5C; Thu, 26 Mar 2015 01:44:36 +1000 (EST) Message-ID: <5512D7E3.9060401@freebsd.org> Date: Wed, 25 Mar 2015 08:44:35 -0700 From: Peter Grehan User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Julian Hsiao Subject: Re: Several bhyve quirks References: In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-virtualization@freebsd.org X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Mar 2015 15:44:48 -0000 Hi Julian, I'll let Neel take care of the time questions. > 3. Even moderate guest disk I/O completely kills guest network > performance. For example, whenever security(8) (security(7) in FreeBSD) > runs, guest network throughput drops from 150+ Mbps to ~20 Mbps, and > jitter from ping jumps from <0.01 ms to 100+ ms. If I try to build > something in the guest, then network becomes almost unusable. > > The network performance degradation only affects the guest that's > generating the I/O; high I/O on guest B doesn't affect guest A, nor > would high I/O on the host. > > I'm using both virtio-blk and virio-net drivers, and the guests' disk > images are backed by zvol+geli. Removing geli has no effect. > > There are some commits in CURRENT that suggests improved virtio > performance, but I'm not comfortable running CURRENT. Is there a > workaround I could use for 10.1? In 10.1, virtio-blk i/o is done sychronously in the context of the guest vCPU exit. If it's a single vCPU guest, or the virtio-net interrupt happens to be delivered to that vCPU, performance will suffer. A workaround is to use ahci-hd for the disk emulation and not virtio-blk. The AHCI emulation does i/o in a dedicated thread and doesn't block the vCPU thread. > 4. virtio-blk always reports the virtual disk as having 512-byte > sectors, and so I get I/O errors on OpenBSD guests when the disk image > is backed by zvol+geli with 4K sector size. Curiously, this only seems > to affect zvol+geli; with just zvol it seems to work. Also, it works > either way on Linux guests. > > ATM I changed the zvol / geli sector size to 512 bytes, which probably > made #2 worse. I think this bug / feature is addressed by: > , > but again is there a workaround to force a specific sector size for 10.1? The only workaround for 10.1 would be to use ahci-hd instead of virtio-blk. The correct sector size will be reported there. > 5. This may be better directed at OpenBSD but I'll ask here anyway: if I > enable virtio-rnd then OpenBSD would not boot with "couldn't map > interrupt" error. The kernel in bsd.rd will boot, but not the installed > kernel (or the one built from STABLE; I forgot). Again, Linux seems > unaffected, but I couldn't tell if it's actually working. Try using the -W option to bhyve. This will force the bhyve virtio code to advertize (non-standard) MSI interrupt capability which OpenBSD will then use to allocate vectors. later, Peter. From owner-freebsd-virtualization@FreeBSD.ORG Thu Mar 26 07:43:59 2015 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 07E6AFD8 for ; Thu, 26 Mar 2015 07:43:59 +0000 (UTC) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B5C2CCB4 for ; Thu, 26 Mar 2015 07:43:57 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1Yb2S2-0001wA-DN for freebsd-virtualization@freebsd.org; Thu, 26 Mar 2015 08:43:38 +0100 Received: from ip184-189-251-175.sb.sd.cox.net ([184.189.251.175]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 26 Mar 2015 08:43:38 +0100 Received: from madoka by ip184-189-251-175.sb.sd.cox.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 26 Mar 2015 08:43:38 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-virtualization@freebsd.org From: Julian Hsiao Subject: Re: Several bhyve quirks Date: Thu, 26 Mar 2015 00:43:27 -0700 Lines: 35 Message-ID: References: <5512D7E3.9060401@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: ip184-189-251-175.sb.sd.cox.net User-Agent: Unison/2.1.10 X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2015 07:43:59 -0000 On 2015-03-25 15:44:35 +0000, Peter Grehan said: > In 10.1, virtio-blk i/o is done sychronously in the context of the > guest vCPU exit. If it's a single vCPU guest, or the virtio-net > interrupt happens to be delivered to that vCPU, performance will suffer. > > A workaround is to use ahci-hd for the disk emulation and not > virtio-blk. The AHCI emulation does i/o in a dedicated thread and > doesn't block the vCPU thread. Thank you for your explanation and tips, Peter. I just tried changing virtio-blk -> ahci-hd and preliminary results are good. And now you've mentioned it, I do recall seeing slightly less performance degradation on guests with 2 vCPUs vs. ones with just one. I've always assumed virtio driver > emulated driver so it didn't occur to me to try ahci-hd. > The only workaround for 10.1 would be to use ahci-hd instead of > virtio-blk. The correct sector size will be reported there. I haven't had a chance to test this; next time I spin up a guest from scratch I'll try it out. > Try using the -W option to bhyve. This will force the bhyve virtio > code to advertize (non-standard) MSI interrupt capability which OpenBSD > will then use to allocate vectors. Unfortunately -W didn't help. This is not critical, however, and I'll ask around in the OpenBSD mailing list. Thanks again for your help. Julian Hsiao From owner-freebsd-virtualization@FreeBSD.ORG Thu Mar 26 07:54:14 2015 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 63517167 for ; Thu, 26 Mar 2015 07:54:14 +0000 (UTC) Received: from iredmail.onthenet.com.au (iredmail.onthenet.com.au [203.13.68.150]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1E942DA2 for ; Thu, 26 Mar 2015 07:54:13 +0000 (UTC) Received: from localhost (iredmail.onthenet.com.au [127.0.0.1]) by iredmail.onthenet.com.au (Postfix) with ESMTP id EF7B7281116 for ; Thu, 26 Mar 2015 17:54:10 +1000 (EST) X-Amavis-Modified: Mail body modified (using disclaimer) - iredmail.onthenet.com.au X-Virus-Scanned: amavisd-new at iredmail.onthenet.com.au Received: from iredmail.onthenet.com.au ([127.0.0.1]) by localhost (iredmail.onthenet.com.au [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id i5v4ycbr4dzo for ; Thu, 26 Mar 2015 17:54:10 +1000 (EST) Received: from Peters-MacBook-Pro.local (c-76-126-65-88.hsd1.ca.comcast.net [76.126.65.88]) by iredmail.onthenet.com.au (Postfix) with ESMTPSA id 0E791280A0E; Thu, 26 Mar 2015 17:54:08 +1000 (EST) Message-ID: <5513BB1E.6020101@freebsd.org> Date: Thu, 26 Mar 2015 00:54:06 -0700 From: Peter Grehan User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Julian Hsiao Subject: Re: Several bhyve quirks References: <5512D7E3.9060401@freebsd.org> In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-virtualization@freebsd.org X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2015 07:54:14 -0000 Hi Julian, > Thank you for your explanation and tips, Peter. I just tried changing > virtio-blk -> ahci-hd and preliminary results are good. And now you've > mentioned it, I do recall seeing slightly less performance degradation > on guests with 2 vCPUs vs. ones with just one. Glad to hear that :) >> Try using the -W option to bhyve. This will force the bhyve virtio >> code to advertize (non-standard) MSI interrupt capability which OpenBSD >> will then use to allocate vectors. > > Unfortunately -W didn't help. This is not critical, however, and I'll > ask around in the OpenBSD mailing list. I tried this out today with OpenBSD 5.7 and a CURRENT host, and it's actually a bug in the virtio-rnd implementation in bhyve when MSI-x isn't used. The early testing was with FreeBSD and Linux which both use MSI-x so wasn't picked up for the MSI/legacy case. I have a fix for CURRENT and that should make it's way into 10-stable shortly. Thanks for the report ! later, Peter. From owner-freebsd-virtualization@FreeBSD.ORG Thu Mar 26 15:33:27 2015 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 00357854 for ; Thu, 26 Mar 2015 15:33:26 +0000 (UTC) Received: from phabric-backend.isc.freebsd.org (phabric-backend.isc.freebsd.org [IPv6:2001:4f8:3:ffe0:406a:0:50:2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D0C5FAC1 for ; Thu, 26 Mar 2015 15:33:26 +0000 (UTC) Received: from phabric-backend.isc.freebsd.org (phabric-backend.isc.freebsd.org [127.0.1.5]) by phabric-backend.isc.freebsd.org (8.14.9/8.14.9) with ESMTP id t2QFXQto058566 for ; Thu, 26 Mar 2015 15:33:26 GMT (envelope-from root@phabric-backend.isc.freebsd.org) Received: (from root@localhost) by phabric-backend.isc.freebsd.org (8.14.9/8.14.9/Submit) id t2QFXQKJ058565; Thu, 26 Mar 2015 15:33:26 GMT (envelope-from root) Date: Thu, 26 Mar 2015 15:33:26 +0000 To: freebsd-virtualization@freebsd.org From: "rodrigc (Craig Rodrigues)" Subject: [Differential] [Updated] D1944: PF and VIMAGE fixes Message-ID: <3861d4b0c9d09419b45ad2fab508b00c@localhost.localdomain> X-Priority: 3 Thread-Topic: D1944: PF and VIMAGE fixes X-Herald-Rules: none X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-Cc: X-Phabricator-Cc: X-Phabricator-Cc: In-Reply-To: References: Thread-Index: NDc2NzM0MzY4OTdiYThiNTU1MjY2ZDZmMTJiIFUUJsY= X-Phabricator-Sent-This-Message: Yes X-Mail-Transport-Agent: MetaMTA X-Auto-Response-Suppress: All X-Phabricator-Mail-Tags: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="utf-8" X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2015 15:33:27 -0000 rodrigc added a reviewer: kristof. REVISION DETAIL https://reviews.freebsd.org/D1944 To: nvass-gmx.com, gnn, bz, zec, trociny, glebius, rodrigc, kristof Cc: freebsd-virtualization, freebsd-pf, freebsd-net From owner-freebsd-virtualization@FreeBSD.ORG Thu Mar 26 21:24:34 2015 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9B54DCED for ; Thu, 26 Mar 2015 21:24:34 +0000 (UTC) Received: from phabric-backend.isc.freebsd.org (phabric-backend.isc.freebsd.org [IPv6:2001:4f8:3:ffe0:406a:0:50:2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 775A1C60 for ; Thu, 26 Mar 2015 21:24:34 +0000 (UTC) Received: from phabric-backend.isc.freebsd.org (phabric-backend.isc.freebsd.org [127.0.1.5]) by phabric-backend.isc.freebsd.org (8.14.9/8.14.9) with ESMTP id t2QLOY9J054582 for ; Thu, 26 Mar 2015 21:24:34 GMT (envelope-from root@phabric-backend.isc.freebsd.org) Received: (from root@localhost) by phabric-backend.isc.freebsd.org (8.14.9/8.14.9/Submit) id t2QLOYxY054580; Thu, 26 Mar 2015 21:24:34 GMT (envelope-from root) Date: Thu, 26 Mar 2015 21:24:34 +0000 To: freebsd-virtualization@freebsd.org From: "kristof (Kristof Provost)" Subject: [Differential] [Commented On] D1944: PF and VIMAGE fixes Message-ID: <73587668d2894451fd37eac55f03906c@localhost.localdomain> X-Priority: 3 Thread-Topic: D1944: PF and VIMAGE fixes X-Herald-Rules: none X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-Cc: X-Phabricator-Cc: X-Phabricator-Cc: In-Reply-To: References: Thread-Index: NDc2NzM0MzY4OTdiYThiNTU1MjY2ZDZmMTJiIFUUeRI= X-Phabricator-Sent-This-Message: Yes X-Mail-Transport-Agent: MetaMTA X-Auto-Response-Suppress: All X-Phabricator-Mail-Tags: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="utf-8" X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2015 21:24:34 -0000 kristof added inline comments. INLINE COMMENTS sys/netpfil/pf/pf_ioctl.c:325 It's not clear to me why this is done here, rather than in pf_unload(). The initialisation is done in pf_load() after all. sys/netpfil/pf/pf_ioctl.c:3725 Don't we still need to do all of this somewhere? REVISION DETAIL https://reviews.freebsd.org/D1944 To: nvass-gmx.com, gnn, bz, zec, trociny, glebius, rodrigc, kristof Cc: freebsd-virtualization, freebsd-pf, freebsd-net From owner-freebsd-virtualization@FreeBSD.ORG Fri Mar 27 09:46:55 2015 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 15AAB494 for ; Fri, 27 Mar 2015 09:46:55 +0000 (UTC) Received: from mail-wg0-x22d.google.com (mail-wg0-x22d.google.com [IPv6:2a00:1450:400c:c00::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9BFD12B1 for ; Fri, 27 Mar 2015 09:46:54 +0000 (UTC) Received: by wgbgs4 with SMTP id gs4so2131293wgb.0 for ; Fri, 27 Mar 2015 02:46:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; bh=ZVv2gsQgeuTxCSbqnt0mpkrTM42qlcVoldD15SIMDic=; b=nPXfsiLYmcVUELXOpqpfR1aBFU1LvldboDlwu8Da0+Cneu7MCXu3bCa2cKe0BXfWD3 FZAwkQIW/GzPQ2f+1CTNJNDqJIdeMpwvWGcqdolD1BgqGFu03V1jYRWDU8/sB7z0x3Sp XyXxwTM4ejul4WOBQNsB6GNU/PlDP3Pp7/tYNyuZIPDdqGXIj5zLstp6VXi5GQGStKXO QJE4yrwZqZ8Mg00AROYAGM9HPjxGhf2urFKEb2VKB28gd9OowMKA+UqMIIANSMH1jYDz 7q3pgpQ2tt3qBYsmM69eXON+ySb4auKM5ugNrj438PF9h/ZbyeteyVB4w4mMU0e9tAXZ yQuA== X-Received: by 10.194.24.103 with SMTP id t7mr35518652wjf.15.1427449612639; Fri, 27 Mar 2015 02:46:52 -0700 (PDT) Received: from mavbook.mavhome.dp.ua ([134.249.139.101]) by mx.google.com with ESMTPSA id v8sm6488236wib.0.2015.03.27.02.46.51 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 27 Mar 2015 02:46:51 -0700 (PDT) Sender: Alexander Motin Message-ID: <5515270A.7050408@FreeBSD.org> Date: Fri, 27 Mar 2015 11:46:50 +0200 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Julian Hsiao , freebsd-virtualization@freebsd.org Subject: Bhyve storage improvements (was: Several bhyve quirks) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2015 09:46:55 -0000 > I've always assumed virtio driver > emulated driver so it didn't occur > to me to try ahci-hd. I've just merged to FreeBSD stable/10 branch set of bhyve changes that should significantly improve situation in the storage area. virtio-blk driver was fixed to work asynchronously and not block virtual CPU, that should fix many problems with performance and interactivity. Both virtio-blk and ahci-hd drivers got ability to execute multiple (up to 8) requests same time, that should proportionally improve parallel random I/O performance on wide storages. At this point virtio-blk is indeed faster then ahci-hd on high IOPS, and they both are faster then before. On the other side ahci-hd driver now got TRIM support to allow freeing unused space on backing ZVOL. Unfortunately there is no any TRIM/UNMAP support in virtio-blk API to allow the same. Also both virtio-blk and ahci-hd drivers now report to guest logical and physical block sizes of underlying storage, that allow guests properly align partitions and I/Os for best compatibility and performance. -- Alexander Motin From owner-freebsd-virtualization@FreeBSD.ORG Fri Mar 27 16:54:44 2015 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 55818E3; Fri, 27 Mar 2015 16:54:44 +0000 (UTC) Received: from webmail2.jnielsen.net (webmail2.jnielsen.net [50.114.224.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "webmail2.jnielsen.net", Issuer "freebsdsolutions.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 324B71CC; Fri, 27 Mar 2015 16:54:43 +0000 (UTC) Received: from [10.10.1.196] (office.betterlinux.com [199.58.199.60]) (authenticated bits=0) by webmail2.jnielsen.net (8.15.1/8.15.1) with ESMTPSA id t2RGlpO8076387 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 27 Mar 2015 10:47:54 -0600 (MDT) (envelope-from lists@jnielsen.net) X-Authentication-Warning: webmail2.jnielsen.net: Host office.betterlinux.com [199.58.199.60] claimed to be [10.10.1.196] Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Subject: Re: Bhyve storage improvements (was: Several bhyve quirks) From: John Nielsen In-Reply-To: <5515270A.7050408@FreeBSD.org> Date: Fri, 27 Mar 2015 10:47:50 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: <98136D5B-297B-4538-8EF4-EA2872C6640B@jnielsen.net> References: <5515270A.7050408@FreeBSD.org> To: Alexander Motin X-Mailer: Apple Mail (2.2070.6) Cc: Julian Hsiao , freebsd-virtualization@freebsd.org X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2015 16:54:44 -0000 On Mar 27, 2015, at 3:46 AM, Alexander Motin wrote: >> I've always assumed virtio driver > emulated driver so it didn't = occur >> to me to try ahci-hd. >=20 > I've just merged to FreeBSD stable/10 branch set of bhyve changes that > should significantly improve situation in the storage area. >=20 > virtio-blk driver was fixed to work asynchronously and not block = virtual > CPU, that should fix many problems with performance and interactivity. > Both virtio-blk and ahci-hd drivers got ability to execute multiple = (up > to 8) requests same time, that should proportionally improve parallel > random I/O performance on wide storages. At this point virtio-blk is > indeed faster then ahci-hd on high IOPS, and they both are faster then > before. >=20 > On the other side ahci-hd driver now got TRIM support to allow freeing > unused space on backing ZVOL. Unfortunately there is no any TRIM/UNMAP > support in virtio-blk API to allow the same. >=20 > Also both virtio-blk and ahci-hd drivers now report to guest logical = and > physical block sizes of underlying storage, that allow guests properly > align partitions and I/Os for best compatibility and performance. Mav, thank you very much for all this great work and for the concise = summary. TRIM on AHCI makes it compelling for a lot of use cases despite = the probable performance hit. Does anyone have plans (or know about any) to implement virtio-scsi = support in bhyve? That API does support TRIM and should retain most or = all of the low-overhead virtio goodness. JN From owner-freebsd-virtualization@FreeBSD.ORG Fri Mar 27 17:00:37 2015 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E2BEE300 for ; Fri, 27 Mar 2015 17:00:37 +0000 (UTC) Received: from webmail2.jnielsen.net (webmail2.jnielsen.net [50.114.224.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "webmail2.jnielsen.net", Issuer "freebsdsolutions.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id A54A622E for ; Fri, 27 Mar 2015 17:00:37 +0000 (UTC) Received: from [10.10.1.196] (office.betterlinux.com [199.58.199.60]) (authenticated bits=0) by webmail2.jnielsen.net (8.15.1/8.15.1) with ESMTPSA id t2RH0X35086547 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 27 Mar 2015 11:00:36 -0600 (MDT) (envelope-from lists@jnielsen.net) X-Authentication-Warning: webmail2.jnielsen.net: Host office.betterlinux.com [199.58.199.60] claimed to be [10.10.1.196] Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Subject: Re: Bhyve storage improvements (was: Several bhyve quirks) From: John Nielsen In-Reply-To: <98136D5B-297B-4538-8EF4-EA2872C6640B@jnielsen.net> Date: Fri, 27 Mar 2015 11:00:33 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: References: <5515270A.7050408@FreeBSD.org> <98136D5B-297B-4538-8EF4-EA2872C6640B@jnielsen.net> To: freebsd-virtualization@freebsd.org X-Mailer: Apple Mail (2.2070.6) Cc: Julian Hsiao X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2015 17:00:38 -0000 On Mar 27, 2015, at 10:47 AM, John Nielsen wrote: > On Mar 27, 2015, at 3:46 AM, Alexander Motin wrote: >=20 >>> I've always assumed virtio driver > emulated driver so it didn't = occur >>> to me to try ahci-hd. >>=20 >> I've just merged to FreeBSD stable/10 branch set of bhyve changes = that >> should significantly improve situation in the storage area. >>=20 >> virtio-blk driver was fixed to work asynchronously and not block = virtual >> CPU, that should fix many problems with performance and = interactivity. >> Both virtio-blk and ahci-hd drivers got ability to execute multiple = (up >> to 8) requests same time, that should proportionally improve parallel >> random I/O performance on wide storages. At this point virtio-blk is >> indeed faster then ahci-hd on high IOPS, and they both are faster = then >> before. >>=20 >> On the other side ahci-hd driver now got TRIM support to allow = freeing >> unused space on backing ZVOL. Unfortunately there is no any = TRIM/UNMAP >> support in virtio-blk API to allow the same. >>=20 >> Also both virtio-blk and ahci-hd drivers now report to guest logical = and >> physical block sizes of underlying storage, that allow guests = properly >> align partitions and I/Os for best compatibility and performance. >=20 > Mav, thank you very much for all this great work and for the concise = summary. TRIM on AHCI makes it compelling for a lot of use cases despite = the probable performance hit. >=20 > Does anyone have plans (or know about any) to implement virtio-scsi = support in bhyve? That API does support TRIM and should retain most or = all of the low-overhead virtio goodness. Okay, some belated googling reminded me that this has been listed as an = "open task" in the last couple of FreeBSD quarterly status reports and = discussed at one or more devsummits. I'd still be interested to know if = anyone's actually contemplated or started doing the work though. :) JN From owner-freebsd-virtualization@FreeBSD.ORG Fri Mar 27 17:43:14 2015 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E91B236C for ; Fri, 27 Mar 2015 17:43:13 +0000 (UTC) Received: from mail-wg0-x232.google.com (mail-wg0-x232.google.com [IPv6:2a00:1450:400c:c00::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7866AA53 for ; Fri, 27 Mar 2015 17:43:13 +0000 (UTC) Received: by wgra20 with SMTP id a20so106719753wgr.3 for ; Fri, 27 Mar 2015 10:43:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=cM+kxrYoSZJb4DbLSIGitmSkpjPvS1m/0kX0TCFH9aU=; b=bsJC2SybFjpYfMheekVe3fKMRSETvQAq2Qv51Ig4D2FND2gjopSacii9llqMJPPPeW EIu0cZ8wU74PldDwCODML7nuMfUeklV8kpKeFUMaK3ClEt711kHFzHqJPHgRyfy+9QJl EkwsapDoBEpZGE1NOZGFR0eVQnB8GGx7d8UJWDG/CBzy+luRFZVfbV6/NHeoNAGzfCPI JThiSOV6r5rSlAP19SEi/WEqDuLQVhYrw5niwB+rm4U2WNWAORjE48ltIah8KZ7xjadr wxlENjWzTic1LQVlj/VkKbi93OgE9A5WW+rRCPwmAxPuWm5JIUqEPDfTMHe1YADmIHM+ e07Q== X-Received: by 10.180.20.233 with SMTP id q9mr58535418wie.75.1427478191930; Fri, 27 Mar 2015 10:43:11 -0700 (PDT) Received: from mavbook.mavhome.dp.ua ([134.249.139.101]) by mx.google.com with ESMTPSA id bd1sm4375740wib.13.2015.03.27.10.43.10 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 27 Mar 2015 10:43:11 -0700 (PDT) Sender: Alexander Motin Message-ID: <551596AD.8070202@FreeBSD.org> Date: Fri, 27 Mar 2015 19:43:09 +0200 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: John Nielsen Subject: Re: Bhyve storage improvements References: <5515270A.7050408@FreeBSD.org> <98136D5B-297B-4538-8EF4-EA2872C6640B@jnielsen.net> In-Reply-To: <98136D5B-297B-4538-8EF4-EA2872C6640B@jnielsen.net> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Cc: freebsd-virtualization@freebsd.org X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2015 17:43:14 -0000 On 27.03.2015 18:47, John Nielsen wrote: > Does anyone have plans (or know about any) to implement virtio-scsi support in bhyve? That API does support TRIM and should retain most or all of the low-overhead virtio goodness. I was thinking about that (not really a plans yet, just some thoughts), but haven't found a good motivation and understanding of whole possible infrastructure. I am not sure it worth to emulate SCSI protocol in addition to already done ATA in ahci-hd and simple block in virtio-blk just to get another, possibly faster then AHCI, block storage with TRIM/UNMAP. Really good SCSI disk emulation in CTL in kernel takes about 20K lines of code. It is pointless to duplicate it, and may be complicated for administration to just interface to it. Indeed I've seen virtio-blk being faster then ahci-hd in some tests, but those tests were highly synthetic. I haven't tested it on real workloads, but I have feeling that real difference may be not that large. If somebody wants to check -- more benchmarks are highly welcome! From the theoretical side I'd like to notice that both ATA and SCSI protocols on guests go through additional ATA/SCSI infrastructure (CAM in FreeBSD), absent in case pure block virtio-blk, so they have some more overhead by definition. Main potential benefit I see from using virtio-scsi is a possibility to pass through to client not a block device, but some real SCSI device. It can be some local DVD writer, or remote iSCSI storage. The last would be especially interesting for large production installations. But the main problem I see here is booting. To make user-level loader boot the kernel from DVD or iSCSI, bhyve has to implement its own SCSI initiator, like small second copy of CAM in user-level. Booting kernel from some other local block storage and then attaching to remote iSCSI storage for data can be much easier, but it is not convenient. It is possible to nt connect to iSCSI directly from user-level, but to make kernel CAM do it, and then make CAM provide both block layer for booting and SCSI layer for virtio-scsi, but I am not sure that it is very good from security point to make host system to see virtual disks. Though may be it could work if CAM could block kernel/GEOM access to them, alike it is done for ZVOLs now, supporting "geom" and "dev" modes. Though that complicates CAM and the whole infrastructure. -- Alexander Motin From owner-freebsd-virtualization@FreeBSD.ORG Fri Mar 27 20:37:35 2015 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E679A3E7; Fri, 27 Mar 2015 20:37:35 +0000 (UTC) Received: from webmail2.jnielsen.net (webmail2.jnielsen.net [50.114.224.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "webmail2.jnielsen.net", Issuer "freebsdsolutions.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id C297B1F7; Fri, 27 Mar 2015 20:37:34 +0000 (UTC) Received: from [10.10.1.196] (office.betterlinux.com [199.58.199.60]) (authenticated bits=0) by webmail2.jnielsen.net (8.15.1/8.15.1) with ESMTPSA id t2RKbUWf053463 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 27 Mar 2015 14:37:33 -0600 (MDT) (envelope-from lists@jnielsen.net) X-Authentication-Warning: webmail2.jnielsen.net: Host office.betterlinux.com [199.58.199.60] claimed to be [10.10.1.196] Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Subject: Re: Bhyve storage improvements From: John Nielsen In-Reply-To: <551596AD.8070202@FreeBSD.org> Date: Fri, 27 Mar 2015 14:37:30 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: <1F36054F-7F07-4972-870C-65018F3AE5AC@jnielsen.net> References: <5515270A.7050408@FreeBSD.org> <98136D5B-297B-4538-8EF4-EA2872C6640B@jnielsen.net> <551596AD.8070202@FreeBSD.org> To: Alexander Motin X-Mailer: Apple Mail (2.2070.6) Cc: freebsd-virtualization@freebsd.org X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2015 20:37:36 -0000 On Mar 27, 2015, at 11:43 AM, Alexander Motin wrote: > On 27.03.2015 18:47, John Nielsen wrote: >> Does anyone have plans (or know about any) to implement virtio-scsi = support in bhyve? That API does support TRIM and should retain most or = all of the low-overhead virtio goodness. >=20 > I was thinking about that (not really a plans yet, just some = thoughts), > but haven't found a good motivation and understanding of whole = possible > infrastructure. >=20 > I am not sure it worth to emulate SCSI protocol in addition to already > done ATA in ahci-hd and simple block in virtio-blk just to get = another, > possibly faster then AHCI, block storage with TRIM/UNMAP. Really good > SCSI disk emulation in CTL in kernel takes about 20K lines of code. It > is pointless to duplicate it, and may be complicated for = administration > to just interface to it. Indeed I've seen virtio-blk being faster = then > ahci-hd in some tests, but those tests were highly synthetic. I = haven't > tested it on real workloads, but I have feeling that real difference = may > be not that large. If somebody wants to check -- more benchmarks are > highly welcome! =46rom the theoretical side I'd like to notice that = both > ATA and SCSI protocols on guests go through additional ATA/SCSI > infrastructure (CAM in FreeBSD), absent in case pure block virtio-blk, > so they have some more overhead by definition. Agreed, more testing is needed to see how big an effect having TRIM = remain dependent on AHCI emulation would have on performance. > Main potential benefit I see from using virtio-scsi is a possibility = to > pass through to client not a block device, but some real SCSI device. = It > can be some local DVD writer, or remote iSCSI storage. The last would = be > especially interesting for large production installations. But the = main > problem I see here is booting. To make user-level loader boot the = kernel > from DVD or iSCSI, bhyve has to implement its own SCSI initiator, like > small second copy of CAM in user-level. Booting kernel from some other > local block storage and then attaching to remote iSCSI storage for = data > can be much easier, but it is not convenient. It is possible to nt > connect to iSCSI directly from user-level, but to make kernel CAM do = it, > and then make CAM provide both block layer for booting and SCSI layer > for virtio-scsi, but I am not sure that it is very good from security > point to make host system to see virtual disks. Though may be it could > work if CAM could block kernel/GEOM access to them, alike it is done = for > ZVOLs now, supporting "geom" and "dev" modes. Though that complicates > CAM and the whole infrastructure. Yes, pass-through of disk devices opens up a number of possibilities. = Would it be feasible to just have bhyve broker between a pass(4) device = on the host and virtio_scsi(4) in the guest? That would require the = guest devices (be they local disks, iSCSI LUNs, etc) be connected to the = host but I'm not sure that's a huge concern. The host will always have a = high level of access to the guest's data. (Plus, there's nothing = preventing a guest from doing its own iSCSI, etc. after it boots). Using = the existing kernel infrastructure (CAM, iSCSI initiator, etc) would = also remove the need to duplicate any of that in userland, wouldn't it? The user-level loader is necessary for now but once UEFI support exists = in bhyve the external loader can go away. Any workarounds like you've = described above would similarly be temporary. Using Qemu+KVM on Linux as a comparison point, there are examples of = both kernel-level and user-level access by the host to guest disks. = Local disk images (be they raw or qcow2) are obviously manipulated by = the Qemu process from userland. RBD (Ceph/RADOS network block device) is = in userland. SRP (SCSI RDMA Protocol) is in kernel. There are a few ways = to do host- and/or kernel-based iSCSI. There is also a userland option = if you link Qemu against libiscsi when you build it. If we do ever want = userland iSCSI support, libiscsi does claim to be "pure POSIX" and to = have been tested on FreeBSD, among others. JN From owner-freebsd-virtualization@FreeBSD.ORG Fri Mar 27 23:49:09 2015 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9F47058A for ; Fri, 27 Mar 2015 23:49:09 +0000 (UTC) Received: from mail-wg0-x229.google.com (mail-wg0-x229.google.com [IPv6:2a00:1450:400c:c00::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 336E6A1B for ; Fri, 27 Mar 2015 23:49:09 +0000 (UTC) Received: by wgbgs4 with SMTP id gs4so23868154wgb.0 for ; Fri, 27 Mar 2015 16:49:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=xn9HiHceHuJ6QNBhE2sOxQvUjA7/eDW52HgUpynp3YA=; b=wjbow9pxCIi7t876Yu0J7gwu8w+kTJpIc3SJDCo4MOytN/JV3u8KrbRZdpznsyLMA2 AlwMBzfS3gi8SkOtRUv5hR9jjbFWxmcIuD6uGc8KadbH7RvKd9eVnuEYT6CJ7jB45VOw afw+1cSjq0FhMJN0l546tFMLwT77XTkAjI4OQ61p5B4dpJldhyFKMBWHX3FU/RG5tkKr cklsao0oi9x9T+RjEltY/GN47qhJsF134/SihneBlZxTWizWpUyNL0zL1Ja5ZKIuihND MDlX6DZOFfQhZLeEsxazNqfzvQgOxg5cqEXcIvDbOdzLeU2uAeDqPooulqB4V9wE+/zX fGkA== MIME-Version: 1.0 X-Received: by 10.180.212.37 with SMTP id nh5mr1905791wic.76.1427500146998; Fri, 27 Mar 2015 16:49:06 -0700 (PDT) Received: by 10.27.9.9 with HTTP; Fri, 27 Mar 2015 16:49:06 -0700 (PDT) In-Reply-To: References: Date: Fri, 27 Mar 2015 16:49:06 -0700 Message-ID: Subject: Re: Several bhyve quirks From: Neel Natu To: Julian Hsiao Content-Type: text/plain; charset=UTF-8 Cc: "freebsd-virtualization@freebsd.org" X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2015 23:49:09 -0000 Hi Julian, On Wed, Mar 25, 2015 at 2:24 AM, Julian Hsiao wrote: > Hi, > > I'm running bhyve on 10.1, mostly with OpenBSD (5.7) guests, and I ran into > a few strange issues: > > 1. The guest RTC is several hours off every time I start bhyve. The host > RTC is set to UTC, and /etc/localtime on both the host and guests are set to > US/Pacific (currently PDT). I thought maybe bhyve is setting the RTC to the > local time, and indeed changing TZ environment variable affects the guest's > RTC. However, with TZ=UTC the guest is still off by an hour, and to get the > correct offset I set TZ='UTC+1'; perhaps something's not handling DST > correctly? > > Also, one time the offset was mysteriously tens of hours off (i.e. the guest > RTS is a day or two ahead), and the condition persisted across multiple host > and guest reboots. Unfortunately, the problem went away a few hours later > and I was unable to reproduce it since. > The problem is that in 10.1 (and earlier) bhyve defaulted to a 12-hour RTC format but some guests like OpenBSD and Linux assume that it is configured in the 24-hour format. The 12-hour format indicates PM time by setting the most significant bit in the 'hour' byte. Since the guest is not prepared to mask this bit it thinks that the time is 68 hours ahead of the actual time (but only for PM times - everything goes back to normal during AM times). This is fixed in HEAD where the RTC device model defaults to 24-hour time. > > suggests that I'm on the right track, but it doesn't explain the off-by-one > nor the (one time) multi-day offset. > The one-hour offset is a bug due to my interpretation of the 12-hour format. I am going to fix this in HEAD shortly but here is a patch for 10.1 and earlier: https://people.freebsd.org/~neel/patches/bhyve_openbsd_rtc.patch > As an aside, the commit message implies that this only affects OpenBSD > guest, when in fact this probably affects all guests (at least also Linux). > Perhaps he meant you cannot configure OpenBSD to assume that the RTC is set > to local time instead of UTC. > > 2. What's the preferred solution for minimizing guest clock drift in bhyve? > Based on some Google searches, I run ntpd in the guests and set > kern.timecounter.hardware=acpitimer0 instead of the default acpihpet0. > acpitimer0 drifts by ~600 ppm while acpihpet0 drifts by ~1500 ppm; why? > I don't know but I am running experiments that I hope will provide some insight. best Neel > 3. Even moderate guest disk I/O completely kills guest network performance. > For example, whenever security(8) (security(7) in FreeBSD) runs, guest > network throughput drops from 150+ Mbps to ~20 Mbps, and jitter from ping > jumps from <0.01 ms to 100+ ms. If I try to build something in the guest, > then network becomes almost unusable. > > The network performance degradation only affects the guest that's generating > the I/O; high I/O on guest B doesn't affect guest A, nor would high I/O on > the host. > > I'm using both virtio-blk and virio-net drivers, and the guests' disk images > are backed by zvol+geli. Removing geli has no effect. > > There are some commits in CURRENT that suggests improved virtio performance, > but I'm not comfortable running CURRENT. Is there a workaround I could use > for 10.1? > > 4. virtio-blk always reports the virtual disk as having 512-byte sectors, > and so I get I/O errors on OpenBSD guests when the disk image is backed by > zvol+geli with 4K sector size. Curiously, this only seems to affect > zvol+geli; with just zvol it seems to work. Also, it works either way on > Linux guests. > > ATM I changed the zvol / geli sector size to 512 bytes, which probably made > #2 worse. I think this bug / feature is addressed by: > , > but again is there a workaround to force a specific sector size for 10.1? > > 5. This may be better directed at OpenBSD but I'll ask here anyway: if I > enable virtio-rnd then OpenBSD would not boot with "couldn't map interrupt" > error. The kernel in bsd.rd will boot, but not the installed kernel (or the > one built from STABLE; I forgot). Again, Linux seems unaffected, but I > couldn't tell if it's actually working. > > Julian Hsiao > > > _______________________________________________ > freebsd-virtualization@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization > To unsubscribe, send any mail to > "freebsd-virtualization-unsubscribe@freebsd.org" From owner-freebsd-virtualization@FreeBSD.ORG Sat Mar 28 05:49:12 2015 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 627A1A1C for ; Sat, 28 Mar 2015 05:49:12 +0000 (UTC) Received: from mail-ig0-x232.google.com (mail-ig0-x232.google.com [IPv6:2607:f8b0:4001:c05::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2711911E for ; Sat, 28 Mar 2015 05:49:12 +0000 (UTC) Received: by igcau2 with SMTP id au2so43211548igc.1 for ; Fri, 27 Mar 2015 22:49:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:content-type; bh=47kwJJiiT6azrRvvT59ZpkQGWGUWU6OwphUQRj27i4Y=; b=QWxUS1e0JIuGsO9v2ZAeP07+MKHQlJRf4XLU5RxJY3RJN02fiw/fDkjZpbp2dbwq9d vgbOErwmwIh+IsRIsWr2TgMo9qisT0YcoztbH9kHn3h3xvX0Fh1c3XtYc09nyzEMsRrK kir/e4C6z7eRBYaEBX7NUx+lV+X8/Kur7RUuF/0KfyRqPxzYWqU2yTbYEwBRK3ZE1K/P rewWh28iJecehMr/gHPStsc9ptfg1IW/lqE6/qcqZcpawtR594ggzB81k/0xyEl0U/dC mlT4uF7dYqFI25pN0gQufqdGjxuOjDLkJYBO2NuKUgAMMKHbZgRdpbs6KPCSlqfRBP86 S3og== X-Received: by 10.107.168.146 with SMTP id e18mr10232088ioj.32.1427521751123; Fri, 27 Mar 2015 22:49:11 -0700 (PDT) MIME-Version: 1.0 Sender: jtubnor@gmail.com Received: by 10.36.96.4 with HTTP; Fri, 27 Mar 2015 22:48:50 -0700 (PDT) In-Reply-To: References: From: Jason Tubnor Date: Sat, 28 Mar 2015 16:48:50 +1100 X-Google-Sender-Auth: rfeZ87CHYNaoPwFN5zp8AMq9Cow Message-ID: Subject: Re: Several bhyve quirks To: Neel Natu , "freebsd-virtualization@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Mar 2015 05:49:12 -0000 On 28 March 2015 at 10:49, Neel Natu wrote: > > This is fixed in HEAD where the RTC device model defaults to 24-hour time. > >> >> suggests that I'm on the right track, but it doesn't explain the off-by-one >> nor the (one time) multi-day offset. >> > > The one-hour offset is a bug due to my interpretation of the 12-hour format. > > I am going to fix this in HEAD shortly but here is a patch for 10.1 and earlier: > https://people.freebsd.org/~neel/patches/bhyve_openbsd_rtc.patch > Thanks for this Neel. I was trying to back port your original HEAD patch into 10.1 but there were too many quirks to deal with into other dependent libs. I didn't have the skills to do this, it is appreciated that you did it :-) Thanks!