Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 19 Feb 2018 10:05:58 +0000
From:      Roger Pau =?iso-8859-1?Q?Monn=E9?= <roger.pau@citrix.com>
To:        Laurence Pawling <laurence.pawling@globalsign.com>
Cc:        "freebsd-xen@freebsd.org" <freebsd-xen@freebsd.org>, "freebsd-virtualization@freebsd.org" <freebsd-virtualization@freebsd.org>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, David King <david.king@globalsign.com>, Vlad Galu <vlad.galu@globalsign.com>
Subject:   Re: multi-vCPU networking issues as client OS under Xen
Message-ID:  <20180219100558.adgb6m5ukdfvxehp@MacBook-Pro-de-Roger.local>
In-Reply-To: <D1AF75E7-49F9-4628-8B26-3ACB64994C97@globalsign.com>
References:  <D1AF75E7-49F9-4628-8B26-3ACB64994C97@globalsign.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Feb 19, 2018 at 09:58:30AM +0000, Laurence Pawling via freebsd-xen wrote:
> Hi all,
> 
>  
> 
> I’m wondering if anyone here has seen this issue before, I’ve spent the last couple of days troubleshooting:
> 
>  
> 
> Platform:
> 
> Host: XenServer 7.0 running on 2 x E2660-v4, 256GB RAM
> 
> Server VM: FreeBSD 11 (tested on 11.0-p15 and 11.1-p6), 2GB RAM (also tested with 32GB RAM), 1x50GB HDD, 1 x NIC, 2 or more vCPUs in any combination (2 sockets x 1 core, 1 socket x 2 cores, …)
> 
> Client VM: FreeBSD 11, any configuration of vCPUs, RAM and HDD.
> 
>  
> 
> Behaviour:
> 
> Sporadic interruption of TCP sessions when utilising the above machine as a “server” with “clients” connecting. Looking into the communication with pcap/Wireshark, you see a TCP Dup Ack sent from both ends, followed by the client sending an RST packet, terminating the TCP session. We have also seen evidence of the client sending a Keepalive packet, which is ACK’d by the server before the RST is sent from the client end.
> 
>  
> 
> To recreate:
> 
> On the above VM, perform a vanilla install of nginx:
> 
> pkg install nginx
> 
> service nginx onestart
> 
> Then on a client VM (currently only tested with FreeBSD), run the following (or similar):
> 
> for i in {1..10000}; do if [ $(curl -s -o /dev/null -w "%{http_code}" http://10.2.122.71) != 200 ] ; then echo "error"; fi; done
> 
> When vCPUs=1 on the server, I get no errors, when vCPUs>1 I get errors reported. The frequency of errors *seems* to be proportional to the number of vCPUs, but they are sporadic with no clear periodicity or pattern, so that is just anecdotal. Also, the problem seems by far the most prevalent when communicating between two VMs on the same host, in the same VLAN. Xen still sends packets via the switch rather than bridging internally between the interfaces.

When using >1 vCPUs can you set hw.xn.num_queues=1 on
/boot/loader.conf and try to reproduce the issue?

I'm afraid this is rather related to multiqueue (which is only used
if >1 vCPUs).

Thanks, Roger.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180219100558.adgb6m5ukdfvxehp>