From owner-freebsd-net@freebsd.org Tue Nov 21 13:46:41 2017 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A7E50DEF600 for ; Tue, 21 Nov 2017 13:46:41 +0000 (UTC) (envelope-from freebsd@omnilan.de) Received: from mx0.gentlemail.de (mx0.gentlemail.de [IPv6:2a00:e10:2800::a130]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2A1C06F346 for ; Tue, 21 Nov 2017 13:46:41 +0000 (UTC) (envelope-from freebsd@omnilan.de) Received: from mh0.gentlemail.de (mh0.gentlemail.de [IPv6:2a00:e10:2800::a135]) by mx0.gentlemail.de (8.14.5/8.14.5) with ESMTP id vALDkdvj072154; Tue, 21 Nov 2017 14:46:39 +0100 (CET) (envelope-from freebsd@omnilan.de) Received: from titan.inop.mo1.omnilan.net (s1.omnilan.de [217.91.127.234]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mh0.gentlemail.de (Postfix) with ESMTPSA id E54CD94F; Tue, 21 Nov 2017 14:46:38 +0100 (CET) Message-ID: <5A142E3E.5010002@omnilan.de> Date: Tue, 21 Nov 2017 14:46:38 +0100 From: Harry Schmalzbauer Organization: OmniLAN User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; de-DE; rv:1.9.2.8) Gecko/20100906 Lightning/1.0b2 Thunderbird/3.1.2 MIME-Version: 1.0 To: Vincenzo Maffione CC: "freebsd-net@freebsd.org" , Giuseppe Lettieri Subject: Re: netmap/vale periodic deadlock References: <5A0F14CD.3040407@omnilan.de> <5A13F8A8.2020209@omnilan.de> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (mx0.gentlemail.de [IPv6:2a00:e10:2800::a130]); Tue, 21 Nov 2017 14:46:39 +0100 (CET) X-Milter: Spamilter (Reciever: mx0.gentlemail.de; Sender-ip: ; Sender-helo: mh0.gentlemail.de; ) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Nov 2017 13:46:41 -0000 Bezüglich Vincenzo Maffione's Nachricht vom 21.11.2017 14:26 (localtime): > It may be that your is not a deadlock but some kind of crash. Enabling > debugging features would probably help (e.g. to get a stack trace). > Maybe your lockup/crash happened because you did some reconfiguration > (ring size, number of rings, etc.) while netmap was active and doing so > you triggered > some hidden bug. The host was completely untouched when these lockups occured in late test phase. Only guests were configured/utilized. No previous (short term stress) test had caused any problem in that path. It first showed up with real-world (unstressed) tests. The last-minute change I described was with powered down guests and the host was rebooted (ppt dev changed in loader.conf). The host isn't going to be reconfigured in any way. Let's wait and see if the lockup shows up again (after not limiting NICs rx/tx descriptors and increasing netmap ring size). Considering your suspect that emulated netmap code in FreeBSD might be buggy, and the fact that I'm not able to debug it myself, I guess switching from vale to netgraph is my best bet. It's not much effort and causes almost no downtime. But it would disallow future ptnetmap extension... I'd prefere to stay with vale, although I'm using emulated mode only... So at first occurance, I'll install the debug kernel and see if that makes any difference. Thanks, -harry