From owner-freebsd-net@FreeBSD.ORG Wed Nov 9 08:51:56 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0488C106566B for ; Wed, 9 Nov 2011 08:51:56 +0000 (UTC) (envelope-from hoomanfazaeli@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 7265E8FC15 for ; Wed, 9 Nov 2011 08:51:54 +0000 (UTC) Received: by bkbzs8 with SMTP id zs8so1583666bkb.13 for ; Wed, 09 Nov 2011 00:51:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=OuT/KNioE51jFMIPJ09icQKCPKSDard8FMAD4D8NlQ8=; b=Ar/86XYhUJ4dab3pXXLdn+sQsHwiO64ty2a/+6x2nHXSDN4O3XyPyfWaOTeiNurUDw QqG8jxZXrMIWVTczzCEJEob3mB22xi4nbp6mLxdltEMKCvRpZa1vswlOaMDFGHiQ3PAq xhk31Q1C1NhifAOrn+zFIChlM6RG8SqUfVFI0= Received: by 10.204.16.67 with SMTP id n3mr995329bka.6.1320828713993; Wed, 09 Nov 2011 00:51:53 -0800 (PST) Received: from [127.0.0.1] ([84.241.57.181]) by mx.google.com with ESMTPS id fu17sm4050614bkc.9.2011.11.09.00.51.50 (version=SSLv3 cipher=OTHER); Wed, 09 Nov 2011 00:51:53 -0800 (PST) Message-ID: <4EBA3F22.2060204@gmail.com> Date: Wed, 09 Nov 2011 12:21:46 +0330 From: Hooman Fazaeli User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.23) Gecko/20110920 Thunderbird/3.1.15 MIME-Version: 1.0 To: Adrian Chadd References: <4E8F51D4.1060509@sentex.net> <4EA7E203.3020306@sepehrs.com> <4EA80818.3030504@sentex.net> <4EA80F88.4000400@hotplug.ru> <4EA82715.2000404@gmail.com> <4EA8FA40.7010504@hotplug.ru> <4EA91836.2040508@gmail.com> <4EA959EE.2070806@hotplug.ru> <4EAD116A.8090006@gmail.com> <4EAE58A2.9040803@gmail.com> <4EB96511.50701@gmail.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: pyunyh@gmail.com, freebsd-net@freebsd.org, Emil Muratov , Jack Vogel , Jason Wolfe Subject: Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Nov 2011 08:51:56 -0000 On 11/8/2011 11:00 PM, Adrian Chadd wrote: > On 8 November 2011 09:21, Hooman Fazaeli wrote: > >> With MSIX enabled, the link task (em_handle_link) does _not_ triggers >> _start when the link changes state from inactive to active (which it >> should). >> If if_snd quickly fills up during a temporary link loss, transmission is >> stopped forever and the driver never recovers from that state. >> >> The last patch should have reduced the frequency of the problem >> but it assumes every IFQ_ENQUEUE is followed by a if_start which >> is not a true assumption. > > FWIW, I saw something very similar with the if_arge code port from > Linux. If the TX queue filled up and wasn't serviced before it hit > completely full, it was never drained. > > It may be worthwhile auditing some of the other NIC drivers to ensure > this kind of situation isn't occuring. Especially if they came from > Linux. :-) > > That's a great catch, I hope it finally fixes the if_em issues with MSIX. :-) > > > Adrian Just for the record, I should inform you that igb, ixgb and ixbge have the same issue. I have not checked other drivers. And there is another subtle problem with all these drivers: if transmit (xxx_xmit) fails for a temporary memory shortage (i.e., DMA failure for ENOMEM), the driver may enter the OACTIVE state and _never_ recovers! The scenario is somehow as before: - if_start is executed. - xxx_xmit fails with ENOMEM. - xxx_start_locked sets OACTIVE. Note that this is different from a low TX descriptor condition which also sets OACTIVE. - stack enqueues packets in if_snd but does not call if_start since driver is OACTIVE. - stack enqueues more packets until if_snd fills up and packets start to drop. - Since there is nowhere in the driver's code to re-try transmission when memory becomes available again (xxx_local_timer is a candidate), the driver remains OACTIVE forever until it is re-initialized. I am working on patches for em/igb/ixgb/ixgbe to fix these issues and would be happy to share them with anyone who is interested. since these are really severe problems, I hope gurus apply official fixes ASAP.