From owner-freebsd-stable Fri Jun 21 2: 1:42 2002 Delivered-To: freebsd-stable@freebsd.org Received: from alogis.com (firewall.solit-ag.de [212.184.102.1]) by hub.freebsd.org (Postfix) with ESMTP id AF6E637B409 for ; Fri, 21 Jun 2002 02:01:35 -0700 (PDT) Received: from alogis.com (kipp@clausthal.int1.b.intern [10.1.1.30]) by alogis.com (8.11.1/8.9.3) with ESMTP id g5L90sl44824; Fri, 21 Jun 2002 11:00:54 +0200 (CEST) (envelope-from holger.kipp@alogis.com) Message-ID: <3D12EB49.3E3CC0D5@alogis.com> Date: Fri, 21 Jun 2002 11:00:57 +0200 From: Holger Kipp X-Mailer: Mozilla 4.7 [en] (X11; U; Linux 2.2.13 i686) X-Accept-Language: en MIME-Version: 1.0 To: frank@exit.com Cc: pjklist@ekahuna.com, stable@FreeBSD.ORG Subject: Re: Status of fxp / smp problem? References: <200206202151.g5KLpXJ9065056@realtime.exit.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Frank Mayhar wrote: > > Philip J. Koenig wrote: > > Over the past couple/few weeks there were lots of reports of systems > > which had trouble with the fxp (Intel Pro 10/100 NIC) drivers, > > particularly on SMP systems. Not only sym/fxp, but also sym/ata, only sym, or only fxp or even others. > > Are people still having these problems with 4.6-RELEASE or RELENG4? > > I've been waiting for an indication this has been fixed, as I've got > > a couple of boxes here waiting to be installed, that I wanted to > > update first - but not if that problem was still there, as they're > > both SMP boxes that use the affected Intel NICs. > > I don't think the NIC really matters, as I've seen it even without an > fxp. I managed to alleviate the problem somewhat by killing the dnetc > processes. It appears that pegging CPUs makes the problem much worse (I'm > not sure whether pegging one is sufficient or if both should be pegged; > based on my experience, though, I strongly suspect the former). There is a workaround available for the sym drivers (I'm not sure if it is already committed), which checks for stalled IRQs, forcing driver service if IRQ is set but obviously not cleared for a longer period of time. Problem seems to manifest itself especially on systems with: - shared IRQs AND - SMP enabled I don't know enough about IRQ handling, but I'd say this should be tracked down - might be a hardware problem in some cases, but could also be some quirk in IRQ handling code, not necessarily a problem with the specific drivers (apart from timing issues, maybe). > I've still had no good suggestions as to what to examine. I looked at > the low-level code, but there were no obvious smoking guns there. A few > commits in the relevant time period, but none that seemed likely to cause > interrupt problems. I have once seen a problem report with a high-end 4-processor system, but there I couldn't find any shared irqs... I intend to write up a summary this weekend, so maybe that might help a bit. Regards, Holger -- Holger Kipp, Dipl.-Math., Systemadministrator | alogis AG Fon: +49 (0)30 / 43 65 8 - 114 | Berliner Strasse 26 Fax: +49 (0)30 / 43 65 8 - 214 | D-13507 Berlin Tegel email: holger.kipp@alogis.com | http://www.alogis.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message