From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 11:02:21 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 76F37106566B for ; Tue, 28 Sep 2010 11:02:21 +0000 (UTC) (envelope-from vf1100c@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 080C48FC15 for ; Tue, 28 Sep 2010 11:02:20 +0000 (UTC) Received: by wyb33 with SMTP id 33so7542273wyb.13 for ; Tue, 28 Sep 2010 04:02:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:references:in-reply-to :mime-version:content-type:message-id:content-transfer-encoding:cc :x-mailer:from:subject:date:to; bh=zUiL5cH4xpCQLACPVp+MmtUItuJICLSn5qAFxViP3kA=; b=a45xQCv0gaxyzrtbHFCbf7wYHL0dSjv5veVHKrm503BnYTugIaZFO1BOzi7LvGv/Ml +6ANuNyzxRRJaXUyR7PYYfQAqg79nzofq54FDuoUqRYaFgwtMUtIa9xoY+UA8y2ggLuM 2p/hD6klBkPYYxiBXMUlrHj74X4orBTI3XzWI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=references:in-reply-to:mime-version:content-type:message-id :content-transfer-encoding:cc:x-mailer:from:subject:date:to; b=ejHxGO/tr6i0gz6bh8nKUYkumgY0EYSdemUHJlMQdAGFJvN9uK4XJWIwieSjgxthcp E/Cupopr3gGwJq6p/AgYJoAyv6jXb9vsEm4BbrhiCjbMMbi9Ij7B/+oPWi1gL9d9K0Tr Pao90OQoax1mcGxQ3W+nEOTYK6EJJTDZM4e6s= Received: by 10.227.69.195 with SMTP id a3mr7743238wbj.58.1285669987900; Tue, 28 Sep 2010 03:33:07 -0700 (PDT) Received: from [10.8.55.4] ([85.118.193.147]) by mx.google.com with ESMTPS id g9sm5845975wbh.19.2010.09.28.03.33.04 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 28 Sep 2010 03:33:06 -0700 (PDT) References: <4CA19F27.6050903@ish.com.au> In-Reply-To: <4CA19F27.6050903@ish.com.au> Mime-Version: 1.0 (iPhone Mail 8B117) Content-Type: text/plain; charset=us-ascii Message-Id: Content-Transfer-Encoding: quoted-printable X-Mailer: iPhone Mail (8B117) From: borislav nikolov Date: Tue, 28 Sep 2010 13:33:48 +0300 To: Jurgen Weber Cc: "freebsd-stable@freebsd.org" Subject: Re: cpu timer issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 11:02:21 -0000 On 28.09.2010, at 10:54, Jurgen Weber wrote: > Hello List >=20 > We have been having issues with some firewall machines of ours using pfSen= se. >=20 > FreeBSD smash01.ish.com.au 7.2-RELEASE-p5 FreeBSD 7.2-RELEASE-p5 #0: Sun D= ec 6 23:20:31 EST 2009 sullrich@FreeBSD_7.2_pfSense_1.2.3_snaps.pfsense.org= :/usr/obj.pfSense/usr/pfSensesrc/src/sys/pfSense_SMP.7 i386 >=20 > MotherBoard: http://www.supermicro.com/products/motherboard/Xeon3000/3200/= X7SBi-LN4.cfm >=20 > Originally the systems started out by showing a lot of packet loss, the sy= stem time would fall behind, and the value of "#vmstat -i | grep timer" was d= ropping below 2000. I was lead to believe by the guys at pfSense that this i= s where the value should sit. I would also receive errors in messages that l= ooked like " kernel: calcru: runtime went backwards from 244314 usec to 2363= 41". >=20 > We tried a variety of things, disabling USB, turning off the Intel Speed S= tep in the BIOS, disabling ACPI, etc, etc. All having little to no effect. T= he only thing that would right it is restarting the box but over time it wou= ld degrade again. I talked to the SuperMicro and they said that this is a Fre= eBSD issue and pretty much washed their hands of it. >=20 > After a couple of months of dealing with this and just rebooting the syste= ms reguarly, the symptoms slowly but surely disappeared. eg. The kernel mess= ages went away, the system time was not falling behind and I was experiencin= g no packet loss but the "#vmstat -i | grep timer" value would continue to d= ecrease over time. Eventually I think, when it finally got the 0 the machine= restarted (I am only guessing here). >=20 > After this restart it worked again for a couple of hours and then it resta= rted again. >=20 > After the second time the system has not missed a beat, it has been fine a= nd the "#vmstat -i | grep timer" value remained near the 2000 mark... We set= up some zabbix monitoring to watch it. As mentioned it was fine for about a m= onth. Until today. Today the value has dropped to 0, but the system has not r= estarted and over the last couple of hours the value has increased to 47. >=20 > This machine is mission critical, we have two in a fail over scenario (usi= ng pfSense's CARP features) and it seems unfortunate that we have an issue w= ith two brand new SuperMicro boxes that affect both machines. While at the m= oment everything seems fine I want to ensure that I have no further issues. D= oes anyone have any suggestions? >=20 > Lastly I have double check both of the below: > http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/troubleshoot.html#CAL= CRU-NEGATIVE-RUNTIME > We disabled EIST. >=20 > http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/troubleshoot.html#COM= PUTER-CLOCK-SKEW >=20 > # dmesg | grep Timecounter > Timecounter "i8254" frequency 1193182 Hz quality 0 > Timecounters tick every 1.000 msec > # sysctl kern.timecounter.hardware > kern.timecounter.hardware: i8254 >=20 > Only have one timer to choose from. >=20 > Thanks >=20 > Jurgen >=20 > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" Hello, vmsat -i calculates interrupt rate based on interrupt count/uptime, and the i= nterrupt count is 32 bit integer.=20 With high values of kern.hz it will overflow in few days (with kern.hz=3D400= 0 it will happen every 12 days or so). If that is the case, use systat -vmstat 1 to get accurate interrupt rate. That is just fyi, because i was confused once and it scared me abit, and i s= tarted changing counters untill i noticed this. p.s. please forgive my poor english=