From owner-freebsd-stable@FreeBSD.ORG  Tue Dec 18 14:17:43 2007
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@FreeBSD.ORG
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id AC8CE16A419;
	Tue, 18 Dec 2007 14:17:43 +0000 (UTC)
	(envelope-from dg@dglawrence.com)
Received: from dglawrence.com (static-72-90-113-2.ptldor.fios.verizon.net
	[72.90.113.2])
	by mx1.freebsd.org (Postfix) with ESMTP id 716BE13C4F2;
	Tue, 18 Dec 2007 14:17:43 +0000 (UTC)
	(envelope-from dg@dglawrence.com)
Received: from tnn.dglawrence.com (localhost [127.0.0.1])
	by dglawrence.com (8.14.1/8.14.1) with ESMTP id lBIEHgwi017772;
	Tue, 18 Dec 2007 06:17:42 -0800 (PST)
	(envelope-from dg@dglawrence.com)
Received: (from dg@localhost)
	by tnn.dglawrence.com (8.14.1/8.14.1/Submit) id lBIEHg1v017771;
	Tue, 18 Dec 2007 06:17:42 -0800 (PST)
	(envelope-from dg@dglawrence.com)
X-Authentication-Warning: tnn.dglawrence.com: dg set sender to
	dg@dglawrence.com using -f
Date: Tue, 18 Dec 2007 06:17:42 -0800
From: David G Lawrence <dg@dglawrence.com>
To: Bruce Evans <brde@optusnet.com.au>
Message-ID: <20071218141742.GS25053@tnn.dglawrence.com>
References: <D50B5BA8-5A80-4370-8F20-6B3A531C2E9B@eng.oar.net>
	<20071217103936.GR25053@tnn.dglawrence.com>
	<20071218170133.X32807@delplex.bde.org>
	<47676E96.4030708@samsco.org> <20071218233644.U756@besplex.bde.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20071218233644.U756@besplex.bde.org>
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0
	(dglawrence.com [127.0.0.1]); Tue, 18 Dec 2007 06:17:42 -0800 (PST)
Cc: freebsd-net@FreeBSD.ORG, freebsd-stable@FreeBSD.ORG
Subject: Re: Packet loss every 30.999 seconds
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Dec 2007 14:17:43 -0000

> >Right, it's a non-optimal loop when N is very large, and that's a fairly
> >well understood problem.  I think what DG was getting at, though, is
> >that this massive flush happens every time the syncer runs, which
> >doesn't seem correct.  Sure, maybe you just rsynced 100,000 files 20
> >seconds ago, so the upcoming flush is going to be expensive.  But the
> >next flush 30 seconds after that shouldn't be just as expensive, yet it
> >appears to be so.
> 
> I'm sure it doesn't cause many bogus flushes.  iostat shows zero writes
> caused by calling this incessantly using "while :; do sync; done".

   I didn't say it caused any bogus disk I/O. My original problem
(after a day or two of uptime) was an occasional large scheduling delay
for a process that needed to process VoIP frames in real-time. It was
happening every 31 seconds and was causing voice frames to be dropped
due to the large latency causing the frame to be outside of the jitter
window. I wrote a program that measures the scheduling delay by sleeping
for one tick and then comparing the timeofday offset from what was
expected. This revealed that every 31 seconds, the process was seeing
a 17ms delay in scheduling. Further investigation found that 1) the
syncer was the process that was running every 31 seconds and causing
the delay (and it was the only one in the system with that timing
interval), and that 2) lowering the kern.maxvnodes to something lowish
(5000) would mostly mitigate the problem. The patch to limit the number
of vnodes to process in the loop before sleeping was then developed
and it completely resolved the problem. Since the wait that I added
is at the bottom of the loop and the limit is 500 vnodes, this tells
me that every 31 seconds, there are a whole lot of vnodes that are
being "synced", when there shouldn't have been any (this fact wasn't
apparent to me at the time, but when I later realized this, I had
no time to investigate further). My tests and analysis have all been
on an otherwise quiet system (no disk I/O), so the bottom of the
ffs_sync vnode loop should not have been reached at all, let alone
tens of thousands of times every 31 seconds. All machines were uni-
processor, FreeBSD 6+. I don't know if this problem is present in 5.2.
I didn't see ffs_syncvnode in your call graph, so it probably is not.
   Anyway, someone needs to instrument the vnode loop in ffs_sync and
figure out what is going on. As you've pointed out, it is necessary
to first read a lot of files (I use tar to /dev/null and make sure it
reads at least 100K files) in order to get the vnodes allocated. As
I mentioned previously, I suspect that either ip->i_flag is not getting
completely cleared in ffs_syncvnode or its children or
v_bufobj.bo_dirty.bv_cnt accounting is broken.

-DG

David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
The FreeBSD Project - http://www.freebsd.org
Pave the road of life with opportunities.