From owner-freebsd-arch@FreeBSD.ORG Tue Jan 19 19:19:54 2010 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B1BE01065670; Tue, 19 Jan 2010 19:19:54 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail01.syd.optusnet.com.au (mail01.syd.optusnet.com.au [211.29.132.182]) by mx1.freebsd.org (Postfix) with ESMTP id 422C08FC18; Tue, 19 Jan 2010 19:19:53 +0000 (UTC) Received: from c220-239-227-214.carlnfd1.nsw.optusnet.com.au (c220-239-227-214.carlnfd1.nsw.optusnet.com.au [220.239.227.214]) by mail01.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id o0JJJo1h020005 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 20 Jan 2010 06:19:51 +1100 Date: Wed, 20 Jan 2010 06:19:50 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Attilio Rao In-Reply-To: <3bbf2fe11001190941s37f62c48tb91be0061b658b2c@mail.gmail.com> Message-ID: <20100120055636.U68115@delplex.bde.org> References: <3bbf2fe10911271542h2b179874qa0d9a4a7224dcb2f@mail.gmail.com> <20100116205752.J64514@delplex.bde.org> <3bbf2fe11001160409w1dfdbb9j36458c52d596c92a@mail.gmail.com> <201001191144.23299.jhb@freebsd.org> <3bbf2fe11001190927m10f73775p7b68eb4d3ce0470a@mail.gmail.com> <274B568B-81D9-4554-8C3A-888FF0CD7B08@samsco.org> <3bbf2fe11001190941s37f62c48tb91be0061b658b2c@mail.gmail.com> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-2077489133-1263928790=:68115" Cc: FreeBSD Arch , Scott Long , Ed Maste Subject: Re: [PATCH] Statclock aliasing by LAPIC X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jan 2010 19:19:54 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-2077489133-1263928790=:68115 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Tue, 19 Jan 2010, Attilio Rao wrote: > 2010/1/19 Scott Long : >> On Jan 19, 2010, at 10:27 AM, Attilio Rao wrote: >>> >>> 2010/1/19 John Baldwin : >>>> My feeling, btw, is that the real solution is to not use a sampling cl= ock >>>> for >>>> per-process stats, but to just use the cycle counter and keep separate >>>> user, >>>> system, and interrupt cycle counts (like the rux_runtime we have now). >>>> =C2=A0This >>>> makes calcru() trivial and eliminates many of the weird "going >>>> backwards", >>>> etc. problems. =C2=A0The only issue with this approach is that not all >>>> platforms >>>> have a cheap cycle counter (many embedded platforms lack one I think),= so >>>> you >>>> would almost need to support both modes of operation and maybe have an >>>> #define >>>> in to choose between the two modes. >>> >>> Generally that would be a good idea, but the problem is not only for >>> the architectures not supporting it, but also for architectures that >>> do (eg. TSC de-synchronization in some SMP environment). >>> >> >> For process stats, TSC desync isn't a big problem. =C2=A0As a process mi= grates >> from one CPU to the other, its stats from the old cpu will be recorded, = then >> stats will be started on the new cpu. =C2=A0The only problem here is wit= h >> normalizing the different TSC's to a common reference. =C2=A0Maybe that = can be >> done when computing cp_times? =C2=A0This is definitely a case where 'per= fect' is >> the enemy of 'a hell of a lot better than we have now'. > Only the frequencies would need normalization, since the TSCs are per-CPU and they hopefully don't get reset by suspend etc. Separate frequencies for separate CPUs are not supported now. > I wouldn't like to be mistaken, but IIRC in some benchmarks kris@ did > in the past years we were seeing TSC timers litterally going backwards > after the de-synchronization (even on absolute measurement). Do you really mean individual TSCs going backwards? P-state-invariance (?) should prevent the desync. If the TSCs actually desync, then TSC timecounters are sure to break, with timecounters going backwards being a typical result (certain calculations overflow if time deltas are unexpectedly large). Timecounters used to be used for the equivalent of rux_runtime. There were/are no checks for timecounters themselves going backwards, but sanity checks in the use of rux_runtime detected this. Now TSCs (if available) are normally used for rux_runtime. Recalibration of the TSC's assumed-common frequency is buggy and can easily cause bizarre user times when the frequency is changed. Apart from that, rux_runtime is correct. Good enough for scheduling even when incorrect. Bruce --0-2077489133-1263928790=:68115--