Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 05 Feb 2012 08:12:42 -0700
From:      Ian Lepore <freebsd@damnhippie.dyndns.org>
To:        John-Mark Gurney <jmg@funkthat.com>
Cc:        freebsd-arm@freebsd.org
Subject:   Re: Performance of SheevaPlug on 8-stable
Message-ID:  <1328454762.1733.8.camel@revolution.hippie.lan>
In-Reply-To: <20120204234319.GR52468@funkthat.com>
References:  <1327980703.1662.240.camel@revolution.hippie.lan> <F48E21E0-129A-418A-B147-7D5FB01160A8@bsdimp.com> <1328025245.1662.289.camel@revolution.hippie.lan> <5FB4965A-66C9-4C99-8B61-5AC605F9ECC5@bsdimp.com> <1328030999.1662.324.camel@revolution.hippie.lan> <20120204234319.GR52468@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 2012-02-04 at 15:43 -0800, John-Mark Gurney wrote:
> Ian Lepore wrote this message on Tue, Jan 31, 2012 at 10:29 -0700:
> > On Tue, 2012-01-31 at 09:37 -0700, Warner Losh wrote:
> > > On Jan 31, 2012, at 8:54 AM, Ian Lepore wrote:
> > > 
> > > > On Mon, 2012-01-30 at 22:39 -0700, Warner Losh wrote:
> > > >> Hi Ian,
> > > >> 
> > > >> Do you have any data on what 9.0 does?
> > > >> 
> > > >> Warner
> > > > 
> > > > No.  Do you have reason to believe it will be different than 8.x?
> > > > 
> > > > It would be a major effort right now to get anything later than 8.2
> > > > built and running on one of our arm platforms.  Maybe not as hard as the
> > > > 6.2 -> 8.2 conversion was, but we're still carrying a lot of diffs from
> > > > stock FreeBSD that have to be analyzed and merged by hand.  Actually
> > > > before that can even happen I'd have to grab a snapshot of 9.0 and do an
> > > > svn->Hg conversion to even be able to start merging the diffs (and I'm
> > > > hardly an Hg expert, but those in the company who are let me know last
> > > > week that they're just as busy as me, and I'm on my own for this kind of
> > > > work).  It's work I want to do, but I suspect it's going to happen later
> > > > rather than sooner because product deadlines are beginning to loom and
> > > > my ability to spend most of my time working on the OS side of things is
> > > > waning.
> > > > 
> > > > If there are some specific changes you've got in mind that affect this
> > > > problem I might be able to backport and test them faster than I could
> > > > get a full 9.0 or -current build environment working, just point me at
> > > > them.
> > > 
> > > I thought that we'd done a root cause of this and had put a fix into the vm system.  Lemme look...
> > > 
> > > ------------------------------------------------------------------------
> > > r224049 | marcel | 2011-07-14 20:11:26 -0600 (Thu, 14 Jul 2011) | 2 lines
> > > 
> > > In pmap_protect(), don't call vm_page_dirty() if the page is unmanaged.
> > > 
> > > 
> > > ------------------------------------------------------------------------
> > > r221844 | cognet | 2011-05-13 09:54:12 -0600 (Fri, 13 May 2011) | 4 lines
> > > 
> > > In pmap_change_wiring(), use the right argument for pmap_modify_pv().
> > > It only worked because the only consumer calls pmap_change_wiring() to remove
> > > the wiring.
> > > 
> > > ------------------------------------------------------------------------
> > > r212507 | cognet | 2010-09-12 14:46:32 -0600 (Sun, 12 Sep 2010) | 5 lines
> > > 
> > > In pmap_remove_all(), do not decrease pm_stats.wired_count if the mapping was
> > > wired, as it's been done later in pmap_nuke_pv().
> > > 
> > > Submitted by:   Mark Tinguely
> > > 
> > > 
> > > ------------------------------------------------------------------------
> > > r209223 | cognet | 2010-06-15 16:16:02 -0600 (Tue, 15 Jun 2010) | 4 lines
> > > 
> > > Turn off cache if there's more than one kernel mapping, and one is writable.
> > > 
> > > Submitted by:   Mark Tinguely
> > > 
> > > ------------------------------------------------------------------------
> > > r205028 | raj | 2010-03-11 14:16:54 -0700 (Thu, 11 Mar 2010) | 12 lines
> > > 
> > > Fix ARM cache handling yet more.
> > > 
> > > 1) vm_machdep.c: remove the dangling allocations so they do not
> > >    un-necessarily turn off the cache upon consecutive access.
> > > 
> > > 2) busdma_machdep.c: remove the same amount than shadow mapped.
> > > 
> > > Reported by:    Maks Verver
> > > Submitted by:   Mark Tinguely
> > > Reviewed by:    Grzegorz Bernacki
> > > MFC after:      3 days
> > > 
> > > ------------------------------------------------------------------------
> > > r203637 | raj | 2010-02-07 13:48:57 -0700 (Sun, 07 Feb 2010) | 19 lines
> > > 
> > > Improve checking whether an ARM VA has a valid mapping before performing cache
> > > sync.
> > > 
> > > VIPT/PIPT caches need valid VA-PA mapping in PTE for a cache operation to
> > > succeed (unlike VIVT). Prior to this fix pmap was using l2pte_valid() for that
> > > check, but this is not sufficient as the function merely checks if a PTE
> > > exists (there can be existing but _invalid_ entries in the table).
> > > 
> > > A new pmap_has_valid_mapping() routine is introduced to do this job right by
> > > checking proper PTE flags.
> > > 
> > > Among other potential problems this cures coherency issues with L2 caches on
> > > MV-78100.
> > > 
> > > Submitted by:   Grzegorz Bernacki, Piotr Ziecik
> > > Reviewed, tested by:    marcel
> > > Obtained from:  Semihalf
> > > MFC after:      1 week
> > > 
> > > 
> > > Only the last two have MFC, so you can start there and see which of these changes are in...
> > > 
> > > Just thought you might have a reference board that would be easy to test...
> > > 
> > > Warner
> > 
> > I think we may have all those changes incorporated except perhaps
> > r224049; I'll make sure of that.  
> > 
> > r209223 is the change that exposed this situation.  
> > 
> > I'm skeptical that any of the changes you cite (or any change at all in
> > the pmap layer) will fix the problem, because the problem seems to be
> > rooted in the fact that the vfs buffer cache establishes a kva mapping
> > of the buffer pages with the protections set to READ|WRITE|EXEC and
> > leaves that mapping in place as long as the buffer is in the cache, and
> > r209223 says that as long as there are multiple mappings of a page with
> > at least one writable, that page's i-cache and d-cache bits stay off.
> > (The multiple mappings being the one for the buffer cache that includes
> > write access and one or more READ|EXEC mappings made by pmap() when the
> > executable or library is loaded/relocated.)
> > 
> > If my analysis is correct (and I'm fairly sure, if not 100% positive,
> > that it is), then it seems to me that the only fix available is going to
> > be at the vfs layer, and it's going to involve dropping the write access
> > to the pages in the buffer cache once any physical IO and/or uio
> > operations needing write access are completed.  
> > 
> > Even if I could figure out a patchset to fix the problem, it's going to
> > need a lot of input from the vm gurus to answer questions such as what
> > the performance impact will be to non-VIVT platforms that don't need
> > this extra work done.  If the extra work is expensive enough (and I'm
> > not sure I could evaluate that properly) it may need to be conditional
> > on whether the platform needs it.  I'm also vaguely uneasy with all this
> > on a purely philosphical level, since this could end up basically
> > infecting MI code with a platform-specific concept.
> 
> What is an easy to figure out if a system we have is effected by this
> issue?  I have a GW2348-4 board running FreeBSD 9.0-RC1 w/ some minor
> modifications to get pf to work...
> 
> I think the system is effected by this since userland seems really slow..
> 
> Thanks.
> 

In the original mail thread the author used the following trivial test
program and posted some examples of what good/bad numbers would be

  int main() { int i = 0; do ++i; while(i > 0); return 0; } 

Based on my experiences, a way to test this would be to put that
executable on the system and reboot.  After it boots, run the test a few
times and note the runtime.  Then do "cat test_program >/dev/null" and
run it again.  If the performance drops dramatically after the cat,
you're seeing the same problem.

I've started on the process of trying to replicate my results in 9.0,
but it's going to take me some time to get our custom mods into 9 to get
a bootable arm box.  (I lost yesterday morning to a failed disk drive,
and decided it's time for a new development/build machine, and lost the
rest of the day to shopping/dreaming/drooling over fast new hardware.)

-- Ian





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1328454762.1733.8.camel>