Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 16 Jun 2011 01:34:45 +0200
From:      Marius Strobl <marius@alchemy.franken.de>
To:        Peter Jeremy <peterjeremy@acm.org>
Cc:        freebsd-sparc64@freebsd.org
Subject:   Re: 'make -j16 universe' gives SIReset
Message-ID:  <20110615233445.GZ7064@alchemy.franken.de>
In-Reply-To: <20110613235144.GA12470@server.vk2pj.dyndns.org>
References:  <20110526234728.GA69750@server.vk2pj.dyndns.org> <20110527120659.GA78000@alchemy.franken.de> <20110601231237.GA5267@server.vk2pj.dyndns.org> <20110608224801.GB35494@alchemy.franken.de> <20110613235144.GA12470@server.vk2pj.dyndns.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jun 14, 2011 at 09:51:44AM +1000, Peter Jeremy wrote:
> On 2011-Jun-09 00:48:01 +0200, Marius Strobl <marius@alchemy.franken.de> wrote:
> >This might be due to the excessive use of sched_lock by SCHED_4BSD
> >and the MD code, f.e. more CPUs means less TLB contexts per CPU which
> >in turn means more flushes that are protect by sched_lock.
> 
> I have noticed that systat reports very high trap & fault counts.

That's basically expected; on USIII and later FreeBSD just flushes
all unlocked TLB entries when we need to flush the userland mappings
and accept TLB misses for the kernel ones instead of traversing
the TLBs for userland mappings and removing just those. Actually
OpenSolaris just does the same thing and IIRC there actually isn't
a way to traverse the large TLBs. Given that the TLB contexts are
divided evenly among the cores this means the more flushes and
misses the more cores are in the machine. Previously FreeBSD shared
the contexts which meant TLB shootdown IPIs even for non-shared
PMAPs. So the question is whether there's some point at which
that approach actually costs less performance than accepting TLB
misses. This seems unlikely though and AFAIK the current approach
actually is inspired by Solaris Internals.

> 
> I got a "spinlock held too long" panic that should have gone to DDB
> but the system wouldn't respond to anything other than a RSC reset.
> 

You could try whether the below patch sufficiently reduces the lock
coverage to avoid these. For stable/8 you'll probably need to apply
the second chunk by hand.

Marius

Index: pmap.c
===================================================================
--- pmap.c	(revision 223042)
+++ pmap.c	(working copy)
@@ -2217,11 +2217,10 @@ pmap_activate(struct thread *td)
 	struct pmap *pm;
 	int context;
 
+	critical_enter();
 	vm = td->td_proc->p_vmspace;
 	pm = vmspace_pmap(vm);
 
-	mtx_lock_spin(&sched_lock);
-
 	context = PCPU_GET(tlb_ctx);
 	if (context == PCPU_GET(tlb_ctx_max)) {
 		tlb_flush_user();
@@ -2229,17 +2228,18 @@ pmap_activate(struct thread *td)
 	}
 	PCPU_SET(tlb_ctx, context + 1);
 
+	mtx_lock_spin(&sched_lock);
 	pm->pm_context[curcpu] = context;
 	CPU_OR(&pm->pm_active, PCPU_PTR(cpumask));
 	PCPU_SET(pmap, pm);
+	mtx_unlock_spin(&sched_lock);
 
 	stxa(AA_DMMU_TSB, ASI_DMMU, pm->pm_tsb);
 	stxa(AA_IMMU_TSB, ASI_IMMU, pm->pm_tsb);
 	stxa(AA_DMMU_PCXR, ASI_DMMU, (ldxa(AA_DMMU_PCXR, ASI_DMMU) &
 	    TLB_CXR_PGSZ_MASK) | context);
 	flush(KERNBASE);
-
-	mtx_unlock_spin(&sched_lock);
+	critical_exit();
 }
 
 void



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110615233445.GZ7064>