Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 5 Jul 2011 21:01:26 +0200
From:      Marius Strobl <marius@alchemy.franken.de>
To:        Alan Cox <alc@rice.edu>
Cc:        Peter Jeremy <peter.jeremy@alcatel-lucent.com>, "alc@freebsd.org" <alc@freebsd.org>, freebsd-sparc64@freebsd.org
Subject:   Re: 'make -j16 universe' gives SIReset
Message-ID:  <20110705190126.GE14797@alchemy.franken.de>
In-Reply-To: <4E135420.4080201@rice.edu>
References:  <20110629025433.GA48145@server.vk2pj.dyndns.org> <20110629175444.GH14797@alchemy.franken.de> <20110629220010.GA53017@pjdesk.au.alcatel-lucent.com> <20110629223008.GL14797@alchemy.franken.de> <20110630221752.GG65891@pjdesk.au.alcatel-lucent.com> <20110702002325.GS14797@alchemy.franken.de> <4E0F6B8D.8000500@rice.edu> <20110704214158.GX14797@alchemy.franken.de> <20110705160709.GA77843@alchemy.franken.de> <4E135420.4080201@rice.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jul 05, 2011 at 01:12:48PM -0500, Alan Cox wrote:
> On 07/05/2011 11:07, Marius Strobl wrote:
> >On Mon, Jul 04, 2011 at 11:41:58PM +0200, Marius Strobl wrote:
> >>On Sat, Jul 02, 2011 at 02:03:41PM -0500, Alan Cox wrote:
> >>>On 07/01/2011 19:23, Marius Strobl wrote:
> >>>>On Fri, Jul 01, 2011 at 08:17:52AM +1000, Peter Jeremy wrote:
> >>>>>[Moving back on-list]
> >>>>>
> >>>>>On 2011-Jun-30 06:30:08 +0800, Marius Strobl<marius@alchemy.franken.de>
> >>>>>wrote:
> >>>>>>On Thu, Jun 30, 2011 at 08:00:10AM +1000, Peter Jeremy wrote:
> >>>>>>>On 2011-Jun-29 19:54:44 +0200, Marius 
> >>>>>>>Strobl<marius@alchemy.franken.de>
> >>>>>>>wrote:
> >>>>>>>>On Wed, Jun 29, 2011 at 12:54:33PM +1000, Peter Jeremy wrote:
> >>>>>>>>>My V890 has been running "make -j32 buildworld" in a loop for a
> >>>>>>>>>week now without problems so I think that was the problem.
> >>>>>>>OTOH, a V440 that has been running similar load for a similar period
> >>>>>>>died overnight with:
> >>>>>>>
> >>>>>>>panic: uma_small_alloc: free page still has mappings!
> >>>>>>>VNASSERT failed
> >>>>>>>cpuid = 3
> >>>>>>>0xfffff800079643c0: KDB: enter: panic
> >>>>>...
> >>>>>>>I'm fairly sure that is the same kernel but will double-check and
> >>>>>>>investigate that panic further.
> >>>>>FWIW, that kernel didn't have the latest patchset (adding Zeus 
> >>>>>support).
> >>>>That shouldn't make a difference; the later version only adds the
> >>>>SPARC64 bits as you already noticed and adjusts the boot loader to
> >>>>compile again. I made no changes to the existing parts apart from
> >>>>fixing a comment. Besides I see no connection between fixing the
> >>>>gross user TLB flushing and the below problem so far.
> >>>>
> >>>>>>Ok, this appears to be an unrelated problem though. Alan, do you
> >>>>>>have an idea what could be causing this?
> >>>>>I managed to get the same panic (though different traceback) on the
> >>>>>V890 after about an hour of pho@'s stress test with INCARNATIONS=150:
> >>>>>
> >>>>>panic: uma_small_alloc: free page still has mappings!
> >>>>>cpuid = 1
> >>>>>KDB: enter: panic
> >>>>>[ thread pid 142 tid 100196 ]
> >>>>>Stopped at      kdb_enter+0x80: ta              %xcc, 1
> >>>>>db>   where
> >>>>>Tracing pid 142 tid 100196 td 0xfffff8a016ace880
> >>>>>panic() at panic+0x20c
> >>>>>uma_small_alloc() at uma_small_alloc+0xe8
> >>>>>keg_alloc_slab() at keg_alloc_slab+0xc8
> >>>>>keg_fetch_slab() at keg_fetch_slab+0x218
> >>>>>zone_fetch_slab() at zone_fetch_slab+0x44
> >>>>>uma_zalloc_arg() at uma_zalloc_arg+0x60c
> >>>>>m_getm2() at m_getm2+0x134
> >>>>>m_uiotombuf() at m_uiotombuf+0x4c
> >>>>>sosend_generic() at sosend_generic+0x420
> >>>>>sosend() at sosend+0x2c
> >>>>>soo_write() at soo_write+0x3c
> >>>>>dofilewrite() at dofilewrite+0x7c
> >>>>>kern_writev() at kern_writev+0x38
> >>>>>write() at write+0x4c
> >>>>>syscallenter() at syscallenter+0x270
> >>>>>syscall() at syscall+0x74
> >>>>>-- syscall (4, FreeBSD ELF64, write) %o7=0x101db4 --
> >>>>>userland() at 0x405936c8
> >>>>>user trace: trap %o7=0x101db4
> >>>>>pc 0x405936c8, sp 0x7fdffffd8a1
> >>>>>pc 0x101f44, sp 0x7fdffffd9a1
> >>>>>pc 0x104604, sp 0x7fdffffda81
> >>>>>pc 0x1046f0, sp 0x7fdffffdb51
> >>>>>pc 0x104994, sp 0x7fdffffdc21
> >>>>>pc 0x104d90, sp 0x7fdffffdd01
> >>>>>pc 0x101610, sp 0x7fdffffde41
> >>>>>pc 0x4020cff4, sp 0x7fdffffdf01
> >>>>>done
> >>>>>db>
> >>>>>
> >>>>>I've got a crashdump on the V440 but discovered that gdb reports
> >>>>>"GDB can't read core files on this machine." so it isn't much use.
> >>>>>Any suggestions on how to debug this?
> >>>>The VM and its interaction with the MD code are beyond me, I hope
> >>>>Alan can chime in here. Reading through the code I see a possible
> >>>>path which could lead to this though; tsb_tte_enter(), which is
> >>>>the only place where TD_PV ever is set and also only in case of
> >>>>managed pages, always calls pmap_cache_enter(), which together
> >>>>with pmap_cache_remove() does the page color handling. In
> >>>>pmap_remove_all() however, pmap_cache_remove() is only called for
> >>>>managed pages, so for unmanaged pages we might miss the removal
> >>>>of the mapping from the the color used. I've no idea though if
> >>>>this actually is relevant, i.e. whether the VM ever calls
> >>>>pmap_remove_all() for unmanaged pages.
> >>>In HEAD, it does not.  Other architectures have an assertion forbidding
> >>>pmap_remove_all() calls on unmanaged pages.  (Btw, I'm happy to add this
> >>>assertion to sparc64's pmap if you like.)  In older versions, calling
> >>>pmap_remove_all() on unmanaged pages is expected to be a harmless NOP
> >>>that's just a waste of cycles.
> >>>
> >>>With unmanaged pages, it is expected that pmap_remove() is used to
> >>>destroy mappings before the page is freed.
> >>>
> >>>For years, vm_page_free{,_toq}() has asserted that the page has no
> >>>managed mappings:
> >>>
> >>>         if ((m->flags&  PG_UNMANAGED) == 0) {
> >>>                 vm_page_lock_assert(m, MA_OWNED);
> >>>                 KASSERT(!pmap_page_is_mapped(m),
> >>>                     ("vm_page_free_toq: freeing mapped page %p", m));
> >>>         }
> >>>
> >>Okay, then my theories don't hold.
> >>
> >>>As a debugging aid, you might want to add an additional check here on
> >>>colors.
> >>I did that and it turns out to trigger rather quickly:
> >>Trying to mount root from nfs: []...
> >>NFS ROOT: 192.168.1.40:/usr/data/nfsroot/sparc64
> >>dc1: link state changed to UP
> >>panic: vm_page_free_toq: free page 0xfffff80047b8a088 still has mappings!
> >>cpuid = 0
> >>KDB: enter: panic
> >>[ thread pid 1 tid 100001 ]
> >>Stopped at      kdb_enter+0x80: ta              %xcc, 1
> >>db>  bt
> >>Tracing pid 1 tid 100001 td 0xfffff80041094000
> >>panic() at panic+0x20c
> >>vm_page_free_toq() at vm_page_free_toq+0xb4
> >>vm_page_free_zero() at vm_page_free_zero+0x10
> >>pmap_release() at pmap_release+0x170
> >>vmspace_free() at vmspace_free+0x70
> >>vmspace_exec() at vmspace_exec+0x48
> >>exec_new_vmspace() at exec_new_vmspace+0x240
> >>exec_elf64_imgact() at exec_elf64_imgact+0x598
> >>kern_execve() at kern_execve+0x398
> >>execve() at execve+0x34
> >>start_init() at start_init+0x2ec
> >>fork_exit() at fork_exit+0x9c
> >>fork_trampoline() at fork_trampoline+0x8
> >>db>
> >>
> >>Further debugging shows that the page in question is one of the TSB
> >>pages entered by pmap_pinit(). In pmap_release() vm_page_free_zero()
> >>is called on these before pmap_qremove(), so there appears to be a
> >>race in which these pages can get re-used before their mappings are
> >>removed. I suspect that this might be related to your change in
> >>r207648, but just reverting that one nowadays this triggers the
> >>assertion in vm_page_free_toq() about the page lock not being held.
> >>Anyway, I'm not sure what the right fix for this is; should
> >>pmap_release() call pmap_qremove() on these pages one-by-one before
> >>calling vm_page_free_zero() or maybe just call pmap_qremove() for
> >>all of them before looping over them and calling vm_page_free_zero()?
> >>
> >Well, given that all uses of pmap_qremove() in the kernel except
> >the one in the sparc64 pmap_release and two invocations in vfs_bio.c
> >remove the pages before they are freed, unwired etc this seems to be
> >a safe thing to do. Does the below patch look correct to you?
> >
> 
> Basically, yes.  However, I would suggest adding the KASSERT in pmap.c 
> as a separate change.  The pmap_qremove() changes should be MFCed to 
> RELENG_8 and RELENG_7, but not the KASSERT change.

Okay, thanks!

> 
> >Index: kern/vfs_bio.c
> >===================================================================
> >--- kern/vfs_bio.c	(revision 223705)
> >+++ kern/vfs_bio.c	(working copy)
> >@@ -1625,6 +1625,7 @@ vfs_vmio_release(struct buf *bp)
> >  	int i;
> >  	vm_page_t m;
> >
> >+	pmap_qremove(trunc_page((vm_offset_t) bp->b_data), bp->b_npages);
> 
> While you're here, please also remove the non-style(9) compliant space 
> after the cast.
> 

Done.

Peter, could you please test again with at least r223795 and the below
patch but no additional change to pmap.c?

Marius

Index: vm/vm_page.c
===================================================================
--- vm/vm_page.c	(revision 223705)
+++ vm/vm_page.c	(working copy)
@@ -1741,6 +1741,10 @@ vm_page_free_toq(vm_page_t m)
 		KASSERT(!pmap_page_is_mapped(m),
 		    ("vm_page_free_toq: freeing mapped page %p", m));
 	}
+#ifdef __sparc64__
+	KASSERT(m->md.colors[0] == 0 && m->md.colors[1] == 0,
+	    ("vm_page_free_toq: free page %p still has mappings!", m));
+#endif
 	PCPU_INC(cnt.v_tfree);
 
 	if (VM_PAGE_IS_FREE(m))



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110705190126.GE14797>