From owner-freebsd-arm@FreeBSD.ORG Fri Oct 23 15:22:54 2009 Return-Path: Delivered-To: freebsd-arm@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5EE9A1065670 for ; Fri, 23 Oct 2009 15:22:54 +0000 (UTC) (envelope-from tinguely@casselton.net) Received: from casselton.net (casselton.net [63.165.140.2]) by mx1.freebsd.org (Postfix) with ESMTP id 2C7988FC19 for ; Fri, 23 Oct 2009 15:22:54 +0000 (UTC) Received: from casselton.net (localhost [127.0.0.1]) by casselton.net (8.14.3/8.14.3) with ESMTP id n9NFMmUj002303; Fri, 23 Oct 2009 10:22:48 -0500 (CDT) (envelope-from tinguely@casselton.net) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=casselton.net; s=ccnMail; t=1256311368; bh=letYRlheGPYq6r5H3Kf8cddJ99vtxskY/t8q3tm8Juo=; h=Date:From:Message-Id:To:Subject:Cc:In-Reply-To; b=tVkb1l7MJpI3LYYRCp+dxk28/PBZTTinkHhGAqcoIeU5i9Cgds3l3e2uIGn1IchBu K+3mC93gKRgyjGX+BQ2+H9IC96XGqBddTc76y+ZWZjD5ALy7Hh/nMaNjcDrKCw+Ig9 B+OYGhsziSI1GU93gQAouQnOWJ0NrgpZjVSjKjtE= Received: (from tinguely@localhost) by casselton.net (8.14.3/8.14.2/Submit) id n9NFMlE3002301; Fri, 23 Oct 2009 10:22:47 -0500 (CDT) (envelope-from tinguely) Date: Fri, 23 Oct 2009 10:22:47 -0500 (CDT) From: Mark Tinguely Message-Id: <200910231522.n9NFMlE3002301@casselton.net> To: freebsd-arm@freebsd.org, ray@dlink.ua, tinguely@casselton.net In-Reply-To: <20091023155825.381728f4.ray@dlink.ua> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.2 (casselton.net [127.0.0.1]); Fri, 23 Oct 2009 10:22:48 -0500 (CDT) Cc: Subject: Re: [ARM+NFS] panic while copying across NFS X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the StrongARM Processor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Oct 2009 15:22:54 -0000 > Hi Mark! > With your patch works fine. > > # dd if=/swap.file of=/mnt/swap.file bs=1M > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 231.294150 secs (4642322 bytes/sec) > > But still slow. Maybe someone know why slow? (Marvell 88F5182 rev A2) Here is what I think is the complete update to the revisions 181296 and 195779 cache fixes. 1) vm_machdep.c: remove the dangling allocations so they do not un-necessarily turn off the cache in the future. (this is the patch that worked for you. 2-3 are two more) 2) busdma_machdep.c: remove the same amount than shadow mapped. 3) pmap.c: PVF_REF is used to invalidate cache and flush tlb. PVF_REF is set by a trap when the page is really use. kernel pages should assume it is immediately used. In ARMv5 pmap, we should manage every RAM physical page. Without a profiling the kernel, it would be tough to say were performance issues are orginating. (device driver, in the fs code, or machine level). Ideas about the machine level code: I think freeing the memory from the level page table descriptors for general use should improve things. More usuable RAM is always a good thing. There is some code in trap and other places that looks to see if the level 1 pde is for this memory space or shared memory space. we can keep a few level pde around for forks. downside a fork could fail the 16K contig buffer; which it can in other archs too. This is a pretty big change. There are tests/fixes (switch/pmap) for low vector page that can be removed with define statement for high vector kernels. In fact if we are not sharing the level 1 pd, this set only in pmap initialization. Simple change "#ifdef LOW_VECTOR", minor savings. Are we cleaning caches too much? ARMv6/7 will be a big game changer. Should put a ton of effort into ARMv5, put the effort into optimizing, or do both? Index: arm/arm/vm_machdep.c =================================================================== --- arm/arm/vm_machdep.c (revision 198246) +++ arm/arm/vm_machdep.c (working copy) @@ -169,6 +169,9 @@ sf_buf_free(struct sf_buf *sf) if (sf->ref_count == 0) { TAILQ_INSERT_TAIL(&sf_buf_freelist, sf, free_entry); nsfbufsused--; + pmap_kremove(sf->kva); + sf->m = NULL; + LIST_REMOVE(sf, list_entry); if (sf_buf_alloc_want > 0) wakeup_one(&sf_buf_freelist); } @@ -449,9 +452,12 @@ arm_unmap_nocache(void *addr, vm_size_t size) size = round_page(size); i = (raddr - arm_nocache_startaddr) / (PAGE_SIZE); - for (; size > 0; size -= PAGE_SIZE, i++) + for (; size > 0; size -= PAGE_SIZE, i++) { arm_nocache_allocated[i / BITS_PER_INT] &= ~(1 << (i % BITS_PER_INT)); + pmap_kremove(raddr); + raddr += PAGE_SIZE; + } } #ifdef ARM_USE_SMALL_ALLOC Index: arm/arm/busdma_machdep.c =================================================================== --- arm/arm/busdma_machdep.c (revision 198246) +++ arm/arm/busdma_machdep.c (working copy) @@ -649,7 +649,8 @@ bus_dmamem_free(bus_dma_tag_t dmat, void *vaddr, b KASSERT(map->allocbuffer == vaddr, ("Trying to freeing the wrong DMA buffer")); vaddr = map->origbuffer; - arm_unmap_nocache(map->allocbuffer, dmat->maxsize); + arm_unmap_nocache(map->allocbuffer, + dmat->maxsize + ((vm_offset_t)vaddr & PAGE_MASK)); } if (dmat->maxsize <= PAGE_SIZE && dmat->alignment < dmat->maxsize && Index: arm/arm/pmap.c =================================================================== --- arm/arm/pmap.c (revision 198246) +++ arm/arm/pmap.c (working copy) @@ -1643,7 +1643,7 @@ pmap_enter_pv(struct vm_page *pg, struct pv_entry /* PMAP_ASSERT_LOCKED(pmap_kernel()); */ pve->pv_pmap = pmap_kernel(); pve->pv_va = pg->md.pv_kva; - pve->pv_flags = PVF_WRITE | PVF_UNMAN; + pve->pv_flags = PVF_WRITE | PVF_UNMAN | PVF_REF; pg->md.pv_kva = 0; TAILQ_INSERT_HEAD(&pg->md.pv_list, pve, pv_list); @@ -2870,7 +2870,7 @@ pmap_kenter_internal(vm_offset_t va, vm_offset_t p vm_page_lock_queues(); PMAP_LOCK(pmap_kernel()); pmap_enter_pv(m, pve, pmap_kernel(), va, - PVF_WRITE | PVF_UNMAN); + PVF_WRITE | PVF_UNMAN | PVF_REF); pmap_fix_cache(m, pmap_kernel(), va); PMAP_UNLOCK(pmap_kernel()); } else { @@ -3538,7 +3538,7 @@ do_l2b_alloc: if (!TAILQ_EMPTY(&m->md.pv_list) || m->md.pv_kva) { KASSERT(pve != NULL, ("No pv")); - nflags |= PVF_UNMAN; + nflags |= PVF_UNMAN | PVF_REF; pmap_enter_pv(m, pve, pmap, va, nflags); } else m->md.pv_kva = va;