Date: Thu, 1 Jun 2006 01:32:12 +0200 From: "Attilio Rao" <asmrookie@gmail.com> To: "Bruce Evans" <bde@zeta.org.au>, "Suleiman Souhlal" <ssouhlal@freebsd.org>, freebsd-arch@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: [patch] Adding optimized kernel copying support - Part III Message-ID: <3bbf2fe10605311632w58c2949buc072e58ac103d7d@mail.gmail.com> In-Reply-To: <20060601084052.D32549@delplex.bde.org> References: <3bbf2fe10605311156p7e629283r34d22b368877582d@mail.gmail.com> <447DFA0C.20207@FreeBSD.org> <3bbf2fe10605311329h7adc1722j9088253515e0265b@mail.gmail.com> <20060601084052.D32549@delplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
2006/6/1, Bruce Evans <bde@zeta.org.au>: > > >> Does that mean it won't work with SMP and PREEMPTION? > > > > Yes it will work (even if I think it needs more testing) but maybe > > would give lesser performances on SMP|PREEMPTION due to too much > > traffic on memory/cache. For this I was planing to use non-temporal > > instructions > > (obviously benchmarks would be very appreciate). > > Er, isn't its main point to fix some !SMP assumptions made in the old > copying-through-the-FPU code? (The old code is messy due to its avoidance > of global changes. It wants to preserve the FPU state on the stack, but > this doesn't quite work so it does extra things (still mostly locally) > that only work in the !SMP && (!SMPng even with UP) case. Patching this > approach to work with SMP || SMPng cases would make it messier.) > > The new code wouldn't behave much differently under SMP. It just might > be a smaller optimization because more memory pressure for SMP causes > more cache misses for everything and there are no benefits from copying > through MMX/XMM unless nontemporal writes are used. All (?) CPUs with > MMX or SSE* can saturate main memory using 32-bit instructions. On > 32-bit CPUs, the benefits of using MMX/XMM come from being able to > saturate the L1 cache on some CPUs (mainly Athlons and not P[2-4]), > and from being able to use nontemporal writes on some CPUs (at least > AthlonXP via SSE extensions all CPUs with SSE2). I was just speaking about the copying routine itself and not about the SSE2 environment preserving mechanism. It remains untouched in SMP case. However I need to say you were right when you suggested me to merge anything in support.s since it has a more coherent design. Attilio -- Peace can only be achieved by understanding - A. Einstein
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3bbf2fe10605311632w58c2949buc072e58ac103d7d>