Date: Sat, 30 Jan 2010 18:09:48 +0530 From: "C. Jayachandran" <c.jayachandran@gmail.com> To: "M. Warner Losh" <imp@bsdimp.com> Cc: freebsd-mips@freebsd.org, neelnatu@gmail.com Subject: Re: Code review: groundwork for SMP Message-ID: <98a59be81001300439o12ec3bf4pc5c03d4f1511297b@mail.gmail.com> In-Reply-To: <20100129.100052.1013538172663276257.imp@bsdimp.com> References: <85D9D383-29A3-4F09-A2FE-61E4EA85CE9B@lakerest.net> <eaa228be1001282242q1f78fff2w9804da6cdadb3d1f@mail.gmail.com> <dffe84831001290725g2ca2574ap22b82f2ad38af2d6@mail.gmail.com> <20100129.100052.1013538172663276257.imp@bsdimp.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jan 29, 2010 at 10:30 PM, M. Warner Losh <imp@bsdimp.com> wrote: > Greetings one and all. =C2=A0Thanks for weighing in on this issue. > > In general, I agree with Neel here. =C2=A0But I also think we need to see > if we can be flexible and push this down into a per-cpu-type > decision (which differs slightly from a per-platform type because we > can have a CPU appearing in multiple platforms, or multiple CPUs > appearing within one platform). =C2=A0If we make it a per-cpu-type > solution, we could have a sys/mips/mips/pcpu_machdep.c which does the > normal SMP stuff, as well as having sys/mips/xlr/pcpu_machdep.c which > does something optimized for the XLR. =C2=A0Chances are good that differe= nt > CPUs will want to have different trade-offs here. =C2=A0We'd also need so= me > way to encode this in an include file, so there's some work to make > PCPU macro different for different CPUs... Yes, I think if the PCPU macros can be left to the specific implementation (even if that involves just ifdef in pcpu.h), we are fine. As I wrote earlier XLR preferred implementation would be to have the this in direct-mapped memory and have pointers in per-CPU scratch registers, and avoid TLB overhead. [...] > The XLR will have scheduler challenges as well. =C2=A0It will push the > design assumptions of ULE beyond the breaking point, I fear. > Hyperthreading already exists on intel, and ULE copes, a bit, with > it. =C2=A0But with the high number of threads each CPU can have, we may > need something with a little more smarts. =C2=A0Something that knows it > might be better to schedule two different processes on two different > cores, and leave some of the threads idle to reduce TLB pressure, for > example. Yes, this is one area we have not looked at much. > Per CPU scratch registers do not exist on MIPS, in general. =C2=A0Some CP= Us > have them, and many do not. =C2=A0CP0 registers are plentiful in more > modern designs, and some of them may even be useful for our needs. > However, mfc0 and mtc0 often have pipeline hazards associated with > them which will trip up the unwary. =C2=A0When reading the historical > errata for MIPS CPUs, we often find that this is where we need to do > the most workarounds. In XLR, there are 8 scratch registers, which can be accessed in a few cycles without any software visible hazards. So in our internal port we keep things like PCPU pointer and 'curthread' in scratch registers. > I guess this is a long way to say "I think we should commit Neel's > patches. =C2=A0We should work along two fronts: (1) implementing Juli's > idea of sharing kstack and pcpu data in one TLB and (2) making it so > that CPUs where this is sub-optimal can swap in their own > implementation." Another suggestion I have on Neel's implementation would be to allocate the pages (direct-mapped) at bootup, so that memory is not taken for cpus which are not available (XLR can have 4-32 cpus depending on the chip). Regards, JC.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?98a59be81001300439o12ec3bf4pc5c03d4f1511297b>