Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 29 May 2005 11:36:07 -0600
From:      Scott Long <scottl@samsco.org>
To:        "M. Warner Losh" <imp@bsdimp.com>
Cc:        "Justin T. Gibbs" <gibbs@scsiguy.com>, arch@freebsd.org, nyan@jp.FreeBSD.org
Subject:   Re: [RFC] remove bus_memio.h and bus_pio.h
Message-ID:  <4299FD87.1000505@samsco.org>
In-Reply-To: <20050525.111945.41668351.imp@bsdimp.com>
References:  <20050525.212009.71136852.nyan@jp.FreeBSD.org> <20050525.111945.41668351.imp@bsdimp.com>

next in thread | previous in thread | raw e-mail | index | archive | help
M. Warner Losh wrote:

> In message: <20050525.212009.71136852.nyan@jp.FreeBSD.org>
>             Takahashi Yoshihiro <nyan@jp.FreeBSD.org> writes:
> : The bus_memio.h and bus_pio.h for a micro-optimization depend on the
> : implementation of the bus_space on i386 and amd64, so they are
> : meaningless files on the other archs.  I'd like to remove a MD part
> : like this from MI drivers at least.
> : 
> : I think that a increasing performance by using this method is very
> : trivial on recent machines.  If there is not strong objection, I'll
> : remove bus_{mem,p}io.h and related code from all archs.
> : 
> : Comments?
> 
> Short answer:
> 
> 	Great idea.  aac and bfe should be tested after the change to
> 	see if there is any benefit for them.  Other drivers almost
> 	certainly will see no benefit from this.
> 
> Longer, more detailed answer.
> 
> The original idea was to provide a hint to busspace that this driver
> only ever used a certain subset of the available mappings so it should
> assume that subset and agressively optimize the code.  The assumption
> was that one could know at compile time that one would never use
> certain features.  In an i386 centric world, this made good sense,
> especially since the bus_space_* macros expanded to inb or whatever
> and nothing else (compiler technology innovations may have changed
> this over time).
> 
> You are correct in that other architectures might have more than two
> kinds of address space, might have other complicating factors.  pc98
> has, as you know, an indirect vector because devices on the
> motherboard and cbus are rarely mapped at contiguous locations due to
> the dual 8-bit bus nature of the internal buses.  In that case it
> makes no sense to do any optimization at all, and these files should
> be empty for such an implementation.
> 
> Alpha, sparc64, powerpc and arm all have much more complex bus space
> implementations due to their greater intra-architectural differences,
> as well as their large difference with i386.  To similarly optimize
> these architectures, one would need additional MD info to know how to
> inline things.  None of them have chosen to support this level of
> optimization.  It is unclear to me how big a win such optimiztion
> would be, even on the slower CPUs some of these platforms support.
> 
> The lowest end of FreeBSD/i386 these days[*] is likely a Pentium II
> running at 300MHz or a soekris box.  The 4510 box is still only
> 166MHz.  However, the only device that it has that are likely to
> benefit from this is sio.  Well, in extreme cases, one could make the
> case for any pci card or pccard, but I think that's too extreme to
> consider.  Since the soekris box has only one free serial port, we
> need only keep up with a ppp connection on that serial port, so I'm
> pretty sure we're OK.
> 
> A number of drivers include only one of these two include files:
> 	ti, bfe, trm, stg, scd, aac, kbd, ie, idt, hfa, gfb, fb, dpt,
> 	cnw, aic, aha, ahb, adv
> and some of the mii phy drivers, plus some other trivial uses.
> 
> The only ones on the list that stand out are bfe and aac (the dpt
> optimization is only for EISA cards, and only for the EISA specific
> portions of the driver).  I do not know how much this optimization
> helps these devices, but they are the only ones that I see might be
> affected.  Simple benchmarks should be easy enough to do on aac and
> bfe.
> 
> Warner
> 
> [*] Yes, I know that slower CPUs are supported, and do still perform
> decently if you have enough memory.  This is an arbitrary cutoff for
> the cost-benefit analysis.

This kind of makes me sad.  I don't see how this was harming anything,
it just wasn't documented so people didn't know how to use it.  If it
didn't apply to non-i386 and amd64, fine, just don't implement it for
those platform.  This optimization might have seemed trivial, but it's
all of the little trivial optimizations that add up to make a nice
system.  I'm guessing that Justin only put effort into this originally
because he did see a benefit; discounting it without doing any testing
of your own is a bit disingenuous.

Scott



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4299FD87.1000505>