Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 11 Sep 2000 09:51:44 +0100 (BST)
From:      Doug Rabson <dfr@nlsystems.com>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        John Baldwin <jhb@pike.osd.bsdi.com>, Doug Rabson <dfr@FreeBSD.org>, cvs-committers@FreeBSD.org, cvs-all@FreeBSD.org
Subject:   Re: cvs commit: src/sys/sys ktr.h
Message-ID:  <Pine.BSF.4.21.0009110944090.49106-100000@salmon.nlsystems.com>
In-Reply-To: <Pine.BSF.4.21.0009111415320.450-100000@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 11 Sep 2000, Bruce Evans wrote:

> On Mon, 11 Sep 2000, Doug Rabson wrote:
> 
> > On Sun, 10 Sep 2000, John Baldwin wrote:
> > 
> > > Doug Rabson wrote:
> > > > dfr         2000/09/10 07:36:55 PDT
> > > > 
> > > >   Modified files:
> > > >     sys/sys              ktr.h 
> > > >   Log:
> > > >   Use '&' instead of '%' to calculate the new value for ktr_idx.
> > > 
> > > Err, it isn't guaranteed that KTR_ENTRIES will be a power of 2.  If
> > > it is, gcc should be optimizing the mod to a binary and though.
> > 
> > As I mentioned to Brian, it can only reduce to a binary and if the numbers
> > are known to be non-negative.
> 
> gcc reduces divisions and moduluses by a (constant) power of 2 to shifts
> and masks fairly well even for signed operations (except possibly for
> quad operations).

Not this one. With the modulus, the code sequence was something like:

	addl	t1, 0x1, t1
	lda	t0, 1023(t1)
	cmovge	t1, t1, t0
	sra	t0, 0xa, t0
	sll	t0, 0xa, t0
	subl	t1, t0, t1

and with logical and:

	lda	t4, 1023(zero)
	addq	t0, 0x1, t0
	and	t0, t4, t0

Also, the compiler managed to mix several other instructions into this
sequence which probably improved pipelining significantly

> 
> > Possibly changing the declaration of ktr_idx
> > to an unsigned int would be enough but practically speaking, a non-power
> > of 2 value for KTR_ENTRIES would be a disaster for performance.
> 
> I usually use `%' for modulus operations, and the cast the divisor (never
> the dividend) to suitable unsigned type if this would help when the divisor
> is a power of 2.  Exception: gcc doesn't optimize quads very well, so
> use `&', and maybe cast the dividend to unsigned so that the result is
> unsigned.
> 
> The ktr macros are inefficient, so another division wouldn't be much more
> of a disaster for performance.

The KTR_EXTEND ones are very slow but the non-extended ones are a bit more
reasonable.

-- 
Doug Rabson				Mail:  dfr@nlsystems.com
Nonlinear Systems Ltd.			Phone: +44 20 8348 3944





To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.21.0009110944090.49106-100000>