Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 25 Jun 2001 05:33:38 -0700 (PDT)
From:      Matt Dillon <dillon@earth.backplane.com>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        Peter Wemm <peter@wemm.org>, Mikhail Teterin <mi@aldan.algebra.com>, jlemon@FreeBSD.org, cvs-committers@FreeBSD.org, cvs-all@FreeBSD.org
Subject:   Re: kernel size w/ optimized bzero() & patch set (was Re: Inline optimized bzero (was Re: cvs commit: src/sys/netinettcp_subr.c)) 
Message-ID:  <200106251233.f5PCXc306427@earth.backplane.com>
References:   <Pine.BSF.4.21.0106252337370.7918-100000@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help

:#define	bzero(p, n) ({						\
:	if (__builtin_constant_p(n) && (n) <= X)		\
:		__builtin_memset((p), 0, (n));			\
:	else							\
:		(bzero)((p), (n));				\
:})
:
:for X = 0, 4, 8, 12, 16, 32 and "infinity", the kernel sizes were:
:
:   text	   data	    bss	    dec	    hex	filename
:1962434	 151436	 349824	2463694	 2597ce	kernel.4
:1962442	 151436	 349824	2463702	 2597d6	kernel.8
:1962446	 151436	 349824	2463706	 2597da	kernel.12
:1962466	 151436	 349824	2463726	 2597ee	kernel.0
:1962802	 151436	 349824	2464062	 25993e	kernel.16
:1962866	 151436	 349824	2464126	 25997e	kernel.20
:1963538	 151436	 349824	2464798	 259c1e	kernel.32
:1964098	 151436	 349824	2465358	 259e4e	kernel.infinity
:
:Summary: it's hard for the inline version to be smaller; even when it
:only needs to do one store-immediate operation, the kernel is only 32
:bytes smaller than the one using function calls which have to push
:2 args, do the call, and clean up.  This is presumably due to increased
:register pressure for the inlined versions.

    Yah.  I adjusted my code so the 6*sizeof(int) and 7*sizeof(int) cases
    did a loop rather then inline stores and the kernel size actually went up
    by 40 bytes.

   text    data     bss     dec     hex filename
1850705  159392  144536 2154633  20e089 kernel		Normal bzero
1850833  159396  144536 2154765  20e10d kernel		Most recent patch
1850873  159396  144536 2154805  20e135 kernel		adjusted cases

    Since the loop will normally be smaller then 7 stores, my guess is
    that register and alignment pressures are to blame for the increased
    size.

    And my earlier assessment was wrong.  800+ bzero()'s, a difference of
    128 bytes in the kernel, results in an average of 0.16 additional
    instructions bytes per bzero, not 1 byte per bzero.  I think this is
    quite acceptable considering the positive effect.

    In anycase, I think I'll stick to my most recent patch, it seems to
    have the best overall effect with its ability to do up to 7 inline
    stores (28 bytes) and loop-zero up to 64 bytes with only 0.16 additional
    bytes per bzero on average.  An inline can't get much better then that.

						-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200106251233.f5PCXc306427>