From owner-svn-src-head@FreeBSD.ORG Thu Jul 25 16:37:29 2013 Return-Path: Delivered-To: svn-src-head@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 99D0F63B; Thu, 25 Jul 2013 16:37:29 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail107.syd.optusnet.com.au (mail107.syd.optusnet.com.au [211.29.132.53]) by mx1.freebsd.org (Postfix) with ESMTP id 28B972305; Thu, 25 Jul 2013 16:37:27 +0000 (UTC) Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23]) by mail107.syd.optusnet.com.au (Postfix) with ESMTPS id EB777D41E74; Fri, 26 Jul 2013 02:37:15 +1000 (EST) Date: Fri, 26 Jul 2013 02:37:14 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: David Chisnall Subject: Re: svn commit: r253636 - head/sys/vm In-Reply-To: <73FCA347-5EB9-445F-A25C-D06CA137CBEE@FreeBSD.org> Message-ID: <20130726021053.O2628@besplex.bde.org> References: <201307250348.r6P3mbsG049595@svn.freebsd.org> <20130725171038.O841@besplex.bde.org> <51F0DDB0.7080102@bitfrost.no> <73FCA347-5EB9-445F-A25C-D06CA137CBEE@FreeBSD.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=YYGEuWhf c=1 sm=1 tr=0 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117 a=PO7r1zJSAAAA:8 a=B9HLQOAVcRIA:10 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=AptvzqaXglMA:10 a=RLjKCVHinBivolnNwb0A:9 a=CjuIK1q_8ugA:10 Cc: src-committers@FreeBSD.org, svn-src-all@FreeBSD.org, svn-src-head@FreeBSD.org, Tim Kientzle , Bruce Evans , Hans Petter Selasky X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Jul 2013 16:37:29 -0000 On Thu, 25 Jul 2013, David Chisnall wrote: > On 25 Jul 2013, at 09:11, Hans Petter Selasky wrote: > >> The structure looks like some size, so bzero() might run faster than memset() depending on the compiler settings. Should be profiled before changed! > > They will generate identical code for small structures with known sizes. Both clang and gcc have a simplify libcalls pass that recognises both functions and will elide the call in preference to a small set of inline stores. In the kernel, compilers are prevented from inlining memset() and many other things by -ffreestanding in CFLAGS. This rarely matters, and no one except me cares. In my version, memset() doesn't exist, but bzero() has the micro-optimization of turning itself into __builtin_memset() if the size is small (<= 32). My memcpy() has the micro-optimization of turning itself into __builtin_memcpy() unconditionally. __builtin_memcpy() then turns itself back into extern memcpy() according to the inverse of a similar size check. I think extern memcmp() still doesn't exist in the FreeBSD kernel, so simply using __builtin_memset() would give linkage errors when __builtin_memcmp() turns itself back into memcmp(). Determining whether the size is "small" is difficult. It is very CPU-dependent, and also depends on how efficient the extern function is. Compilers once used a very large limits for inlining, but changed to fairly small limits when they realized that they didn't understand memory. Extern functions are hard to optimize, since the correct optimization depends on the CPU including its cache organization. FreeBSD's x86 bcopy() and bzero() are still optimized for generic CPUs. Generic means approximately the original i386, but since these are important operations, all CPUs run the old i386 code for them not to badly (perhaps only twice as slow as possible), with newer Intel systems doing it better than most. Use of memcpy() in the kernel is the result of previous micro-optimizations. It was supposed to be used only for small fixed-size copies. This could have been done better by making bcopy() inline and calling __builtin_memcpy() in this case. The extern memcpy() should never have been used, but was needed for cases where __builtin_memcpy() turns itself into memcpy(), which happened mainly when compiling with -O0. Other uses of memcpy() were style bugs. No one cared when this optimization was turned into a style bug in all cases by -ffreestanding. > However(), memset is to be preferred in this idiom because the compiler provides better diagnostics in the case of error: > > bzero.c:9:22: warning: 'memset' call operates on objects of type 'struct foo' > while the size is based on a different type 'struct foo *' > [-Wsizeof-pointer-memaccess] > memset(f, 0, sizeof(f)); > ~ ^ > bzero.c:9:22: note: did you mean to dereference the argument to 'sizeof' (and > multiply it by the number of elements)? > memset(f, 0, sizeof(f)); > ^ > > The same line with bzero(f, sizeof(f)) generates no error. This is compiler bug with -ffreestanding. Then memset() is not special. clang allows me to declare my own memset but still prints this warning if the API is not too different: no warning for "void memset(int *, int)", but warning for "int memset(int *, int, int)". The warning seems to be based on the function name, since it is not claimed that the function is standard (even without -ffreestanding). While testing this, I mistyped an &f as &foo, where &foo is a function name. clang doesn't warn about this. clang warned when memset was my home made "int memset(int *, int, int)", because the function pointer isn't compatible with int *. But it also isn't compatible with void *. I think casting NULL to a function pointer must work even if NULL is spelled with a void *, but that is the only case where a void * object pointer can safely be converted to a function pointer. Bruce