From owner-freebsd-numerics@FreeBSD.ORG Thu Jun 27 19:25:39 2013 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 09F5FE26 for ; Thu, 27 Jun 2013 19:25:39 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by mx1.freebsd.org (Postfix) with ESMTP id C88BA1AA8 for ; Thu, 27 Jun 2013 19:25:38 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.6/8.14.6) with ESMTP id r5RJPchE041812; Thu, 27 Jun 2013 12:25:38 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.6/8.14.6/Submit) id r5RJPcGS041811; Thu, 27 Jun 2013 12:25:38 -0700 (PDT) (envelope-from sgk) Date: Thu, 27 Jun 2013 12:25:38 -0700 From: Steve Kargl To: enh Subject: Re: sincos? Message-ID: <20130627192538.GA41760@troutmask.apl.washington.edu> References: <20130627013502.GA37295@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-numerics@freebsd.org X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Jun 2013 19:25:39 -0000 On Thu, Jun 27, 2013 at 09:12:04AM -0700, enh wrote: > well, that was Intel and the code's not been accepted, but yes --- that's > another reason for me not to accept their patch! > > Intel claimed "The reason for this fix [beside workaround for O0 switch] - > it helps to remove some sin[f]+cos[f] code duplication (which is the whole > reason for introduction of such function at all), which results in > 1.58-1.81x performance gain on intervals |x|<100." i've not seen their > benchmark code, so i don't know what their distribution of values was, and > i don't understand why they covered a range as large as +/- 100. > The code duplication, which is removed, is the argument reduction for values |x| > pi / 4 for sin and cos. If you have void sincos(x, *s, *c) { *s = sin(x); *c = cos(x); } then both sin and cos call rem_pio2 (or whatever the function is called) if |x| > pi/4. The code in question removes one of the argument reduction calls, and so you get a speed improvement of 1.5 to 2. As Bruce noted, he would like to see some additional optimizations for -2*pi < x < 2*pi (may have the range incorrect here) integrated intoin, cos, sinl, and cosl before we worry about sincos[fl]. I'll get to those hopefully in August, but coshl, sinhl, and tanhl are on my plate. -- Steve