From owner-svn-src-all@FreeBSD.ORG Mon May 5 22:12:17 2014 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 10837B15; Mon, 5 May 2014 22:12:17 +0000 (UTC) Received: from theravensnest.org (theraven.freebsd.your.org [216.14.102.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "cloud.theravensnest.org", Issuer "StartCom Class 1 Primary Intermediate Server CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id C86B3B80; Mon, 5 May 2014 22:12:16 +0000 (UTC) Received: from [192.168.0.7] (cpc14-cmbg15-2-0-cust307.5-4.cable.virginm.net [82.26.1.52]) (authenticated bits=0) by theravensnest.org (8.14.7/8.14.7) with ESMTP id s45MCBr3035447 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Mon, 5 May 2014 22:12:13 GMT (envelope-from theraven@FreeBSD.org) Content-Type: text/plain; charset=koi8-r Mime-Version: 1.0 (Mac OS X Mail 7.2 \(1874\)) Subject: Re: svn commit: r265367 - head/lib/libc/regex From: David Chisnall In-Reply-To: <536807D8.9000005@freebsd.org> Date: Mon, 5 May 2014 23:12:05 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <9349EAA9-F92C-4170-A1C0-2200FD490E5F@FreeBSD.org> References: <201405051641.s45GfFje086423@svn.freebsd.org> <5367CD77.40909@freebsd.org> <5367EB54.1080109@FreeBSD.org> <3C7CFFB7-5C84-4AC1-9A81-C718D184E87B@FreeBSD.org> <7D7A417E-17C3-4001-8E79-0B57636A70E1@gmail.com> <536807D8.9000005@freebsd.org> To: Andrey Chernov X-Mailer: Apple Mail (2.1874) Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, Pedro Giffuni , src-committers , Warner Losh X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 May 2014 22:12:17 -0000 On 5 May 2014, at 22:51, Andrey Chernov wrote: > For standard malloc/realloc interface it is up to the caller to check > n*size not overflows. You must trust caller already does such check. Do a search of the CVE database sometime to see how well placed that = trust generally is. Or even look at the code in question, where none of = the realloc() or malloc() calls does overflow checking. > Using calloc() to enforce it instead of caller is semantically wrong, Relying on a standard function to behave according to the standard is = semantically wrong? > and especially strange when the caller is standard C library under = your > control. I don't follow this. If libc can't rely on standards conformance from = itself then other code stands no chance. > It was unclear what type of ckecking you mean initially You mean when I said 'the overflow-checking behaviour of calloc'? I'm = sorry, but I'm not sure how I could have made that clearer. > and confirm my > statement that such code is hard to understand. I disagree. Favouring calloc() over malloc() unless profiling indicates = that calloc() is a bottleneck has been recommended practice for a *very* = long time and I'm honestly surprised to encounter C programmers who have = not come across the advice. =20 > Even if it is for > arithmetic overflow, it is still semantically incorrect, see my other > answer. Your other answer did not say *why* you think it's 'semantically = incorrect'. The standard requires calloc() to do overflow checking and = that is the reason for its use in the overwhelming number of cases. > Main purpose of calloc is to zero memory, not to check its > argument, so its argument checking is side effect. It should be > implemented by the caller (as I already answer) and not by the price = of > zeroing. It is unfortunate that the zeroing and the overflow checking were = conflated in the standard, but that certainly doesn't mean that it is = the only purpose of calloc. =20 If you want to argue that the price of zeroing is too high, then I would = like to see some profiling data to back it up. Between the cost of = performing an allocation and the cost of doing a regex search, I'd be = surprised if the cost of a bzero() were not in the noise. To offset = this, you'd be increasing i-cache usage at every malloc() call site by = wrapping it in an overflow check (if you want the code to be *correct* = as well as fast), which is likely to be a bigger hit. =20 The reason that calloc() does zeroing in the first place rather than = just having malloc() followed by memset() / bzero() is that the memory = that malloc() gets from the kernel is already zero'd, and so the 'price' = for the zeroing is often nothing. David P.S. A quick look at Coverity shows 4 other bugs in this file, one of = which looks like it might actually be serious. =20=