From owner-cvs-src@FreeBSD.ORG Fri Jan 21 14:39:07 2005 Return-Path: Delivered-To: cvs-src@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id ECE8916A4CE; Fri, 21 Jan 2005 14:39:07 +0000 (GMT) Received: from mailout1.pacific.net.au (mailout1.pacific.net.au [61.8.0.84]) by mx1.FreeBSD.org (Postfix) with ESMTP id 40E0843D1D; Fri, 21 Jan 2005 14:39:07 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87])j0LEd6A6022793; Sat, 22 Jan 2005 01:39:06 +1100 Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) j0LEd3xH009024; Sat, 22 Jan 2005 01:39:04 +1100 Date: Sat, 22 Jan 2005 01:39:03 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Peter Wemm In-Reply-To: <200501210123.j0L1NP1u009492@repoman.freebsd.org> Message-ID: <20050122003920.U37564@delplex.bde.org> References: <200501210123.j0L1NP1u009492@repoman.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: cvs-src@freebsd.org cc: src-committers@freebsd.org cc: cvs-all@freebsd.org Subject: Re: cvs commit: src/sys/nfs nfs_vfsops.c X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Jan 2005 14:39:08 -0000 On Fri, 21 Jan 2005, Peter Wemm wrote: > peter 2005-01-21 01:23:25 UTC > > FreeBSD src repository > > Modified files: (Branch: RELENG_4) > sys/nfs nfs_vfsops.c > Log: > Yet another pass on trying to fix the nfs statfs large-fs blocksize scaler. > Casting the result of the 64 bit division to 32 bits (thus discarding the > upper 32 bits) and then looking at the truncated result to try and figure > out if the untruncated result would fit in 32 bits was utterly useless. > > I am still not sure that it is right, but it has a chance of working now. > I'm not at all sure about the sign handling. NFSv3 only reports positive > values here, but correctness of handling the 63/64 bit signs on nfs > volumes is not a problem we'll likely have to deal with for some time. I > think the "most correct" test is for an unsigned division testing for > exceeding LONG_MAX, since we should never end up with a negative number to > compare against LONG_MIN. I sent many mails with better sign handling and many other details, including ones about the panic implemented in this commit. The panic here is just a minor variation of the one in rev.1.133 which was "fixed" in rev.1.135 using bogus casts to annull buggy code. Now the modified casts have no effect so the code is active again. It was reactivated in a slightly different way in -current a month or two ago, but has been backed out there, thus reducing the main bug in this area in -current to wrong values for negative available space (e.g., 2^55-1 instead of -1 for -1 available blocks). > values here, but correctness of handling the 63/64 bit signs on nfs > volumes is not a problem we'll likely have to deal with for some time. I > think the "most correct" test is for an unsigned division testing for > exceeding LONG_MAX, since we should never end up with a negative number to > compare against LONG_MIN. No, negative numbers happen every day for fs_bavail with non-broken servers. The v3 server has been broken in -current but remains unbroken in RELENG_4. % Index: nfs_vfsops.c % =================================================================== % RCS file: /home/ncvs/src/sys/nfs/Attic/nfs_vfsops.c,v % retrieving revision 1.91.2.8 % retrieving revision 1.91.2.9 % diff -u -2 -r1.91.2.8 -r1.91.2.9 % --- nfs_vfsops.c 10 Jul 2004 10:28:08 -0000 1.91.2.8 % +++ nfs_vfsops.c 21 Jan 2005 01:23:25 -0000 1.91.2.9 % @@ -294,16 +294,16 @@ % sbp->f_bsize = bsize; % tquad = fxdr_hyper(&sfp->sf_tbytes); % - if (((long)(tquad / bsize) > LONG_MAX) || % - ((long)(tquad / bsize) < LONG_MIN)) % + if (((quad_t)(tquad / bsize) > LONG_MAX) || % + ((quad_t)(tquad / bsize) < LONG_MIN)) % continue; Casting to quad_t has no effect, since bsize begins at 512 so dividing by it always reduces tquad far below QUAD_MAX. The result of the division alone is always nonnegative so the comparison with LONG_MIN has no effect either. Removing the null code and obfuscatory parentheses gives: if (tquad / bsize > LONG_MAX) continue; This is the same as in rev.1.133 except for it not having excessive parentheses. A panic can occur due to continuing here, but is unlikely. If tquad is nearly 2^64, then there is no value of bsize that can reduce the block count to less than LONG_MAX on machines with 32-bit longs. Then bsize wants to be doubled endlessly, but it first overflows on doubling of 2^30 and then causes a panic for division by 0 on doubling of 2^31. But physical file systems cannot have sizes anywhere near 2^64 bytes yet. My version avoids the panic using explicit bounds checking for bsize. % sbp->f_blocks = tquad / bsize; % tquad = fxdr_hyper(&sfp->sf_fbytes); % - if (((long)(tquad / bsize) > LONG_MAX) || % - ((long)(tquad / bsize) < LONG_MIN)) % + if (((quad_t)(tquad / bsize) > LONG_MAX) || % + ((quad_t)(tquad / bsize) < LONG_MIN)) % continue; As above. % sbp->f_bfree = tquad / bsize; % tquad = fxdr_hyper(&sfp->sf_abytes); % - if (((long)(tquad / bsize) > LONG_MAX) || % - ((long)(tquad / bsize) < LONG_MIN)) % + if (((quad_t)(tquad / bsize) > LONG_MAX) || % + ((quad_t)(tquad / bsize) < LONG_MIN)) % continue; Now the panic can very easily occur. Non-broken servers pass negative available space counts mod 2^64. Then tquad is 2^64-epsilon, so tquad / bsize is initially 2^55-epsilon1 and never smaller than 2^33-epsilon2. So on machines with 32-bit longs, there is no possible value of bsize which can reduce 2^64-epsilon to less that LONG_MAX, and doubling bsize causes a panic as above. Fixing the bugs consists mainly of removing parentheses: if ((quad_t)tquad / bsize > LONG_MAX || (quad_t)tquad / bsize < LONG_MIN) continue; % sbp->f_bavail = tquad / bsize; tquad must also be converted to a signed type before this division too. In my version, tquad is a quad_t and holds the divided value so that the division and its ugly cast doesn't have to be repeated for the range checking, but this is not quite right since tquad really should be an unsigned type except for handling f_bavail. Bruce