From owner-freebsd-arch@FreeBSD.ORG Mon Nov 3 05:03:47 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 37D2716A4CF; Mon, 3 Nov 2003 05:03:47 -0800 (PST) Received: from silver.tanimura.dyndns.org (IP1A0600.kng.mesh.ad.jp [211.13.103.220]) by mx1.FreeBSD.org (Postfix) with ESMTP id D5C1A43FAF; Mon, 3 Nov 2003 05:03:43 -0800 (PST) (envelope-from tanimura@tanimura.dyndns.org) Received: from silver.tanimura.dyndns.org (localhost [127.0.0.1]) with ESMTP id hA3D3TQK038461 ; Mon, 3 Nov 2003 22:03:29 +0900 (JST) Message-Id: <200311031303.hA3D3TQK038461@silver.tanimura.dyndns.org> Date: Mon, 03 Nov 2003 22:03:29 +0900 From: Seigo Tanimura To: current@FreeBSD.org, arch@FreeBSD.org In-Reply-To: <200310221021.h9MALkUM023905@urban> References: <200310220319.h9M3JGI1005225@urban> <200310221021.h9MALkUM023905@urban> User-Agent: Wanderlust/2.10.0 (Venus) SEMI/1.14.4 (Hosorogi) FLIM/1.14.4 (=?ISO-8859-1?Q?Kashiharajing=FE-mae?=) APEL/10.4 MULE XEmacs/21.1 (patch 14) (Cuyahoga Valley) (i386--freebsd) Organization: My Home MIME-Version: 1.0 (generated by SEMI 1.14.4 - "Hosorogi") Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, hits=0.0 required=8.0 tests=none autolearn=no version=2.60 X-Spam-Checker-Version: SpamAssassin 2.60 (1.212-2003-09-23-exp) on silver.tanimura.dyndns.org cc: Seigo Tanimura Subject: Re: Who should set the priority of a select(2)ing thread being waken up? X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Nov 2003 13:03:47 -0000 [posted to -current as well, because there were no replies in -arch] On Wed, 22 Oct 2003 19:21:46 +0900, Seigo Tanimura said: tanimura> In good old days, only a socket and a pipe were the major file tanimura> descriptors being select(2)ed. As select(2) was just a socket tanimura> operation, it was sufficient to set the priority of select(2)ing tanimura> process to PSOCK(*1), I suppose. tanimura> Nowadays, quite a few drivers support select(2) as well, including tanimura> sound, usb, scsi controllers, and so on. I am not convinced whether a tanimura> process should select(2) those devices at PSOCK as we do for a socket. tanimura> Suppose that a process select(2)s for a pcm device and a socket at tanimura> once. If the process is waken up by the pcm driver at PSOCK, another tanimura> process at a better priority may preempt the first one, which can tanimura> result in dropping some pcm data. tanimura> Maybe it would be better if the caller of selwakeup() could determine tanimura> the priority of a process or a thread. That would let us raise the tanimura> priority to PRIBIO if pcm data was ready, while the priority would tanimura> stay at PSOCK if the socket was ready. tanimura> (*1) I broke that in 5-CURRENT when I modified select(2) and poll(2) tanimura> to use a conditional variable. tanimura> The attached patch implements selwakeuppri(), which lets you set the tanimura> priority of a thread being waken up. Also in the patch is a small tanimura> test code to use selwakeuppri() in pcm(4). (patch snipped) The updated patch at: http://people.freebsd.org/~tanimura/selwakeuppri.diff.gz converts all selwakeup()s into selwakeuppri()s with appropriate priorities. Old selwakeup() is left, but I may axe it. Any objections if I commit the patch in a week? -- Seigo Tanimura From owner-freebsd-arch@FreeBSD.ORG Mon Nov 3 08:30:17 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 052D116A4CF for ; Mon, 3 Nov 2003 08:30:17 -0800 (PST) Received: from mail.speakeasy.net (mail6.speakeasy.net [216.254.0.206]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6D32344003 for ; Mon, 3 Nov 2003 08:30:07 -0800 (PST) (envelope-from jhb@FreeBSD.org) Received: (qmail 25313 invoked from network); 3 Nov 2003 16:30:06 -0000 Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender )encrypted SMTP for ; 3 Nov 2003 16:30:06 -0000 Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.12.9/8.12.9) with ESMTP id hA3GTgce063379; Mon, 3 Nov 2003 11:29:43 -0500 (EST) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.5.4 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <20031101190722.M10222-100000@mail.chesapeake.net> Date: Mon, 03 Nov 2003 11:29:43 -0500 (EST) From: John Baldwin To: Jeff Roberson X-Spam-Checker-Version: SpamAssassin 2.55 (1.174.2.19-2003-05-19-exp) cc: arch@freebsd.org Subject: Re: HEADSUP: New i386 interrupt and SMP code.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Nov 2003 16:30:17 -0000 On 02-Nov-2003 Jeff Roberson wrote: > > On Thu, 30 Oct 2003, John Baldwin wrote: > >> Coming very soon to a CVS tree near you are some very large changes to >> the i386 interrupt and SMP code. New features include: >> >> - Runtime selection of using the I/O APICs or the AT PICs to route >> interrupts. >> - I/O APICs can be used in a UP kernel or on a UP system that >> supplies either an MP Table or ACPI APIC Table. >> - An SMP kernel can run on a UP machine. This means that SMP >> can now be enabled in GENERIC and the SMP kernel config can die. > > The lock prefix is extremely expensive on the P4 systems that I have > measured. It makes lock cmpxchg 150 cycles vs 12. On athlon this is not > such a big deal since it goes to 25 cycles from 12. We should measure the > impact of compiling in the lock prefix on UP P4 systems before making this > the default. > > Otherwise, this all sounds good. Note that one can always compile a custom kernel if one needs it for a specific application, but that this will increase the amount of out-of-box support for i386. This has also been a requested feature for quite a while now. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ From owner-freebsd-arch@FreeBSD.ORG Tue Nov 4 17:37:23 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5718B16A4D0 for ; Tue, 4 Nov 2003 17:37:23 -0800 (PST) Received: from smtp.omnis.com (smtp.omnis.com [216.239.128.26]) by mx1.FreeBSD.org (Postfix) with ESMTP id 65D3043FF3 for ; Tue, 4 Nov 2003 17:37:22 -0800 (PST) (envelope-from wes@softweyr.com) Received: from salty.rapid.stbernard.com (corp-2.ipinc.com [199.245.188.2]) by smtp-relay.omnis.com (Postfix) with ESMTP id B4CB05B6FC for ; Tue, 4 Nov 2003 17:31:28 -0800 (PST) From: Wes Peters Organization: Softweyr.com To: arch@freebsd.org Date: Tue, 4 Nov 2003 17:37:20 -0800 User-Agent: KMail/1.5.2 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200311041737.20467.wes@softweyr.com> Subject: newfs and mount vs. half-baked disks X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Nov 2003 01:37:23 -0000 Upon switching to FreeBSD 5.x and disk-based hardware at ${DAYJOB}, we found a little problem. We have a large data area on our disk that holds transient data; when the system boots if this filesystem isn't clean we just newfs and mount the clean new filesystem. The problem came when some wiseacre yanked the powercord in the middle of newfs'ing this 40GB filesystem. When the system booted, it noted the filesystem as clean, mounted it, and promptly panic'ed on the first write access. Oops. I emailed Kirk about this state of affairs and he confirmed that newfs was developed with operator intervention in mind. He suggested employing one of the unused flags in the filesystem header as a 'consistent' flag, setting it to 'not consistent' at the beginning of newfs, and then updating to 'is consistent' at the end. The performance hit in updating all superblock copies at the end is small but noticable (< 1s on a rather slow 6GB filesystem). The attached patch does this, plus a bit more. The fs_state field is used to signify the filesystem has been completely written. The mount vfsop has been modified to require this field to be zero. Newfs has been modified to initially set this field to a non-zero value until the last phase of superblock updates, when it is again cleared to zero. The patch attached also adds testing code to newfs to force it to abandon the newfs operation in various places, to facilitate testing. This would obviously be committed in a separate commit, if at all. Questions: I'd like to commit the safer newfs and vfs support before 5.2. Anyone have heartburn with that? If so, would it be acceptable to make the extra I/O enabled by a command line option? (I.e. skipping the first sbwrite and calling the second non-recursive, along with NOT muddying the fs_state and fs_clean flags.) Should extra debugging code like this be committed? Code like this would make it much easier to wrap a regression test around newfs, at the cost of introducing non-operational command line arguments into utilities. If anyone has suggestions on how to do this, please share. -- "Where am I, and what am I doing in this handbasket?" Wes Peters wes@softweyr.com From owner-freebsd-arch@FreeBSD.ORG Tue Nov 4 17:57:12 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 96F2016A4D7 for ; Tue, 4 Nov 2003 17:57:11 -0800 (PST) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id DA01743FD7 for ; Tue, 4 Nov 2003 17:57:10 -0800 (PST) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.12.9/8.12.9) id hA51vA23002956; Tue, 4 Nov 2003 19:57:10 -0600 (CST) (envelope-from dan) Date: Tue, 4 Nov 2003 19:57:10 -0600 From: Dan Nelson To: Wes Peters Message-ID: <20031105015709.GC28915@dan.emsphone.com> References: <200311041737.20467.wes@softweyr.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200311041737.20467.wes@softweyr.com> X-OS: FreeBSD 5.1-CURRENT X-message-flag: Outlook Error User-Agent: Mutt/1.5.4i cc: arch@freebsd.org Subject: Re: newfs and mount vs. half-baked disks X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Nov 2003 01:57:12 -0000 In the last episode (Nov 04), Wes Peters said: > I emailed Kirk about this state of affairs and he confirmed that > newfs was developed with operator intervention in mind. He suggested > employing one of the unused flags in the filesystem header as a > 'consistent' flag, setting it to 'not consistent' at the beginning of > newfs, and then updating to 'is consistent' at the end. The > performance hit in updating all superblock copies at the end is small > but noticable (< 1s on a rather slow 6GB filesystem). Would writing a block of zeros to the first (or first n) superblock, newfs'ing, then rewriting the correct data do the same thing without affecting the filesystem itself? I'm thinking about 4.x and cross-OS portability here. -- Dan Nelson dnelson@allantgroup.com From owner-freebsd-arch@FreeBSD.ORG Tue Nov 4 18:28:16 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7550916A4CE for ; Tue, 4 Nov 2003 18:28:16 -0800 (PST) Received: from smtp4.server.rpi.edu (smtp4.server.rpi.edu [128.113.2.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8B60343FB1 for ; Tue, 4 Nov 2003 18:28:15 -0800 (PST) (envelope-from drosih@rpi.edu) Received: from [128.113.24.47] (gilead.netel.rpi.edu [128.113.24.47]) by smtp4.server.rpi.edu (8.12.10/8.12.9) with ESMTP id hA52SDaE021741; Tue, 4 Nov 2003 21:28:13 -0500 Mime-Version: 1.0 X-Sender: drosih@mail.rpi.edu Message-Id: In-Reply-To: <200311041737.20467.wes@softweyr.com> References: <200311041737.20467.wes@softweyr.com> Date: Tue, 4 Nov 2003 21:28:12 -0500 To: Wes Peters , arch@freebsd.org From: Garance A Drosihn Content-Type: text/plain; charset="us-ascii" ; format="flowed" X-Scanned-By: CanIt (www . canit . ca) Subject: Re: newfs and mount vs. half-baked disks X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Nov 2003 02:28:16 -0000 At 5:37 PM -0800 11/4/03, Wes Peters wrote: >Should extra debugging code like this be committed? Code like >this would make it much easier to wrap a regression test around >newfs, at the cost of introducing non-operational command line >arguments into utilities. If anyone has suggestions on how >to do this, please share. For what it's worth, I recently added some non-operational command line arguments into newsyslog. In my case I decided to add only *1* new argument, and then all the special debugging commands are specified as strings to that argument. This means that only one option-letter is used up, and it also makes it much much less likely that someone is going to specify the special debugging/regression options by accident. -- Garance Alistair Drosehn = gad@gilead.netel.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu From owner-freebsd-arch@FreeBSD.ORG Wed Nov 5 00:15:31 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BCABD16A4CE for ; Wed, 5 Nov 2003 00:15:31 -0800 (PST) Received: from cirb503493.alcatel.com.au (c211-30-75-229.belrs2.nsw.optusnet.com.au [211.30.75.229]) by mx1.FreeBSD.org (Postfix) with ESMTP id DD0EE43F85 for ; Wed, 5 Nov 2003 00:15:29 -0800 (PST) (envelope-from PeterJeremy@optushome.com.au) Received: from cirb503493.alcatel.com.au (localhost.alcatel.com.au [127.0.0.1])hA58FHf1038050; Wed, 5 Nov 2003 19:15:30 +1100 (EST) (envelope-from jeremyp@cirb503493.alcatel.com.au) Received: (from jeremyp@localhost)hA58FGqx038049; Wed, 5 Nov 2003 19:15:16 +1100 (EST) (envelope-from jeremyp) Date: Wed, 5 Nov 2003 19:15:16 +1100 From: Peter Jeremy To: Wes Peters Message-ID: <20031105081516.GA38016@cirb503493.alcatel.com.au> References: <200311041737.20467.wes@softweyr.com> <20031105015709.GC28915@dan.emsphone.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031105015709.GC28915@dan.emsphone.com> User-Agent: Mutt/1.4.1i cc: arch@freebsd.org Subject: Re: newfs and mount vs. half-baked disks X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Nov 2003 08:15:31 -0000 On Tue, Nov 04, 2003 at 07:57:10PM -0600, Dan Nelson wrote: >In the last episode (Nov 04), Wes Peters said: >> I emailed Kirk about this state of affairs and he confirmed that >> newfs was developed with operator intervention in mind. He suggested >> employing one of the unused flags in the filesystem header as a >> 'consistent' flag, setting it to 'not consistent' at the beginning of >> newfs, and then updating to 'is consistent' at the end. The >> performance hit in updating all superblock copies at the end is small >> but noticable (< 1s on a rather slow 6GB filesystem). > >Would writing a block of zeros to the first (or first n) superblock, >newfs'ing, then rewriting the correct data do the same thing without >affecting the filesystem itself? I'm thinking about 4.x and cross-OS >portability here. My suggestion would be to write a non-standard magic number to fs_magic in the primary and first backup superblock (block 32) - I believe these are the only ones fsck will automatically search. The "invalid" magic number means that neither mount nor fsck will recognize the partition. Those two blocks can be re-written at the end - the additional time should be unnoticable. The remaining superblocks would appear valid but if someone is silly enough to manually specify a alternate superblock in an incompletely newfs'd filesystem, they get a neat hole in their foot. (A known non-standard magic number would also allow fsck to warn that the filesystem was incompletely newfs'd). I'm surprised that this bug hasn't been noticed previously. Peter From owner-freebsd-arch@FreeBSD.ORG Wed Nov 5 03:18:57 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 709DA16A4CE for ; Wed, 5 Nov 2003 03:18:57 -0800 (PST) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 13F0543FDD for ; Wed, 5 Nov 2003 03:18:56 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3p2/8.8.7) with ESMTP id WAA26417; Wed, 5 Nov 2003 22:18:49 +1100 Date: Wed, 5 Nov 2003 22:18:48 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Wes Peters In-Reply-To: <200311041737.20467.wes@softweyr.com> Message-ID: <20031105213950.Y1738@gamplex.bde.org> References: <200311041737.20467.wes@softweyr.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: arch@freebsd.org Subject: Re: newfs and mount vs. half-baked disks X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Nov 2003 11:18:57 -0000 On Tue, 4 Nov 2003, Wes Peters wrote: > Upon switching to FreeBSD 5.x and disk-based hardware at ${DAYJOB}, we > found a little problem. We have a large data area on our disk that > holds transient data; when the system boots if this filesystem isn't > clean we just newfs and mount the clean new filesystem. > > The problem came when some wiseacre yanked the powercord in the middle > of newfs'ing this 40GB filesystem. When the system booted, it noted > the filesystem as clean, mounted it, and promptly panic'ed on the first > write access. Oops. > > I emailed Kirk about this state of affairs and he confirmed that newfs > was developed with operator intervention in mind. He suggested > employing one of the unused flags in the filesystem header as a > 'consistent' flag, setting it to 'not consistent' at the beginning of > newfs, and then updating to 'is consistent' at the end. The > performance hit in updating all superblock copies at the end is small > but noticable (< 1s on a rather slow 6GB filesystem). There is no need to use a new flag. Just set the magic number to a value different from both FS_UFS1_MAGIC and FS_UFS2_MAGIC, e.g., to 0, until newfs is nearly finished. As an implementation detail, it might be simpler to write 0's to the whole superblock than 0 to one word in it. I think writing special values to all the superblock copies is not needed, since the kernel and utilities give up if they don't find the magic number in the first superblock (not sure of this for fsck_ffs). OTOH, newfs should start by setting the magic number to a non-ffs value in all of the (4) possible superblocks given by SBLOCKSEARCH, since the superblock for the previous file system may have been in a different place. Newfs should also set the unclean flags and any related flags until it is nearly finished, but that alone wouldn't work so well. The kernel would still permit readonly mounts, and fsck would be confused by half-baked file systems. I think permitting readonly mounts without the force flag is a bug (we only require the force flag for r/w mounts of unclean file systems). Reading from a damaged file system may be just as dangerous as writing. > The attached patch does this, plus a bit more. The fs_state field is Nothing was attached :-). I would probably prefer the version that does a bit less. Bruce From owner-freebsd-arch@FreeBSD.ORG Wed Nov 5 10:04:46 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from green.bikeshed.org (freefall.freebsd.org [216.136.204.21]) by hub.freebsd.org (Postfix) with ESMTP id 5133816A4CF; Wed, 5 Nov 2003 10:04:46 -0800 (PST) Received: from green.bikeshed.org (localhost [127.0.0.1]) by green.bikeshed.org (8.12.10/8.12.9) with ESMTP id hA5I4gcR002374; Wed, 5 Nov 2003 13:04:42 -0500 (EST) (envelope-from green@green.bikeshed.org) Received: from localhost (green@localhost)hA5I4gga002370; Wed, 5 Nov 2003 13:04:42 -0500 (EST) Message-Id: <200311051804.hA5I4gga002370@green.bikeshed.org> X-Mailer: exmh version 2.6.3 04/04/2003 with nmh-1.0.4 To: arch@FreeBSD.org From: Brian Fundakowski Feldman Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 05 Nov 2003 13:04:42 -0500 Sender: green@green.bikeshed.org cc: sam@FreeBSD.org cc: imp@FreeBSD.org Subject: __VA_ARGS__izing IEEE80211_DPRINTF[2]() X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Nov 2003 18:04:46 -0000 Would it be a problem to make the following change to src/sys/net80211 so that the debug messages aren't totally useless for systems that have more than one card (or confusing on systems that just have one)? Obviously, it would also involve removing the extra parentheses in each of the callers as well. Old: #define IEEE80211_DPRINTF(X) if (ieee80211_debug) printf X #define IEEE80211_DPRINTF2(X) if (ieee80211_debug>1) printf X New: #define IEEE80211_DPRINTF(...) do { \ if (ieee80211_debug) \ if_printf(&ic->ic_ifp, __VA_ARGS__); \ while (0) The only place this wouldn't work is ieee80211_decap(), so I'd change it to add a local "ic" variable when compiled for debugging. There's an easy fallback for non-C99 compilers, too; it just wouldn't print the interface: static __inline void IEEE80211_DPRINTF(const char *fmt, ...) { if (ieee80211_debug) { va_list ap; va_start(ap, fmt); (void)vprintf(fmt, ap); va_end(ap); } } -- Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\ <> green@FreeBSD.org \ The Power to Serve! \ Opinions expressed are my own. \,,,,,,,,,,,,,,,,,,,,,,\ From owner-freebsd-arch@FreeBSD.ORG Wed Nov 5 11:25:30 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from green.bikeshed.org (freefall.freebsd.org [216.136.204.21]) by hub.freebsd.org (Postfix) with ESMTP id AD14416A4CF; Wed, 5 Nov 2003 11:25:30 -0800 (PST) Received: from green.bikeshed.org (localhost [127.0.0.1]) by green.bikeshed.org (8.12.10/8.12.9) with ESMTP id hA5JPTcR003096; Wed, 5 Nov 2003 14:25:29 -0500 (EST) (envelope-from green@green.bikeshed.org) Received: from localhost (green@localhost)hA5JPT6S003092; Wed, 5 Nov 2003 14:25:29 -0500 (EST) Message-Id: <200311051925.hA5JPT6S003092@green.bikeshed.org> X-Mailer: exmh version 2.6.3 04/04/2003 with nmh-1.0.4 To: arch@FreeBSD.org From: Brian Fundakowski Feldman Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 05 Nov 2003 14:25:29 -0500 Sender: green@green.bikeshed.org cc: fenner@FreeBSD.org Subject: bpf/pcap are weird X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Nov 2003 19:25:31 -0000 Okay, this is goofy stuff and breaks a lot of code that otherwise makes certain assumptions about pcap/bpf that don't work on FreeBSD. Our bpf(4) doesn't actually care about the non-blocking fd flag, and our pcap(3) doesn't care at all about BIOCIMMEDIATE. Why do we have BIOCIMMEDIATE? It seems like it's what SHOULD be implemented with the non-blocking I/O flag with the exception that if using O_NONBLOCK/FIONBIO you could actually query for the status, whereas you can't query for BIOCIMMEDIATE since it's only a SET and not a GET ioctl. What's up with this? Software that knows about pcap(3) but not bpf(4) on FreeBSD can't put the interface in the mode it wants to, and the non-blocking flag is settable and gettable but doesn't do anything. Wouldn't it be better to get rid of at least one of the interfaces, and provide a way to check what mode the bpf descriptor is in, either way? -- Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\ <> green@FreeBSD.org \ The Power to Serve! \ Opinions expressed are my own. \,,,,,,,,,,,,,,,,,,,,,,\ From owner-freebsd-arch@FreeBSD.ORG Wed Nov 5 13:59:34 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DD51C16A4D1; Wed, 5 Nov 2003 13:59:34 -0800 (PST) Received: from ebb.errno.com (ebb.errno.com [66.127.85.87]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9ECC743FCB; Wed, 5 Nov 2003 13:59:33 -0800 (PST) (envelope-from sam@errno.com) Received: from 66.127.85.91 ([66.127.85.91]) (authenticated bits=0) by ebb.errno.com (8.12.9/8.12.9) with ESMTP id hA5LxW0x086838 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO); Wed, 5 Nov 2003 13:59:33 -0800 (PST) (envelope-from sam@errno.com) From: Sam Leffler Organization: Errno Consulting To: Brian Fundakowski Feldman , arch@freebsd.org Date: Wed, 5 Nov 2003 14:01:27 -0800 User-Agent: KMail/1.5.3 References: <200311051804.hA5I4gga002370@green.bikeshed.org> In-Reply-To: <200311051804.hA5I4gga002370@green.bikeshed.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200311051401.27579.sam@errno.com> Subject: Re: __VA_ARGS__izing IEEE80211_DPRINTF[2]() X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Nov 2003 21:59:35 -0000 On Wednesday 05 November 2003 10:04 am, Brian Fundakowski Feldman wrote: > Would it be a problem to make the following change to src/sys/net80211 so > that the debug messages aren't totally useless for systems that have more > than one card (or confusing on systems that just have one)? Obviously, it > would also involve removing the extra parentheses in each of the callers as > well. > > Old: > #define IEEE80211_DPRINTF(X) if (ieee80211_debug) printf X > #define IEEE80211_DPRINTF2(X) if (ieee80211_debug>1) printf X > > New: > #define IEEE80211_DPRINTF(...) do { \ > if (ieee80211_debug) \ > if_printf(&ic->ic_ifp, __VA_ARGS__); \ > while (0) > > The only place this wouldn't work is ieee80211_decap(), so I'd change it to > add a local "ic" variable when compiled for debugging. There's an easy > fallback for non-C99 compilers, too; it just wouldn't print the interface: > > static __inline void > IEEE80211_DPRINTF(const char *fmt, ...) > { > > if (ieee80211_debug) { > va_list ap; > > va_start(ap, fmt); > (void)vprintf(fmt, ap); > va_end(ap); > } > } I can't see what your intent is from the above. If the point is to use if_printf everywhere so all the printfs have a device prepended to the message then I'm fine with that. However I think it's a bad idea to depend on local variables existing. If you're going to do it, then add an explicit argument to the macros. If you're trying to deal with debugging systems w/ multiple 802.11 cards then you probably want debugging enabled on a per-if basis which this doesn't address. Regardless, in all this remember that this code is shared with other systems so changes like this shouldn't be done lightly. Sam From owner-freebsd-arch@FreeBSD.ORG Wed Nov 5 14:17:19 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from green.bikeshed.org (freefall.freebsd.org [216.136.204.21]) by hub.freebsd.org (Postfix) with ESMTP id 305ED16A4CE; Wed, 5 Nov 2003 14:17:19 -0800 (PST) Received: from green.bikeshed.org (localhost [127.0.0.1]) by green.bikeshed.org (8.12.10/8.12.9) with ESMTP id hA5MHIcR004515; Wed, 5 Nov 2003 17:17:18 -0500 (EST) (envelope-from green@green.bikeshed.org) Received: from localhost (green@localhost)hA5MHHmx004511; Wed, 5 Nov 2003 17:17:18 -0500 (EST) Message-Id: <200311052217.hA5MHHmx004511@green.bikeshed.org> X-Mailer: exmh version 2.6.3 04/04/2003 with nmh-1.0.4 To: Sam Leffler In-Reply-To: Message from Sam Leffler of "Wed, 05 Nov 2003 14:01:27 PST." <200311051401.27579.sam@errno.com> From: "Brian F. Feldman" Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 05 Nov 2003 17:17:17 -0500 Sender: green@green.bikeshed.org cc: arch@freebsd.org Subject: Re: __VA_ARGS__izing IEEE80211_DPRINTF[2]() X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Nov 2003 22:17:19 -0000 Sam Leffler wrote: > On Wednesday 05 November 2003 10:04 am, Brian Fundakowski Feldman wrote: > > Would it be a problem to make the following change to src/sys/net80211 so > > that the debug messages aren't totally useless for systems that have more > > than one card (or confusing on systems that just have one)? Obviously, it > > would also involve removing the extra parentheses in each of the callers as > > well. > > > > Old: > > #define IEEE80211_DPRINTF(X) if (ieee80211_debug) printf X > > #define IEEE80211_DPRINTF2(X) if (ieee80211_debug>1) printf X > > > > New: > > #define IEEE80211_DPRINTF(...) do { \ > > if (ieee80211_debug) \ > > if_printf(&ic->ic_ifp, __VA_ARGS__); \ > > while (0) > > > > The only place this wouldn't work is ieee80211_decap(), so I'd change it to > > add a local "ic" variable when compiled for debugging. There's an easy > > fallback for non-C99 compilers, too; it just wouldn't print the interface: > > > > static __inline void > > IEEE80211_DPRINTF(const char *fmt, ...) > > { > > > > if (ieee80211_debug) { > > va_list ap; > > > > va_start(ap, fmt); > > (void)vprintf(fmt, ap); > > va_end(ap); > > } > > } > > I can't see what your intent is from the above. If the point is to use > if_printf everywhere so all the printfs have a device prepended to the > message then I'm fine with that. However I think it's a bad idea to depend > on local variables existing. If you're going to do it, then add an explicit > argument to the macros. > > If you're trying to deal with debugging systems w/ multiple 802.11 cards then > you probably want debugging enabled on a per-if basis which this doesn't > address. > > Regardless, in all this remember that this code is shared with other systems > so changes like this shouldn't be done lightly. My intent is for the use of if_printf to prepend the interface name, yes. Per-interface debug flags will be trivial if we convert calls to say DPRINTF(ic, "fmt",...). Not using the "magic" ic is fine with me -- just seems a shame because it's pervasive in just about every function you'd ever want to DPRINTF() from. -- Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\ <> green@FreeBSD.org \ The Power to Serve! \ Opinions expressed are my own. \,,,,,,,,,,,,,,,,,,,,,,\ From owner-freebsd-arch@FreeBSD.ORG Wed Nov 5 20:22:43 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from green.bikeshed.org (freefall.freebsd.org [216.136.204.21]) by hub.freebsd.org (Postfix) with ESMTP id F302116A4CF for ; Wed, 5 Nov 2003 20:22:42 -0800 (PST) Received: from green.bikeshed.org (localhost [127.0.0.1]) by green.bikeshed.org (8.12.10/8.12.9) with ESMTP id hA64MgcR016142 for ; Wed, 5 Nov 2003 23:22:42 -0500 (EST) (envelope-from green@green.bikeshed.org) Received: from localhost (green@localhost)hA64MfbS016138 for ; Wed, 5 Nov 2003 23:22:42 -0500 (EST) Message-Id: <200311060422.hA64MfbS016138@green.bikeshed.org> X-Mailer: exmh version 2.6.3 04/04/2003 with nmh-1.0.4 To: arch@FreeBSD.org From: Brian Fundakowski Feldman Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 05 Nov 2003 23:22:41 -0500 Sender: green@green.bikeshed.org Subject: my 802.11 work X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Nov 2003 04:22:43 -0000 For anyone who may be interested in any of the things I've mentioned so far for net80211 and the 802.11 hardware drivers, I've got a running diff with everything I've found important so far. Missing are changes to further reduce gratuitous resetting in wi(4) and to reduce that problem at all in ath(4). Please check out: http://green.homeunix.org/~green/wi-fi.diffs -- Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\ <> green@FreeBSD.org \ The Power to Serve! \ Opinions expressed are my own. \,,,,,,,,,,,,,,,,,,,,,,\ From owner-freebsd-arch@FreeBSD.ORG Wed Nov 5 21:04:58 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D0BCA16A4CE; Wed, 5 Nov 2003 21:04:58 -0800 (PST) Received: from beastie.mckusick.com (beastie.mckusick.com [209.31.233.184]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4BBAE44001; Wed, 5 Nov 2003 21:04:43 -0800 (PST) (envelope-from mckusick@beastie.mckusick.com) Received: from beastie.mckusick.com (localhost [127.0.0.1]) by beastie.mckusick.com (8.12.8/8.12.3) with ESMTP id hA654feN034044; Wed, 5 Nov 2003 21:04:41 -0800 (PST) (envelope-from mckusick@beastie.mckusick.com) Message-Id: <200311060504.hA654feN034044@beastie.mckusick.com> To: Peter Wemm X-URL: http://WWW.McKusick.COM/ Date: Wed, 05 Nov 2003 21:04:41 -0800 From: Kirk McKusick cc: Robert Watson cc: arch@FreeBSD.org Subject: >0x7fffffff blocksize filesystem reporting X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Kirk McKusick List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Nov 2003 05:04:58 -0000 I have gone back and resurrected the changes for the updated statfs structure that were discussed on arch some months ago. Because they introduce replacement system calls for the statfs family of system calls, it is necessary to build and install a new kernel before doing a `make world'. Failure to do so will result in programs that use statfs (like df) failing with `bad system call' until a new kernel is booted. If the new kernel is booted first, it will handle both the new and old forms of the system call. If I have not received any objections to this change, I will check it into -current on the evening of Monday November 10th. Kirk McKusick =-=-=-=-=-= Index: sys/sys/mount.h =================================================================== RCS file: /usr/ncvs/src/sys/sys/mount.h,v retrieving revision 1.148 diff -c -r1.148 mount.h *** sys/sys/mount.h 1 Jul 2003 17:40:23 -0000 1.148 --- sys/sys/mount.h 5 Nov 2003 05:10:54 -0000 *************** *** 63,75 **** /* * filesystem statistics */ ! #define MFSNAMELEN 16 /* length of fs type name, including null */ ! #define MNAMELEN (88 - 2 * sizeof(long)) /* size of on/from name bufs */ /* XXX getfsstat.2 is out of date with write and read counter changes here. */ /* XXX statfs.2 is out of date with read counter changes here. */ ! struct statfs { long f_spare2; /* placeholder */ long f_bsize; /* fundamental filesystem block size */ long f_iosize; /* optimal transfer block size */ --- 63,103 ---- /* * filesystem statistics */ + #define MFSNAMELEN 16 /* length of type name including null */ + #define MNAMELEN 80 /* size of on/from name bufs */ + #define STATFS_VERSION 0x20030518 /* current version number */ + struct statfs { + u_int32_t f_version; /* structure version number */ + u_int32_t f_type; /* type of filesystem */ + u_int64_t f_flags; /* copy of mount exported flags */ + u_int64_t f_bsize; /* filesystem fragment size */ + u_int64_t f_iosize; /* optimal transfer block size */ + u_int64_t f_blocks; /* total data blocks in filesystem */ + u_int64_t f_bfree; /* free blocks in filesystem */ + int64_t f_bavail; /* free blocks avail to non-superuser */ + u_int64_t f_files; /* total file nodes in filesystem */ + int64_t f_ffree; /* free nodes avail to non-superuser */ + u_int64_t f_syncwrites; /* count of sync writes since mount */ + u_int64_t f_asyncwrites; /* count of async writes since mount */ + u_int64_t f_syncreads; /* count of sync reads since mount */ + u_int64_t f_asyncreads; /* count of async reads since mount */ + u_int64_t f_spare[10]; /* unused spare */ + u_int32_t f_namemax; /* maximum filename length */ + uid_t f_owner; /* user that mounted the filesystem */ + fsid_t f_fsid; /* filesystem id */ + char f_charspare[76]; /* spare string space */ + char f_fstypename[MFSNAMELEN]; /* filesystem type name */ + char f_mntfromname[MNAMELEN]; /* mounted filesystem */ + char f_mntonname[MNAMELEN]; /* directory on which mounted */ + }; ! #ifdef _KERNEL ! #define OMFSNAMELEN 16 /* length of fs type name, including null */ ! #define OMNAMELEN (88 - 2 * sizeof(long)) /* size of on/from name bufs */ /* XXX getfsstat.2 is out of date with write and read counter changes here. */ /* XXX statfs.2 is out of date with read counter changes here. */ ! struct ostatfs { long f_spare2; /* placeholder */ long f_bsize; /* fundamental filesystem block size */ long f_iosize; /* optimal transfer block size */ *************** *** 84,95 **** int f_flags; /* copy of mount exported flags */ long f_syncwrites; /* count of sync writes since mount */ long f_asyncwrites; /* count of async writes since mount */ ! char f_fstypename[MFSNAMELEN]; /* fs type name */ ! char f_mntonname[MNAMELEN]; /* directory on which mounted */ long f_syncreads; /* count of sync reads since mount */ long f_asyncreads; /* count of async reads since mount */ short f_spares1; /* unused spare */ ! char f_mntfromname[MNAMELEN];/* mounted filesystem */ short f_spares2; /* unused spare */ /* * XXX on machines where longs are aligned to 8-byte boundaries, there --- 112,123 ---- int f_flags; /* copy of mount exported flags */ long f_syncwrites; /* count of sync writes since mount */ long f_asyncwrites; /* count of async writes since mount */ ! char f_fstypename[OMFSNAMELEN]; /* fs type name */ ! char f_mntonname[OMNAMELEN]; /* directory on which mounted */ long f_syncreads; /* count of sync reads since mount */ long f_asyncreads; /* count of async reads since mount */ short f_spares1; /* unused spare */ ! char f_mntfromname[OMNAMELEN];/* mounted filesystem */ short f_spares2; /* unused spare */ /* * XXX on machines where longs are aligned to 8-byte boundaries, there *************** *** 99,105 **** long f_spare[2]; /* unused spare */ }; - #ifdef _KERNEL #define MMAXOPTIONLEN 65536 /* maximum length of a mount option */ TAILQ_HEAD(vnodelst, vnode); --- 127,132 ---- Index: sys/kern/syscalls.master =================================================================== RCS file: /usr/ncvs/src/sys/kern/syscalls.master,v retrieving revision 1.155 diff -c -r1.155 syscalls.master *** sys/kern/syscalls.master 21 Oct 2003 07:03:27 -0000 1.155 --- sys/kern/syscalls.master 5 Nov 2003 05:10:54 -0000 *************** *** 68,74 **** 15 STD POSIX { int chmod(char *path, int mode); } 16 STD POSIX { int chown(char *path, int uid, int gid); } 17 MSTD BSD { int obreak(char *nsize); } break obreak_args int ! 18 STD BSD { int getfsstat(struct statfs *buf, long bufsize, \ int flags); } 19 COMPAT POSIX { long lseek(int fd, long offset, int whence); } 20 MSTD POSIX { pid_t getpid(void); } --- 68,74 ---- 15 STD POSIX { int chmod(char *path, int mode); } 16 STD POSIX { int chown(char *path, int uid, int gid); } 17 MSTD BSD { int obreak(char *nsize); } break obreak_args int ! 18 COMPAT4 BSD { int getfsstat(struct ostatfs *buf, long bufsize, \ int flags); } 19 COMPAT POSIX { long lseek(int fd, long offset, int whence); } 20 MSTD POSIX { pid_t getpid(void); } *************** *** 252,259 **** 155 MNOIMPL BSD { int nfssvc(int flag, caddr_t argp); } 156 COMPAT BSD { int getdirentries(int fd, char *buf, u_int count, \ long *basep); } ! 157 STD BSD { int statfs(char *path, struct statfs *buf); } ! 158 STD BSD { int fstatfs(int fd, struct statfs *buf); } 159 UNIMPL NOHIDE nosys 160 UNIMPL NOHIDE nosys 161 STD BSD { int getfh(char *fname, struct fhandle *fhp); } --- 252,259 ---- 155 MNOIMPL BSD { int nfssvc(int flag, caddr_t argp); } 156 COMPAT BSD { int getdirentries(int fd, char *buf, u_int count, \ long *basep); } ! 157 COMPAT4 BSD { int statfs(char *path, struct ostatfs *buf); } ! 158 COMPAT4 BSD { int fstatfs(int fd, struct ostatfs *buf); } 159 UNIMPL NOHIDE nosys 160 UNIMPL NOHIDE nosys 161 STD BSD { int getfh(char *fname, struct fhandle *fhp); } *************** *** 439,445 **** 295 UNIMPL NOHIDE nosys 296 UNIMPL NOHIDE nosys ; XXX 297 is 300 in NetBSD ! 297 STD BSD { int fhstatfs(const struct fhandle *u_fhp, struct statfs *buf); } 298 STD BSD { int fhopen(const struct fhandle *u_fhp, int flags); } 299 STD BSD { int fhstat(const struct fhandle *u_fhp, struct stat *sb); } ; syscall numbers for FreeBSD --- 439,445 ---- 295 UNIMPL NOHIDE nosys 296 UNIMPL NOHIDE nosys ; XXX 297 is 300 in NetBSD ! 297 COMPAT4 BSD { int fhstatfs(const struct fhandle *u_fhp, struct ostatfs *buf); } 298 STD BSD { int fhopen(const struct fhandle *u_fhp, int flags); } 299 STD BSD { int fhstat(const struct fhandle *u_fhp, struct stat *sb); } ; syscall numbers for FreeBSD *************** *** 574,583 **** struct sf_hdtr *hdtr, off_t *sbytes, int flags); } 394 MSTD BSD { int mac_syscall(const char *policy, int call, \ void *arg); } ! 395 UNIMPL NOHIDE nosys ! 396 UNIMPL NOHIDE nosys ! 397 UNIMPL NOHIDE nosys ! 398 UNIMPL NOHIDE nosys 399 UNIMPL NOHIDE nosys 400 MNOSTD BSD { int ksem_close(semid_t id); } 401 MNOSTD BSD { int ksem_post(semid_t id); } --- 574,585 ---- struct sf_hdtr *hdtr, off_t *sbytes, int flags); } 394 MSTD BSD { int mac_syscall(const char *policy, int call, \ void *arg); } ! 395 STD BSD { int getfsstat(struct statfs *buf, long bufsize, \ ! int flags); } ! 396 STD BSD { int statfs(char *path, struct statfs *buf); } ! 397 STD BSD { int fstatfs(int fd, struct statfs *buf); } ! 398 STD BSD { int fhstatfs(const struct fhandle *u_fhp, \ ! struct statfs *buf); } 399 UNIMPL NOHIDE nosys 400 MNOSTD BSD { int ksem_close(semid_t id); } 401 MNOSTD BSD { int ksem_post(semid_t id); } Index: sys/kern/vfs_syscalls.c =================================================================== RCS file: /usr/ncvs/src/sys/kern/vfs_syscalls.c,v retrieving revision 1.332 diff -c -r1.332 vfs_syscalls.c *** sys/kern/vfs_syscalls.c 19 Oct 2003 20:41:07 -0000 1.332 --- sys/kern/vfs_syscalls.c 5 Nov 2003 05:10:54 -0000 *************** *** 226,236 **** struct statfs *buf; } */ *uap; { ! register struct mount *mp; ! register struct statfs *sp; int error; struct nameidata nd; - struct statfs sb; NDINIT(&nd, LOOKUP, FOLLOW, UIO_USERSPACE, uap->path, td); if ((error = namei(&nd)) != 0) --- 226,235 ---- struct statfs *buf; } */ *uap; { ! struct mount *mp; ! struct statfs *sp, sb; int error; struct nameidata nd; NDINIT(&nd, LOOKUP, FOLLOW, UIO_USERSPACE, uap->path, td); if ((error = namei(&nd)) != 0) *************** *** 244,253 **** if (error) return (error); #endif error = VFS_STATFS(mp, sp, td); if (error) return (error); - sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; if (suser(td)) { bcopy(sp, &sb, sizeof(sb)); sb.f_fsid.val[0] = sb.f_fsid.val[1] = 0; --- 243,257 ---- if (error) return (error); #endif + /* + * Set these in case the underlying filesystem fails to do so. + */ + sp->f_version = STATFS_VERSION; + sp->f_namemax = NAME_MAX; + sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; error = VFS_STATFS(mp, sp, td); if (error) return (error); if (suser(td)) { bcopy(sp, &sb, sizeof(sb)); sb.f_fsid.val[0] = sb.f_fsid.val[1] = 0; *************** *** 276,284 **** { struct file *fp; struct mount *mp; ! register struct statfs *sp; int error; - struct statfs sb; if ((error = getvnode(td->td_proc->p_fd, uap->fd, &fp)) != 0) return (error); --- 280,287 ---- { struct file *fp; struct mount *mp; ! struct statfs *sp, sb; int error; if ((error = getvnode(td->td_proc->p_fd, uap->fd, &fp)) != 0) return (error); *************** *** 292,301 **** return (error); #endif sp = &mp->mnt_stat; error = VFS_STATFS(mp, sp, td); if (error) return (error); - sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; if (suser(td)) { bcopy(sp, &sb, sizeof(sb)); sb.f_fsid.val[0] = sb.f_fsid.val[1] = 0; --- 295,309 ---- return (error); #endif sp = &mp->mnt_stat; + /* + * Set these in case the underlying filesystem fails to do so. + */ + sp->f_version = STATFS_VERSION; + sp->f_namemax = NAME_MAX; + sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; error = VFS_STATFS(mp, sp, td); if (error) return (error); if (suser(td)) { bcopy(sp, &sb, sizeof(sb)); sb.f_fsid.val[0] = sb.f_fsid.val[1] = 0; *************** *** 323,330 **** int flags; } */ *uap; { ! register struct mount *mp, *nmp; ! register struct statfs *sp; caddr_t sfsp; long count, maxcount, error; --- 331,338 ---- int flags; } */ *uap; { ! struct mount *mp, *nmp; ! struct statfs *sp, sb; caddr_t sfsp; long count, maxcount, error; *************** *** 346,351 **** --- 354,366 ---- if (sfsp && count < maxcount) { sp = &mp->mnt_stat; /* + * Set these in case the underlying filesystem + * fails to do so. + */ + sp->f_version = STATFS_VERSION; + sp->f_namemax = NAME_MAX; + sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; + /* * If MNT_NOWAIT or MNT_LAZY is specified, do not * refresh the fsstat cache. MNT_NOWAIT or MNT_LAZY * overrides MNT_WAIT. *************** *** 358,364 **** vfs_unbusy(mp, td); continue; } ! sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; error = copyout(sp, sfsp, sizeof(*sp)); if (error) { vfs_unbusy(mp, td); --- 373,383 ---- vfs_unbusy(mp, td); continue; } ! if (suser(td)) { ! bcopy(sp, &sb, sizeof(sb)); ! sb.f_fsid.val[0] = sb.f_fsid.val[1] = 0; ! sp = &sb; ! } error = copyout(sp, sfsp, sizeof(*sp)); if (error) { vfs_unbusy(mp, td); *************** *** 379,384 **** --- 398,660 ---- return (0); } + #ifdef COMPAT_FREEBSD4 + /* + * Get old format filesystem statistics. + */ + static void cvtstatfs(struct thread *, struct statfs *, struct ostatfs *); + + #ifndef _SYS_SYSPROTO_H_ + struct freebsd4_statfs_args { + char *path; + struct ostatfs *buf; + }; + #endif + /* ARGSUSED */ + int + freebsd4_statfs(td, uap) + struct thread *td; + struct freebsd4_statfs_args /* { + char *path; + struct ostatfs *buf; + } */ *uap; + { + struct mount *mp; + struct statfs *sp; + struct ostatfs osb; + int error; + struct nameidata nd; + + NDINIT(&nd, LOOKUP, FOLLOW, UIO_USERSPACE, uap->path, td); + if ((error = namei(&nd)) != 0) + return (error); + mp = nd.ni_vp->v_mount; + sp = &mp->mnt_stat; + NDFREE(&nd, NDF_ONLY_PNBUF); + vrele(nd.ni_vp); + #ifdef MAC + error = mac_check_mount_stat(td->td_ucred, mp); + if (error) + return (error); + #endif + error = VFS_STATFS(mp, sp, td); + if (error) + return (error); + sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; + cvtstatfs(td, sp, &osb); + return (copyout(&osb, uap->buf, sizeof(osb))); + } + + /* + * Get filesystem statistics. + */ + #ifndef _SYS_SYSPROTO_H_ + struct freebsd4_fstatfs_args { + int fd; + struct ostatfs *buf; + }; + #endif + /* ARGSUSED */ + int + freebsd4_fstatfs(td, uap) + struct thread *td; + struct freebsd4_fstatfs_args /* { + int fd; + struct ostatfs *buf; + } */ *uap; + { + struct file *fp; + struct mount *mp; + struct statfs *sp; + struct ostatfs osb; + int error; + + if ((error = getvnode(td->td_proc->p_fd, uap->fd, &fp)) != 0) + return (error); + mp = fp->f_vnode->v_mount; + fdrop(fp, td); + if (mp == NULL) + return (EBADF); + #ifdef MAC + error = mac_check_mount_stat(td->td_ucred, mp); + if (error) + return (error); + #endif + sp = &mp->mnt_stat; + error = VFS_STATFS(mp, sp, td); + if (error) + return (error); + sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; + cvtstatfs(td, sp, &osb); + return (copyout(&osb, uap->buf, sizeof(osb))); + } + + /* + * Get statistics on all filesystems. + */ + #ifndef _SYS_SYSPROTO_H_ + struct freebsd4_getfsstat_args { + struct ostatfs *buf; + long bufsize; + int flags; + }; + #endif + int + freebsd4_getfsstat(td, uap) + struct thread *td; + register struct freebsd4_getfsstat_args /* { + struct ostatfs *buf; + long bufsize; + int flags; + } */ *uap; + { + struct mount *mp, *nmp; + struct statfs *sp; + struct ostatfs osb; + caddr_t sfsp; + long count, maxcount, error; + + maxcount = uap->bufsize / sizeof(struct ostatfs); + sfsp = (caddr_t)uap->buf; + count = 0; + mtx_lock(&mountlist_mtx); + for (mp = TAILQ_FIRST(&mountlist); mp != NULL; mp = nmp) { + #ifdef MAC + if (mac_check_mount_stat(td->td_ucred, mp) != 0) { + nmp = TAILQ_NEXT(mp, mnt_list); + continue; + } + #endif + if (vfs_busy(mp, LK_NOWAIT, &mountlist_mtx, td)) { + nmp = TAILQ_NEXT(mp, mnt_list); + continue; + } + if (sfsp && count < maxcount) { + sp = &mp->mnt_stat; + /* + * If MNT_NOWAIT or MNT_LAZY is specified, do not + * refresh the fsstat cache. MNT_NOWAIT or MNT_LAZY + * overrides MNT_WAIT. + */ + if (((uap->flags & (MNT_LAZY|MNT_NOWAIT)) == 0 || + (uap->flags & MNT_WAIT)) && + (error = VFS_STATFS(mp, sp, td))) { + mtx_lock(&mountlist_mtx); + nmp = TAILQ_NEXT(mp, mnt_list); + vfs_unbusy(mp, td); + continue; + } + sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; + cvtstatfs(td, sp, &osb); + error = copyout(&osb, sfsp, sizeof(osb)); + if (error) { + vfs_unbusy(mp, td); + return (error); + } + sfsp += sizeof(osb); + } + count++; + mtx_lock(&mountlist_mtx); + nmp = TAILQ_NEXT(mp, mnt_list); + vfs_unbusy(mp, td); + } + mtx_unlock(&mountlist_mtx); + if (sfsp && count > maxcount) + td->td_retval[0] = maxcount; + else + td->td_retval[0] = count; + return (0); + } + + /* + * Implement fstatfs() for (NFS) file handles. + */ + #ifndef _SYS_SYSPROTO_H_ + struct freebsd4_fhstatfs_args { + struct fhandle *u_fhp; + struct ostatfs *buf; + }; + #endif + int + freebsd4_fhstatfs(td, uap) + struct thread *td; + struct freebsd4_fhstatfs_args /* { + struct fhandle *u_fhp; + struct ostatfs *buf; + } */ *uap; + { + struct statfs *sp; + struct mount *mp; + struct vnode *vp; + struct ostatfs osb; + fhandle_t fh; + int error; + + /* + * Must be super user + */ + error = suser(td); + if (error) + return (error); + + if ((error = copyin(uap->u_fhp, &fh, sizeof(fhandle_t))) != 0) + return (error); + + if ((mp = vfs_getvfs(&fh.fh_fsid)) == NULL) + return (ESTALE); + if ((error = VFS_FHTOVP(mp, &fh.fh_fid, &vp))) + return (error); + mp = vp->v_mount; + sp = &mp->mnt_stat; + vput(vp); + #ifdef MAC + error = mac_check_mount_stat(td->td_ucred, mp); + if (error) + return (error); + #endif + if ((error = VFS_STATFS(mp, sp, td)) != 0) + return (error); + sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; + cvtstatfs(td, sp, &osb); + return (copyout(&osb, uap->buf, sizeof(osb))); + } + + /* + * Convert a new format statfs structure to an old format statfs structure. + */ + static void + cvtstatfs(td, nsp, osp) + struct thread *td; + struct statfs *nsp; + struct ostatfs *osp; + { + + bzero(osp, sizeof(*osp)); + osp->f_bsize = nsp->f_bsize; + osp->f_iosize = nsp->f_iosize; + osp->f_blocks = nsp->f_blocks; + osp->f_bfree = nsp->f_bfree; + osp->f_bavail = nsp->f_bavail; + osp->f_files = nsp->f_files; + osp->f_ffree = nsp->f_ffree; + osp->f_owner = nsp->f_owner; + osp->f_type = nsp->f_type; + osp->f_flags = nsp->f_flags; + osp->f_syncwrites = nsp->f_syncwrites; + osp->f_asyncwrites = nsp->f_asyncwrites; + osp->f_syncreads = nsp->f_syncreads; + osp->f_asyncreads = nsp->f_asyncreads; + bcopy(nsp->f_fstypename, osp->f_fstypename, MFSNAMELEN); + bcopy(nsp->f_mntonname, osp->f_mntonname, MNAMELEN); + bcopy(nsp->f_mntfromname, osp->f_mntfromname, MNAMELEN); + if (suser(td)) { + osp->f_fsid.val[0] = osp->f_fsid.val[1] = 0; + } else { + osp->f_fsid = nsp->f_fsid; + } + } + #endif /* COMPAT_FREEBSD4 */ + /* * Change current working directory to a given file descriptor. */ *************** *** 3788,3797 **** struct statfs *buf; } */ *uap; { ! struct statfs *sp; struct mount *mp; struct vnode *vp; - struct statfs sb; fhandle_t fh; int error; --- 4064,4072 ---- struct statfs *buf; } */ *uap; { ! struct statfs *sp, sb; struct mount *mp; struct vnode *vp; fhandle_t fh; int error; *************** *** 3817,3825 **** if (error) return (error); #endif if ((error = VFS_STATFS(mp, sp, td)) != 0) return (error); - sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; if (suser(td)) { bcopy(sp, &sb, sizeof(sb)); sb.f_fsid.val[0] = sb.f_fsid.val[1] = 0; --- 4092,4105 ---- if (error) return (error); #endif + /* + * Set these in case the underlying filesystem fails to do so. + */ + sp->f_version = STATFS_VERSION; + sp->f_namemax = NAME_MAX; + sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; if ((error = VFS_STATFS(mp, sp, td)) != 0) return (error); if (suser(td)) { bcopy(sp, &sb, sizeof(sb)); sb.f_fsid.val[0] = sb.f_fsid.val[1] = 0; Index: sys/ufs/ffs/ffs_vfsops.c =================================================================== RCS file: /usr/ncvs/src/sys/ufs/ffs/ffs_vfsops.c,v retrieving revision 1.222 diff -c -r1.222 ffs_vfsops.c *** sys/ufs/ffs/ffs_vfsops.c 2 Nov 2003 04:52:53 -0000 1.222 --- sys/ufs/ffs/ffs_vfsops.c 5 Nov 2003 05:10:54 -0000 *************** *** 1075,1080 **** --- 1075,1081 ---- fs = ump->um_fs; if (fs->fs_magic != FS_UFS1_MAGIC && fs->fs_magic != FS_UFS2_MAGIC) panic("ffs_statfs"); + sbp->f_version = STATFS_VERSION; sbp->f_bsize = fs->fs_fsize; sbp->f_iosize = fs->fs_bsize; sbp->f_blocks = fs->fs_dsize; *************** *** 1084,1091 **** --- 1085,1102 ---- dbtofsb(fs, fs->fs_pendingblocks); sbp->f_files = fs->fs_ncg * fs->fs_ipg - ROOTINO; sbp->f_ffree = fs->fs_cstotal.cs_nifree + fs->fs_pendinginodes; + sbp->f_namemax = NAME_MAX; if (sbp != &mp->mnt_stat) { + sbp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; sbp->f_type = mp->mnt_vfc->vfc_typenum; + sbp->f_syncwrites = mp->mnt_stat.f_syncwrites; + sbp->f_asyncwrites = mp->mnt_stat.f_asyncwrites; + sbp->f_syncreads = mp->mnt_stat.f_syncreads; + sbp->f_asyncreads = mp->mnt_stat.f_asyncreads; + sbp->f_owner = mp->mnt_stat.f_owner; + sbp->f_fsid = mp->mnt_stat.f_fsid; + bcopy((caddr_t)mp->mnt_stat.f_fstypename, + (caddr_t)&sbp->f_fstypename[0], MFSNAMELEN); bcopy((caddr_t)mp->mnt_stat.f_mntonname, (caddr_t)&sbp->f_mntonname[0], MNAMELEN); bcopy((caddr_t)mp->mnt_stat.f_mntfromname, Index: sys/kern/vfs_bio.c =================================================================== RCS file: /usr/ncvs/src/sys/kern/vfs_bio.c,v retrieving revision 1.420 diff -c -r1.420 vfs_bio.c *** sys/kern/vfs_bio.c 4 Nov 2003 06:30:00 -0000 1.420 --- sys/kern/vfs_bio.c 5 Nov 2003 05:10:54 -0000 *************** *** 3239,3245 **** (int) m->pindex, (int)(foff >> 32), (int) foff & 0xffffffff, resid, i); if (!vn_isdisk(vp, NULL)) ! printf(" iosize: %ld, lblkno: %jd, flags: 0x%x, npages: %d\n", bp->b_vp->v_mount->mnt_stat.f_iosize, (intmax_t) bp->b_lblkno, bp->b_flags, bp->b_npages); --- 3239,3245 ---- (int) m->pindex, (int)(foff >> 32), (int) foff & 0xffffffff, resid, i); if (!vn_isdisk(vp, NULL)) ! printf(" iosize: %jd, lblkno: %jd, flags: 0x%x, npages: %d\n", bp->b_vp->v_mount->mnt_stat.f_iosize, (intmax_t) bp->b_lblkno, bp->b_flags, bp->b_npages); Index: sys/kern/vfs_cluster.c =================================================================== RCS file: /usr/ncvs/src/sys/kern/vfs_cluster.c,v retrieving revision 1.147 diff -c -r1.147 vfs_cluster.c *** sys/kern/vfs_cluster.c 20 Oct 2003 18:24:38 -0000 1.147 --- sys/kern/vfs_cluster.c 5 Nov 2003 05:10:54 -0000 *************** *** 327,333 **** GIANT_REQUIRED; KASSERT(size == vp->v_mount->mnt_stat.f_iosize, ! ("cluster_rbuild: size %ld != filesize %ld\n", size, vp->v_mount->mnt_stat.f_iosize)); /* --- 327,333 ---- GIANT_REQUIRED; KASSERT(size == vp->v_mount->mnt_stat.f_iosize, ! ("cluster_rbuild: size %ld != filesize %jd\n", size, vp->v_mount->mnt_stat.f_iosize)); /* Index: sys/sys/syscall.h =================================================================== RCS file: /usr/ncvs/src/sys/sys/syscall.h,v retrieving revision 1.142 diff -c -r1.142 syscall.h *** sys/sys/syscall.h 21 Oct 2003 07:03:27 -0000 1.142 --- sys/sys/syscall.h 5 Nov 2003 05:13:28 -0000 *************** *** 3,9 **** * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ ! * created from FreeBSD: src/sys/kern/syscalls.master,v 1.154 2003/10/20 16:16:03 dwmalone Exp */ #define SYS_syscall 0 --- 3,9 ---- * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ ! * created from FreeBSD */ #define SYS_syscall 0 *************** *** 24,30 **** #define SYS_chmod 15 #define SYS_chown 16 #define SYS_break 17 ! #define SYS_getfsstat 18 /* 19 is old lseek */ #define SYS_getpid 20 #define SYS_mount 21 --- 24,30 ---- #define SYS_chmod 15 #define SYS_chown 16 #define SYS_break 17 ! /* 18 is old getfsstat */ /* 19 is old lseek */ #define SYS_getpid 20 #define SYS_mount 21 *************** *** 156,163 **** /* 150 is old getsockname */ #define SYS_nfssvc 155 /* 156 is old getdirentries */ ! #define SYS_statfs 157 ! #define SYS_fstatfs 158 #define SYS_getfh 161 #define SYS_getdomainname 162 #define SYS_setdomainname 163 --- 156,163 ---- /* 150 is old getsockname */ #define SYS_nfssvc 155 /* 156 is old getdirentries */ ! /* 157 is old statfs */ ! /* 158 is old fstatfs */ #define SYS_getfh 161 #define SYS_getdomainname 162 #define SYS_setdomainname 163 *************** *** 221,227 **** #define SYS_nstat 278 #define SYS_nfstat 279 #define SYS_nlstat 280 ! #define SYS_fhstatfs 297 #define SYS_fhopen 298 #define SYS_fhstat 299 #define SYS_modnext 300 --- 221,227 ---- #define SYS_nstat 278 #define SYS_nfstat 279 #define SYS_nlstat 280 ! /* 297 is old fhstatfs */ #define SYS_fhopen 298 #define SYS_fhstat 299 #define SYS_modnext 300 *************** *** 310,315 **** --- 310,319 ---- #define SYS_uuidgen 392 #define SYS_sendfile 393 #define SYS_mac_syscall 394 + #define SYS_getfsstat 395 + #define SYS_statfs 396 + #define SYS_fstatfs 397 + #define SYS_fhstatfs 398 #define SYS_ksem_close 400 #define SYS_ksem_post 401 #define SYS_ksem_wait 402 Index: sys/sys/syscall.mk =================================================================== RCS file: /usr/ncvs/src/sys/sys/syscall.mk,v retrieving revision 1.97 diff -c -r1.97 syscall.mk *** sys/sys/syscall.mk 21 Oct 2003 07:03:27 -0000 1.97 --- sys/sys/syscall.mk 5 Nov 2003 05:13:28 -0000 *************** *** 1,7 **** # FreeBSD system call names. # DO NOT EDIT-- this file is automatically generated. # $FreeBSD$ ! # created from FreeBSD: src/sys/kern/syscalls.master,v 1.154 2003/10/20 16:16:03 dwmalone Exp MIASM = \ syscall.o \ exit.o \ --- 1,7 ---- # FreeBSD system call names. # DO NOT EDIT-- this file is automatically generated. # $FreeBSD$ ! # created from FreeBSD MIASM = \ syscall.o \ exit.o \ *************** *** 19,25 **** chmod.o \ chown.o \ break.o \ - getfsstat.o \ getpid.o \ mount.o \ unmount.o \ --- 19,24 ---- *************** *** 108,115 **** setsid.o \ quotactl.o \ nfssvc.o \ - statfs.o \ - fstatfs.o \ getfh.o \ getdomainname.o \ setdomainname.o \ --- 107,112 ---- *************** *** 173,179 **** nstat.o \ nfstat.o \ nlstat.o \ - fhstatfs.o \ fhopen.o \ fhstat.o \ modnext.o \ --- 170,175 ---- *************** *** 256,261 **** --- 252,261 ---- uuidgen.o \ sendfile.o \ mac_syscall.o \ + getfsstat.o \ + statfs.o \ + fstatfs.o \ + fhstatfs.o \ ksem_close.o \ ksem_post.o \ ksem_wait.o \ Index: sys/sys/sysproto.h =================================================================== RCS file: /usr/ncvs/src/sys/sys/sysproto.h,v retrieving revision 1.138 diff -c -r1.138 sysproto.h *** sys/sys/sysproto.h 21 Oct 2003 07:03:27 -0000 1.138 --- sys/sys/sysproto.h 5 Nov 2003 05:13:29 -0000 *************** *** 3,9 **** * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ ! * created from FreeBSD: src/sys/kern/syscalls.master,v 1.154 2003/10/20 16:16:03 dwmalone Exp */ #ifndef _SYS_SYSPROTO_H_ --- 3,9 ---- * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ ! * created from FreeBSD */ #ifndef _SYS_SYSPROTO_H_ *************** *** 95,105 **** struct obreak_args { char nsize_l_[PADL_(char *)]; char * nsize; char nsize_r_[PADR_(char *)]; }; - struct getfsstat_args { - char buf_l_[PADL_(struct statfs *)]; struct statfs * buf; char buf_r_[PADR_(struct statfs *)]; - char bufsize_l_[PADL_(long)]; long bufsize; char bufsize_r_[PADR_(long)]; - char flags_l_[PADL_(int)]; int flags; char flags_r_[PADR_(int)]; - }; struct getpid_args { register_t dummy; }; --- 95,100 ---- *************** *** 491,504 **** char flag_l_[PADL_(int)]; int flag; char flag_r_[PADR_(int)]; char argp_l_[PADL_(caddr_t)]; caddr_t argp; char argp_r_[PADR_(caddr_t)]; }; - struct statfs_args { - char path_l_[PADL_(char *)]; char * path; char path_r_[PADR_(char *)]; - char buf_l_[PADL_(struct statfs *)]; struct statfs * buf; char buf_r_[PADR_(struct statfs *)]; - }; - struct fstatfs_args { - char fd_l_[PADL_(int)]; int fd; char fd_r_[PADR_(int)]; - char buf_l_[PADL_(struct statfs *)]; struct statfs * buf; char buf_r_[PADR_(struct statfs *)]; - }; struct getfh_args { char fname_l_[PADL_(char *)]; char * fname; char fname_r_[PADR_(char *)]; char fhp_l_[PADL_(struct fhandle *)]; struct fhandle * fhp; char fhp_r_[PADR_(struct fhandle *)]; --- 486,491 ---- *************** *** 778,787 **** char path_l_[PADL_(char *)]; char * path; char path_r_[PADR_(char *)]; char ub_l_[PADL_(struct nstat *)]; struct nstat * ub; char ub_r_[PADR_(struct nstat *)]; }; - struct fhstatfs_args { - char u_fhp_l_[PADL_(const struct fhandle *)]; const struct fhandle * u_fhp; char u_fhp_r_[PADR_(const struct fhandle *)]; - char buf_l_[PADL_(struct statfs *)]; struct statfs * buf; char buf_r_[PADR_(struct statfs *)]; - }; struct fhopen_args { char u_fhp_l_[PADL_(const struct fhandle *)]; const struct fhandle * u_fhp; char u_fhp_r_[PADR_(const struct fhandle *)]; char flags_l_[PADL_(int)]; int flags; char flags_r_[PADR_(int)]; --- 765,770 ---- *************** *** 1128,1133 **** --- 1111,1133 ---- char call_l_[PADL_(int)]; int call; char call_r_[PADR_(int)]; char arg_l_[PADL_(void *)]; void * arg; char arg_r_[PADR_(void *)]; }; + struct getfsstat_args { + char buf_l_[PADL_(struct statfs *)]; struct statfs * buf; char buf_r_[PADR_(struct statfs *)]; + char bufsize_l_[PADL_(long)]; long bufsize; char bufsize_r_[PADR_(long)]; + char flags_l_[PADL_(int)]; int flags; char flags_r_[PADR_(int)]; + }; + struct statfs_args { + char path_l_[PADL_(char *)]; char * path; char path_r_[PADR_(char *)]; + char buf_l_[PADL_(struct statfs *)]; struct statfs * buf; char buf_r_[PADR_(struct statfs *)]; + }; + struct fstatfs_args { + char fd_l_[PADL_(int)]; int fd; char fd_r_[PADR_(int)]; + char buf_l_[PADL_(struct statfs *)]; struct statfs * buf; char buf_r_[PADR_(struct statfs *)]; + }; + struct fhstatfs_args { + char u_fhp_l_[PADL_(const struct fhandle *)]; const struct fhandle * u_fhp; char u_fhp_r_[PADR_(const struct fhandle *)]; + char buf_l_[PADL_(struct statfs *)]; struct statfs * buf; char buf_r_[PADR_(struct statfs *)]; + }; struct ksem_close_args { char id_l_[PADL_(semid_t)]; semid_t id; char id_r_[PADR_(semid_t)]; }; *************** *** 1300,1306 **** int chmod(struct thread *, struct chmod_args *); int chown(struct thread *, struct chown_args *); int obreak(struct thread *, struct obreak_args *); - int getfsstat(struct thread *, struct getfsstat_args *); int getpid(struct thread *, struct getpid_args *); int mount(struct thread *, struct mount_args *); int unmount(struct thread *, struct unmount_args *); --- 1300,1305 ---- *************** *** 1389,1396 **** int setsid(struct thread *, struct setsid_args *); int quotactl(struct thread *, struct quotactl_args *); int nfssvc(struct thread *, struct nfssvc_args *); - int statfs(struct thread *, struct statfs_args *); - int fstatfs(struct thread *, struct fstatfs_args *); int getfh(struct thread *, struct getfh_args *); int getdomainname(struct thread *, struct getdomainname_args *); int setdomainname(struct thread *, struct setdomainname_args *); --- 1388,1393 ---- *************** *** 1452,1458 **** int nstat(struct thread *, struct nstat_args *); int nfstat(struct thread *, struct nfstat_args *); int nlstat(struct thread *, struct nlstat_args *); - int fhstatfs(struct thread *, struct fhstatfs_args *); int fhopen(struct thread *, struct fhopen_args *); int fhstat(struct thread *, struct fhstat_args *); int modnext(struct thread *, struct modnext_args *); --- 1449,1454 ---- *************** *** 1536,1541 **** --- 1532,1541 ---- int uuidgen(struct thread *, struct uuidgen_args *); int sendfile(struct thread *, struct sendfile_args *); int mac_syscall(struct thread *, struct mac_syscall_args *); + int getfsstat(struct thread *, struct getfsstat_args *); + int statfs(struct thread *, struct statfs_args *); + int fstatfs(struct thread *, struct fstatfs_args *); + int fhstatfs(struct thread *, struct fhstatfs_args *); int ksem_close(struct thread *, struct ksem_close_args *); int ksem_post(struct thread *, struct ksem_post_args *); int ksem_wait(struct thread *, struct ksem_wait_args *); *************** *** 1748,1753 **** --- 1748,1770 ---- #ifdef COMPAT_FREEBSD4 + struct freebsd4_getfsstat_args { + char buf_l_[PADL_(struct ostatfs *)]; struct ostatfs * buf; char buf_r_[PADR_(struct ostatfs *)]; + char bufsize_l_[PADL_(long)]; long bufsize; char bufsize_r_[PADR_(long)]; + char flags_l_[PADL_(int)]; int flags; char flags_r_[PADR_(int)]; + }; + struct freebsd4_statfs_args { + char path_l_[PADL_(char *)]; char * path; char path_r_[PADR_(char *)]; + char buf_l_[PADL_(struct ostatfs *)]; struct ostatfs * buf; char buf_r_[PADR_(struct ostatfs *)]; + }; + struct freebsd4_fstatfs_args { + char fd_l_[PADL_(int)]; int fd; char fd_r_[PADR_(int)]; + char buf_l_[PADL_(struct ostatfs *)]; struct ostatfs * buf; char buf_r_[PADR_(struct ostatfs *)]; + }; + struct freebsd4_fhstatfs_args { + char u_fhp_l_[PADL_(const struct fhandle *)]; const struct fhandle * u_fhp; char u_fhp_r_[PADR_(const struct fhandle *)]; + char buf_l_[PADL_(struct ostatfs *)]; struct ostatfs * buf; char buf_r_[PADR_(struct ostatfs *)]; + }; struct freebsd4_sendfile_args { char fd_l_[PADL_(int)]; int fd; char fd_r_[PADR_(int)]; char s_l_[PADL_(int)]; int s; char s_r_[PADR_(int)]; *************** *** 1765,1770 **** --- 1782,1791 ---- struct freebsd4_sigreturn_args { char sigcntxp_l_[PADL_(const struct ucontext4 *)]; const struct ucontext4 * sigcntxp; char sigcntxp_r_[PADR_(const struct ucontext4 *)]; }; + int freebsd4_getfsstat(struct thread *, struct freebsd4_getfsstat_args *); + int freebsd4_statfs(struct thread *, struct freebsd4_statfs_args *); + int freebsd4_fstatfs(struct thread *, struct freebsd4_fstatfs_args *); + int freebsd4_fhstatfs(struct thread *, struct freebsd4_fhstatfs_args *); int freebsd4_sendfile(struct thread *, struct freebsd4_sendfile_args *); int freebsd4_sigaction(struct thread *, struct freebsd4_sigaction_args *); int freebsd4_sigreturn(struct thread *, struct freebsd4_sigreturn_args *); Index: sys/kern/init_sysent.c =================================================================== RCS file: /usr/ncvs/src/sys/kern/init_sysent.c,v retrieving revision 1.158 diff -c -r1.158 init_sysent.c *** sys/kern/init_sysent.c 21 Oct 2003 07:03:27 -0000 1.158 --- sys/kern/init_sysent.c 5 Nov 2003 05:13:29 -0000 *************** *** 3,9 **** * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ ! * created from FreeBSD: src/sys/kern/syscalls.master,v 1.154 2003/10/20 16:16:03 dwmalone Exp */ #include "opt_compat.h" --- 3,9 ---- * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ ! * created from FreeBSD */ #include "opt_compat.h" *************** *** 46,52 **** { AS(chmod_args), (sy_call_t *)chmod }, /* 15 = chmod */ { AS(chown_args), (sy_call_t *)chown }, /* 16 = chown */ { SYF_MPSAFE | AS(obreak_args), (sy_call_t *)obreak }, /* 17 = break */ ! { AS(getfsstat_args), (sy_call_t *)getfsstat }, /* 18 = getfsstat */ { compat(AS(olseek_args),lseek) }, /* 19 = old lseek */ { SYF_MPSAFE | 0, (sy_call_t *)getpid }, /* 20 = getpid */ { AS(mount_args), (sy_call_t *)mount }, /* 21 = mount */ --- 46,52 ---- { AS(chmod_args), (sy_call_t *)chmod }, /* 15 = chmod */ { AS(chown_args), (sy_call_t *)chown }, /* 16 = chown */ { SYF_MPSAFE | AS(obreak_args), (sy_call_t *)obreak }, /* 17 = break */ ! { compat4(AS(freebsd4_getfsstat_args),getfsstat) }, /* 18 = old getfsstat */ { compat(AS(olseek_args),lseek) }, /* 19 = old lseek */ { SYF_MPSAFE | 0, (sy_call_t *)getpid }, /* 20 = getpid */ { AS(mount_args), (sy_call_t *)mount }, /* 21 = mount */ *************** *** 185,192 **** { 0, (sy_call_t *)nosys }, /* 154 = nosys */ { SYF_MPSAFE | AS(nfssvc_args), (sy_call_t *)nosys }, /* 155 = nfssvc */ { compat(AS(ogetdirentries_args),getdirentries) }, /* 156 = old getdirentries */ ! { AS(statfs_args), (sy_call_t *)statfs }, /* 157 = statfs */ ! { AS(fstatfs_args), (sy_call_t *)fstatfs }, /* 158 = fstatfs */ { 0, (sy_call_t *)nosys }, /* 159 = nosys */ { 0, (sy_call_t *)nosys }, /* 160 = nosys */ { AS(getfh_args), (sy_call_t *)getfh }, /* 161 = getfh */ --- 185,192 ---- { 0, (sy_call_t *)nosys }, /* 154 = nosys */ { SYF_MPSAFE | AS(nfssvc_args), (sy_call_t *)nosys }, /* 155 = nfssvc */ { compat(AS(ogetdirentries_args),getdirentries) }, /* 156 = old getdirentries */ ! { compat4(AS(freebsd4_statfs_args),statfs) }, /* 157 = old statfs */ ! { compat4(AS(freebsd4_fstatfs_args),fstatfs) }, /* 158 = old fstatfs */ { 0, (sy_call_t *)nosys }, /* 159 = nosys */ { 0, (sy_call_t *)nosys }, /* 160 = nosys */ { AS(getfh_args), (sy_call_t *)getfh }, /* 161 = getfh */ *************** *** 325,331 **** { 0, (sy_call_t *)nosys }, /* 294 = nosys */ { 0, (sy_call_t *)nosys }, /* 295 = nosys */ { 0, (sy_call_t *)nosys }, /* 296 = nosys */ ! { AS(fhstatfs_args), (sy_call_t *)fhstatfs }, /* 297 = fhstatfs */ { AS(fhopen_args), (sy_call_t *)fhopen }, /* 298 = fhopen */ { AS(fhstat_args), (sy_call_t *)fhstat }, /* 299 = fhstat */ { SYF_MPSAFE | AS(modnext_args), (sy_call_t *)modnext }, /* 300 = modnext */ --- 325,331 ---- { 0, (sy_call_t *)nosys }, /* 294 = nosys */ { 0, (sy_call_t *)nosys }, /* 295 = nosys */ { 0, (sy_call_t *)nosys }, /* 296 = nosys */ ! { compat4(AS(freebsd4_fhstatfs_args),fhstatfs) }, /* 297 = old fhstatfs */ { AS(fhopen_args), (sy_call_t *)fhopen }, /* 298 = fhopen */ { AS(fhstat_args), (sy_call_t *)fhstat }, /* 299 = fhstat */ { SYF_MPSAFE | AS(modnext_args), (sy_call_t *)modnext }, /* 300 = modnext */ *************** *** 423,432 **** { AS(uuidgen_args), (sy_call_t *)uuidgen }, /* 392 = uuidgen */ { SYF_MPSAFE | AS(sendfile_args), (sy_call_t *)sendfile }, /* 393 = sendfile */ { SYF_MPSAFE | AS(mac_syscall_args), (sy_call_t *)mac_syscall }, /* 394 = mac_syscall */ ! { 0, (sy_call_t *)nosys }, /* 395 = nosys */ ! { 0, (sy_call_t *)nosys }, /* 396 = nosys */ ! { 0, (sy_call_t *)nosys }, /* 397 = nosys */ ! { 0, (sy_call_t *)nosys }, /* 398 = nosys */ { 0, (sy_call_t *)nosys }, /* 399 = nosys */ { SYF_MPSAFE | AS(ksem_close_args), (sy_call_t *)lkmressys }, /* 400 = ksem_close */ { SYF_MPSAFE | AS(ksem_post_args), (sy_call_t *)lkmressys }, /* 401 = ksem_post */ --- 423,432 ---- { AS(uuidgen_args), (sy_call_t *)uuidgen }, /* 392 = uuidgen */ { SYF_MPSAFE | AS(sendfile_args), (sy_call_t *)sendfile }, /* 393 = sendfile */ { SYF_MPSAFE | AS(mac_syscall_args), (sy_call_t *)mac_syscall }, /* 394 = mac_syscall */ ! { AS(getfsstat_args), (sy_call_t *)getfsstat }, /* 395 = getfsstat */ ! { AS(statfs_args), (sy_call_t *)statfs }, /* 396 = statfs */ ! { AS(fstatfs_args), (sy_call_t *)fstatfs }, /* 397 = fstatfs */ ! { AS(fhstatfs_args), (sy_call_t *)fhstatfs }, /* 398 = fhstatfs */ { 0, (sy_call_t *)nosys }, /* 399 = nosys */ { SYF_MPSAFE | AS(ksem_close_args), (sy_call_t *)lkmressys }, /* 400 = ksem_close */ { SYF_MPSAFE | AS(ksem_post_args), (sy_call_t *)lkmressys }, /* 401 = ksem_post */ Index: sys/kern/syscalls.c =================================================================== RCS file: /usr/ncvs/src/sys/kern/syscalls.c,v retrieving revision 1.144 diff -c -r1.144 syscalls.c *** sys/kern/syscalls.c 21 Oct 2003 07:03:27 -0000 1.144 --- sys/kern/syscalls.c 5 Nov 2003 05:13:28 -0000 *************** *** 3,9 **** * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ ! * created from FreeBSD: src/sys/kern/syscalls.master,v 1.154 2003/10/20 16:16:03 dwmalone Exp */ const char *syscallnames[] = { --- 3,9 ---- * * DO NOT EDIT-- this file is automatically generated. * $FreeBSD$ ! * created from FreeBSD */ const char *syscallnames[] = { *************** *** 25,31 **** "chmod", /* 15 = chmod */ "chown", /* 16 = chown */ "break", /* 17 = break */ ! "getfsstat", /* 18 = getfsstat */ "old.lseek", /* 19 = old lseek */ "getpid", /* 20 = getpid */ "mount", /* 21 = mount */ --- 25,31 ---- "chmod", /* 15 = chmod */ "chown", /* 16 = chown */ "break", /* 17 = break */ ! "old.getfsstat", /* 18 = old getfsstat */ "old.lseek", /* 19 = old lseek */ "getpid", /* 20 = getpid */ "mount", /* 21 = mount */ *************** *** 164,171 **** "#154", /* 154 = nosys */ "nfssvc", /* 155 = nfssvc */ "old.getdirentries", /* 156 = old getdirentries */ ! "statfs", /* 157 = statfs */ ! "fstatfs", /* 158 = fstatfs */ "#159", /* 159 = nosys */ "#160", /* 160 = nosys */ "getfh", /* 161 = getfh */ --- 164,171 ---- "#154", /* 154 = nosys */ "nfssvc", /* 155 = nfssvc */ "old.getdirentries", /* 156 = old getdirentries */ ! "old.statfs", /* 157 = old statfs */ ! "old.fstatfs", /* 158 = old fstatfs */ "#159", /* 159 = nosys */ "#160", /* 160 = nosys */ "getfh", /* 161 = getfh */ *************** *** 304,310 **** "#294", /* 294 = nosys */ "#295", /* 295 = nosys */ "#296", /* 296 = nosys */ ! "fhstatfs", /* 297 = fhstatfs */ "fhopen", /* 298 = fhopen */ "fhstat", /* 299 = fhstat */ "modnext", /* 300 = modnext */ --- 304,310 ---- "#294", /* 294 = nosys */ "#295", /* 295 = nosys */ "#296", /* 296 = nosys */ ! "old.fhstatfs", /* 297 = old fhstatfs */ "fhopen", /* 298 = fhopen */ "fhstat", /* 299 = fhstat */ "modnext", /* 300 = modnext */ *************** *** 402,411 **** "uuidgen", /* 392 = uuidgen */ "sendfile", /* 393 = sendfile */ "mac_syscall", /* 394 = mac_syscall */ ! "#395", /* 395 = nosys */ ! "#396", /* 396 = nosys */ ! "#397", /* 397 = nosys */ ! "#398", /* 398 = nosys */ "#399", /* 399 = nosys */ "ksem_close", /* 400 = ksem_close */ "ksem_post", /* 401 = ksem_post */ --- 402,411 ---- "uuidgen", /* 392 = uuidgen */ "sendfile", /* 393 = sendfile */ "mac_syscall", /* 394 = mac_syscall */ ! "getfsstat", /* 395 = getfsstat */ ! "statfs", /* 396 = statfs */ ! "fstatfs", /* 397 = fstatfs */ ! "fhstatfs", /* 398 = fhstatfs */ "#399", /* 399 = nosys */ "ksem_close", /* 400 = ksem_close */ "ksem_post", /* 401 = ksem_post */ Index: bin/df/df.c =================================================================== RCS file: /usr/ncvs/src/bin/df/df.c,v retrieving revision 1.51 diff -c -r1.51 df.c *** bin/df/df.c 13 Sep 2003 20:46:58 -0000 1.51 --- bin/df/df.c 5 Nov 2003 19:22:11 -0000 *************** *** 120,128 **** static unit_t unitp [] = { NONE, KILO, MEGA, GIGA, TERA, PETA }; static char *getmntpt(const char *); ! static size_t longwidth(long); static char *makenetvfslist(void); ! static void prthuman(const struct statfs *, size_t); static void prthumanval(double); static void prtstat(struct statfs *, struct maxwidths *); static size_t regetmntinfo(struct statfs **, long, const char **); --- 120,128 ---- static unit_t unitp [] = { NONE, KILO, MEGA, GIGA, TERA, PETA }; static char *getmntpt(const char *); ! static size_t int64width(int64_t); static char *makenetvfslist(void); ! static void prthuman(const struct statfs *, int64_t); static void prthumanval(double); static void prtstat(struct statfs *, struct maxwidths *); static size_t regetmntinfo(struct statfs **, long, const char **); *************** *** 371,377 **** } static void ! prthuman(const struct statfs *sfsp, size_t used) { prthumanval((double)sfsp->f_blocks * (double)sfsp->f_bsize); --- 371,377 ---- } static void ! prthuman(const struct statfs *sfsp, int64_t used) { prthumanval((double)sfsp->f_blocks * (double)sfsp->f_bsize); *************** *** 408,417 **** static void prtstat(struct statfs *sfsp, struct maxwidths *mwp) { ! static long blocksize; static int headerlen, timesthrough = 0; static const char *header; ! size_t used, availblks, inodes; if (++timesthrough == 1) { mwp->mntfrom = max(mwp->mntfrom, strlen("Filesystem")); --- 408,417 ---- static void prtstat(struct statfs *sfsp, struct maxwidths *mwp) { ! static u_long blocksize; static int headerlen, timesthrough = 0; static const char *header; ! int64_t used, availblks, inodes; if (++timesthrough == 1) { mwp->mntfrom = max(mwp->mntfrom, strlen("Filesystem")); *************** *** 445,463 **** if (hflag) { prthuman(sfsp, used); } else { ! (void)printf(" %*ld %*ld %*ld", ! (u_int)mwp->total, fsbtoblk(sfsp->f_blocks, sfsp->f_bsize, blocksize), ! (u_int)mwp->used, fsbtoblk(used, sfsp->f_bsize, blocksize), ! (u_int)mwp->avail, fsbtoblk(sfsp->f_bavail, sfsp->f_bsize, ! blocksize)); } (void)printf(" %5.0f%%", availblks == 0 ? 100.0 : (double)used / (double)availblks * 100.0); if (iflag) { inodes = sfsp->f_files; used = inodes - sfsp->f_ffree; ! (void)printf(" %*lu %*lu %4.0f%% ", ! (u_int)mwp->iused, (u_long)used, (u_int)mwp->ifree, sfsp->f_ffree, inodes == 0 ? 100.0 : (double)used / (double)inodes * 100.0); } else --- 445,465 ---- if (hflag) { prthuman(sfsp, used); } else { ! (void)printf(" %*qd %*qd %*qd", ! (u_int)mwp->total, ! fsbtoblk(sfsp->f_blocks, sfsp->f_bsize, blocksize), ! (u_int)mwp->used, ! fsbtoblk(used, sfsp->f_bsize, blocksize), ! (u_int)mwp->avail, ! fsbtoblk(sfsp->f_bavail, sfsp->f_bsize, blocksize)); } (void)printf(" %5.0f%%", availblks == 0 ? 100.0 : (double)used / (double)availblks * 100.0); if (iflag) { inodes = sfsp->f_files; used = inodes - sfsp->f_ffree; ! (void)printf(" %*qd %*qd %4.0f%% ", ! (u_int)mwp->iused, used, (u_int)mwp->ifree, sfsp->f_ffree, inodes == 0 ? 100.0 : (double)used / (double)inodes * 100.0); } else *************** *** 472,498 **** static void update_maxwidths(struct maxwidths *mwp, const struct statfs *sfsp) { ! static long blocksize = 0; int dummy; if (blocksize == 0) getbsize(&dummy, &blocksize); mwp->mntfrom = max(mwp->mntfrom, strlen(sfsp->f_mntfromname)); ! mwp->total = max(mwp->total, longwidth(fsbtoblk(sfsp->f_blocks, sfsp->f_bsize, blocksize))); ! mwp->used = max(mwp->used, longwidth(fsbtoblk(sfsp->f_blocks - ! sfsp->f_bfree, sfsp->f_bsize, blocksize))); ! mwp->avail = max(mwp->avail, longwidth(fsbtoblk(sfsp->f_bavail, ! sfsp->f_bsize, blocksize))); ! mwp->iused = max(mwp->iused, longwidth(sfsp->f_files - sfsp->f_ffree)); ! mwp->ifree = max(mwp->ifree, longwidth(sfsp->f_ffree)); } /* Return the width in characters of the specified long. */ static size_t ! longwidth(long val) { size_t len; --- 474,500 ---- static void update_maxwidths(struct maxwidths *mwp, const struct statfs *sfsp) { ! static u_long blocksize = 0; int dummy; if (blocksize == 0) getbsize(&dummy, &blocksize); mwp->mntfrom = max(mwp->mntfrom, strlen(sfsp->f_mntfromname)); ! mwp->total = max(mwp->total, int64width( ! fsbtoblk((int64_t)sfsp->f_blocks, sfsp->f_bsize, blocksize))); ! mwp->used = max(mwp->used, int64width(fsbtoblk((int64_t)sfsp->f_blocks - ! (int64_t)sfsp->f_bfree, sfsp->f_bsize, blocksize))); ! mwp->avail = max(mwp->avail, int64width(fsbtoblk(sfsp->f_bavail, sfsp->f_bsize, blocksize))); ! mwp->iused = max(mwp->iused, int64width((int64_t)sfsp->f_files - sfsp->f_ffree)); ! mwp->ifree = max(mwp->ifree, int64width(sfsp->f_ffree)); } /* Return the width in characters of the specified long. */ static size_t ! int64width(int64_t val) { size_t len; From owner-freebsd-arch@FreeBSD.ORG Wed Nov 5 21:30:29 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7E25616A4CE; Wed, 5 Nov 2003 21:30:29 -0800 (PST) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id F14D04400B; Wed, 5 Nov 2003 21:30:26 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3p2/8.8.7) with ESMTP id QAA12100; Thu, 6 Nov 2003 16:30:23 +1100 Date: Thu, 6 Nov 2003 16:30:22 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Brian Fundakowski Feldman In-Reply-To: <200311051925.hA5JPT6S003092@green.bikeshed.org> Message-ID: <20031106160854.H6157@gamplex.bde.org> References: <200311051925.hA5JPT6S003092@green.bikeshed.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: arch@freebsd.org cc: fenner@freebsd.org Subject: Re: bpf/pcap are weird X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Nov 2003 05:30:29 -0000 On Wed, 5 Nov 2003, Brian Fundakowski Feldman wrote: > Okay, this is goofy stuff and breaks a lot of code that otherwise makes > certain assumptions about pcap/bpf that don't work on FreeBSD. Our bpf(4) > doesn't actually care about the non-blocking fd flag, and our pcap(3) > doesn't care at all about BIOCIMMEDIATE. Why do we have BIOCIMMEDIATE? It > seems like it's what SHOULD be implemented with the non-blocking I/O flag > with the exception that if using O_NONBLOCK/FIONBIO you could actually query > for the status, whereas you can't query for BIOCIMMEDIATE since it's only a > SET and not a GET ioctl. Er, FreeBSD's bpf certainly cares about the non-blocking fd flag. It uses it in bpfread() although not in any other device switch function: if (ioflag & IO_NDELAY) { BPFD_UNLOCK(d); return (EWOULDBLOCK); } NetBSD still seems to use the old 4.4 code which ignores the non-blocking fd flag in bpfread() and doesn't even use a dedicated non-blocking device flag (it overloads the timeout). bpfpoll() is reported to be broken; see PR 36219. Rev.1.113 of bpf.c may have disturbed this. It removed the comment which said that bpf_ready() doesn't acually imitate resultof(FIONREAD) != 0. I don't know anything about BIOCIMMEDIATE. Bruce From owner-freebsd-arch@FreeBSD.ORG Wed Nov 5 22:45:32 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7C4A516A4CE; Wed, 5 Nov 2003 22:45:32 -0800 (PST) Received: from smtp01.syd.iprimus.net.au (smtp01.syd.iprimus.net.au [210.50.30.52]) by mx1.FreeBSD.org (Postfix) with ESMTP id C721D43FAF; Wed, 5 Nov 2003 22:45:31 -0800 (PST) (envelope-from tim@robbins.dropbear.id.au) Received: from robbins.dropbear.id.au (210.50.203.152) by smtp01.syd.iprimus.net.au (7.0.020) id 3F8B009E00917232; Thu, 6 Nov 2003 17:45:30 +1100 Received: by robbins.dropbear.id.au (Postfix, from userid 1000) id 20B6160FD; Thu, 6 Nov 2003 17:45:29 +1100 (EST) Date: Thu, 6 Nov 2003 17:45:28 +1100 From: Tim Robbins To: Kirk McKusick Message-ID: <20031106064528.GA1440@wombat.robbins.dropbear.id.au> References: <200311060504.hA654feN034044@beastie.mckusick.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200311060504.hA654feN034044@beastie.mckusick.com> User-Agent: Mutt/1.4.1i cc: Robert Watson cc: arch@freebsd.org cc: Peter Wemm Subject: Re: >0x7fffffff blocksize filesystem reporting X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Nov 2003 06:45:32 -0000 On Wed, Nov 05, 2003 at 09:04:41PM -0800, Kirk McKusick wrote: > + /* > + * Convert a new format statfs structure to an old format statfs structure. > + */ > + static void > + cvtstatfs(td, nsp, osp) > + struct thread *td; > + struct statfs *nsp; > + struct ostatfs *osp; > + { > + > + bzero(osp, sizeof(*osp)); > + osp->f_bsize = nsp->f_bsize; > + osp->f_iosize = nsp->f_iosize; > + osp->f_blocks = nsp->f_blocks; > + osp->f_bfree = nsp->f_bfree; > + osp->f_bavail = nsp->f_bavail; > + osp->f_files = nsp->f_files; > + osp->f_ffree = nsp->f_ffree; > + osp->f_owner = nsp->f_owner; > + osp->f_type = nsp->f_type; > + osp->f_flags = nsp->f_flags; > + osp->f_syncwrites = nsp->f_syncwrites; > + osp->f_asyncwrites = nsp->f_asyncwrites; > + osp->f_syncreads = nsp->f_syncreads; > + osp->f_asyncreads = nsp->f_asyncreads; It may be better to return LONG_MAX for some of these members than to truncate the value. Alternatively, the block size could be adjusted to ensure that f_blocks fits in a "long" even though f_blocks * f_bsize may overflow it, but this is messy and can't help if f_files or f_{sync,async}{reads,writes} are too big. > + bcopy(nsp->f_fstypename, osp->f_fstypename, MFSNAMELEN); > + bcopy(nsp->f_mntonname, osp->f_mntonname, MNAMELEN); > + bcopy(nsp->f_mntfromname, osp->f_mntfromname, MNAMELEN); On architectures where longs are not 32 bits (amd64), OMNAMELEN != MNAMELEN, so this may do the wrong thing. Tim From owner-freebsd-arch@FreeBSD.ORG Wed Nov 5 23:58:19 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4C91D16A4CE; Wed, 5 Nov 2003 23:58:19 -0800 (PST) Received: from eth0.b.smtp.sonic.net (eth0.b.smtp.sonic.net [64.142.19.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4EDD843FD7; Wed, 5 Nov 2003 23:58:18 -0800 (PST) (envelope-from gharris@sonic.net) Received: from quadrajet.sonic.net (adsl-209-204-185-249.sonic.net [209.204.185.249])hA67wHmS030499; Wed, 5 Nov 2003 23:58:17 -0800 Received: (from guy@localhost) by quadrajet.sonic.net (8.9.3/8.9.3) id XAA08727; Wed, 5 Nov 2003 23:58:17 -0800 (PST) (envelope-from gharris) Date: Wed, 5 Nov 2003 23:58:16 -0800 From: Guy Harris To: Brian Fundakowski Feldman Message-ID: <20031105235816.E331@quadrajet.sonic.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.4i cc: arch@FreeBSD.org cc: fenner@FreeBSD.org Subject: Re: bpf/pcap are weird X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Nov 2003 07:58:19 -0000 > Okay, this is goofy stuff and breaks a lot of code that otherwise makes > certain assumptions about pcap/bpf that don't work on FreeBSD. Our > bpf(4) doesn't actually care about the non-blocking fd flag, and our pcap(3) > doesn't care at all about BIOCIMMEDIATE. This is a libpcap deficiency that I will probably fix at some point, as 1) some libpcap applications might want that mode and 2) the way you get that mode differs on different platforms (some platforms always implement it, e.g. Linux; other platforms have different ways of requesting it). It's in my queue along with a number of other libpcap deficiencies. > Why do we have BIOCIMMEDIATE? > It seems like it's what SHOULD be implemented with the non-blocking I/O > flag No. BIOCIMMEDIATE and non-blocking mode are different. BIOCIMMEDIATE mode means "make incoming packets readable immediately; don't buffer them up until either the store buffer is full or the timeout expires". This is for use in, for example, applications that are using BPF to implement network protocols, and want to be able to respond immediately to incoming packets, as opposed to, for example, packet capture applications (tcpdump, Ethereal, etc.) which don't necessarily need to immediately show or save incoming packets and which might want to try to get as many packets as possible per read on the BPF device. It does *NOT* mean "an attempt to read on this device won't block even if *no* packets are available", nor should it - applications running in BIOCIMMEDIATE mode would probably still want to block, rather than spin, if no packets are available. Non-blocking mode should mean "an attempt to read on this device won't block, even if there are no packets remaining", so it's not identical to BIOCIMMEDIATE mode. If used in conjunction with a properly-working "select()" or "poll()" - i.e., one that causes the timeout timer to start when the "select()" or "poll()" is done, so that the "select()" or "poll()" will wake up if the store buffer fills *OR* the timeout expires - then it does need to be the case that, if the "select()" or "poll()" says a read on the BPF device will succeed, it will, in fact, succeed. This could be implemented by having reads in non-blocking mode always do a buffer rotation if there are packets in the store buffer but not the hold buffer, just as is the case in BIOCIMMEDIATE mode. That's currently done in "bpf_read()" - note the "|| timed_out" in the "if" inside the "while (d->bd_hbuf == 0)" loop. That appears to have been introduced in 4.5, in revision 1.59.2.8, which was an MFC of revision 1.86: Make bpf's read timeout feature work more correctly with select/poll, and therefore with pthreads. I doubt there is any way to make this 100% semantically identical to the way it behaves in unthreaded programs with blocking reads, but the solution here should do the right thing for all reasonable usage patterns. The basic idea is to schedule a callout for the read timeout when a select/poll is done. When the callout fires, it ends the select if it is still in progress, or marks the state as "timed out" if the select has already ended for some other reason. Additional logic in bpfread then does the right thing in the case where the timeout has fired. Note, I co-opted the bd_state member of the bpf_d structure. It has been present in the structure since the initial import of 4.4-lite, but as far as I can tell it has never been used. PR: kern/22063 and bin/31649 PR 22063 is "bpf when used with the select system call with timeout doesn't forward packets on timeout": When bpf is accessed via libpcap with the select system call with a timeout set if a less than full buffer of packets received on the interface (and passed to bpf.c) they will never be returned to libpcap even on a timeout. OpenBSD has a partial fix for this (it gets the first packet of 9 up and leaves the other 8) which I have corrected, reported to OpenBSD and ported to FreeBSD. As a side note one of the OpenBSD people is working on a better bpf implementation and would be interested in help by someone knowledgable in the FreeBSD VM system to assist porting his code when finished to FreeBSD. (I think the "better bpf implementation" might be Michael Stolarchuk's memory-mapped BPF, but I don't know whether it ever saw the light of day.) PR 31649 is "libpcap doesn't work with -pthread"; the problem is that the userland pthreads library requires that "select()"/"poll()" and non-blocking reads work on anything from which you're trying to read if you can get long-term waits on it - and that wasn't the case for BPF devices. The question then is whether if *not* used with "select()" or "poll()" reads should return whatever packets are there, even if the timer hasn't expired. One could argue that it should, in which case the "if" in question should also check for "ioflag & IO_NDELAY". I don't know whether that would cause problems for any applications, though. From owner-freebsd-arch@FreeBSD.ORG Thu Nov 6 00:38:51 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4A43616A4CE; Thu, 6 Nov 2003 00:38:51 -0800 (PST) Received: from eth0.b.smtp.sonic.net (eth0.b.smtp.sonic.net [64.142.19.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 563874400E; Thu, 6 Nov 2003 00:38:50 -0800 (PST) (envelope-from gharris@sonic.net) Received: from quadrajet.sonic.net (adsl-209-204-185-249.sonic.net [209.204.185.249])hA68cnmS002917; Thu, 6 Nov 2003 00:38:49 -0800 Received: (from guy@localhost) by quadrajet.sonic.net (8.9.3/8.9.3) id AAA08793; Thu, 6 Nov 2003 00:38:48 -0800 (PST) (envelope-from gharris) Date: Thu, 6 Nov 2003 00:38:48 -0800 From: Guy Harris To: Bruce Evans Message-ID: <20031106003848.F331@quadrajet.sonic.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.4i cc: arch@freebsd.org cc: fenner@freebsd.org Subject: Re: bpf/pcap are weird X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Nov 2003 08:38:51 -0000 > bpfpoll() is reported to be broken; see PR 36219. Yes, that's the PR that indicated that "select()"/"poll()" don't, in fact, work correctly with timeouts. That was fixed in 4.5... > Rev.1.113 of bpf.c may have disturbed this. ...but might have been re-broken. > It removed the comment which said that > bpf_ready() doesn't acually imitate resultof(FIONREAD) != 0. ...and it also removed the check for "d->bd_state == BPF_TIMED_OUT" that made "select()"/"poll()" work with timeouts. And, unfortunately, it looks as if that's in 4.9 - revision 1.59.2.13 has it, and that's tagged with RELENG_4_9_0_RELEASE. If non-blocking reads *do* cause buffer rotation if necessary, this can, at least for "select()"/"poll()"-based applications that can control the timeout in their select loop, be worked around by making the timeout timer be a timeout in "select()" or "poll()", putting the BPF device in non-blocking mode (for which there is an API in recent versions of libpcap, and in older versions you can just set the flag on the descriptor you get from "pcap_fileno()"), and reading from the BPF device (or the pcap_t) either if "select()"/"poll()" says you can *or* the "select()"/"poll()" timer has expired. If they don't cause a buffer rotation, however, you're screwed unless there's some other way to force the buffer rotation. Turning BIOCIMMEDIATE mode on would take care of that - but it would also turn off packet buffering, which might cause too much CPU overhead when capturing on a busy network. (Or it might not; I don't know whether anybody's measured it.) However, in 4.9, a buffer rotation will be done in "bpf_read()" if a timeout has occurred, so it looks as if that workaround would work. In 4.x releases from 4.5 through 4.8, "select()"/"poll()" should, I thihnk, work with timeouts, and reads after a timeout has occurred should rotate the buffers even if the timeout occurred before the "read()" (such as in a "select()"/"poll()"). In releases prior to 4.4, "select()"/"poll()" won't work with timeouts - but non-blocking reads will force a buffer rotation; if the hold buffer is empty, and IO_NDELAY is set in "ioflag", "error" will be set to EWOULDBLOCK, and it'll eventually fall through to the "rotate buffers if EWOULDBLOCK is true, the hold buffer is empty, and the store buffer isn't" code. In 4.4, "bpf_read()" was changed (revision 1.59.2.5, MFC of 1.72) so that if the hold buffer is empty, and IO_NDELAY is set in "ioflag", it'll return EWOULDBLOCK. I.e., it was changed so that non-blocking reads *won't* force a buffer rotation; the comment for 1.72 is Fix bug: a read() on a bpf device which was in non-blocking mode and had no data available returned 0. Now it returns -1 with errno set to EWOULDBLOCK (== EAGAIN) as it should. This fix makes the bpf device usable in threaded programs. (It didn't make it usable, as far as I know, because the "select()"/"poll()" behavior also had to be fixed.) I don't know whether there's a PR for that bug or not. I have vague memories of seeing the problem reported, but it might've been in one of the mailing lists. I don't remember what the symptoms were, but it *might* mean that making non-blocking reads always rotate the buffers might cause a regression. Unfortunately, having non-blocking reads not force a buffer rotation means that the workaround for the BPF "select()"/"poll()" will, I think, not work - and it's not obvious how to make it work. (This means that if I make Ethereal register the pcap_t descriptor in the GTK+ main loop, and have the main loop wait both for UI events and packet arrival, that probably won't work on FreeBSD 4.4....) From owner-freebsd-arch@FreeBSD.ORG Thu Nov 6 03:44:51 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F277416A4CF; Thu, 6 Nov 2003 03:44:50 -0800 (PST) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id D7A6F43FE3; Thu, 6 Nov 2003 03:44:48 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3p2/8.8.7) with ESMTP id WAA31720; Thu, 6 Nov 2003 22:44:34 +1100 Date: Thu, 6 Nov 2003 22:44:33 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Kirk McKusick In-Reply-To: <200311060504.hA654feN034044@beastie.mckusick.com> Message-ID: <20031106211046.J7380@gamplex.bde.org> References: <200311060504.hA654feN034044@beastie.mckusick.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: Robert Watson cc: arch@freebsd.org cc: Peter Wemm Subject: Re: >0x7fffffff blocksize filesystem reporting X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Nov 2003 11:44:51 -0000 On Wed, 5 Nov 2003, Kirk McKusick wrote: > I have gone back and resurrected the changes for the updated statfs > structure that were discussed on arch some months ago. Because they > ... > Index: sys/sys/mount.h > =================================================================== > RCS file: /usr/ncvs/src/sys/sys/mount.h,v > retrieving revision 1.148 > diff -c -r1.148 mount.h > *** sys/sys/mount.h 1 Jul 2003 17:40:23 -0000 1.148 > --- sys/sys/mount.h 5 Nov 2003 05:10:54 -0000 > *************** > *** 63,75 **** > /* > * filesystem statistics > */ > > ! #define MFSNAMELEN 16 /* length of fs type name, including null */ > ! #define MNAMELEN (88 - 2 * sizeof(long)) /* size of on/from name bufs */ > > /* XXX getfsstat.2 is out of date with write and read counter changes here. */ > /* XXX statfs.2 is out of date with read counter changes here. */ > ! struct statfs { > long f_spare2; /* placeholder */ > long f_bsize; /* fundamental filesystem block size */ > long f_iosize; /* optimal transfer block size */ > --- 63,103 ---- > /* > * filesystem statistics > */ > + #define MFSNAMELEN 16 /* length of type name including null */ > + #define MNAMELEN 80 /* size of on/from name bufs */ As pointed out by tjr, there are buffer overflows from bcopy() MNAMELEN bytes. This is because the above MNAMELEN Is smaller than the old MNAMELEN on systems with sizeof(long) > 4. It should be either the historical MNAMELEN of 90 or 88, or larger than that since we have room for expansion, or no larger than the previous smallest MNAMELEN for all machines, depending on whether the overflow bug is fixed by truncation on conversion or avoided by truncation when the name is copied in. > + #define STATFS_VERSION 0x20030518 /* current version number */ Style bug (space instead of tab). > + struct statfs { > + u_int32_t f_version; /* structure version number */ > + u_int32_t f_type; /* type of filesystem */ > + u_int64_t f_flags; /* copy of mount exported flags */ > + u_int64_t f_bsize; /* filesystem fragment size */ > + u_int64_t f_iosize; /* optimal transfer block size */ > + u_int64_t f_blocks; /* total data blocks in filesystem */ > + u_int64_t f_bfree; /* free blocks in filesystem */ > + int64_t f_bavail; /* free blocks avail to non-superuser */ > + u_int64_t f_files; /* total file nodes in filesystem */ > + int64_t f_ffree; /* free nodes avail to non-superuser */ > + u_int64_t f_syncwrites; /* count of sync writes since mount */ > + u_int64_t f_asyncwrites; /* count of async writes since mount */ > + u_int64_t f_syncreads; /* count of sync reads since mount */ > + u_int64_t f_asyncreads; /* count of async reads since mount */ > + u_int64_t f_spare[10]; /* unused spare */ > + u_int32_t f_namemax; /* maximum filename length */ I disklike all these unsigned types, and to a lesser extent, typedefed types and some of the 64-bit types. Unsigned types limit possibilties for trapping overflow if overflow actually occurs. They should not be used to get an extra bit of precision unless the extra bit is very important. The old code got this right by using only signed types. f_bavail and f_ffree need to go negative so they are correctly signed types, but this means that the extra bit for the 64-bit unsigned block and node counts cannot actually be used (since values that use it would cause overflow if most of the blocks or nodes are free). Typedefed types are difficult to print, as shown by printf format errors in most of the formats changed in this patch. But forwards compatibility at the source level requires them. Why 64-bit types for f_bsize and f_iosize? If unsigned fixed-width types are used, then they should be named in a standard way (uintN_t, not u_intN_t). There are also some standard POSIX types for block counts, etc. These are used in POSIX's variant of statfs(). E.g., there is blkcnt_t, which is a signed integral type not yet implemented in FreeBSD. In the XSI extension there is also fsblkcnt_t, which is an _unsigned_ integral type implemented as uint64_t in FreeBSD. Its unsignedness is apparently derived from the old XSI type of unsigned long for block counts. > + uid_t f_owner; /* user that mounted the filesystem */ > + fsid_t f_fsid; /* filesystem id */ > + char f_charspare[76]; /* spare string space */ > + char f_fstypename[MFSNAMELEN]; /* filesystem type name */ > + char f_mntfromname[MNAMELEN]; /* mounted filesystem */ > + char f_mntonname[MNAMELEN]; /* directory on which mounted */ > + }; f_charspare would be better at the end. If explicit padding is needed after f_fsid_t, then it should be in terms of fsid_t spares. Similarly for spares after f_namemax and f_owner. These are only packed because uid_t happens to be 32-bit. fsid_t is 2*32-bit so it doesn't need padding either. 76 is an odd amount of padding. It gives 252 bytes of char arrays altogether and a struct size of 452 on i386's. 452 is 4 mod 8, so there are 4 bytes of unnamed padding at the end of the struct on most 64-bit arches (if not internal padding). > Index: sys/kern/vfs_syscalls.c > =================================================================== > RCS file: /usr/ncvs/src/sys/kern/vfs_syscalls.c,v > retrieving revision 1.332 > diff -c -r1.332 vfs_syscalls.c > *** sys/kern/vfs_syscalls.c 19 Oct 2003 20:41:07 -0000 1.332 > --- sys/kern/vfs_syscalls.c 5 Nov 2003 05:10:54 -0000 > ... > + /* > + * Convert a new format statfs structure to an old format statfs structure. > + */ > + static void > + cvtstatfs(td, nsp, osp) > + struct thread *td; > + struct statfs *nsp; > + struct ostatfs *osp; > + { > + > + bzero(osp, sizeof(*osp)); > + osp->f_bsize = nsp->f_bsize; > + osp->f_iosize = nsp->f_iosize; > + osp->f_blocks = nsp->f_blocks; > + osp->f_bfree = nsp->f_bfree; > + osp->f_bavail = nsp->f_bavail; > + osp->f_files = nsp->f_files; > + osp->f_ffree = nsp->f_ffree; tjr suggested setting the values to LONG_MAX instead of blindly truncating them I think POSIX has some methods for dealing with such overflows (from LFS). I thing they would reduce to returning -1/EOVERFLOW here, which is not very useful. tjr also suggested using a fake block size. This is already done in nfs, but the implementation is broken (nfs3 has similar 32-bit limits in the client-server interface). > Index: sys/kern/vfs_bio.c > =================================================================== > RCS file: /usr/ncvs/src/sys/kern/vfs_bio.c,v > retrieving revision 1.420 > diff -c -r1.420 vfs_bio.c > *** sys/kern/vfs_bio.c 4 Nov 2003 06:30:00 -0000 1.420 > --- sys/kern/vfs_bio.c 5 Nov 2003 05:10:54 -0000 > *************** > *** 3239,3245 **** > (int) m->pindex, (int)(foff >> 32), > (int) foff & 0xffffffff, resid, i); > if (!vn_isdisk(vp, NULL)) > ! printf(" iosize: %ld, lblkno: %jd, flags: 0x%x, npages: %d\n", > bp->b_vp->v_mount->mnt_stat.f_iosize, > (intmax_t) bp->b_lblkno, > bp->b_flags, bp->b_npages); > --- 3239,3245 ---- > (int) m->pindex, (int)(foff >> 32), > (int) foff & 0xffffffff, resid, i); > if (!vn_isdisk(vp, NULL)) > ! printf(" iosize: %jd, lblkno: %jd, flags: 0x%x, npages: %d\n", > bp->b_vp->v_mount->mnt_stat.f_iosize, > (intmax_t) bp->b_lblkno, > bp->b_flags, bp->b_npages); Example of a printf format error. The long was easy to print using %ld, but now there is a a u_int64_t. Using %jd gives a sign mismatch on all machines and a size mismatch on machines with sizeof(u_int64_t) != sizeof(intmax_t). > Index: bin/df/df.c > =================================================================== > RCS file: /usr/ncvs/src/bin/df/df.c,v > retrieving revision 1.51 > diff -c -r1.51 df.c > *** bin/df/df.c 13 Sep 2003 20:46:58 -0000 1.51 > --- bin/df/df.c 5 Nov 2003 19:22:11 -0000 > ... This has many more examples of printf format errors. The old code has many related bugs. Bruce From owner-freebsd-arch@FreeBSD.ORG Thu Nov 6 08:29:55 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from green.bikeshed.org (freefall.freebsd.org [216.136.204.21]) by hub.freebsd.org (Postfix) with ESMTP id 780DB16A4CE; Thu, 6 Nov 2003 08:29:55 -0800 (PST) Received: from green.bikeshed.org (localhost [127.0.0.1]) by green.bikeshed.org (8.12.10/8.12.9) with ESMTP id hA6GTscR021981; Thu, 6 Nov 2003 11:29:54 -0500 (EST) (envelope-from green@green.bikeshed.org) Received: from localhost (green@localhost)hA6GTrob021977; Thu, 6 Nov 2003 11:29:54 -0500 (EST) Message-Id: <200311061629.hA6GTrob021977@green.bikeshed.org> X-Mailer: exmh version 2.6.3 04/04/2003 with nmh-1.0.4 To: Sam Leffler In-Reply-To: Message from Sam Leffler of "Wed, 05 Nov 2003 21:25:07 PST." <200311052125.07773.sam@errno.com> From: "Brian F. Feldman" Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 06 Nov 2003 11:29:53 -0500 Sender: green@green.bikeshed.org cc: arch@FreeBSD.org cc: Warner Losh Subject: Re: my 802.11 work X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Nov 2003 16:29:55 -0000 Sam Leffler wrote: > On Wednesday 05 November 2003 08:22 pm, Brian Fundakowski Feldman wrote: > > For anyone who may be interested in any of the things I've mentioned so far > > for net80211 and the 802.11 hardware drivers, I've got a running diff with > > everything I've found important so far. Missing are changes to further > > reduce gratuitous resetting in wi(4) and to reduce that problem at all in > > ath(4). Please check out: > > http://green.homeunix.org/~green/wi-fi.diffs There's no way to do the MTAG_ABI_BPF/BPF_TAG_LINK_UNENCRYPTED trick without violating some sort of layering. I decided that since it's arguably an interface/link layer operation, bpf(4) would be the most appropriate place to put it. Without it, there's no way to get FreeBSD the functionality to do cleartext authentication on an encrypted link. This means no FreeBSD 802.1x authenticators, and not-very-good 802.1x supplicants. Getting rid of the "WEP mode" flags is the obvious step to take when adding another wep mode, hence ic_wep_mode. This simplifies a good deal of code, because it lets you make the decisions "WEP mode on?" "WEP mode supported?" "WEP mode not on?" with only a single comparison. Another enum in the ieee80211com shouldn't hurt anything. "Hybrid" WEP mode is just as important as MTAG_ABI_BPF/BPF_TAG_LINK_UNENCRYPTED to the goal of supporting 802.1x. Of course, it could simply be collapsed into "on" mode if it's desired to return unencrypted packets to the userland when they're not requested, but I don't feel that's very secure. Also, there's a performance hit for not setting EXCLUDE_UNENCRYPTED on wi(4) hardware I'm sure. ENETRESET is way too simplistic. I tried to work around one aspect of that with the IEEE80211_F_NORESETNODE flag, but it's not enough. ENETRESET is evilly large-grained. My ath(4) shouldn't be disassociating, scanning all the channels, and reassociating just because I change a WEP key. Changing from a WEP unsupported to WEP supported mode, yes. There's so many of these flags that shouldn't be resetting the card, but at least this is enough that wi(4) can act as an 802.1x authenticator. The timeout code for hostap nodes is wrong. There's no consideration to whether the the node is actually still around or not. There's no reason to time out a node that's still around (accepts a zero-length DATA packet and ACKs it, resulting in a local TX complete interrupt if we ask for it), and there's no reason to keep a node around just because we tried to send a packet to it. If the packet actually transmitted successfully, then that's good enough, though. A received packed is also fine for marking the node active. The routine which sent MGMT packets only before I've trivially converted to be able to send any kind. This means the kernel can take upon itself in the hostap code to probe for an inactive node, as mentioned above. IEEE80211_IOC_BSSID should exist. There's tons and tons of features common to all 802.11 cards that aren't reflected by SIOCG80211/SIOCS80211, so the API should be created where it doesn't exist already. IEEE80211_IOC_APNODEFLAGS is useful for determining if a node exists in a portable (not like wicontrol -l) way, for hostap. Obviously, it's also useful for toggling the IEEE80211_APNODEFLAGS_AUTHORIZED flag. IEEE80211_IOC_APNODEDELETE is useful because the administrator or adminstration software for the hostap should be able to delete clients at will. The extra hostap+authorization mode I called IFM_IEEE80211_HOSTAP | IFM_FLAG0 just because it's easy. Potentially there would be a hostap management API; if this existed, it would be a better place to store this option. IEEE80211_NODE_FLAG_AUTHORIZED is useful as an extra layer of security that the administrator is in charge of, preventing or allowing specific nodes from utilizing the network unrelated to WEP-based authentication as supported in the first layer. The really large bits which I haven't even tried to take on (since they're not strictly necessary... for me... at the moment) are replacing ENETRESET with something correct, and changing the SIOC[GS]80211 API so that the i_len and i_val arguments are both pervasive and functional in every call, just like sysctl. The user should either have to know the length of a given call's i_data argument and be able to PROVE it by passing in i_len, or be able to query it with a NULL i_data argument which returns a valid i_len to the user. In no circumstances should i_data be passed in without i_len, and in no circumstances should i_data be overwritten with more data than the user requested with i_len. I consider these important security considerations. -- Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\ <> green@FreeBSD.org \ The Power to Serve! \ Opinions expressed are my own. \,,,,,,,,,,,,,,,,,,,,,,\ From owner-freebsd-arch@FreeBSD.ORG Thu Nov 6 09:43:40 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4C0ED16A4CE for ; Thu, 6 Nov 2003 09:43:40 -0800 (PST) Received: from smtp.omnis.com (smtp.omnis.com [216.239.128.26]) by mx1.FreeBSD.org (Postfix) with ESMTP id E2CF743F93 for ; Thu, 6 Nov 2003 09:43:38 -0800 (PST) (envelope-from wes@softweyr.com) Received: from salty.rapid.stbernard.com (corp-2.ipinc.com [199.245.188.2]) by smtp-relay.omnis.com (Postfix) with ESMTP id 3F35E9BE82; Thu, 6 Nov 2003 09:37:30 -0800 (PST) From: Wes Peters Organization: Softweyr.com To: Peter Jeremy , kirk@mckusick.com Date: Thu, 6 Nov 2003 09:43:34 -0800 User-Agent: KMail/1.5.2 References: <200311041737.20467.wes@softweyr.com> <20031105015709.GC28915@dan.emsphone.com> <20031105081516.GA38016@cirb503493.alcatel.com.au> In-Reply-To: <20031105081516.GA38016@cirb503493.alcatel.com.au> MIME-Version: 1.0 Content-Type: Multipart/Mixed; boundary="Boundary-00=_Ghoq/V8Jqd8uEZp" Message-Id: <200311060943.34284.wes@softweyr.com> cc: arch@freebsd.org Subject: Re: newfs and mount vs. half-baked disks X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Nov 2003 17:43:40 -0000 --Boundary-00=_Ghoq/V8Jqd8uEZp Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline On Wednesday 05 November 2003 00:15, Peter Jeremy wrote: > On Tue, Nov 04, 2003 at 07:57:10PM -0600, Dan Nelson wrote: > >In the last episode (Nov 04), Wes Peters said: > >> I emailed Kirk about this state of affairs and he confirmed that > >> newfs was developed with operator intervention in mind. He > >> suggested employing one of the unused flags in the filesystem > >> header as a 'consistent' flag, setting it to 'not consistent' at > >> the beginning of newfs, and then updating to 'is consistent' at > >> the end. The performance hit in updating all superblock copies at > >> the end is small but noticable (< 1s on a rather slow 6GB > >> filesystem). > > > >Would writing a block of zeros to the first (or first n) superblock, > >newfs'ing, then rewriting the correct data do the same thing without > >affecting the filesystem itself? I'm thinking about 4.x and > > cross-OS portability here. > > My suggestion would be to write a non-standard magic number to > fs_magic in the primary and first backup superblock (block 32) - I > believe these are the only ones fsck will automatically search. The > "invalid" magic number means that neither mount nor fsck will > recognize the partition. Those two blocks can be re-written at the > end - the additional time should be unnoticable. The remaining > superblocks would appear valid but if someone is silly enough to > manually specify a alternate superblock in an incompletely newfs'd > filesystem, they get a neat hole in their foot. (A known > non-standard magic number would also allow fsck to warn that the > filesystem was incompletely newfs'd). > > I'm surprised that this bug hasn't been noticed previously. I found an unused field called "fs_state" and used that, as Kirk suggested. Here's the new patch, which changes fsck to notice the fs_state and doesn't require re-writing all of the superblocks. This patch adds a -E (generate errors) option to fsck, causing fsck to exit at various stages or to otherwise leave the state of fs_state and fs_clean in other than pristine conditions. I will, of course, commit the -E changes separately from the fs_state changes. Thanks in advance for reviewing. And yes, I did manage to attach the patch this time. Doh! -- "Where am I, and what am I doing in this handbasket?" Wes Peters wes@softweyr.com --Boundary-00=_Ghoq/V8Jqd8uEZp Content-Type: text/x-diff; charset="iso-8859-1"; name="fs_state.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="fs_state.patch" --- sys/ufs/ffs/ffs_vfsops.c.orig Tue Oct 14 12:23:07 2003 +++ sys/ufs/ffs/ffs_vfsops.c Tue Oct 14 12:43:59 2003 @@ -654,6 +654,12 @@ fs->fs_fmod = 0; fs->fs_flags &= ~FS_INDEXDIRS; /* no support for directory indicies */ fs->fs_flags &= ~FS_UNCLEAN; + if (fs->fs_state != 0) { + printf( +"WARNING: Filesystem on %s is incomplete, rerun newfs.\n", fs->fs_fsmnt); + error = EINVAL; + goto out; + } if (fs->fs_clean == 0) { fs->fs_flags |= FS_UNCLEAN; if (ronly || (mp->mnt_flag & MNT_FORCE) || --- lib/libc/sys/mount.2.orig Tue Oct 14 12:40:33 2003 +++ lib/libc/sys/mount.2 Tue Oct 14 12:41:10 2003 @@ -242,7 +242,7 @@ No space remains in the mount table. .It Bq Er EINVAL The super block for the file system had a bad magic -number or an out of range block size. +number or an out of range block size, or was incomplete. .It Bq Er ENOMEM Not enough memory was available to read the cylinder group information for the file system. --- sbin/newfs/newfs.h.orig Tue Nov 4 16:27:31 2003 +++ sbin/newfs/newfs.h Tue Nov 4 16:27:56 2003 @@ -52,6 +52,7 @@ extern int Oflag; /* build UFS1 format file system */ extern int Rflag; /* regression test */ extern int Uflag; /* enable soft updates for file system */ +extern int ErrorFlag; /* exit as if error, for testing */ extern quad_t fssize; /* file system size */ extern int sectorsize; /* bytes/sector */ extern int realsectorsize; /* bytes/sector in hardware*/ --- sbin/newfs/newfs.c.orig Tue Nov 4 16:20:42 2003 +++ sbin/newfs/newfs.c Tue Nov 4 16:27:02 2003 @@ -119,6 +119,7 @@ int Oflag = 2; /* file system format (1 => UFS1, 2 => UFS2) */ int Rflag; /* regression test */ int Uflag; /* enable soft updates for file system */ +int ErrorFlag = 0; /* exit in middle of newfs for testing */ quad_t fssize; /* file system size */ int sectorsize; /* bytes/sector */ int realsectorsize; /* bytes/sector in hardware */ @@ -156,8 +157,11 @@ off_t mediasize; while ((ch = getopt(argc, argv, - "L:NO:RS:T:Ua:b:c:d:e:f:g:h:i:m:o:s:")) != -1) + "EL:NO:RS:T:Ua:b:c:d:e:f:g:h:i:m:o:s:")) != -1) switch (ch) { + case 'E': + ErrorFlag++; + break; case 'L': volumelabel = optarg; i = -1; --- sbin/newfs/mkfs.c.orig Tue Oct 14 13:53:55 2003 +++ sbin/newfs/mkfs.c Thu Nov 6 09:01:02 2003 @@ -388,8 +388,8 @@ sblock.fs_pendinginodes = 0; sblock.fs_fmod = 0; sblock.fs_ronly = 0; - sblock.fs_state = 0; - sblock.fs_clean = 1; + sblock.fs_state = 0xdeadbeef; + sblock.fs_clean = 0; sblock.fs_id[0] = (long)utime; sblock.fs_id[1] = newfs_random(); sblock.fs_fsmnt[0] = '\0'; @@ -448,11 +448,27 @@ chdummy, SBLOCKSIZE); } } + if (!Nflag) + sbwrite(&disk, 0); + if (ErrorFlag == 1) { + printf("** Exiting on ErrorFlag 1\n"); + exit(0); + } /* * Now build the cylinders group blocks and * then print out indices of cylinder groups. - */ + * The superblock backups in the cylinder groups + * are created clean & stable so we don't have + * to make another pass cleaning them up. + */ + if (ErrorFlag == 5) + printf("** Leaving superblock backups dirty on ErrorFlag 5\n"); + else { + sblock.fs_state = 0; + sblock.fs_clean = 1; + } + printf("super-block backups (for fsck -b #) at:\n"); i = 0; width = charsperline(); @@ -492,6 +508,10 @@ printf("\n"); if (Nflag) exit(0); + if (ErrorFlag == 2) { + printf("** Exiting on ErrorFlag 2\n"); + exit(0); + } /* * Now construct the initial file system, * then write out the super-block. @@ -503,6 +523,22 @@ sblock.fs_old_cstotal.cs_nifree = sblock.fs_cstotal.cs_nifree; sblock.fs_old_cstotal.cs_nffree = sblock.fs_cstotal.cs_nffree; } + + /* + * Update the primary superblock setting the state to + * consistent and clean (unless the user told us not to). + */ + if (ErrorFlag == 3) { + printf("** Not clean on ErrorFlag 3\n"); + sblock.fs_clean = 0; + } else if (ErrorFlag == 4) { + printf("** Not stable on ErrorFlag 4\n"); + sblock.fs_state = 0xdeadface; + } else { + sblock.fs_state = 0; + sblock.fs_clean = 1; + } + if (!Nflag) sbwrite(&disk, 0); for (i = 0; i < sblock.fs_cssize; i += sblock.fs_bsize) --- sbin/fsck_ffs/setup.c.orig Wed Nov 5 15:50:12 2003 +++ sbin/fsck_ffs/setup.c Thu Nov 6 08:36:58 2003 @@ -307,6 +307,11 @@ bflag); return (0); } + if (sblock.fs_state != 0) { + fprintf(stderr, "superblock %d is not finished\n", + bflag); + return (0); + } } else { for (i = 0; sblock_try[i] != -1; i++) { super = sblock_try[i] / dev_bsize; @@ -317,9 +322,11 @@ (sblock.fs_magic == FS_UFS2_MAGIC && sblock.fs_sblockloc == sblock_try[i])) && sblock.fs_ncg >= 1 && + sblock.fs_state == 0 && sblock.fs_bsize >= MINBSIZE && - sblock.fs_bsize >= sizeof(struct fs)) + sblock.fs_bsize >= sizeof(struct fs)) { break; + } } if (sblock_try[i] == -1) { fprintf(stderr, "Cannot find file system superblock\n"); --Boundary-00=_Ghoq/V8Jqd8uEZp-- From owner-freebsd-arch@FreeBSD.ORG Thu Nov 6 16:02:05 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BE6F516A4CF; Thu, 6 Nov 2003 16:02:05 -0800 (PST) Received: from beastie.mckusick.com (beastie.mckusick.com [209.31.233.184]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7929443FEC; Thu, 6 Nov 2003 16:02:04 -0800 (PST) (envelope-from mckusick@beastie.mckusick.com) Received: from beastie.mckusick.com (localhost [127.0.0.1]) by beastie.mckusick.com (8.12.8/8.12.3) with ESMTP id hA7021eN035691; Thu, 6 Nov 2003 16:02:01 -0800 (PST) (envelope-from mckusick@beastie.mckusick.com) Message-Id: <200311070002.hA7021eN035691@beastie.mckusick.com> To: Tim Robbins In-Reply-To: Your message of "Thu, 06 Nov 2003 17:45:28 +1100." <20031106064528.GA1440@wombat.robbins.dropbear.id.au> Date: Thu, 06 Nov 2003 16:02:01 -0800 From: Kirk McKusick cc: Robert Watson cc: arch@freebsd.org cc: Peter Wemm Subject: Re: >0x7fffffff blocksize filesystem reporting X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Nov 2003 00:02:05 -0000 > Date: Thu, 6 Nov 2003 17:45:28 +1100 > From: Tim Robbins > To: Kirk McKusick > Cc: Peter Wemm , Robert Watson , > arch@freebsd.org > Subject: Re: >0x7fffffff blocksize filesystem reporting > X-ASK-Info: Whitelist match > > On Wed, Nov 05, 2003 at 09:04:41PM -0800, Kirk McKusick wrote: > > > + /* > > + * Convert a new format statfs structure to an old format statfs structure. > > + */ > > + static void > > + cvtstatfs(td, nsp, osp) > > + struct thread *td; > > + struct statfs *nsp; > > + struct ostatfs *osp; > > + { > > + > > + bzero(osp, sizeof(*osp)); > > + osp->f_bsize = nsp->f_bsize; > > + osp->f_iosize = nsp->f_iosize; > > + osp->f_blocks = nsp->f_blocks; > > + osp->f_bfree = nsp->f_bfree; > > + osp->f_bavail = nsp->f_bavail; > > + osp->f_files = nsp->f_files; > > + osp->f_ffree = nsp->f_ffree; > > + osp->f_owner = nsp->f_owner; > > + osp->f_type = nsp->f_type; > > + osp->f_flags = nsp->f_flags; > > + osp->f_syncwrites = nsp->f_syncwrites; > > + osp->f_asyncwrites = nsp->f_asyncwrites; > > + osp->f_syncreads = nsp->f_syncreads; > > + osp->f_asyncreads = nsp->f_asyncreads; > > It may be better to return LONG_MAX for some of these members than to > truncate the value. Alternatively, the block size could be adjusted > to ensure that f_blocks fits in a "long" even though f_blocks * f_bsize > may overflow it, but this is messy and can't help if f_files or > f_{sync,async}{reads,writes} are too big. > > > + bcopy(nsp->f_fstypename, osp->f_fstypename, MFSNAMELEN); > > + bcopy(nsp->f_mntonname, osp->f_mntonname, MNAMELEN); > > + bcopy(nsp->f_mntfromname, osp->f_mntfromname, MNAMELEN); > > On architectures where longs are not 32 bits (amd64), OMNAMELEN != MNAMELEN, > so this may do the wrong thing. > > > Tim You make two good points. Here is my revised diff for cvtstatfs: + /* + * Convert a new format statfs structure to an old format statfs structure. + */ + static void + cvtstatfs(td, nsp, osp) + struct thread *td; + struct statfs *nsp; + struct ostatfs *osp; + { + + bzero(osp, sizeof(*osp)); + osp->f_bsize = MIN(nsp->f_bsize, LONG_MAX); + osp->f_iosize = MIN(nsp->f_iosize, LONG_MAX); + osp->f_blocks = MIN(nsp->f_blocks, LONG_MAX); + osp->f_bfree = MIN(nsp->f_bfree, LONG_MAX); + osp->f_bavail = MIN(nsp->f_bavail, LONG_MAX); + osp->f_files = MIN(nsp->f_files, LONG_MAX); + osp->f_ffree = MIN(nsp->f_ffree, LONG_MAX); + osp->f_owner = nsp->f_owner; + osp->f_type = nsp->f_type; + osp->f_flags = nsp->f_flags; + osp->f_syncwrites = MIN(nsp->f_syncwrites, LONG_MAX); + osp->f_asyncwrites = MIN(nsp->f_asyncwrites, LONG_MAX); + osp->f_syncreads = MIN(nsp->f_syncreads, LONG_MAX); + osp->f_asyncreads = MIN(nsp->f_asyncreads, LONG_MAX); + bcopy(nsp->f_fstypename, osp->f_fstypename, + MIN(MFSNAMELEN, OMNAMELEN)); + bcopy(nsp->f_mntonname, osp->f_mntonname, + MIN(MFSNAMELEN, OMNAMELEN)); + bcopy(nsp->f_mntfromname, osp->f_mntfromname, + MIN(MFSNAMELEN, OMNAMELEN)); From owner-freebsd-arch@FreeBSD.ORG Thu Nov 6 16:25:06 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D9D1416A4CE; Thu, 6 Nov 2003 16:25:06 -0800 (PST) Received: from beastie.mckusick.com (beastie.mckusick.com [209.31.233.184]) by mx1.FreeBSD.org (Postfix) with ESMTP id DF76044005; Thu, 6 Nov 2003 16:25:05 -0800 (PST) (envelope-from mckusick@beastie.mckusick.com) Received: from beastie.mckusick.com (localhost [127.0.0.1]) by beastie.mckusick.com (8.12.8/8.12.3) with ESMTP id hA70P4eN035719; Thu, 6 Nov 2003 16:25:05 -0800 (PST) (envelope-from mckusick@beastie.mckusick.com) Message-Id: <200311070025.hA70P4eN035719@beastie.mckusick.com> To: Bruce Evans In-Reply-To: Your message of "Thu, 06 Nov 2003 22:44:33 +1100." <20031106211046.J7380@gamplex.bde.org> Date: Thu, 06 Nov 2003 16:25:04 -0800 From: Kirk McKusick cc: Robert Watson cc: arch@freebsd.org cc: Peter Wemm Subject: Re: >0x7fffffff blocksize filesystem reporting X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Nov 2003 00:25:07 -0000 > Date: Thu, 6 Nov 2003 22:44:33 +1100 (EST) > From: Bruce Evans > To: Kirk McKusick > cc: Peter Wemm , Robert Watson , > arch@freebsd.org > Subject: Re: >0x7fffffff blocksize filesystem reporting > X-ASK-Info: Whitelist match > > On Wed, 5 Nov 2003, Kirk McKusick wrote: > > > + #define MNAMELEN 80 /* size of on/from name bufs */ > > As pointed out by tjr, there are buffer overflows from bcopy() MNAMELEN > bytes. This is because the above MNAMELEN Is smaller than the old MNAMELEN > on systems with sizeof(long) > 4. It should be either the historical > MNAMELEN of 90 or 88, or larger than that since we have room for expansion, > or no larger than the previous smallest MNAMELEN for all machines, depending > on whether the overflow bug is fixed by truncation on conversion or avoided > by truncation when the name is copied in. Per my earlier message to this list, I now copy the smaller of MNAMELEN and OMNAMELEN. Per your suggestion, I have increased MNAMELEN to 88 (see revised statfs structure below). > > + #define STATFS_VERSION 0x20030518 /* current version number */ > > Style bug (space instead of tab). > > > + struct statfs { > > + u_int32_t f_version; /* structure version number */ > > + u_int32_t f_type; /* type of filesystem */ > > + u_int64_t f_flags; /* copy of mount exported flags */ > > + u_int64_t f_bsize; /* filesystem fragment size */ > > + u_int64_t f_iosize; /* optimal transfer block size */ > > + u_int64_t f_blocks; /* total data blocks in filesystem */ > > + u_int64_t f_bfree; /* free blocks in filesystem */ > > + int64_t f_bavail; /* free blocks avail to non-superuser */ > > + u_int64_t f_files; /* total file nodes in filesystem */ > > + int64_t f_ffree; /* free nodes avail to non-superuser */ > > + u_int64_t f_syncwrites; /* count of sync writes since mount */ > > + u_int64_t f_asyncwrites; /* count of async writes since mount */ > > + u_int64_t f_syncreads; /* count of sync reads since mount */ > > + u_int64_t f_asyncreads; /* count of async reads since mount */ > > + u_int64_t f_spare[10]; /* unused spare */ > > + u_int32_t f_namemax; /* maximum filename length */ > > I disklike all these unsigned types, and to a lesser extent, typedefed types > and some of the 64-bit types. > > Unsigned types limit possibilties for trapping overflow if overflow > actually occurs. They should not be used to get an extra bit of > precision unless the extra bit is very important. The old code got > this right by using only signed types. f_bavail and f_ffree need to > go negative so they are correctly signed types, but this means that > the extra bit for the 64-bit unsigned block and node counts cannot > actually be used (since values that use it would cause overflow if > most of the blocks or nodes are free). > > Typedefed types are difficult to print, as shown by printf format errors > in most of the formats changed in this patch. But forwards compatibility > at the source level requires them. I tend to agree with you about using signed types. However, the general sentiment when this was discussed on the arch list several months ago were that unsigned types should be used. So, I went with majority (or at least most vocal) opinion there. The fixed sizes are for the reasons that you note. > Why 64-bit types for f_bsize and f_iosize? It is concievable that 32-bits would not be enough, so why risk getting it wrong when it is so easy to fix now. > If unsigned fixed-width types are used, then they should be named in > a standard way (uintN_t, not u_intN_t). Done (see revised statfs structure below). > There are also some standard POSIX types for block counts, etc. These > are used in POSIX's variant of statfs(). E.g., there is blkcnt_t, which > is a signed integral type not yet implemented in FreeBSD. In the XSI > extension there is also fsblkcnt_t, which is an _unsigned_ integral > type implemented as uint64_t in FreeBSD. Its unsignedness is apparently > derived from the old XSI type of unsigned long for block counts. > > > + uid_t f_owner; /* user that mounted the filesystem */ > > + fsid_t f_fsid; /* filesystem id */ > > + char f_charspare[76]; /* spare string space */ > > + char f_fstypename[MFSNAMELEN]; /* filesystem type name */ > > + char f_mntfromname[MNAMELEN]; /* mounted filesystem */ > > + char f_mntonname[MNAMELEN]; /* directory on which mounted */ > > + }; > > f_charspare would be better at the end. If explicit padding is needed > after f_fsid_t, then it should be in terms of fsid_t spares. Similarly > for spares after f_namemax and f_owner. These are only packed because > uid_t happens to be 32-bit. fsid_t is 2*32-bit so it doesn't need > padding either. 76 is an odd amount of padding. It gives 252 bytes > of char arrays altogether and a struct size of 452 on i386's. 452 is > 4 mod 8, so there are 4 bytes of unnamed padding at the end of the > struct on most 64-bit arches (if not internal padding). I miss-calculated the size of fsid_t. I agree that the structure should be mod 8 == 0. I changed the f_charspare to 80 per your suggestion. I put the spare character space between the int32's and the character arrays so that it could be used for either additions int32's or new character arrays. We could use it for extra int32's even at the end, but I feel it is more intuitive to keep the int32's together. > > Index: sys/kern/vfs_syscalls.c > > =================================================================== > > RCS file: /usr/ncvs/src/sys/kern/vfs_syscalls.c,v > > retrieving revision 1.332 > > diff -c -r1.332 vfs_syscalls.c > > *** sys/kern/vfs_syscalls.c 19 Oct 2003 20:41:07 -0000 1.332 > > --- sys/kern/vfs_syscalls.c 5 Nov 2003 05:10:54 -0000 > > ... > > + /* > > + * Convert a new format statfs structure to an old format statfs structure. > > + */ > > + static void > > + cvtstatfs(td, nsp, osp) > > + struct thread *td; > > + struct statfs *nsp; > > + struct ostatfs *osp; > > + { > > + > > + bzero(osp, sizeof(*osp)); > > + osp->f_bsize = nsp->f_bsize; > > + osp->f_iosize = nsp->f_iosize; > > + osp->f_blocks = nsp->f_blocks; > > + osp->f_bfree = nsp->f_bfree; > > + osp->f_bavail = nsp->f_bavail; > > + osp->f_files = nsp->f_files; > > + osp->f_ffree = nsp->f_ffree; > > tjr suggested setting the values to LONG_MAX instead of blindly truncating > them I think POSIX has some methods for dealing with such overflows > (from LFS). I thing they would reduce to returning -1/EOVERFLOW here, > which is not very useful. > > tjr also suggested using a fake block size. This is already done in nfs, > but the implementation is broken (nfs3 has similar 32-bit limits in the > client-server interface). Per my earlier message, I now cap the size in the old structure to LONG_MAX. I chose not to play games with the block size as the old interface will be going away and I would rather leave it as it has historically been done. > > Index: sys/kern/vfs_bio.c > > =================================================================== > > RCS file: /usr/ncvs/src/sys/kern/vfs_bio.c,v > > retrieving revision 1.420 > > diff -c -r1.420 vfs_bio.c > > *** sys/kern/vfs_bio.c 4 Nov 2003 06:30:00 -0000 1.420 > > --- sys/kern/vfs_bio.c 5 Nov 2003 05:10:54 -0000 > > *************** > > *** 3239,3245 **** > > (int) m->pindex, (int)(foff >> 32), > > (int) foff & 0xffffffff, resid, i); > > if (!vn_isdisk(vp, NULL)) > > ! printf(" iosize: %ld, lblkno: %jd, flags: 0x%x, npages: %d\n", > > bp->b_vp->v_mount->mnt_stat.f_iosize, > > (intmax_t) bp->b_lblkno, > > bp->b_flags, bp->b_npages); > > --- 3239,3245 ---- > > (int) m->pindex, (int)(foff >> 32), > > (int) foff & 0xffffffff, resid, i); > > if (!vn_isdisk(vp, NULL)) > > ! printf(" iosize: %jd, lblkno: %jd, flags: 0x%x, npages: %d\n", > > bp->b_vp->v_mount->mnt_stat.f_iosize, > > (intmax_t) bp->b_lblkno, > > bp->b_flags, bp->b_npages); > > Example of a printf format error. The long was easy to print using %ld, > but now there is a a u_int64_t. Using %jd gives a sign mismatch on all > machines and a size mismatch on machines with > sizeof(u_int64_t) != sizeof(intmax_t). So true, I need to do a lot of casting to (intmax_t). I wish there were a better way, sigh. > > Index: bin/df/df.c > > =================================================================== > > RCS file: /usr/ncvs/src/bin/df/df.c,v > > retrieving revision 1.51 > > diff -c -r1.51 df.c > > *** bin/df/df.c 13 Sep 2003 20:46:58 -0000 1.51 > > --- bin/df/df.c 5 Nov 2003 19:22:11 -0000 > > ... > > This has many more examples of printf format errors. The old code has > many related bugs. > > Bruce Thanks for your prompt feedback. The revised statfs structure is given below. Kirk McKusick =-=-=-=-=-= + #define MFSNAMELEN 16 /* length of type name including null */ + #define MNAMELEN 88 /* size of on/from name bufs */ + #define STATFS_VERSION 0x20030518 /* current version number */ + struct statfs { + uint32_t f_version; /* structure version number */ + uint32_t f_type; /* type of filesystem */ + uint64_t f_flags; /* copy of mount exported flags */ + uint64_t f_bsize; /* filesystem fragment size */ + uint64_t f_iosize; /* optimal transfer block size */ + uint64_t f_blocks; /* total data blocks in filesystem */ + uint64_t f_bfree; /* free blocks in filesystem */ + int64_t f_bavail; /* free blocks avail to non-superuser */ + uint64_t f_files; /* total file nodes in filesystem */ + int64_t f_ffree; /* free nodes avail to non-superuser */ + uint64_t f_syncwrites; /* count of sync writes since mount */ + uint64_t f_asyncwrites; /* count of async writes since mount */ + uint64_t f_syncreads; /* count of sync reads since mount */ + uint64_t f_asyncreads; /* count of async reads since mount */ + uint64_t f_spare[10]; /* unused spare */ + uint32_t f_namemax; /* maximum filename length */ + uid_t f_owner; /* user that mounted the filesystem */ + fsid_t f_fsid; /* filesystem id */ + char f_charspare[80]; /* spare string space */ + char f_fstypename[MFSNAMELEN]; /* filesystem type name */ + char f_mntfromname[MNAMELEN]; /* mounted filesystem */ + char f_mntonname[MNAMELEN]; /* directory on which mounted */ + }; From owner-freebsd-arch@FreeBSD.ORG Thu Nov 6 16:52:52 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D5B3116A4CE for ; Thu, 6 Nov 2003 16:52:52 -0800 (PST) Received: from smtp.omnis.com (smtp.omnis.com [216.239.128.26]) by mx1.FreeBSD.org (Postfix) with ESMTP id 52DD643F93 for ; Thu, 6 Nov 2003 16:52:52 -0800 (PST) (envelope-from wes@softweyr.com) Received: from salty.rapid.stbernard.com (corp-2.ipinc.com [199.245.188.2]) by smtp-relay.omnis.com (Postfix) with ESMTP id 742629BE99; Thu, 6 Nov 2003 16:46:41 -0800 (PST) From: Wes Peters Organization: Softweyr.com To: Bruce Evans Date: Thu, 6 Nov 2003 16:52:48 -0800 User-Agent: KMail/1.5.2 References: <200311041737.20467.wes@softweyr.com> <20031105213950.Y1738@gamplex.bde.org> In-Reply-To: <20031105213950.Y1738@gamplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200311061652.48948.wes@softweyr.com> cc: arch@freebsd.org Subject: Re: newfs and mount vs. half-baked disks X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Nov 2003 00:52:52 -0000 On Wednesday 05 November 2003 03:18, Bruce Evans wrote: > On Tue, 4 Nov 2003, Wes Peters wrote: > > > > I emailed Kirk about this state of affairs and he confirmed that > > newfs was developed with operator intervention in mind. He > > suggested employing one of the unused flags in the filesystem > > header as a 'consistent' flag, setting it to 'not consistent' at > > the beginning of newfs, and then updating to 'is consistent' at the > > end. The performance hit in updating all superblock copies at the > > end is small but noticable (< 1s on a rather slow 6GB filesystem). > > There is no need to use a new flag. Just set the magic number to a > value different from both FS_UFS1_MAGIC and FS_UFS2_MAGIC, e.g., to > 0, until newfs is nearly finished. I specifically don't want to do that because I want the state "interrupted newfs operation" to be discernable from the state "something stomped on your superblock." This I believe better shows that the superblock is valid but the filesystem is not (yet). The name fs_state suggests someone was thinking of recording some sort of state in here that was never implemented. I've simply used it to record states 'newfs operation completed' and 'newfs operation not completed.' -- "Where am I, and what am I doing in this handbasket?" Wes Peters wes@softweyr.com From owner-freebsd-arch@FreeBSD.ORG Fri Nov 7 01:57:30 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0226216A4CE; Fri, 7 Nov 2003 01:57:30 -0800 (PST) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3AC0243FE1; Fri, 7 Nov 2003 01:57:28 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3p2/8.8.7) with ESMTP id UAA22156; Fri, 7 Nov 2003 20:57:20 +1100 Date: Fri, 7 Nov 2003 20:57:19 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Kirk McKusick In-Reply-To: <200311070025.hA70P4eN035719@beastie.mckusick.com> Message-ID: <20031107201129.M2926@gamplex.bde.org> References: <200311070025.hA70P4eN035719@beastie.mckusick.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: Robert Watson cc: arch@FreeBSD.org cc: Peter Wemm Subject: Re: >0x7fffffff blocksize filesystem reporting X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Nov 2003 09:57:30 -0000 On Thu, 6 Nov 2003, Kirk McKusick wrote: > > From: Bruce Evans > > > > On Wed, 5 Nov 2003, Kirk McKusick wrote: > > > > > + #define MNAMELEN 80 /* size of on/from name bufs */ > > > > As pointed out by tjr, there are buffer overflows from bcopy() MNAMELEN > > bytes. ... > > Per my earlier message to this list, I now copy the smaller of > MNAMELEN and OMNAMELEN. Per your suggestion, I have increased > MNAMELEN to 88 (see revised statfs structure below). OK. > > > + struct statfs { > > > + u_int32_t f_version; /* structure version number */ > > > + u_int32_t f_type; /* type of filesystem */ > > > + u_int64_t f_flags; /* copy of mount exported flags */ > > > + u_int64_t f_bsize; /* filesystem fragment size */ > > > ... > > > > I disklike all these unsigned types, and to a lesser extent, typedefed types > > and some of the 64-bit types. [Actually, i like unsigned types for bitmaps and therefore for f_flags.] > > ... > I tend to agree with you about using signed types. However, the > general sentiment when this was discussed on the arch list several > months ago were that unsigned types should be used. So, I went with > majority (or at least most vocal) opinion there. The fixed sizes > are for the reasons that you note. Sigh. Perhaps I should be more vocal :-). > > Why 64-bit types for f_bsize and f_iosize? > > It is concievable that 32-bits would not be enough, so why risk > getting it wrong when it is so easy to fix now. Well, the type for a (closely related if not the same) block size is also used in struct stat, so this could not be changed without much larger effect that would result from changing struct stat. We currently use the following types for st_blksize, and POSIX has requirements on it: struct ostat: int32_t st_blksize struct stat: uint32_t st_blksize struct nstat: uint32_t st_blksize POSIX (XSI): blkcnt_t st_blksize POSIX (XSI): blkcnt_t shall be a signed integer type : /* XXX: missing blkcnt_t, blksize_t */ The XXX needs to be expanded to say that we have a sign mismatch for the missing blkcnt_t. POSIX doesn't use blksize_t in its version of statfs() (i.e., statvfs). It just uses "unsigned long" for sizes and flags. See our implementation in if you don't have the standard handy. > > If unsigned fixed-width types are used, then they should be named in > > a standard way (uintN_t, not u_intN_t). > > Done (see revised statfs structure below). OK. I believe fixed-width fields _are_ the right thing here, since we want binary compatibility. Perhaps you should extend this to the only fields that aren't fixed-width now. These are: uid_t f_owner; /* user that mounted the filesystem */ fsid_t f_fsid; /* filesystem id */ dinode.h already use fixed-width fields ids for the same reason. The fields should be wider than we would ever need them to be. dinode.h just uses u_int32_t, which might not be enough, but whatever is here doesn't need to be larger. > > ... > > f_charspare would be better at the end. If explicit padding is needed > > ... > > I miss-calculated the size of fsid_t. I agree that the structure should > be mod 8 == 0. I changed the f_charspare to 80 per your suggestion. I > put the spare character space between the int32's and the character > arrays so that it could be used for either additions int32's or new > character arrays. We could use it for extra int32's even at the end, > but I feel it is more intuitive to keep the int32's together. OK. > > > Index: sys/kern/vfs_syscalls.c > > > =================================================================== > > > RCS file: /usr/ncvs/src/sys/kern/vfs_syscalls.c,v > > > retrieving revision 1.332 > > > diff -c -r1.332 vfs_syscalls.c > > > *** sys/kern/vfs_syscalls.c 19 Oct 2003 20:41:07 -0000 1.332 > > > --- sys/kern/vfs_syscalls.c 5 Nov 2003 05:10:54 -0000 > > > ... > > > + /* > > > + * Convert a new format statfs structure to an old format statfs structure. > > > + */ > > > + static void > > > + cvtstatfs(td, nsp, osp) > > > + struct thread *td; > > > + struct statfs *nsp; > > > + struct ostatfs *osp; > > > + { > > > + > > > + bzero(osp, sizeof(*osp)); > > > + osp->f_bsize = nsp->f_bsize; > > > + osp->f_iosize = nsp->f_iosize; > > > + osp->f_blocks = nsp->f_blocks; > > > + osp->f_bfree = nsp->f_bfree; > > > + osp->f_bavail = nsp->f_bavail; > > > + osp->f_files = nsp->f_files; > > > + osp->f_ffree = nsp->f_ffree; > > > > tjr suggested setting the values to LONG_MAX instead of blindly truncating > > ... > > Per my earlier message, I now cap the size in the old structure to > LONG_MAX. I chose not to play games with the block size as the old > ... Strictly, the signed ones also need to be limited by LONG_MIN from below (in case root uses more than LONG_MIN reserved blocks). > > > Index: sys/kern/vfs_bio.c > > > =================================================================== > > > RCS file: /usr/ncvs/src/sys/kern/vfs_bio.c,v > > > retrieving revision 1.420 > > > diff -c -r1.420 vfs_bio.c > > > *** sys/kern/vfs_bio.c 4 Nov 2003 06:30:00 -0000 1.420 > > > --- sys/kern/vfs_bio.c 5 Nov 2003 05:10:54 -0000 > > > *************** > > > *** 3239,3245 **** > > > (int) m->pindex, (int)(foff >> 32), > > > (int) foff & 0xffffffff, resid, i); > > > if (!vn_isdisk(vp, NULL)) > > > ! printf(" iosize: %ld, lblkno: %jd, flags: 0x%x, npages: %d\n", > > > bp->b_vp->v_mount->mnt_stat.f_iosize, > > > (intmax_t) bp->b_lblkno, > > > bp->b_flags, bp->b_npages); > > > --- 3239,3245 ---- > > > (int) m->pindex, (int)(foff >> 32), > > > (int) foff & 0xffffffff, resid, i); > > > if (!vn_isdisk(vp, NULL)) > > > ! printf(" iosize: %jd, lblkno: %jd, flags: 0x%x, npages: %d\n", > > > bp->b_vp->v_mount->mnt_stat.f_iosize, > > > (intmax_t) bp->b_lblkno, > > > bp->b_flags, bp->b_npages); > > > > Example of a printf format error. The long was easy to print using %ld, > > but now there is a a u_int64_t. Using %jd gives a sign mismatch on all > > machines and a size mismatch on machines with > > sizeof(u_int64_t) != sizeof(intmax_t). > > So true, I need to do a lot of casting to (intmax_t). I wish there > were a better way, sigh. Unfortunately, existing practice didn't keep up with needs here, so C99 couldn't standardize anything good. I hoped for something like sfio's method (which IIRC uses %I to put parameter sizes in the format string), with compiler support to rewrite literal format strings to supply these sizes automatically if requested. Format strings in message catalogs are harder to handle. > Thanks for your prompt feedback. The revised statfs structure is given > below. OK except for the unsignedness and POSIX points mentioned above. I think statfs() should be as compatible with statvfs() as possible. It need not use the POSIX types fsblkcnt_t and fsfilcnt_t directly, but shouldn't be gratuitously different. Since fsblkcnt_t and fsfilcnt_t are unsigned, this requires unsigned types for statfs() too. Bruce From owner-freebsd-arch@FreeBSD.ORG Fri Nov 7 03:01:39 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5099616A4CE for ; Fri, 7 Nov 2003 03:01:39 -0800 (PST) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 96F4943FB1 for ; Fri, 7 Nov 2003 03:01:29 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3p2/8.8.7) with ESMTP id WAA28796; Fri, 7 Nov 2003 22:01:22 +1100 Date: Fri, 7 Nov 2003 22:01:21 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Wes Peters In-Reply-To: <200311061652.48948.wes@softweyr.com> Message-ID: <20031107211239.O3343@gamplex.bde.org> References: <200311041737.20467.wes@softweyr.com> <20031105213950.Y1738@gamplex.bde.org> <200311061652.48948.wes@softweyr.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: arch@FreeBSD.org Subject: Re: newfs and mount vs. half-baked disks X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Nov 2003 11:01:39 -0000 On Thu, 6 Nov 2003, Wes Peters wrote: > On Wednesday 05 November 2003 03:18, Bruce Evans wrote: > > On Tue, 4 Nov 2003, Wes Peters wrote: > > > > > > I emailed Kirk about this state of affairs and he confirmed that > > > newfs was developed with operator intervention in mind. He > > > suggested employing one of the unused flags in the filesystem > > > header as a 'consistent' flag, setting it to 'not consistent' at > > > the beginning of newfs, and then updating to 'is consistent' at the > > > end. The performance hit in updating all superblock copies at the > > > end is small but noticable (< 1s on a rather slow 6GB filesystem). > > > > There is no need to use a new flag. Just set the magic number to a > > value different from both FS_UFS1_MAGIC and FS_UFS2_MAGIC, e.g., to > > 0, until newfs is nearly finished. > > I specifically don't want to do that because I want the state > "interrupted newfs operation" to be discernable from the state > "something stomped on your superblock." This I believe better shows > that the superblock is valid but the filesystem is not (yet). That's not a very useful distinction, especially for your application where the disk contents is disposable and you re-newfs it a lot. After a crash the complete state (as given by the disk contents) may be almost anywhwere between its initial and final values, depending on buffering etc., so it would be very difficult just to determine what it is if you needed it. Some cases can be discerned anyway, depending on how much erasing of metadata newfs does when it starts. E.g., if there was an ffs file system on the disk, then this will be recorded in the disk label unless that feature has been axed). newfs doesn't rewrite the label until just before it finishes, so the old label can be looked at to determine what was there. Writing the label last may be a bug though. Perhaps newfs should erase all the primary metadata for the old filesystem when it starts so as to minimise the window where there may be conflicting metadata. Clearing the magic number works better because it requires no kernel changes and no changes to applications other than ufs. In particular, half-baked file systems formatted with the changed newfs work right under all versions of FreeBSD (i.e., they don't work and don't cause panics), and utilities like dumpfs and fsdb don't need to be changed to display and/or edit the newly used field. > The name fs_state suggests someone was thinking of recording some sort > of state in here that was never implemented. I've simply used it to > record states 'newfs operation completed' and 'newfs operation not > completed.' Its comment also suggests that it was for validating fs_clean. This goes back to at least FreeBSD-1 where both fs_state and fs_clean are unused. fs_clean wasn't used in 4.4BSD-Lite1 either. Use of it was added in FreeBSD and/or NetBSD and picked up by Lite2. I am nervous about using yet another variable for things related to state. There is a large enough mess for fs_clean already. We now also have FS_UNCLEAN in fs_flags (which were also unused until relatively recently). As I understand it (the details were simpler when I understood them all), FS_UNCLEAN was originally fs_clean done right except it probably belongs in fs_state. It was just fs_clean inverted and put in a bitmap. Now FS_UNCLEAN is tangled up with FS_NEEDSFSCK and is not simply fs_clean inverted. Bruce From owner-freebsd-arch@FreeBSD.ORG Fri Nov 7 23:30:04 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C31B316A4CE for ; Fri, 7 Nov 2003 23:30:04 -0800 (PST) Received: from beastie.mckusick.com (64118101118.sierratel.com [64.118.101.118]) by mx1.FreeBSD.org (Postfix) with ESMTP id E1F7043FBD for ; Fri, 7 Nov 2003 23:30:02 -0800 (PST) (envelope-from mckusick@beastie.mckusick.com) Received: from beastie.mckusick.com (localhost [127.0.0.1]) by beastie.mckusick.com (8.12.8/8.12.3) with ESMTP id hA7NmgaG000289; Fri, 7 Nov 2003 15:48:42 -0800 (PST) (envelope-from mckusick@beastie.mckusick.com) Message-Id: <200311072348.hA7NmgaG000289@beastie.mckusick.com> To: Bruce Evans In-Reply-To: Your message of "Fri, 07 Nov 2003 22:01:21 +1100." <20031107211239.O3343@gamplex.bde.org> Date: Fri, 07 Nov 2003 15:48:42 -0800 From: Kirk McKusick cc: arch@freebsd.org Subject: Re: newfs and mount vs. half-baked disks X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Nov 2003 07:30:04 -0000 > Date: Fri, 7 Nov 2003 22:01:21 +1100 (EST) > From: Bruce Evans > To: Wes Peters > Cc: arch@freebsd.org > Subject: Re: newfs and mount vs. half-baked disks > X-ASK-Info: Whitelist match > > That's not a very useful distinction, especially for your application > where the disk contents is disposable and you re-newfs it a lot. After > a crash the complete state (as given by the disk contents) may be > almost anywhwere between its initial and final values, depending on > buffering etc., so it would be very difficult just to determine what > it is if you needed it. > > Some cases can be discerned anyway, depending on how much erasing of > metadata newfs does when it starts. E.g., if there was an ffs file > system on the disk, then this will be recorded in the disk label unless > that feature has been axed). newfs doesn't rewrite the label until > just before it finishes, so the old label can be looked at to determine > what was there. Writing the label last may be a bug though. Perhaps > newfs should erase all the primary metadata for the old filesystem > when it starts so as to minimise the window where there may be > conflicting metadata. You cannot depend on the disk label as the disklabel is going away or at least being wholly overhauled with GEOM. In particular, the existing disk label only has a 2^32 block count which is insufficient for filesystems larger than 2Tb. > Clearing the magic number works better because it requires no kernel > changes and no changes to applications other than ufs. In particular, > half-baked file systems formatted with the changed newfs work right > under all versions of FreeBSD (i.e., they don't work and don't cause > panics), and utilities like dumpfs and fsdb don't need to be changed > to display and/or edit the newly used field. > > Bruce I concur that knocking out the magic number will have the benefit of making all the existing utilities and kernels refuse to inspect / mount the filesystem. Kirk McKusick From owner-freebsd-arch@FreeBSD.ORG Sat Nov 8 00:35:35 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B97FF16A4CE for ; Sat, 8 Nov 2003 00:35:35 -0800 (PST) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5A57343FE0 for ; Sat, 8 Nov 2003 00:35:34 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3p2/8.8.7) with ESMTP id TAA21631; Sat, 8 Nov 2003 19:35:17 +1100 Date: Sat, 8 Nov 2003 19:35:16 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Kirk McKusick In-Reply-To: <200311072348.hA7NmgaG000289@beastie.mckusick.com> Message-ID: <20031108191433.J608@gamplex.bde.org> References: <200311072348.hA7NmgaG000289@beastie.mckusick.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: arch@freebsd.org Subject: Re: newfs and mount vs. half-baked disks X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Nov 2003 08:35:35 -0000 On Fri, 7 Nov 2003, Kirk McKusick wrote: > > From: Bruce Evans > > ... > > Some cases can be discerned anyway, depending on how much erasing of > > metadata newfs does when it starts. E.g., if there was an ffs file > > system on the disk, then this will be recorded in the disk label unless > > that feature has been axed). newfs doesn't rewrite the label until > > just before it finishes, so the old label can be looked at to determine > > what was there. Writing the label last may be a bug though. Perhaps > > newfs should erase all the primary metadata for the old filesystem > > when it starts so as to minimise the window where there may be > > conflicting metadata. > > You cannot depend on the disk label as the disklabel is going away > or at least being wholly overhauled with GEOM. In particular, the > existing disk label only has a 2^32 block count which is insufficient > for filesystems larger than 2Tb. I don't use GEOM, so the label won't be going away for me. Anyway, there is no dependency (the label is just one of the things that one might examine to recover a crashed disk), and any overaul by GEOM would have to duplicate the functionality of storing metadata about the superblocks somewhere outside the superblocks. (I actually store metadata about file systems in (backups of) disk files in /var/backups. Normal backups provide inadequate backups of metadata.) The block count is in units of sector size, so disks much larger than 2TB can be supported by disklabel using (fake if necessary) sector sizes larger than 512. File systems need to use similarly large block (fragment for ffs) sizes, and some patches are needed for reading superblocks if the sector size is larger than 8K. Since ffs uses a block size of 16K by default, a sector size of 16K are not unreasonable and this is sufficent for disks smaller then 64TB. Bruce From owner-freebsd-arch@FreeBSD.ORG Sat Nov 8 12:59:50 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B68E516A4CE for ; Sat, 8 Nov 2003 12:59:50 -0800 (PST) Received: from smtp.omnis.com (smtp.omnis.com [216.239.128.26]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0981B43FF7 for ; Sat, 8 Nov 2003 12:59:50 -0800 (PST) (envelope-from wes@softweyr.com) Received: from softweyr.homeunix.net (66-91-236-204.san.rr.com [66.91.236.204]) by smtp-relay.omnis.com (Postfix) with ESMTP id 2B18B5B625; Sat, 8 Nov 2003 12:53:22 -0800 (PST) From: Wes Peters Organization: Softweyr To: Bruce Evans Date: Sat, 8 Nov 2003 12:59:48 -0800 User-Agent: KMail/1.5.4 References: <200311041737.20467.wes@softweyr.com> <200311061652.48948.wes@softweyr.com> <20031107211239.O3343@gamplex.bde.org> In-Reply-To: <20031107211239.O3343@gamplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200311081259.48214.wes@softweyr.com> cc: arch@FreeBSD.org Subject: Re: newfs and mount vs. half-baked disks X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Nov 2003 20:59:50 -0000 On Friday 07 November 2003 03:01 am, Bruce Evans wrote: > On Thu, 6 Nov 2003, Wes Peters wrote: > > On Wednesday 05 November 2003 03:18, Bruce Evans wrote: > > > On Tue, 4 Nov 2003, Wes Peters wrote: > > > > I emailed Kirk about this state of affairs and he confirmed that > > > > newfs was developed with operator intervention in mind. He > > > > suggested employing one of the unused flags in the filesystem > > > > header as a 'consistent' flag, setting it to 'not consistent' at > > > > the beginning of newfs, and then updating to 'is consistent' at > > > > the end. The performance hit in updating all superblock copies > > > > at the end is small but noticable (< 1s on a rather slow 6GB > > > > filesystem). > > > > > > There is no need to use a new flag. Just set the magic number to a > > > value different from both FS_UFS1_MAGIC and FS_UFS2_MAGIC, e.g., to > > > 0, until newfs is nearly finished. > > > > I specifically don't want to do that because I want the state > > "interrupted newfs operation" to be discernable from the state > > "something stomped on your superblock." This I believe better shows > > that the superblock is valid but the filesystem is not (yet). > > That's not a very useful distinction, especially for your application > where the disk contents is disposable and you re-newfs it a lot. Oh, but after the newfs we take different actions depending on whether we think we're creating a new filesystem or restoring an old one. In particular, if we are restoring a previous filesystem, there are references to restored data that need to be checked for validity. None of these references exist on a new system, so we can avoid the cost of searching for them. > After > a crash the complete state (as given by the disk contents) may be > almost anywhwere between its initial and final values, depending on > buffering etc., so it would be very difficult just to determine what > it is if you needed it. > > Some cases can be discerned anyway, depending on how much erasing of > metadata newfs does when it starts. E.g., if there was an ffs file > system on the disk, then this will be recorded in the disk label unless > that feature has been axed). newfs doesn't rewrite the label until > just before it finishes, so the old label can be looked at to determine > what was there. Writing the label last may be a bug though. Perhaps > newfs should erase all the primary metadata for the old filesystem > when it starts so as to minimise the window where there may be > conflicting metadata. > > Clearing the magic number works better because it requires no kernel > changes and no changes to applications other than ufs. In particular, > half-baked file systems formatted with the changed newfs work right > under all versions of FreeBSD (i.e., they don't work and don't cause > panics), and utilities like dumpfs and fsdb don't need to be changed > to display and/or edit the newly used field. Actually, changing the magic number to a differing non-zero magic value has all the benefits you suggest, plus the benefit of signifying what I'm looking for (except for a tiny window of vulnerability where a disk *could* have the alternate magic number written in the superblock location by chance.) > > The name fs_state suggests someone was thinking of recording some > > sort of state in here that was never implemented. I've simply used > > it to record states 'newfs operation completed' and 'newfs operation > > not completed.' > > Its comment also suggests that it was for validating fs_clean. This > goes back to at least FreeBSD-1 where both fs_state and fs_clean are > unused. fs_clean wasn't used in 4.4BSD-Lite1 either. Use of it was > added in FreeBSD and/or NetBSD and picked up by Lite2. I am nervous > about using yet another variable for things related to state. There > is a large enough mess for fs_clean already. We now also have > FS_UNCLEAN in fs_flags (which were also unused until relatively > recently). As I understand it (the details were simpler when I > understood them all), FS_UNCLEAN was originally fs_clean done right > except it probably belongs in fs_state. It was just fs_clean inverted > and put in a bitmap. Now FS_UNCLEAN is tangled up with FS_NEEDSFSCK > and is not simply fs_clean inverted. Yes, trying to understand the interactions of fs_state and FS_UNCLEAN made my head hurt. Badly. I'll do a bit of testing with the alternate magic number and report what I observe, but probably not until Monday. I left my testing disk at work and I have too much sysadmin work at home to work on code this weekend. Sigh. -- Where am I, and what am I doing in this handbasket? Wes Peters wes@softweyr.com