From owner-freebsd-arch@FreeBSD.ORG Sun Jan 4 03:11:59 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 52E7916A4CE; Sun, 4 Jan 2004 03:11:59 -0800 (PST) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0281243D39; Sun, 4 Jan 2004 03:11:56 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3p2/8.8.7) with ESMTP id WAA25172; Sun, 4 Jan 2004 22:11:52 +1100 Date: Sun, 4 Jan 2004 22:11:51 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Scott Long In-Reply-To: <3FF7BD89.4080406@freebsd.org> Message-ID: <20040104211704.O582@gamplex.bde.org> References: <20040103.153644.107852018.imp@bsdimp.com> <3FF7967A.1090401@freebsd.org><3FF7BD89.4080406@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: arch@freebsd.org Subject: Re: Simple patch: Make DFLTPHYS an option X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Jan 2004 11:11:59 -0000 On Sun, 4 Jan 2004, Scott Long wrote: > Bruce Evans wrote: > > On Sat, 3 Jan 2004, Scott Long wrote: > > > >>The key, though, is to ensure that the block system is actually > >>honoring the per-device disk.d_maxsize variable. I'm not sure if it is > >>right now. > > > > It at least used to work (up to MAXPHYS). The ad driver used a max > > i/o size of 128K until recently. This has rotted back to 64K for some > > reason (64K is encoded as DFLTPHYS in the non-dma case and as 64 * 1024 > > in the dma case). > > I've seen evidence lately that this might be broken, but I need to track > it down further. Do you mean sizes other than DFLTPHYS or the ad driver? For ad, I remember seeing the commit that reduced the size, but I couldn't find it easily. It seems to have been just the big ATAng commit. I don't know of any problems with i/o size maxes different from the defaults except for the one in spec_getpages(). I/O sizes of up to (VM_INITIAL_PAGEIN * PAGE_SIZE) bytes must work for disk devices, since spec_getpages() doesn't honor dev->si_iosize_max. This value accidentally defaults to the same value as DFLTPHYS on machines with 4K pages and to the same value as MAXPHYS on machines with 8K pages. Thus the "maximum" given by dev->si_iosize_max cannot actually be the maximum on any machine if it is < DFLTPHYS, and the usual default of DFLTPHYS is never the actual maximum on non-broken machines with 8K pages. Most disk drivers handle this by splitting up large i/o's into smaller ones internally. physio() does similar splitting (based on si_iosize_max). So si_iosize_max is not very useful for disks. physio() would do better just to split up based on MAXPHYS (since large sizes only occur if the user requests them). Clustering may benefit from using a smaller size (since a smaller size may actually be better and users can't control it). physio() needs si_iosize_max mainly to avoid wrong splitting for non-disk devices (mainly tapes). > >>Also, increasing MAXPHYS will lead to your KVA being chewed up quite > >>quickly, which in turn will lead to unpleasant panics. A lot of work > >>needs to go in to fixing this; increasing the value here has little > >>value even to people who shun seatbelts. > > > > Not all that quicky. MAXPHYS affects mainly pbufs, and there are a > > limited number of them (256 max?), and their kva is statically allocated. > > 256 times the current MAXPHYS gives 16M. This could easily be increased > > by a factor of up to about 8 without necesarily breaking things (e.g., > > by stealing 112MB from buffer kva using VM_BCACHE_SIZE if the default > > normal-buffer kva size is large (if it is small then there should be > > space to spare, else there would be no space to spare on systems with > > more RAM so that the buffer kva size is larger). > > VFS, softupdates, UFS_DIRHASH, etc, all contribute to KVA being eaten > faster than it used to be. Don't use them then :-). (I mostly don't.) > Even with smarter tuning of common culprits > like maxvnodes, KVA is still under a lot of pressure. This depends on the memory size. I use VM_BCACHE_SIZE = 512M and have no problems fitting everything else in the remaining 512M - on a machine with 1GB. With more physical memory, it becomes harder to fit everything in without kludges. (The default BKVASIZE and VM_BCACHE_SIZE are already kludged to take 1/4 as much space as they should, although it this is not necessary on machines with not much physical memory or more than KVA than i386's have.) Bruce