From owner-freebsd-arch Mon Feb 5 12:47:44 2001 Delivered-To: freebsd-arch@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id 8327237B491; Mon, 5 Feb 2001 12:47:25 -0800 (PST) Received: (from bright@localhost) by fw.wintelcom.net (8.10.0/8.10.0) id f15Kl7S09686; Mon, 5 Feb 2001 12:47:07 -0800 (PST) Date: Mon, 5 Feb 2001 12:47:07 -0800 From: Alfred Perlstein To: "Justin T. Gibbs" Cc: Randell Jesup , Matt Dillon , Matthew Jacob , Mike Smith , Dag-Erling Smorgrav , Dan Nelson , Seigo Tanimura , arch@FreeBSD.ORG Subject: Re: Bumping up {MAX,DFLT}*PHYS (was Re: Bumping up {MAX,DFL}*SIZ in i386) Message-ID: <20010205124707.Y26076@fw.wintelcom.net> References: <200102052006.f15K6bO49659@aslan.scsiguy.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200102052006.f15K6bO49659@aslan.scsiguy.com>; from gibbs@scsiguy.com on Mon, Feb 05, 2001 at 01:06:37PM -0700 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG * Justin T. Gibbs [010205 12:08] wrote: > >> (2) Modify the 'struct buf' b_pages[] array to instead be a pointer > >> to an array. Include the original static array under another name > >> for compatibility purposes and have the init code default to > >> assigning b_pages to the original embedded static array. > >> > >> Then the physio code could be adjusted to dynamically MALLOC the > >> necessary pages array if the static one in the supplied buffer is > >> insufficient. > > > > So, how reasonable is this? It seems like a pretty good solution, > >but I'm far from up-to-speed on the internals here. > > I'd rather allow bufs (or bios) to be chained and let the block devices > decide how to break them up. This simplifies the clustering code too > as you avoid all of the VM operations to combine bufs into a single cluster > buf. One of the suggestions that Poul-Henning made was to have the device somehow specify an optimal clustering strategy, being able to specify bounds and sizes. For instance an NFS commit request could be megabytes in size, while a NFS write may not want any clustering at all. A RAID request might want to ask for a megabyte of data, but have it in a range on the device level. Currently (i think) we only cluster based on logical file offsets, it would be interesting to allow drivers to do callbacks into the FS to ask for blocks physically adjacent to the blocks being written. This is because a 64k block of any file may actually be spread out across any position, even though UFS tries to reduce fragmentation, the worse case is that we do the vm ops to cluster non-physically contiguous blocks. I think the simplest way to do this would be to rip out the current clustering code and provide helper routines for the devices to get adjacent blocks, either logically via VOP or physically via some VFS mechanism. -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] "I have the heart of a child; I keep it in a jar on my desk." To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message