From owner-freebsd-hackers  Thu Oct  3 10:22:55 1996
Return-Path: owner-hackers
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.5/8.7.3) id KAA09297
          for hackers-outgoing; Thu, 3 Oct 1996 10:22:55 -0700 (PDT)
Received: from minnow.render.com (render.demon.co.uk [158.152.30.118])
          by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id KAA09101;
          Thu, 3 Oct 1996 10:20:04 -0700 (PDT)
Received: from minnow.render.com (minnow.render.com [193.195.178.1]) by minnow.render.com (8.6.12/8.6.9) with SMTP id SAA26614; Thu, 3 Oct 1996 18:18:41 +0100
Date: Thu, 3 Oct 1996 18:18:38 +0100 (BST)
From: Doug Rabson <dfr@render.com>
To: Heo Sung-Gwan <heo@cslsun10.sogang.ac.kr>
cc: freebsd-hackers@FreeBSD.org, freebsd-fs@FreeBSD.org
Subject: Re: vnode and cluster read-ahead
In-Reply-To: <Pine.SUN.3.93.961004003010.6582A-100000@cslsun10>
Message-ID: <Pine.BSF.3.95.961003181143.10204U-100000@minnow.render.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-hackers@FreeBSD.org
X-Loop: FreeBSD.org
Precedence: bulk

On Fri, 4 Oct 1996, Heo Sung-Gwan wrote:

> 
> John Dyson writes:
> >> 
> >> You could maintain a number of 'pending readahead' structures indexed by
> >> vnode and block number.  Each call to cluster_read would check for a
> >> pending readahead by hashing.  For efficiency, keep a pointer to the last
> >> readahead structure used by cluster_read in the vnode in place of the
> >> existing in-vnode readahead data.  Should be no slower than the current
> >> system for single process reads and it saves 4 bytes per vnode :-).
> > 
> >Pretty cool idea.  I am remembering now that this deficiency in our read
> >ahead code is well known.  This might be something really good for 2.3 or
> >3.1 :-).  (Unless someone else wants to implement it -- hint hint :-)).
> >
> 
> I suggest a new idea. The fields for read-ahead(maxra, lenra, etc) are
> in file structure or other structure(e.g. Doug Rabson's readahead structure)
> that is pointed by a new field in file structure. And vnode has a new field 
> to contain the point to the file structure. This vnode field is filled every 
> read system call with the point to the file structure at vn_read() 
> in kern/vfs_vnops.c Then it is possible that the file structure is accessed
> through vnode in cluster_read. 

Not all the vnodes in the system are associated with file structures.  The
NFS server uses vnodes directly along with some other oddities like exec
and coredumps.  If we optimise cluster_read for normal open files, we
should try and avoid pessimising it for other vnode users in the system.

> 
> Because the system calls are nonpreemptive the point to the file structure 
> in the vnode is not changed until the current read system call is finished.  

I have vain hopes of a future kernel which is multithreading and
introducing a new complication to that is not a good idea IMHO.  In
addition, multiple userland threads could fool a system where readaheads
were calculater per-open-file.

> 
> This method removes the hashing using vnode and block number.

For the common single reader case, the vnode would cache a pointer to the
readahead structure, avoiding the hash.  The hash would be a simple O(1)
operation anyway for the multiple reader case and so should not be a real
performance problem.

> 
> Is it really possible?

A friend of mine always used to answer, 'Anything is possible, after all
its only software' to that question :-).

--
Doug Rabson, Microsoft RenderMorphics Ltd.	Mail:  dfr@render.com
						Phone: +44 171 734 3761
						FAX:   +44 171 734 6426