Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 May 2005 13:05:26 +1000 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Dominic Marks <dom@goodforbusiness.co.uk>
Cc:        freebsd-fs@FreeBSD.org, freebsd-gnats-submit@FreeBSD.org, banhalmi@field.hu
Subject:   Re: i386/68719: [usb] USB 2.0 mobil rack+ fat32 performance problem
Message-ID:  <20050531115604.S91592@delplex.bde.org>
In-Reply-To: <200505301609.11857.dom@goodforbusiness.co.uk>
References:  <200505271328.58072.dom@goodforbusiness.co.uk> <20050530155609.Q1473@epsplex.bde.org> <20050530193711.I843@epsplex.bde.org> <200505301609.11857.dom@goodforbusiness.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 30 May 2005, Dominic Marks wrote:

> On Monday 30 May 2005 11:11, Bruce Evans wrote:
>> The main problem is that VOP_BMAP() is not fully implemented for msdosfs.
>> msdosfs_bmap() only has a stub which pretends that clustering ins never
>> possible:
>
> If I understand what is supposed to be done here (I looked at cd9660 but
> I don't know if the rules are different from msdos), a_runp should be set
> to the extent of contiguous blocks from the current position within the
> same region? I put some debugging into msdosfs_bmap and here it is copied:

cd9660 is deceptively simple here because (I think) it allocates files
in perfectly contiguous extents.

msdosfs, ffs^ufs and ext2fs have to do considerable work to map even a
single block.  The details are in pcbmap() for msdosfs.  (The name of this
function dates from when msdosfs was named pcfs.)  I think msdosfs_bmap()
just needs to call this function for each block following the start block
until a discontiguity is hit or a limit (*) is reached.

ufs and ext2fs have an optimized and obfucsated version of this, with
multiple blocks looked up at once and the single-block lookup implemented
as a multiple-block lookup with a count of 1.  I doubt that this
optimization is significant even for ufs, at least now that CPUs are
10 to 100 times as fast relative to I/O as when it was implemented.
However it is easier to optimize for msdosfs since there are no
indirect blocks.

All of cd9660, ufs and ext2fs have a whole file *_bmap.c for bmapping.
ext2_bmaparray() is simplest, but bmapping in ext2fs and ufs is so
similar that misspelling ext2_getlbns() as ufs_getlbns() in 1 caller
is harmless.

(*) The correct limit is mnt_iosize_max bytes.  cd9660 uses the wrong
limit of MAXBSIZE.

> (fsz is dep->de_FileSize)
>
> msdosfs_bmap: fsz  81047  blkno  6374316  lblkno 5
> ...
> msdosfs_bmap: fsz  81047  blkno  6374364  lblkno 11
> msdosfs_bmap: fsz  81047  blkno  6374372  lblkno 12 # A1
> msdosfs_bmap: fsz  81047  blkno 13146156  lblkno 13 # A2
> msdosfs_bmap: fsz  81047  blkno 13146156  lblkno 14
> ...
>
> I should compute the position of the boundary illustrated in A1 I should set
> that to the read ahead value, until setting a new value at A2, perhaps this
> should only be done for particularly large files? I will look at the other
> _bmap routines to see what they do.

Better to do it for all files.  For small files there are just fewer
blocks to check for contiguity.

> I am still confused as to how reading blsize * 16 actually improved
> the transfer rate after a long period of making it worse. Perhaps it
> is related to the buffer resource problem you describe below.

Could be.  The buffer cache layer doesn't handle either overlapping
buffers or variant buffer sizes very well.  Buffer sizes of (blsize *
16) mixed with buffer sizes of blsize for msdosfs and 16K for ffs may
excercise both of these.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050531115604.S91592>