From owner-freebsd-fs@FreeBSD.ORG Mon May 30 15:07:58 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AFA8816A41C; Mon, 30 May 2005 15:07:58 +0000 (GMT) (envelope-from dom@goodforbusiness.co.uk) Received: from mail.helenmarks.co.uk (mail.helenmarks.co.uk [82.68.196.22]) by mx1.FreeBSD.org (Postfix) with ESMTP id D193243D4C; Mon, 30 May 2005 15:07:57 +0000 (GMT) (envelope-from dom@goodforbusiness.co.uk) Received: from localhost (localhost [127.0.0.1]) by mail.helenmarks.co.uk (Postfix) with ESMTP id 87BE0222403; Mon, 30 May 2005 16:07:52 +0100 (BST) Received: from mail.helenmarks.co.uk ([127.0.0.1]) by localhost (mail.helenmarks.co.uk [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 56399-07; Mon, 30 May 2005 16:07:47 +0100 (BST) Received: from egg.helenmarks.co.uk (egg.helenmarks.co.uk [192.168.15.3]) by mail.helenmarks.co.uk (Postfix) with ESMTP id 7EC90222402; Mon, 30 May 2005 16:07:47 +0100 (BST) From: Dominic Marks Organization: GoodforBusiness.co.uk To: Bruce Evans Date: Mon, 30 May 2005 16:09:11 +0100 User-Agent: KMail/1.8 References: <200505271328.58072.dom@goodforbusiness.co.uk> <20050530155609.Q1473@epsplex.bde.org> <20050530193711.I843@epsplex.bde.org> In-Reply-To: <20050530193711.I843@epsplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200505301609.11857.dom@goodforbusiness.co.uk> X-Virus-Scanned: By ClamAV 0.80 Cc: freebsd-fs@freebsd.org, freebsd-gnats-submit@freebsd.org, banhalmi@field.hu Subject: Re: i386/68719: [usb] USB 2.0 mobil rack+ fat32 performance problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 May 2005 15:07:58 -0000 On Monday 30 May 2005 11:11, Bruce Evans wrote: > On Mon, 30 May 2005, Bruce Evans wrote: > > On Sun, 29 May 2005, Dominic Marks wrote: > >> I have been experimenting in msdosfs_read and I have managed to come up > >> with > >> something that works, but I'm sure it is flawed. On large file reads it > >> will > >> ... > >> %% > >> Index: msdosfs_vnops.c > >> =================================================================== > >> RCS file: /usr/cvs/src/sys/fs/msdosfs/msdosfs_vnops.c,v > >> retrieving revision 1.149.2.1 > >> diff -u -r1.149.2.1 msdosfs_vnops.c > >> --- msdosfs_vnops.c 31 Jan 2005 23:25:56 -0000 1.149.2.1 > >> +++ msdosfs_vnops.c 29 May 2005 14:10:18 -0000 > >> @@ -565,14 +567,21 @@ > >> error = bread(pmp->pm_devvp, lbn, blsize, NOCRED, > >> &bp); > >> } else { > >> blsize = pmp->pm_bpcluster; > >> - rablock = lbn + 1; > >> - if (seqcount > 1 && > >> - de_cn2off(pmp, rablock) < dep->de_FileSize) { > >> - rasize = pmp->pm_bpcluster; > >> - error = breadn(vp, lbn, blsize, > >> - &rablock, &rasize, 1, NOCRED, &bp); > >> + /* XXX what is the best value for crsize? */ > >> + crsize = blsize * nblks > MAXBSIZE ? MAXBSIZE : > >> blsize * nblks; > >> + if ((vp->v_mount->mnt_flag & MNT_NOCLUSTERR) == 0) { > >> + error = cluster_read(vp, dep->de_FileSize, > >> lbn, > >> + crsize, NOCRED, uio->uio_resid, > >> seqcount, &bp); > > > > crsize should be just the block size (cluster size in msdosfs and > > blsize variable here) according to this code in all other file systems. > > ... > > The main problem is that VOP_BMAP() is not fully implemented for msdosfs. > msdosfs_bmap() only has a stub which pretends that clustering ins never > possible: > > % /* > % * vp - address of vnode file the file > % * bn - which cluster we are interested in mapping to a filesystem block > number. % * vpp - returns the vnode for the block special file holding the > filesystem % * containing the file of interest > % * bnp - address of where to return the filesystem relative block number > % */ > > This comment rotted in 1994 when 4.4BSD packed the args into a struct > and added the a_runp and a_runb args to support clustering. > > % static int > % msdosfs_bmap(ap) > % struct vop_bmap_args /* { > % struct vnode *a_vp; > % daddr_t a_bn; > % struct vnode **a_vpp; > % daddr_t *a_bnp; > % int *a_runp; > % int *a_runb; > % } */ *ap; > % { > % struct denode *dep = VTODE(ap->a_vp); > % daddr_t blkno; > % int error; > % > % if (ap->a_vpp != NULL) > % *ap->a_vpp = dep->de_devvp; > % if (ap->a_bnp == NULL) > % return (0); > % if (ap->a_runp) { > % /* > % * Sequential clusters should be counted here. > ^^^^^^^^^ > % */ > % *ap->a_runp = 0; > % } > % if (ap->a_runb) { > % *ap->a_runb = 0; > % } > % error = pcbmap(dep, ap->a_bn, &blkno, 0, 0); > % *ap->a_bnp = blkno; > % return (error); > % } If I understand what is supposed to be done here (I looked at cd9660 but I don't know if the rules are different from msdos), a_runp should be set to the extent of contiguous blocks from the current position within the same region? I put some debugging into msdosfs_bmap and here it is copied: (fsz is dep->de_FileSize) msdosfs_bmap: fsz 81047 blkno 6374316 lblkno 5 msdosfs_bmap: fsz 81047 blkno 6374324 lblkno 6 msdosfs_bmap: fsz 81047 blkno 6374332 lblkno 7 msdosfs_bmap: fsz 81047 blkno 6374340 lblkno 8 msdosfs_bmap: fsz 81047 blkno 6374348 lblkno 9 msdosfs_bmap: fsz 81047 blkno 6374356 lblkno 10 msdosfs_bmap: fsz 81047 blkno 6374364 lblkno 11 msdosfs_bmap: fsz 81047 blkno 6374372 lblkno 12 # A1 msdosfs_bmap: fsz 81047 blkno 13146156 lblkno 13 # A2 msdosfs_bmap: fsz 81047 blkno 13146156 lblkno 14 msdosfs_bmap: fsz 81047 blkno 13146156 lblkno 15 msdosfs_bmap: fsz 81047 blkno 13146156 lblkno 16 msdosfs_bmap: fsz 81047 blkno 13146156 lblkno 17 msdosfs_bmap: fsz 81047 blkno 13146156 lblkno 18 msdosfs_bmap: fsz 81047 blkno 13146156 lblkno 19 I should compute the position of the boundary illustrated in A1 I should set that to the read ahead value, until setting a new value at A2, perhaps this should only be done for particularly large files? I will look at the other _bmap routines to see what they do. > Here is a cleaned up version of the patch to add (not actually working) > clustering to msdosfs_read(). > > %%% > Index: msdosfs_vnops.c > =================================================================== > RCS file: /home/ncvs/src/sys/fs/msdosfs/msdosfs_vnops.c,v > retrieving revision 1.147 > diff -u -2 -r1.147 msdosfs_vnops.c > --- msdosfs_vnops.c 4 Feb 2004 21:52:53 -0000 1.147 > +++ msdosfs_vnops.c 30 May 2005 08:57:02 -0000 > @@ -541,5 +555,7 @@ > if (uio->uio_offset >= dep->de_FileSize) > break; > + blsize = pmp->pm_bpcluster; > lbn = de_cluster(pmp, uio->uio_offset); > + rablock = lbn + 1; > /* > * If we are operating on a directory file then be sure to > @@ -556,15 +573,15 @@ > break; > error = bread(pmp->pm_devvp, lbn, blsize, NOCRED, &bp); > + } else if (de_cn2off(pmp, rablock) >= dep->de_FileSize) { > + error = bread(vp, lbn, blsize, NOCRED, &bp); > + } else if ((vp->v_mount->mnt_flag & MNT_NOCLUSTERR) == 0) { > + error = cluster_read(vp, dep->de_FileSize, lbn, blsize, > + NOCRED, uio->uio_resid, seqcount, &bp); > + } else if (seqcount > 1) { > + rasize = blsize; > + error = breadn(vp, lbn, > + blsize, &rablock, &rasize, 1, NOCRED, &bp); > } else { > - blsize = pmp->pm_bpcluster; > - rablock = lbn + 1; > - if (seqcount > 1 && > - de_cn2off(pmp, rablock) < dep->de_FileSize) { > - rasize = pmp->pm_bpcluster; > - error = breadn(vp, lbn, blsize, > - &rablock, &rasize, 1, NOCRED, &bp); > - } else { > - error = bread(vp, lbn, blsize, NOCRED, &bp); > - } > + error = bread(vp, lbn, blsize, NOCRED, &bp); > } > if (error) { > %%% > > I rearranged the code to be almost lexically identical with that in > ffs_read(). Thanks, I will use this as a basis for any other things I try. > I only tested this on a relatively fast ATA drive. It made little > difference. Most writes were clustered to give a block size of 64K > and write speed of over 40+MB/s until the disk is nearly full, but > reads weren't clustered with or without the patch so the block size > remained at the fs block size (4K); the drive handles this block size > mediocrely and gave a read speed of 20+MB/sec. (The drive is a WDC > 1200JB-00CRA1. This drive has the interesting behaviour of giving > almost the same mediocre read speed for all block sizes between 2.5K > and 19.5K. A block size 20K gives maximal speed which is about twice > as fast as the speed for a block size of 19.5K.) I am still confused as to how reading blsize * 16 actually improved the transfer rate after a long period of making it worse. Perhaps it is related to the buffer resource problem you describe below. > Both reading and writing a 1GB file to/from msdosfs caused noticable > buffer resource problems. Accesses to other file systems on the same > disk sometimes blocked for many seconds. I have debugging code in > getblk(). It reported that a process waited 17 seconds in or near > getblk(). The process only stopped waiting because I suspended the > process accessing msdosfs. This may be a local bug. I'll look for buffer resource statistics in the system tools and measure those. There are no obvious signs, to me, that the systems are in any specific difficulties while running the transfers. > Bruce Thanks a lot for the answers and code, -- Dominic GoodforBusiness.co.uk I.T. Services for SMEs in the UK.