From owner-freebsd-fs Tue Mar 13 8:28:36 2001 Delivered-To: freebsd-fs@freebsd.org Received: from mx10.quantum.com (mx10.quantum.com [204.212.103.176]) by hub.freebsd.org (Postfix) with ESMTP id 7CFE837B721; Tue, 13 Mar 2001 08:28:25 -0800 (PST) (envelope-from Stephen.Byan@quantum.com) Received: from milcmima.qntm.com (milcmima.qntm.com [146.174.18.61]) by mx10.quantum.com (8.9.3+Sun/8.9.3) with ESMTP id IAA22570; Tue, 13 Mar 2001 08:24:53 -0800 (PST) Received: by milcmima.qntm.com with Internet Mail Service (5.5.2650.21) id ; Tue, 13 Mar 2001 08:26:17 -0800 Message-ID: <38E52A0B1357D411A42400805FA79384019631CE@shrcmsgb.tdh.qntm.com> From: Stephen Byan To: "'Alfred Perlstein'" , Matthew Jacob Cc: "Justin T. Gibbs" , Soren Schmidt , Kevin Oberman , scsi@FreeBSD.ORG, fs@FreeBSD.ORG, dillon@FreeBSD.ORG Subject: RE: Disk I/O problem in 4.3-BETA Date: Tue, 13 Mar 2001 08:26:10 -0800 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="iso-8859-1" Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Alfred Perlstein [mailto:bright@wintelcom.net] wrote: > This would allow us to use writecaching for data, but force > stable storage for meta-data. I think we'd also want to use > this for forced data sync (fsync(2) and files opened with O_SYNC). I do hope this gets implemented. I did something similar at Hitachi in our OSF/1 port to the S/390 back around 1992. The disk controllers had a substantial amount of volatile RAM cache, and a lesser amount of NV-RAM cache. We directed the metadata writes to NV-RAM, and the data to volatile cache (with a flush at partition close, of course). Since the hit rate on metadata writes in UFS is very high, even with a small NV cache, we were able to get substantial speedups in metadata intensive operations such as a recursive directory copy. We also implemented sequential I/O hinting, detected by the read-ahead mechanism in the file system. Passing this hint down allowed the controller to do a better job of cache management: sequential I/O recycled the buffers after they had been read or written, rather than aging them through the LRU list, so sequential reads and writes didn't trash the cache. For NFS v2, it's also helpful to be able to mark the write I/O's as non-cachable, thus hinting them toward NV-RAM. In Windows NT, NTFS uses the SCSI Force Unit Access (FUA) bit on it's metadata writes (log writes for sure; it should also use FUA for lazy writes of metadata, else there is a race condition for recycling the log entry, no?, but I don't know whether it actually uses FUA for lazy writes of metadata). It doesn't hint the SCSI CDBs with sequential access information, but such info is available in the IRP presented to the SCSI class driver, and a filter driver could do some magic... If NT and FreeBSD both support hinting of metadata writes, it's only a matter of time before the hardware support appears. Regards, -Steve Steve Byan Design Engineer Quantum Corporation MS 1-3/E23 333 South Street Shrewsbury, MA 01545 (508)770-3414 fax: (508)770-2604 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message