From owner-freebsd-hackers Thu Nov 12 22:34:51 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id WAA29867 for freebsd-hackers-outgoing; Thu, 12 Nov 1998 22:34:51 -0800 (PST) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from allegro.lemis.com (allegro.lemis.com [192.109.197.134]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id WAA29860 for ; Thu, 12 Nov 1998 22:34:44 -0800 (PST) (envelope-from grog@freebie.lemis.com) Received: from freebie.lemis.com (freebie.lemis.com [192.109.197.137]) by allegro.lemis.com (8.9.1/8.9.0) with ESMTP id RAA27675; Fri, 13 Nov 1998 17:03:30 +1030 (CST) Received: (from grog@localhost) by freebie.lemis.com (8.9.1/8.9.0) id RAA02232; Fri, 13 Nov 1998 17:03:29 +1030 (CST) Message-ID: <19981113170329.Y781@freebie.lemis.com> Date: Fri, 13 Nov 1998 17:03:29 +1030 From: Greg Lehey To: Matthew Dillon Cc: Bernd Walter , Mike Smith , hackers@FreeBSD.ORG Subject: Re: [Vinum] Stupid benchmark: newfsstone References: <199811100638.WAA00637@dingo.cdrom.com> <19981111103028.L18183@freebie.lemis.com> <19981111040654.07145@cicely.de> <19981111134546.D20374@freebie.lemis.com> <19981111085152.55040@cicely.de> <19981111183546.D20849@freebie.lemis.com> <19981111194157.06719@cicely.de> <19981112184509.K463@freebie.lemis.com> <199811130555.VAA01110@apollo.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.91.1i In-Reply-To: <199811130555.VAA01110@apollo.backplane.com>; from Matthew Dillon on Thu, Nov 12, 1998 at 09:55:59PM -0800 WWW-Home-Page: http://www.lemis.com/~grog Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-41-739-7062 Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Thursday, 12 November 1998 at 21:55:59 -0800, Matthew Dillon wrote: > : > :OK, so you want to have 4 15 kB reads, and you expect a performance > :improvement because of it. > : > :Let's consider the hardware: a good modern disk has a disk transfer > :rate of 10 MB/s and a rotational speed of 7200 rpm. Let's look at the > :times involved: > : > : rotational transfer time total > : latency > : > :1 disk/60 kB 4.2 ms 6 ms 10.2 ms > :4 disks/15 kB 7.8 ms 1.5 ms 9.3 ms > : > :Huh? Why the difference in rotational latency? If you're reading > > This is only relevant in the non-parallel random-seek-read case. Is > that the case you are talking about? Yes. Most people don't install striped volumes for sequential reads (though Justin has shown an exception). > In the linear-read case the disk's read lookahead cache absorbs > the rotational latency. > In the parallel random-seek-read case (i.e. a large concurrent > load on the disks from different processes), it's irrelevant > because although the per-request latency is higher due to lack > of spindle synchronization, each disk is still individually > pipelined so if disk #1 finishes it's portion of the request > before disk #2, disk #1 start work on some other concurrent > request, and so forth. > > It may seem relevant, but remember that the important resource number > for striped disks is transactions/sec, not the per-transaction > latency. Sure, but your argumentation views the speed from the single transaction viewpoint. The fact is that with multiple parallel transactions going on, each physical access involves a seek, rotational latency and the access itself. Let's go back and look at that again from the parallel access point of view: rotational transfer time total latency (total) (total) 1 disk/60 kB 4.2 ms 6 ms 10.2 ms 4 disks/15 kB 16.8 ms 6 ms 22.8 ms This is the (idealized) time that the disks involved are occupied with the transaction. > In the large concurrent load case the transactions/sec is not > compromised by the lack of a spindle sync. Correct. > It should also be noted that the case where you might think that > per-transaction latency is important... the database case, > actually isn't, because concurrent load is typically the most > important factor for a database. I wasn't even thinking that far, but agreed. I don't think there's much point in discussing this matter from a purely theoretical viewpoint: I intend to do some testing when I have Vinum in fit shape, and then I'll publish results. What I have done here (even for sequential access) shows a significant performance improvement for large stripes, even in the area where no "aggregation" is possible (transfer size < stripe size * number of disks). Greg -- See complete headers for address, home page and phone numbers finger grog@lemis.com for PGP public key To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message