Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 20 Oct 1998 11:28:16 -0600 (MDT)
From:      "Kenneth D. Merry" <ken@plutotech.com>
To:        keefe@cse.psu.edu (Thomas F Keefe)
Cc:        freebsd-scsi@FreeBSD.ORG
Subject:   Re: Sequential Disk I/O
Message-ID:  <199810201728.LAA04393@panzer.plutotech.com>
In-Reply-To: <199810201548.LAA18346@remulak.cse.psu.edu> from Thomas F Keefe at "Oct 20, 98 11:48:45 am"

next in thread | previous in thread | raw e-mail | index | archive | help
Thomas F Keefe wrote...
> > Thomas F. Keefe wrote...
> > > I am trying to write the logging portion
> > > of a database and have had trouble getting
> > > good performance. I am trying to avoid seek
> > > and rotational latency by writing consecutive
> > > 512 byte blocks to the disk.
> > 
> > You'd get better performance by writing larget blocks, most likely.
> 
> Yes, and I may eventually need to do this.
> However, I want to understand why the performance is
> so much worse. I can see that there will be more overhead
> involved, but it seems that if the requests are submitted
> in order and in time the throughput should approach
> what I would get writing large blocks.
>  
> > > Here are some details. I am using
> > > Mach/Lites, with the drivers for the Adaptec 
> > > aic7xxx controller ported from FreeBSD 
> > > (the version from around 9/97).
> > 
> > Is this with the old or new SCSI layer?  The SCSI layer in FreeBSD was
> > replaced in mid-September with a new, CAM-based SCSI layer.
> 
> This is the pre-CAM driver that I ported.
> I moved the portion of the driver specific to the adapter. 
> (That is, the code under /usr/src/sys/i386/scsi/aic7xxx.[ch]
> and a few other files.)

Ahh, okay.

> > > Is it possible to achieve sequential I/O rates
> > > (i.e., no seek latency and no rotational latency)
> > > with small write requests? Any insight you can 
> > > provide will be appreciated. Thanks.
> > 
> > One thing to keep in mind is that you won't be able to achieve very good
> > performance at all with small block sizes unless you're able to get a large
> > number of tagged transactions to the drive at one time.
> 
> I can understand the need for a large number of requests
> for random access patterns, but if the best possible request
> (the adjacent sector) is available to the adapter (and perhaps
> the drive) why are more needed?

You need to "fill the pipe".  i.e., there should be a constant stream of
requests to the drive.  If you've only got a few requests outstanding to the
drive at any one time, the drive may be sitting idle for a time.  This is
because there is a certain amount of latency involved in actually getting
transactions to and from the drive.

Another thing to keep in mind is that the disk will be less efficient when
writing tiny amounts of data at a time.  It may be more efficient if it can
write a number of adjacent smaller blocks at a time.  (i.e. coalesce them
into larger blocks)

> > I don't know what your SCSI subsystem is based on (you only mentioned
> > porting the Adaptec driver, not which Adaptec driver or what your SCSI
> > subsystem is like), but if it is based on the old FreeBSD SCSI subsystem,
> > you'll only be able to have 4 transactions outstanding to the disk at once.
> 
> I am using the SCSI layer from Mach, but have modified it
> to allow more than one outstanding request per target. The
> modifications are based loosely on the pre-CAM SCSI layer
> used in FreeBSD (i.e., each target has a number of openings
> and requests are started if the openings for the target have
> not been exhausted).

Well, hopefully the number of openings is large enough.  (by large enough,
I mean around 64 per device)

> > Another problem is that your disk supposedly has horrible tagged queueing
> > performance.  If you're trying to get good throughput with small
> > transactions, I'd suggest using a drive that behaves a little better.  Most
> > IBM and Seagate disks work pretty well.
> > 
> > For instance, I've got an IBM Ultrastar 9ZX (9G, 10000RPM).  Running iozone
> > with 512 byte blocks for a 256MB file, I get:
> > 
> > IOZONE performance measurements:
> >         8054322 bytes/second for writing the file
> >         14070326 bytes/second for reading the file
> > 
> > With 64K blocks, I get:
> > 
> > IOZONE performance measurements:
> >         14559211 bytes/second for writing the file
> >         15929410 bytes/second for reading the file
> > 
> > This is on a filesystem of course, not a raw device.  And this is with CAM,
> > not the old SCSI subsystem.  So it is possible to get better performance
> > with small block sizes.  You probably just need to get a better disk and
> > make sure you can handle more outstanding tagged transactions.
> 
> I also get better performance when using the file system. I guess
> that is because the writes shipped to the controller are 
> actually 8Kbytes blocks.

Actually, on my system, it looks like they vary between 55KB and 64KB.
Note that I do have softupdates on the filesystem in question, and that may
have some impact on performance.

> Do these disks have write caching turned on? I have shut mine
> off and this makes a big difference in the throughput.
> I can get several MB/sec with it turned on and only about
> 50KB/sec with it off.

Oddly enough, that's with write caching turned off.  I generally run with
write caching turned on, but I've never modified the write cache enable
settings on this disk.  (most of my Quantum disks come with write caching
turned on)

Here are the numbers with write caching turned on:

==============================
256MB file, 512 byte blocks:

IOZONE performance measurements:
        8289442 bytes/second for writing the file
        14316557 bytes/second for reading the file

256MB file, 64KB blocks:

IOZONE performance measurements:
        8481791 bytes/second for writing the file
        15877882 bytes/second for reading the file

==============================

Oddly enough, write performance with the larger block sizes isn't nearly as
good with write caching turned on.  I'm not sure whether this is the disk,
or an effect of softupdates' caching policy, or what.  I'm sure Terry will
have a theory.  I suppose if I disabled softupdates and then re-ran the
test, I could find out for sure.

Here are numbers with write caching and softupdates turned off:

==============================
256MB file, 512 byte blocks:

IOZONE performance measurements:
        8423569 bytes/second for writing the file
        14310594 bytes/second for reading the file

It looked like most of the 512 byte blocks still got coalesced into 55-64KB
blocks, even with softupdates turned off.

256MB file, 64KB blocks:

IOZONE performance measurements:
        13580924 bytes/second for writing the file
        15981273 bytes/second for reading the file
==============================

Here are the numbers with write caching on, and softupdates off:

==============================
256MB file, 512 byte blocks:

IOZONE performance measurements:
        8492273 bytes/second for writing the file
        14394528 bytes/second for reading the file

256MB file, 64KB blocks:

IOZONE performance measurements:
        12408717 bytes/second for writing the file
        15988710 bytes/second for reading the file

==============================

The numbers you should probably look at are the ones with write caching and
softupdates turned off.

I would certainly recommend getting a better disk, since I think that may
be a big reason behind your poor performance.

One interesting thing that can be gleaned from the numbers above is that
write caching and softupdates seem to screw each other up a little bit for
sequential I/O with large block sizes.  (write performance with 64KB block
sizes was decent in all cases except when both softupdates and write caching
were enabled)

Ken
-- 
Kenneth Merry
ken@plutotech.com

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199810201728.LAA04393>