Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 11 Jul 2008 21:20:51 +0000 (UTC)
From:      Scott Long <scottl@FreeBSD.org>
To:        src-committers@FreeBSD.org, cvs-src@FreeBSD.org, cvs-all@FreeBSD.org
Subject:   cvs commit: src/sys/dev/ciss ciss.c cissio.h cissreg.h cissvar.h
Message-ID:  <200807112121.m6BLL70t031426@repoman.freebsd.org>

next in thread | raw e-mail | index | archive | help
scottl      2008-07-11 21:20:51 UTC

  FreeBSD src repository

  Modified files:
    sys/dev/ciss         ciss.c cissio.h cissreg.h cissvar.h 
  Log:
  SVN rev 180454 on 2008-07-11 21:20:51Z by scottl
  
  A number of significant enhancements to the ciss driver:
  
  1.  The FreeBSD driver was setting an interrupt coalesce delay of 1000us
  for reasons that I can only speculate on.  This was hurting everything
  from lame sequential I/O "benchmarks" to legitimate filesystem metadata
  operations that relied on serialized barrier writes.  One of my
  filesystem tests went from 35s to complete down to 6s.
  
  2.  Implemented the Performant transport method.  Without the fix in
  (1), I saw almost no difference.  With it, my filesystem tests showed
  another 5-10% improvement in speed.  It was hard to measure CPU
  utilization in any meaningful way, so it's not clear if there was a
  benefit there, though there should have been since the interrupt handler
  was reduced from 2 or more PCI reads down to 1.
  
  3.  Implemented MSI-X.  Without any docs on this, I was just taking a
  guess, and it appears to only work with the Performant method.  This
  could be a programming or understanding mistake on my part.  While this
  by itself made almost no difference to performance since the Performant
  method already eliminated most of the synchronous reads over the PCI
  bus, it did allow the CISS hardware to stop sharing its interrupt with
  the USB hardware, which in turn allowed the driver to become decoupled
  from the Giant-locked USB driver stack.  This increased performance by
  almost 20%.  The MSI-X setup was done with 4 vectors allocated, but only
  1 vector used since the performant method was told to only use 1 of 4
  queues.  Fiddling with this might make it work with the simpleq method,
  not sure.  I did not implement MSI since I have no MSI-specific hardware
  in my test lab.
  
  4.  Improved the locking in the driver, trimmed some data structures.
  This didn't improve test times in any measurable way, but it does look
  like it gave a minor improvement to CPU usage when many
  processes/threads were doing I/O in parallel.  Again, this was hard to
  accurately test.
  
  Revision  Changes    Path
  1.90      +388 -93   src/sys/dev/ciss/ciss.c
  1.6       +19 -0     src/sys/dev/ciss/cissio.h
  1.17      +36 -0     src/sys/dev/ciss/cissreg.h
  1.12      +51 -57    src/sys/dev/ciss/cissvar.h



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200807112121.m6BLL70t031426>