Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 16 Aug 2013 00:40:43 +0300
From:      Alexander Motin <>
To:        FreeBSD SCSI <>,,  "Justin T. Gibbs" <>, Scott Long <>, ken <>,  Jeff Roberson <>, Steven Hartland <>
Subject:   New CAM locking preview
Message-ID:  <>

Next in thread | Raw E-Mail | Index | Archive | Help

Last weeks I've made substantial progress on my CAM locking work. In 
fact, at this moment I think I've tied all loose ends good enough to 
consider the new design viable and implementation worth further testing 
and bug fixing. So I would like to ask for review of my work from 
everybody who interested in CAM internals.

In short, my idea was to split single per-SIM lock, that creates huge 
congestion under high IOPS, into several smaller ones. So design I've 
finally chosen includes such locks:
  1) New per-device (per-LUN) locks to protect state of the devices and 
respective periphs. In most cases peripheral drivers just use that lock 
instead of SIM lock used before, so code modification is minimal and 
  2) New per-target lock to protect list of LUNs fetched from the device.
  3) Old single per-SIM lock to protect SIM driver internals, but only 
that. No parts of CAM itself use that lock. Keeping it for SIMs allows 
to keep API and hopefully ABI compatibility. Reducing its scope allows 
to reduce congestion.
  4) New per-SIM lock to protect SIM and device command queues. That 
allows execute queued commands from any context unrelated to other 
locks. Also this lock serializes accesses to sim_action() method for the 
most of commands, this allows to mostly avoid busy spilling on SIM lock 
  5) New per-bus locks to protect target, device and periphs reference 
counters. It allows to create and destroy paths unrelated to other locks 
in any possible context.

Numbers above also define supposed lock ordering: while holding 
per-device lock 1) is allowed to request SIM lock 3), but not backward. 
Cases where opposite is required (command completions and async events) 
are handled via queuing events via several completion threads. The rest 
of locks are self-contained and does not really suppose cascading.

All these changes combined with GEOM direct dispatch (it will be next 
separate project) allow to double system performance in disk I/O 
microbenchmarks, comparing to present head, same as it was announced on 
2013-05 DevSummit: . Tests 
without GEOM changes also show performance improvement, but limited by 
heavy bottleneck at the GEOM g_up/g_down threads at the level of 5-20%.

Project sources could be found at SVN projects/camlock branch: . Many early changes 
from that branch are already integrated to head, so to simplify review 
the rest patches for changes before r254059 were manually remade and 
could be found here: .

These changes do not require controller driver modifications, keeping 
KPIs and hopefully KBIs intact, but create base for later work to use 
multiqueue capabilities of new controllers.

This work is sponsored by iXsystems, Inc.

Alexander Motin

Want to link to this message? Use this URL: <>