From owner-freebsd-scsi@FreeBSD.ORG  Sun Feb  3 17:12:36 2013
Return-Path: <owner-freebsd-scsi@FreeBSD.ORG>
Delivered-To: freebsd-scsi@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 89FEC696
 for <freebsd-scsi@FreeBSD.org>; Sun,  3 Feb 2013 17:12:36 +0000 (UTC)
 (envelope-from jau@oxit.fi)
Received: from smtp.oxit.fi (smtp.oxit.fi [193.185.41.132])
 by mx1.freebsd.org (Postfix) with ESMTP id 01BBD15B
 for <freebsd-scsi@FreeBSD.org>; Sun,  3 Feb 2013 17:12:35 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by smtp.oxit.fi (Postfix) with ESMTP id 5502E6C226F;
 Sun,  3 Feb 2013 19:04:03 +0200 (EET)
X-Virus-Scanned: Debian amavisd-new at smtp.oxit.fi
Received: from smtp.oxit.fi ([127.0.0.1])
 by localhost (huskvarna.oxit.fi [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id Pd+FxkbSdu6T; Sun,  3 Feb 2013 19:03:57 +0200 (EET)
Received: from [192.168.1.131] (ip193-64-26-115.cust.eunet.fi [193.64.26.115])
 by smtp.oxit.fi (Postfix) with ESMTPSA id D0E4F6C053F;
 Sun,  3 Feb 2013 19:03:56 +0200 (EET)
Message-ID: <510E987C.4090509@oxit.fi>
Date: Sun, 03 Feb 2013 19:03:56 +0200
From: "Jukka A. Ukkonen" <jau@oxit.fi>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/20130110 Thunderbird/17.0.2
MIME-Version: 1.0
To: freebsd-scsi@FreeBSD.org
Subject: Re: Multiple FreeBSD SCSI Hosts
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Joerg Wunsch <j@uriah.heep.sax.de>
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 03 Feb 2013 17:12:36 -0000


Hello all,

I have been browsing through these old messages about the
SCSI RESERVE/RELEASE operations in FreeBSD.

What nobody seems to have quite realized at the time is that
a multiply attached SCSI device can be used as the trusted
3rd party for a "cluster group" or for anything which should
be active only at one node at any particular time.
E.g. Solaris cluster groups do exactly that. If one node gets
the reservation through, no other node will until the first
successful one either releases the device or crashes.
The reserved LUN can be either an otherwise unused small
storage unit or one that is going to be anyhow mounted and
unmounted as the cluster group dictates.
The same method would work also for selecting the leader
among Shared QFS metadata servers and for other similar
purposes.

The reservation should definitely not be only bundled inside
the mount operations. Instead it should be possible to trigger
reserve and release through ioctl() or through a separate
system call. This is because sometimes the feature might be
used for unmounted raw devices or for devices which could
be logically mounted to multiple systems while anyhow busy
for all other systems but one.
E.g. something in the style of Shared QFS could use a SCSI
reservation to its metadata volumes, which need not visible
to the users as separate file systems at all.

Anyhow I could see seriously more use for this particular
SCSI feature than just locking a mounted tape drive were
it implemented for other devices than sa only and somehow
exported to the user space. Obviously it should be available
for root only, though.

Cheers,
--jau


From owner-freebsd-scsi@FreeBSD.ORG  Sun Feb  3 17:35:31 2013
Return-Path: <owner-freebsd-scsi@FreeBSD.ORG>
Delivered-To: freebsd-scsi@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 57A1C338
 for <freebsd-scsi@freebsd.org>; Sun,  3 Feb 2013 17:35:31 +0000 (UTC)
 (envelope-from mjacob@freebsd.org)
Received: from ns1.feral.com (ns1.feral.com [192.67.166.1])
 by mx1.freebsd.org (Postfix) with ESMTP id 3DB2925B
 for <freebsd-scsi@freebsd.org>; Sun,  3 Feb 2013 17:35:29 +0000 (UTC)
Received: from [192.168.135.7] (quaver.net [76.14.49.207])
 (authenticated bits=0)
 by ns1.feral.com (8.14.5/8.14.4) with ESMTP id r13HZI1A032716
 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO);
 Sun, 3 Feb 2013 09:35:23 -0800 (PST)
 (envelope-from mjacob@freebsd.org)
Message-ID: <510E9FD1.5070907@freebsd.org>
Date: Sun, 03 Feb 2013 09:35:13 -0800
From: Matthew Jacob <mjacob@freebsd.org>
Organization: FreeBSD
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130107 Thunderbird/17.0.2
MIME-Version: 1.0
To: freebsd-scsi@freebsd.org, j@uriah.heep.sax.de
Subject: Re: Multiple FreeBSD SCSI Hosts
References: <510E987C.4090509@oxit.fi>
In-Reply-To: <510E987C.4090509@oxit.fi>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (ns1.feral.com [192.67.166.1]); Sun, 03 Feb 2013 09:35:23 -0800 (PST)
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: mjacob@freebsd.org
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 03 Feb 2013 17:35:31 -0000

On 2/3/2013 9:03 AM, Jukka A. Ukkonen wrote:
>
> a multiply attached SCSI device can be used as the trusted 
For SANs or iSCSI  this can make some sense- but only if you really 
really trust the release mechanism (which I don't in any heterogeneous 
environment).

The other question to raise is how do you sensibly represent the disks 
to the non-winner node and field failed attempts to operate on the 
shared disk? In other words, how do percolate RESERVATION CONFLICT 
errors up to the application level?


From owner-freebsd-scsi@FreeBSD.ORG  Mon Feb  4 11:06:51 2013
Return-Path: <owner-freebsd-scsi@FreeBSD.ORG>
Delivered-To: freebsd-scsi@FreeBSD.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 509219A8
 for <freebsd-scsi@FreeBSD.org>; Mon,  4 Feb 2013 11:06:51 +0000 (UTC)
 (envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 42B28D1C
 for <freebsd-scsi@FreeBSD.org>; Mon,  4 Feb 2013 11:06:51 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r14B6pUV028898
 for <freebsd-scsi@FreeBSD.org>; Mon, 4 Feb 2013 11:06:51 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
 by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r14B6of6028896
 for freebsd-scsi@FreeBSD.org; Mon, 4 Feb 2013 11:06:50 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 4 Feb 2013 11:06:50 GMT
Message-Id: <201302041106.r14B6of6028896@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
 owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@freebsd.org>
To: freebsd-scsi@FreeBSD.org
Subject: Current problem reports assigned to freebsd-scsi@FreeBSD.org
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Feb 2013 11:06:51 -0000

Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).

The following is a listing of current problems submitted by FreeBSD users.
These represent problem reports covering all versions including
experimental development code and obsolete releases.


S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o kern/171650  scsi       [da] da(4) driver does not recognize end of cciss (Sma
o kern/169403  scsi       [cam] [patch] CAM layer, I/O starvation, no fairness
o kern/165982  scsi       [mpt] mpt instability, drive resets, and losses on Fre
o kern/165740  scsi       [cam] SCSI code must drain callbacks before free
o kern/163713  scsi       [aic7xxx] [patch] Add Adaptec29329LPE to aic79xx_pci.c
o kern/162256  scsi       [mpt] QUEUE FULL EVENT and 'mpt_cam_event: 0x0'
o kern/161809  scsi       [cam] [patch] set kern.cam.boot_delay via build option
o kern/157770  scsi       [iscsi] [panic] iscsi_initiator panic
o kern/154432  scsi       [xpt] run_interrupt_driven_hooks: still waiting after 
o kern/153514  scsi       [cam] [panic] CAM related panic
o docs/151336  scsi       Missing documentation of scsi_ and ata_ functions in c
s kern/149927  scsi       [cam] hard drive not stopped before removing power dur
o kern/148083  scsi       [aac] Strange device reporting
o kern/147704  scsi       [mpt] sys/dev/mpt: new chip revision, partially unsupp
o kern/145768  scsi       [mpt] can't perform I/O on SAS based SAN disk in freeb
o kern/144648  scsi       [aac] Strange values of speed and bus width in dmesg
o kern/142351  scsi       [mpt] LSILogic driver performance problems
o kern/134488  scsi       [mpt] MPT SCSI driver probes max. 8 LUNs per device
o kern/132206  scsi       [mpt] system panics on boot when mirroring and 2nd dri
o kern/130621  scsi       [mpt] tranfer rate is inscrutable slow when use lsi213
o kern/129602  scsi       [ahd] ahd(4) gets confused and wedges SCSI bus
o kern/128452  scsi       [sa] [panic] Accessing SCSI tape drive randomly crashe
o kern/128245  scsi       [scsi] "inquiry data fails comparison at DV1 step" [re
o kern/127927  scsi       [isp] isp(4) target driver crashes kernel when set up 
o kern/127717  scsi       [ata] [patch] [request] - support write cache toggling
o kern/123674  scsi       [ahc] ahc driver dumping
o kern/123520  scsi       [ahd] unable to boot from net while using ahd
o sparc/121676 scsi       [iscsi] iscontrol do not connect iscsi-target on sparc
o kern/120487  scsi       [sg] scsi_sg incompatible with scanners
o kern/120247  scsi       [mpt] FreeBSD 6.3 and LSI Logic 1030 = only 3.300MB/s 
o kern/114597  scsi       [sym] System hangs at SCSI bus reset with dual HBAs
o kern/110847  scsi       [ahd] Tyan U320 onboard problem with more than 3 disks
o kern/99954   scsi       [ahc] reading from DVD failes on 6.x [regression]
o kern/92798   scsi       [ahc] SCSI problem with timeouts
o kern/90282   scsi       [sym] SCSI bus resets cause loss of ch device
o kern/76178   scsi       [ahd] Problem with ahd and large SCSI Raid system
o kern/74627   scsi       [ahc] [hang] Adaptec 2940U2W Can't boot 5.3
s kern/61165   scsi       [panic] kernel page fault after calling cam_send_ccb
o kern/60641   scsi       [sym] Sporadic SCSI bus resets with 53C810 under load
o kern/60598   scsi       wire down of scsi devices conflicts with config
s kern/57398   scsi       [mly] Current fails to install on mly(4) based RAID di
o kern/52638   scsi       [panic] SCSI U320 on SMP server won't run faster than 
o kern/44587   scsi       dev/dpt/dpt.h is missing defines required for DPT_HAND
o kern/39388   scsi       ncr/sym drivers fail with 53c810 and more than 256MB m
o kern/35234   scsi       World access to /dev/pass? (for scanner) requires acce

45 problems total.


From owner-freebsd-scsi@FreeBSD.ORG  Tue Feb  5 02:36:09 2013
Return-Path: <owner-freebsd-scsi@FreeBSD.ORG>
Delivered-To: freebsd-scsi@freebsd.org
Received: by hub.freebsd.org (Postfix, from userid 821)
 id A853C2E1; Tue,  5 Feb 2013 02:36:09 +0000 (UTC)
Date: Tue, 5 Feb 2013 02:36:09 +0000
From: John <jwd@FreeBSD.org>
To: freebsd-scsi@freebsd.org
Subject: Increase mps sequential read performance with ZFS/zvol
Message-ID: <20130205023609.GA99100@FreeBSD.org>
References: <510E987C.4090509@oxit.fi>
 <510E9FD1.5070907@freebsd.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <510E9FD1.5070907@freebsd.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Feb 2013 02:36:09 -0000

Hi Folks,

   I'm in the process of putting together another ZFS server and
after running some sequential read performance tests I'm thinking
things could be better. It's running 9.1-stable from late January:

FreeBSD vprzfs30p.unx.sas.com 9.1-STABLE FreeBSD 9.1-STABLE #1 r246079M

   I have two HP D2700 shelves populated with 600GB drives connected
to a pair of LSI 9207-8e HBA cards installed in a Del R620 with 128GB
of ram, the OS installed an internal raid volume. The shelves are dual
channel, each LSI card with a channel through both shelves.

   Gmultipath is used to bind the disks such that each disk can be
addressed by either controller and the I/O balanced.

   The zfs pool consists of 24 mirrors, each pair one from each shelf.
The multipaths are rotated such that I/O is balanced between shelves
and controllers.

   For testing, two 300GB zvols are created, each almost full:

NAME               USED  AVAIL  REFER  MOUNTPOINT
pool0             1.46T  11.4T    31K  /pool0
pool0/lun000004    301G  11.4T   261G  -
pool0/lun000005    301G  11.4T   300G  -

  Running a simple dd test:

# dd if=/dev/zvol/pool0/lun000005 of=/dev/null bs=512k
614400+0 records in
614400+0 records out
322122547200 bytes transferred in 278.554656 secs (1156406975 bytes/sec)

  The drives are spread and balanced across four 6Gb/s channels, 1.1GB/s
seems a bit slow. Note, changing the bs= options makes no real difference.

   Now, if I run 2 'dd' operations against different pools in parallel:

# dd if=/dev/zvol/pool0/lun000005 of=/dev/null bs=512k
614400+0 records in
614400+0 records out
322122547200 bytes transferred in 278.605380 secs (1156196435 bytes/sec)

# dd if=/dev/zvol/pool0/lun000004 of=/dev/null bs=512k
614400+0 records in
614400+0 records out
322122547200 bytes transferred in 282.065008 secs (1142015274 bytes/sec)

  This tells me the I/O subsystem has plenty of overhead room available
such that the first 'dd' operation could run faster.

  I've included some basic config information below. No kmem values in
/boot/loader.conf.  I did play around with block_cap but it made no
difference.  It seems like something is holding the system back.

  Thanks for any ideas.

-John

Output from top during a single dd run:

    5 root         11  -8    -     0K   208K zvol:i  1   5:11 41.65% zfskern
    0 root        350  -8    0     0K  5600K -       5   3:59 15.23% kernel
 1784 root          1  26    0  9944K  2072K CPU1    1   0:31 13.87% dd

The zvol:io state appears to be a simple loop wait loop waiting
for outstanding I/O requests to complete. How to get more I/O
requests going?

Sample of the highest number of I/O requests per controller:

dev.mps.0.io_cmds_highwater: 207
dev.mps.1.io_cmds_highwater: 126


   IOCFACTS (identical):

mps0: <LSI SAS2308> port 0xec00-0xecff mem 0xdaff0000-0xdaffffff,0xdaf80000-0xdafbffff irq 48 at device 0.0 on pci5
mps0: Doorbell= 0x22000000
mps0: mps_wait_db_ack: successfull count(2), timeout(5)
mps0: Doorbell= 0x12000000
mps0: mps_wait_db_ack: successfull count(1), timeout(5)
mps0: mps_wait_db_ack: successfull count(1), timeout(5)
mps0: mps_wait_db_ack: successfull count(1), timeout(5)
mps0: mps_wait_db_ack: successfull count(1), timeout(5)
mps0: IOCFacts  :
        MsgVersion: 0x200
        HeaderVersion: 0x1b00
        IOCNumber: 0
        IOCExceptions: 0x0
        MaxChainDepth: 128
        WhoInit: ROM BIOS
        NumberOfPorts: 1
        RequestCredit: 10240
        ProductID: 0x2214
        IOCCapabilities: 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc>
        FWVersion= 15-0-0-0
        IOCRequestFrameSize: 32
        MaxInitiators: 32
        MaxTargets: 1024
        MaxSasExpanders: 64
        MaxEnclosures: 65
        ProtocolFlags: 3<ScsiTarg,ScsiInit>
        HighPriorityCredit: 128
        MaxReplyDescriptorPostQueueDepth: 65504
        ReplyFrameSize: 32
        MaxVolumes: 0
        MaxDevHandle: 1128
        MaxPersistentEntries: 128
mps0: Firmware: 15.00.00.00, Driver: 14.00.00.01-fbsd
mps0: IOCCapabilities: 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc>


And some output from 'gstat -f Z -I 300ms'

dT: 0.302s  w: 0.300s  filter: Z
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0    202    202  25450    2.6      0      0    0.0   25.5| multipath/Z0
    1    202    202  25046    6.2      0      0    0.0   36.6| multipath/Z2
    7    185    185  23735    6.3      0      0    0.0   33.1| multipath/Z4
    0    212    212  27125    5.4      0      0    0.0   30.4| multipath/Z6
    0    169    169  21616    5.0      0      0    0.0   28.1| multipath/Z8
    0    162    162  20768    5.0      0      0    0.0   25.7| multipath/Z10
    0    175    175  22463    6.0      0      0    0.0   30.4| multipath/Z12
    0    192    192  24582    4.4      0      0    0.0   32.1| multipath/Z14
    2    169    169  21616    3.3      0      0    0.0   18.8| multipath/Z16
    4    169    169  20808    4.1      0      0    0.0   23.0| multipath/Z18
    2    195    195  24602    4.5      0      0    0.0   28.5| multipath/Z20
    5    172    172  22039    4.4      0      0    0.0   22.7| multipath/Z22
    0    166    166  21192    3.7      0      0    0.0   20.2| multipath/Z24
    7    179    179  22887    5.4      0      0    0.0   27.8| multipath/Z26
    7    172    172  22039    3.5      0      0    0.0   23.1| multipath/Z28
    0    192    192  24582    3.8      0      0    0.0   25.5| multipath/Z30
    1    175    175  22463    6.0      0      0    0.0   30.5| multipath/Z32
    1    182    182  22907    3.9      0      0    0.0   25.6| multipath/Z34
    0    212    212  27125    6.3      0      0    0.0   32.7| multipath/Z36
    0    179    179  22483    4.8      0      0    0.0   27.5| multipath/Z38
    2    185    185  23735    4.6      0      0    0.0   30.0| multipath/Z40
    0    179    179  22887    4.5      0      0    0.0   28.2| multipath/Z42
    3    195    195  25006    4.4      0      0    0.0   32.3| multipath/Z44
    3    192    192  24582    4.0      0      0    0.0   30.5| multipath/Z46
    0      0      0      0    0.0      0      0    0.0    0.0| multipath/Z48
    0    179    179  22887    4.7      0      0    0.0   31.0| multipath/Z1
    0    185    185  23331    4.1      0      0    0.0   24.8| multipath/Z3
    0    175    175  21639    5.3      0      0    0.0   28.2| multipath/Z5
    4    162    162  20768    5.1      0      0    0.0   26.6| multipath/Z7
    0    195    195  25006    3.5      0      0    0.0   23.4| multipath/Z9
    3    179    179  22887    5.0      0      0    0.0   25.7| multipath/Z11
    4    159    159  20344    4.9      0      0    0.0   23.7| multipath/Z13
    4    166    166  21192    4.3      0      0    0.0   25.1| multipath/Z15
    0    169    169  21616    3.9      0      0    0.0   24.7| multipath/Z17
    7    189    189  23334    4.2      0      0    0.0   25.7| multipath/Z19
    4    169    169  21212    4.3      0      0    0.0   28.1| multipath/Z21
    0    159    159  20344    5.3      0      0    0.0   25.8| multipath/Z23
    5    185    185  23316    4.1      0      0    0.0   26.0| multipath/Z25
    0    192    192  24582    4.9      0      0    0.0   30.6| multipath/Z27
    0    172    172  22039    5.5      0      0    0.0   27.4| multipath/Z29
    4    166    166  21192    4.2      0      0    0.0   23.7| multipath/Z31
    0    169    169  20778    3.5      0      0    0.0   22.2| multipath/Z33
    2    172    172  21232    5.1      0      0    0.0   29.4| multipath/Z35
    3    169    169  21616    2.9      0      0    0.0   20.1| multipath/Z37
    0    179    179  22887    5.2      0      0    0.0   32.0| multipath/Z39
    0    212    212  26721    5.4      0      0    0.0   31.7| multipath/Z41
    2    175    175  22463    4.4      0      0    0.0   28.0| multipath/Z43
    0    179    179  22887    3.6      0      0    0.0   18.2| multipath/Z45
    0    179    179  22887    4.3      0      0    0.0   28.3| multipath/Z47
    0      0      0      0    0.0      0      0    0.0    0.0| multipath/Z49

Each individual disk on the system shows the capability of 255 tags:

# camcontrol tags da0 -v
(pass2:mps0:0:10:0): dev_openings  255
(pass2:mps0:0:10:0): dev_active    0
(pass2:mps0:0:10:0): devq_openings 255
(pass2:mps0:0:10:0): devq_queued   0
(pass2:mps0:0:10:0): held          0
(pass2:mps0:0:10:0): mintags       2
(pass2:mps0:0:10:0): maxtags       255


zpool:

# zpool status
  pool: pool0
 state: ONLINE
  scan: none requested
config:

	NAME               STATE     READ WRITE CKSUM
	pool0              ONLINE       0     0     0
	  mirror-0         ONLINE       0     0     0
	    multipath/Z0   ONLINE       0     0     0
	    multipath/Z1   ONLINE       0     0     0
	  mirror-1         ONLINE       0     0     0
	    multipath/Z2   ONLINE       0     0     0
	    multipath/Z3   ONLINE       0     0     0
	  mirror-2         ONLINE       0     0     0
	    multipath/Z4   ONLINE       0     0     0
	    multipath/Z5   ONLINE       0     0     0
	  mirror-3         ONLINE       0     0     0
	    multipath/Z6   ONLINE       0     0     0
	    multipath/Z7   ONLINE       0     0     0
	  mirror-4         ONLINE       0     0     0
	    multipath/Z8   ONLINE       0     0     0
	    multipath/Z9   ONLINE       0     0     0
	  mirror-5         ONLINE       0     0     0
	    multipath/Z10  ONLINE       0     0     0
	    multipath/Z11  ONLINE       0     0     0
...
	  mirror-21        ONLINE       0     0     0
	    multipath/Z42  ONLINE       0     0     0
	    multipath/Z43  ONLINE       0     0     0
	  mirror-22        ONLINE       0     0     0
	    multipath/Z44  ONLINE       0     0     0
	    multipath/Z45  ONLINE       0     0     0
	  mirror-23        ONLINE       0     0     0
	    multipath/Z46  ONLINE       0     0     0
	    multipath/Z47  ONLINE       0     0     0
	spares
	  multipath/Z48    AVAIL   
	  multipath/Z49    AVAIL   

errors: No known data errors


From owner-freebsd-scsi@FreeBSD.ORG  Tue Feb  5 21:18:39 2013
Return-Path: <owner-freebsd-scsi@FreeBSD.ORG>
Delivered-To: freebsd-scsi@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 58B01DAE;
 Tue,  5 Feb 2013 21:18:39 +0000 (UTC) (envelope-from ken@kdm.org)
Received: from nargothrond.kdm.org (nargothrond.kdm.org [70.56.43.81])
 by mx1.freebsd.org (Postfix) with ESMTP id 161578BB;
 Tue,  5 Feb 2013 21:18:38 +0000 (UTC)
Received: from nargothrond.kdm.org (localhost [127.0.0.1])
 by nargothrond.kdm.org (8.14.2/8.14.2) with ESMTP id r15LGhCO075656;
 Tue, 5 Feb 2013 14:16:43 -0700 (MST)
 (envelope-from ken@nargothrond.kdm.org)
Received: (from ken@localhost)
 by nargothrond.kdm.org (8.14.2/8.14.2/Submit) id r15LGgLf075655;
 Tue, 5 Feb 2013 14:16:42 -0700 (MST) (envelope-from ken)
Date: Tue, 5 Feb 2013 14:16:42 -0700
From: "Kenneth D. Merry" <ken@freebsd.org>
To: "Desai, Kashyap" <Kashyap.Desai@lsi.com>
Subject: Re: Max Queue depth of HBA limited to 256 ?
Message-ID: <20130205211642.GA75343@nargothrond.kdm.org>
References: <B2FD678A64EAAD45B089B123FDFC3ED75E68FD6158@inbmail01.lsi.com>
 <20130121170529.GA64188@nargothrond.kdm.org>
 <B2FD678A64EAAD45B089B123FDFC3ED75E68FD616E@inbmail01.lsi.com>
 <B2FD678A64EAAD45B089B123FDFC3ED75E68FD6296@inbmail01.lsi.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <B2FD678A64EAAD45B089B123FDFC3ED75E68FD6296@inbmail01.lsi.com>
User-Agent: Mutt/1.4.2i
Cc: "freebsd-scsi@freebsd.org" <freebsd-scsi@freebsd.org>, "McConnell,
 Stephen" <Stephen.McConnell@lsi.com>, "jhb@freebsd.org" <jhb@freebsd.org>
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Feb 2013 21:18:39 -0000


I'm able to get more than 255 commands outstanding to the controller in my
configuration.

For example:

dev.mps.0.%desc: LSI SAS2116
dev.mps.0.%driver: mps
dev.mps.0.%location: slot=6 function=0 handle=\_SB_.PCI0.S30_
dev.mps.0.%pnpinfo: vendor=0x1000 device=0x0064 subvendor=0x1000
subdevice=0x30c0 class=0x010700
dev.mps.0.%parent: pci0
dev.mps.0.debug_level: 4
dev.mps.0.disable_msix: 0
dev.mps.0.disable_msi: 0
dev.mps.0.firmware_version: 13.00.01.00
dev.mps.0.driver_version: 14.00.00.01-fbsd
dev.mps.0.io_cmds_active: 442
dev.mps.0.io_cmds_highwater: 464
dev.mps.0.chain_free: 354
dev.mps.0.chain_free_lowwater: 181
dev.mps.0.max_chains: 2048
dev.mps.0.chain_alloc_fail: 0

io_cmds_highwater is 464.  Can you get more than 255 commands outstanding
if you use more than 1 target?

This is with 272 'dd' processes doing 1MB reads to 16 2TB and 3TB SAS
drives behind 2 3Gb Maxim expanders:

<SEAGATE ST32000444SS 0006>        at scbus2 target 144 lun 0 (pass4,sg4,da0)
<SEAGATE ST32000444SS 0006>        at scbus2 target 145 lun 0 (pass5,sg5,da1)
<SEAGATE ST33000650SS 0003>        at scbus2 target 146 lun 0 (pass6,sg6,da2)
<SEAGATE ST33000650SS 0003>        at scbus2 target 147 lun 0 (pass7,sg7,da3)
<SEAGATE ST32000444SS 0006>        at scbus2 target 148 lun 0 (pass8,sg8,da4)
<SEAGATE ST32000444SS 0006>        at scbus2 target 149 lun 0 (pass9,sg9,da5)
<SEAGATE ST33000650SS 0003>        at scbus2 target 150 lun 0 (pass10,sg10,da6)
<SEAGATE ST33000650SS 0003>        at scbus2 target 151 lun 0 (pass11,sg11,da7)
<SEAGATE ST32000444SS 0006>        at scbus2 target 152 lun 0 (pass12,sg12,da8)
<SEAGATE ST32000444SS 0006>        at scbus2 target 153 lun 0 (pass13,sg13,da9)
<SEAGATE ST32000444SS 0006>        at scbus2 target 154 lun 0 (pass14,sg14,da10)
<SEAGATE ST32000444SS 0006>        at scbus2 target 155 lun 0 (pass15,sg15,da11)
<SEAGATE ST33000650SS 0003>        at scbus2 target 156 lun 0 (pass16,sg16,da12)
<SEAGATE ST33000650SS 0003>        at scbus2 target 157 lun 0 (pass17,sg17,da13)
<SEAGATE ST33000650SS 0003>        at scbus2 target 158 lun 0 (pass18,sg18,da14)
<SEAGATE ST33000650SS 0003>        at scbus2 target 159 lun 0 (pass19,sg19,da15)

i.e. 17 iterations of this:

((i=0)); while [ $i -le 15 ]; do dd if=/dev/da$i of=/dev/null bs=1m & ((i++)); done

The individual drives see varying numbers of tags, but nowhere near the
maximum:

[root@storage-domain ~]# camcontrol tags da15 -v
(pass19:mps0:0:159:0): dev_openings  230
(pass19:mps0:0:159:0): dev_active    25
(pass19:mps0:0:159:0): devq_openings 230
(pass19:mps0:0:159:0): devq_queued   0
(pass19:mps0:0:159:0): held          0
(pass19:mps0:0:159:0): mintags       2
(pass19:mps0:0:159:0): maxtags       255

What kind of drive is the target?

Ken

On Wed, Jan 23, 2013 at 00:44:31 +0530, Desai, Kashyap wrote:
> LSI h/w needs more outstanding command in FW to get better Perf counts compare to other OS.
> 
> Please suggest if whatever I have been observed is limitation from FreeBSD or we can tune it in Driver ?
> My goals is to pump ~1000 outstanding IOs to the HBA. I see that it never goes beyond 255. 
> 
> Thanks,
> Kashyap
> 
> > -----Original Message-----
> > From: owner-freebsd-scsi@freebsd.org [mailto:owner-freebsd-
> > scsi@freebsd.org] On Behalf Of Desai, Kashyap
> > Sent: Monday, January 21, 2013 11:18 PM
> > To: Kenneth D. Merry
> > Cc: freebsd-scsi@freebsd.org; jhb@freebsd.org; McConnell, Stephen
> > Subject: RE: Max Queue depth of HBA limited to 256 ?
> > 
> > 
> > 
> > > -----Original Message-----
> > > From: Kenneth D. Merry [mailto:ken@freebsd.org]
> > > Sent: Monday, January 21, 2013 10:35 PM
> > > To: Desai, Kashyap
> > > Cc: freebsd-scsi@freebsd.org; McConnell, Stephen; Saxena, Sumit;
> > > jhb@freebsd.org
> > > Subject: Re: Max Queue depth of HBA limited to 256 ?
> > >
> > > On Mon, Jan 21, 2013 at 20:15:47 +0530, Desai, Kashyap wrote:
> > > > Hi,
> > > >
> > > > I was trying to check few things on LSI controller, where we have
> > > > more
> > > than 256 queue depth support.
> > > > I added default maxtags in scsi/scsi_xpt.c as below. (Because I
> > > > don't
> > > want mattags to restrict any outstanding commands the LSI HBA.
> > > >
> > > >     {
> > > >         /* Default tagged queuing parameters for all devices */
> > > >         {
> > > >           T_ANY, SIP_MEDIA_REMOVABLE|SIP_MEDIA_FIXED,
> > > >           /*vendor*/"*", /*product*/"*", /*revision*/"*"
> > > >         },
> > > >         /*quirks*/0, /*mintags*/2, /*maxtags*/1024      <--- Default
> > > maxtags were 256. I increase it to 10234
> > > >     },
> > > >
> > > >
> > > > LSI's SAS-HBA and MR-HBA can support more than 256 outstanding
> > > commands in Firmware.  But due to some reason, I am not able to pump
> > > more than 256 outstanding commands to the HBA.
> > > >
> > > > I used "rawio -p 256 /dev/da1" and more /dev/dax in loop. I have
> > > sysctl parameter in Driver to display outstanding "FW commands". Max
> > > value for FW outstanding only goes up to 256.
> > > >
> > > > Also from some other mail thread Subject "mfi driver performance", I
> > > found that folks talk about tuning queue depth _but_ nobody discussed
> > > to increase it beyond 256. Is there any limitation in FreeBSD ?
> > > >
> > >
> > > As Jim pointed out, one thing to check is the values passed into
> > > cam_sim_alloc().  In the case of the mps(4) driver, the calculation is
> > > in mps_attach():
> > >
> > > sc->num_reqs = MIN(MPS_REQ_FRAMES, sc->facts->RequestCredit);
> > >
> > > What is reported for the RequestCredit on this particular adapter?
> > >
> > > The other question is, what does 'camcontrol tags daX -v' show when
> > > you are running the test?
> > 
> > Below is output of camcontrol tags da1 -v.
> > 
> > dhcp-135-24-192-127# camcontrol tags da13 -v
> > (pass13:mrsas0:0:13:0): dev_openings  1024
> > (pass13:mrsas0:0:13:0): dev_active    0
> > (pass13:mrsas0:0:13:0): devq_openings 1024
> > (pass13:mrsas0:0:13:0): devq_queued   0
> > (pass13:mrsas0:0:13:0): held          0
> > (pass13:mrsas0:0:13:0): mintags       2
> > (pass13:mrsas0:0:13:0): maxtags       1024
> > dhcp-135-24-192-127# camcontrol tags da1 -v
> > (pass1:mrsas0:0:1:0): dev_openings  1024
> > (pass1:mrsas0:0:1:0): dev_active    0
> > (pass1:mrsas0:0:1:0): devq_openings 1024
> > (pass1:mrsas0:0:1:0): devq_queued   0
> > (pass1:mrsas0:0:1:0): held          0
> > (pass1:mrsas0:0:1:0): mintags       2
> > (pass1:mrsas0:0:1:0): maxtags       1024
> > 
> > Value 1024 is hard coded for my testing. In MegaRaid controller and SAS-
> > HBA Driver read max commands value from FW.
> > Similar to "RequestCredit"..  Different FW has different value, but they
> > are every time above 255.
> > 
> > 
> > When I run IOs dev_active stays in range of 0-255 only.  See below
> > output when I run IOs on /dev/da1 and /dev/da13. I expect total
> > dev_openings should go beyond 255, which is not happening.
> > 
> > 
> > dhcp-135-24-192-127# camcontrol tags da1 -v
> > (pass1:mrsas0:0:1:0): dev_openings  832
> > (pass1:mrsas0:0:1:0): dev_active    192
> > (pass1:mrsas0:0:1:0): devq_openings 832
> > (pass1:mrsas0:0:1:0): devq_queued   0
> > (pass1:mrsas0:0:1:0): held          0
> > (pass1:mrsas0:0:1:0): mintags       2
> > (pass1:mrsas0:0:1:0): maxtags       1024
> > dhcp-135-24-192-127# camcontrol tags da13 -v
> > (pass13:mrsas0:0:13:0): dev_openings  881
> > (pass13:mrsas0:0:13:0): dev_active    143
> > (pass13:mrsas0:0:13:0): devq_openings 881
> > (pass13:mrsas0:0:13:0): devq_queued   0
> > (pass13:mrsas0:0:13:0): held          0
> > (pass13:mrsas0:0:13:0): mintags       2
> > (pass13:mrsas0:0:13:0): maxtags       1024
> > 
> > 
> > 
> > 
> > Jim:
> > Below is my API call. I have hard code value "queue_depth" = 1024
> > 
> >     sc->sim_0 = cam_sim_alloc(mrsas_action, mrsas_poll, "mrsas", sc,
> >         device_get_unit(sc->mrsas_dev), &sc->sim_lock, queue_depth,
> >         queue_depth, devq);
> > 
> > ~ Kashyap
> > 
> > >
> > > Ken
> > > --
> > > Kenneth Merry
> > > ken@FreeBSD.ORG
> > _______________________________________________
> > freebsd-scsi@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-scsi
> > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org"

-- 
Kenneth Merry
ken@FreeBSD.ORG