From owner-freebsd-stable@freebsd.org  Sun Mar  6 21:27:37 2016
Return-Path: <owner-freebsd-stable@freebsd.org>
Delivered-To: freebsd-stable@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8B82AA958DB
 for <freebsd-stable@mailman.ysv.freebsd.org>;
 Sun,  6 Mar 2016 21:27:37 +0000 (UTC) (envelope-from slw@zxy.spb.ru)
Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 4594ADA6
 for <freebsd-stable@freebsd.org>; Sun,  6 Mar 2016 21:27:37 +0000 (UTC)
 (envelope-from slw@zxy.spb.ru)
Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD))
 (envelope-from <slw@zxy.spb.ru>)
 id 1acgD7-000Bd7-Fn; Mon, 07 Mar 2016 00:27:33 +0300
Date: Mon, 7 Mar 2016 00:27:33 +0300
From: Slawa Olhovchenkov <slw@zxy.spb.ru>
To: Scott Long <scottl@samsco.org>
Cc: freebsd-stable@freebsd.org
Subject: Re: kernel: mps0: Out of chain frames, consider increasing
 hw.mps.max_chains.
Message-ID: <20160306212733.GJ11654@zxy.spb.ru>
References: <20160306194555.GC94639@zxy.spb.ru>
 <0F0C78F4-6FE2-43BA-B503-AA04A79F2E70@samsco.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <0F0C78F4-6FE2-43BA-B503-AA04A79F2E70@samsco.org>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-SA-Exim-Connect-IP: <locally generated>
X-SA-Exim-Mail-From: slw@zxy.spb.ru
X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-stable>, 
 <mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable/>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 06 Mar 2016 21:27:37 -0000

On Sun, Mar 06, 2016 at 01:10:42PM -0800, Scott Long wrote:

> Hi,
> 
> The message is harmless, it's a reminder that you should tune the kernel for your workload.  When the message is triggered, it means that a potential command was deferred, likely for only a few microseconds, and then everything moved on as normal.  
> 
> A command uses anywhere from 0 to a few dozen chain frames per I/o, depending on the size of the io.  The chain frame memory is allocated at boot so that it's always available, not allocated on the fly.  When I wrote this driver, I felt that it would be wasteful to reserve memory for a worst case scenario of all large io's by default, so I put in this deferral system with a console reminder to for tuning.  
> 
> Yes, you actually do have 900 io's outstanding.  The controller buffers the io requests and allows the system to queue up much more than what sata disks might allow on their own.  It's debatable if this is good or bad, but it's tunable as well.
> 
> Anyways, the messages should not cause alarm.  Either tune up the chain frame count, or tune down the max io count.

I am don't know depends or not, but I see dramaticaly performance drop
at time of this messages.

How I can calculate buffers numbers?
I am have very heavy I/O.
This allocated one for all controllers, or allocated for every controller?

> > On Mar 6, 2016, at 11:45 AM, Slawa Olhovchenkov <slw@zxy.spb.ru> wrote:
> > 
> > I am use 10-STABLE r295539 and LSI SAS2008.
> > 
> > mps0: <Avago Technologies (LSI) SAS2008> port 0x8000-0x80ff mem 0xdfc00000-0xdfc03fff,0xdfb80000-0xdfbbffff irq 32 at device 0.0 on pci2
> > mps0: Firmware: 15.00.00.00, Driver: 20.00.00.00-fbsd
> > mps0: IOCCapabilities: 185c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,IR>
> > mps1: <Avago Technologies (LSI) SAS2008> port 0x7000-0x70ff mem 0xdf600000-0xdf603fff,0xdf580000-0xdf5bffff irq 34 at device 0.0 on pci3
> > mps1: Firmware: 17.00.01.00, Driver: 20.00.00.00-fbsd
> > mps1: IOCCapabilities: 185c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,IR>
> > mps2: <Avago Technologies (LSI) SAS2008> port 0xf000-0xf0ff mem 0xfba00000-0xfba03fff,0xfb980000-0xfb9bffff irq 50 at device 0.0 on pci129
> > mps2: Firmware: 15.00.00.00, Driver: 20.00.00.00-fbsd
> > mps2: IOCCapabilities: 185c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,IR>
> > mps3: <Avago Technologies (LSI) SAS2008> port 0xe000-0xe0ff mem 0xfb400000-0xfb403fff,0xfb380000-0xfb3bffff irq 56 at device 0.0 on pci130
> > mps3: Firmware: 15.00.00.00, Driver: 20.00.00.00-fbsd
> > mps3: IOCCapabilities: 185c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,IR>
> > 
> > Some time ago I am see in log messages like this:
> > 
> > Mar  6 22:28:27 edge02 kernel: mps3: Out of chain frames, consider increasing hw.mps.max_chains.
> > Mar  6 22:28:28 edge02 kernel: mps1: Out of chain frames, consider increasing hw.mps.max_chains.
> > Mar  6 22:28:28 edge02 kernel: mps0: Out of chain frames, consider increasing hw.mps.max_chains.
> > Mar  6 22:29:39 edge02 kernel: mps0: Out of chain frames, consider increasing hw.mps.max_chains.
> > Mar  6 22:30:07 edge02 kernel: mps3: Out of chain frames, consider increasing hw.mps.max_chains.
> > Mar  6 22:30:09 edge02 kernel: mps1: Out of chain frames, consider increasing hw.mps.max_chains.
> > 
> > This is peak hour. I am try to monitoring:
> > 
> > root@edge02:/ # sysctl dev.mps | grep -e chain_free: -e io_cmds_active
> > dev.mps.3.chain_free: 70
> > dev.mps.3.io_cmds_active: 901
> > dev.mps.2.chain_free: 504
> > dev.mps.2.io_cmds_active: 725
> > dev.mps.1.chain_free: 416
> > dev.mps.1.io_cmds_active: 896
> > dev.mps.0.chain_free: 39
> > dev.mps.0.io_cmds_active: 12
> > root@edge02:/ # sysctl dev.mps | grep -e chain_free: -e io_cmds_active
> > dev.mps.3.chain_free: 412
> > dev.mps.3.io_cmds_active: 572
> > dev.mps.2.chain_free: 718
> > dev.mps.2.io_cmds_active: 687
> > dev.mps.1.chain_free: 211
> > dev.mps.1.io_cmds_active: 906
> > dev.mps.0.chain_free: 65
> > dev.mps.0.io_cmds_active: 144
> > root@edge02:/ # sysctl dev.mps | grep -e chain_free: -e io_cmds_active
> > dev.mps.3.chain_free: 500
> > dev.mps.3.io_cmds_active: 629
> > dev.mps.2.chain_free: 623
> > dev.mps.2.io_cmds_active: 676
> > dev.mps.1.chain_free: 251
> > dev.mps.1.io_cmds_active: 907
> > dev.mps.0.chain_free: 139
> > dev.mps.0.io_cmds_active: 144
> > 
> > [...]
> > 
> > root@edge02:/ # sysctl dev.mps | grep -e chain_free: -e io_cmds_active
> > dev.mps.3.chain_free: 1874
> > dev.mps.3.io_cmds_active: 78
> > dev.mps.2.chain_free: 1888
> > dev.mps.2.io_cmds_active: 64
> > dev.mps.1.chain_free: 1922
> > dev.mps.1.io_cmds_active: 42
> > dev.mps.0.chain_free: 1936
> > dev.mps.0.io_cmds_active: 48
> > root@edge02:/ # sysctl dev.mps | grep -e chain_free: -e io_cmds_active
> > dev.mps.3.chain_free: 1890
> > dev.mps.3.io_cmds_active: 78
> > dev.mps.2.chain_free: 1890
> > dev.mps.2.io_cmds_active: 82
> > dev.mps.1.chain_free: 1729
> > dev.mps.1.io_cmds_active: 150
> > dev.mps.0.chain_free: 1893
> > dev.mps.0.io_cmds_active: 57
> > 
> > What this mean? Why with allocated 2048 chains per controller I see 65 free and 144 allocated?
> > How I got 976 active commands? I am use SATA HDD suported only 32 tags on NCQ, with 8 ports
> > this maximum 256 outstanding commands per controller.
> > 
> > How I can resolve this issue?
> > _______________________________________________
> > freebsd-stable@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"