From owner-freebsd-scsi@freebsd.org Mon Jun 6 22:19:12 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E40F3B63C72 for ; Mon, 6 Jun 2016 22:19:12 +0000 (UTC) (envelope-from list-news@mindpackstudios.com) Received: from mail.furymx.com (mindpack.mx1.furymx.net [64.141.130.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C196317B5 for ; Mon, 6 Jun 2016 22:19:11 +0000 (UTC) (envelope-from list-news@mindpackstudios.com) Received: from mindpack.furymx.net (mindpack.mx1.furymx.net [10.10.1.10]) by mail.furymx.com (Postfix) with ESMTP id D084921AA6C for ; Mon, 6 Jun 2016 17:19:04 -0500 (CDT) X-Virus-Scanned: amavisd-new at furymx.com Received: from mail.furymx.com ([10.10.1.10]) by mindpack.furymx.net (mail.furymx.com [10.10.1.10]) (amavisd-new, port 10024) with ESMTP id QU7in_cUxwyD for ; Mon, 6 Jun 2016 17:19:02 -0500 (CDT) Received: from vortex.local (c-98-215-180-176.hsd1.in.comcast.net [98.215.180.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: kyle@mindpackstudios.com) by mail.furymx.com (Postfix) with ESMTPSA id C90D221AA58 for ; Mon, 6 Jun 2016 17:19:02 -0500 (CDT) Subject: Re: Avago LSI SAS 3008 & Intel SSD Timeouts To: freebsd-scsi@freebsd.org References: <30c04d8b-80cb-c637-26dc-97caebad3acb@mindpackstudios.com> <08C01646-9AF3-4E89-A545-C051A284E039@sarenet.es> From: list-news Message-ID: <986e03a7-5dc8-f5e0-5a17-4bf49459f905@mindpackstudios.com> Date: Mon, 6 Jun 2016 17:19:02 -0500 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <08C01646-9AF3-4E89-A545-C051A284E039@sarenet.es> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Jun 2016 22:19:13 -0000 System was running solid all weekend with camcontrol tags set to 1 for each device, zero errors. Last week I did try *# sysctl kern.cam.da.X.delete_method=**DISABLE* for each drive, but it still threw errors. Also, I did try out bio_delete_disable earlier today: *# camcontrol tags daX -N 255* (Firstly resetting tags back to 255 for each device, as they are currently at 1.) *# sysctl vfs.zfs.vdev.bio_delete_disable=1* (a few minutes later) Jun 6 12:28:36 s18 kernel: (da2:mpr0:0:12:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 length 0 SMID 577 command timeout cm 0xfffffe0001351550 ccb 0xfffff804e78e3800 target 12, handle(0x000c) Jun 6 12:28:36 s18 kernel: mpr0: At enclosure level 0, slot 4, connector name ( ) Jun 6 12:28:36 s18 kernel: mpr0: timedout cm 0xfffffe0001351550 allocated tm 0xfffffe0001322150 Jun 6 12:28:36 s18 kernel: (noperiph:mpr0:0:4294967295:0): SMID 1 Aborting command 0xfffffe0001351550 Jun 6 12:28:36 s18 kernel: mpr0: Sending reset from mprsas_send_abort for target ID 12 Jun 6 12:28:36 s18 kernel: (da2:mpr0:0:12:0): READ(10). CDB: 28 00 18 45 1c c0 00 00 08 00 length 4096 SMID 583 command timeout cm 0xfffffe0001351d30 ccb 0xfffff806b9556800 target 12, handle(0x000c) Jun 6 12:28:36 s18 kernel: mpr0: At enclosure level 0, slot 4, connector name ( ) Jun 6 12:28:36 s18 kernel: mpr0: queued timedout cm 0xfffffe0001351d30 for processing by tm 0xfffffe0001322150 ... During the 60 second hang: *# gstat -do* L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d o/s ms/o %busy Name 70 0 0 0 0.0 0 0 0.0 0 0 0.0 0 0.0 0.0| da2 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0 0.0 0.0| da4 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0 0.0 0.0| da6 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0 0.0 0.0| da7 Also during the 60 second hang: *# camcontrol tags da3 -v* (pass2:mpr0:0:12:0): dev_openings 248 (pass2:mpr0:0:12:0): dev_active 7 (pass2:mpr0:0:12:0): allocated 7 (pass2:mpr0:0:12:0): queued 0 (pass2:mpr0:0:12:0): held 0 (pass2:mpr0:0:12:0): mintags 2 (pass2:mpr0:0:12:0): maxtags 255 Also during the 60 second hang: *# sysctl dev.mpr* dev.mpr.0.spinup_wait_time: 3 dev.mpr.0.chain_alloc_fail: 0 dev.mpr.0.enable_ssu: 1 dev.mpr.0.max_chains: 2048 dev.mpr.0.chain_free_lowwater: 2022 dev.mpr.0.chain_free: 2048 dev.mpr.0.io_cmds_highwater: 71 dev.mpr.0.io_cmds_active: 4 dev.mpr.0.driver_version: 09.255.01.00-fbsd dev.mpr.0.firmware_version: 10.00.03.00 dev.mpr.0.disable_msi: 0 dev.mpr.0.disable_msix: 0 dev.mpr.0.debug_level: 895 dev.mpr.0.%parent: pci1 dev.mpr.0.%pnpinfo: vendor=0x1000 device=0x0097 subvendor=0x15d9 subdevice=0x0808 class=0x010700 dev.mpr.0.%location: pci0:1:0:0 handle=\_SB_.PCI0.BR1A.H000 dev.mpr.0.%driver: mpr dev.mpr.0.%desc: Avago Technologies (LSI) SAS3008 dev.mpr.%parent: Something else that may be of consideration: I ran fio & bonnie++ for about an hour of heavy io (with tags still set to 255 drive busy showing 90-100%). No errors. I fire up my application (threaded Java/Postgres application), and within minutes: *# gstat -do* dT: 1.002s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d o/s ms/o %busy Name 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0 0.0 0.0| da2 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0 0.0 0.0| da4 71 0 0 0 0.0 0 0 0.0 0 0 0.0 0 0.0 0.0| da6 0 0 0 0 0.0 0 0 0.0 0 0 0.0 0 0.0 0.0| da7 *Error:* Jun 6 13:36:15 s18 kernel: (da6:mpr0:0:16:0): WRITE(10). CDB: 2a 00 30 65 13 90 00 00 10 00 Jun 6 13:36:15 s18 kernel: (da6:mpr0:0:16:0): CAM status: SCSI Status Error Jun 6 13:36:15 s18 kernel: (da6:mpr0:0:16:0): SCSI status: Check Condition Jun 6 13:36:15 s18 kernel: (da6:mpr0:0:16:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred) Jun 6 13:36:15 s18 kernel: (da6:mpr0:0:16:0): Retrying command (per sense data) ... *And again 2 minutes later:* Jun 6 13:38:43 s18 kernel: (da2:mpr0:0:12:0): WRITE(10). CDB: 2a 00 21 66 63 58 00 00 10 00 Jun 6 13:38:43 s18 kernel: (da2:mpr0:0:12:0): CAM status: SCSI Status Error Jun 6 13:38:43 s18 kernel: (da2:mpr0:0:12:0): SCSI status: Check Condition Jun 6 13:38:43 s18 kernel: (da2:mpr0:0:12:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred) Jun 6 13:38:43 s18 kernel: (da2:mpr0:0:12:0): Retrying command (per sense data) ... *And again 3 minutes later:* Jun 6 13:41:29 s18 kernel: (da7:mpr0:0:18:0): WRITE(10). CDB: 2a 00 33 44 b5 b8 00 00 10 00 ... *#camcontrol tags daX -N **1* (And now, after 15 minutes, zero errors.) In putting some thoughts to this, which may or may not be off base (please feel free to correct me btw), I've noticed the following: 1) There doesn't seem to be any indication as to what causes the drive to time-out. The command that fails in the error log is one of the following: READ(10), WRITE(10), ATA COMMAND PASS THROUGH(16), and SYNCHRONIZE CACHE(10). As I understand it, that was the command being executed, timed-out, and retried, not what potentially caused the drive lock-up. 2) When my application is run, it hammers postgres pretty hard, and when postgres is running I get the errors. FIO & Bonnie++ doesn't give me errors; daily use of the system doesn't give me errors. I'm assuming postgresql is sending far more of a certain type of command to the io subsystem than those other applications, and the first command that comes to mind is fsync. 3) I turned fsync off in postgresql.conf (I'm brave for science!!) then ran my application again with tags at 255, 100% cpu load, 70-80% drive busy%. *1.5 hours later at full load - finally, a single timeout:* Jun 6 16:31:33 s18 kernel: (da2:mpr0:0:12:0): READ(10). CDB: 28 00 2d 50 1b 78 00 00 08 00 length 4096 SMID 556 command timeout cm 0xfffffe000134f9c0 ccb 0xfffff83aa5b25000 target 12, handle(0x000c) I ran it for another 20 minutes with no additional timeouts. I assume the fsync command turns into a zfs -> cam -> SYNCHRONIZE CACHE command for each device. And postgres is sending this command considerably more often than a typical application (at least with fsync turned on in postgresql.conf), which would explain why when fsync is turned off or minimal fsyncs are being sent (ie typical system usage), the error is rare. Yet, when fsync is being sent repeatedly, the errors start happening every few minutes. The only reason I can think why setting tags to 1 eliminates the errors entirely must have something to do with Intel drives not handling parallel commands from cam when one (or more) of the commands are SYNCHRONIZE CACHE. Thoughts? Thanks, -Kyle On 6/6/16 3:42 AM, Borja Marcos wrote: >> On 03 Jun 2016, at 23:49, Steven Hartland wrote: >> >> First thing would be to run gstat with -d to see if you're actually stacking up deletes, a symptom of which can be r/w dropping to zero. >> >> If you are seeing significant deletes it could be a FW issue on the drives. > Hmm. I’ve suffered that badly with Intel P3500 NVMe drives, which suffer at least from a driver problem: trims are not coalesced. > However I didn’t experience command timeouts. Reads and, especially, writes, stalled badly. > > A quick test for trim related trouble is setting the sysctl variable vfs.zfs.vdev.bio_delete_disable to 1. It doesn´t require > a reboot and you can quickly compare results. > > In my case, a somewhat similar problem in an IBM server was caused by a faulty LSI3008 card it seems. As I didn´t have spare LSI3008 cards > at the time I replaced it by a LSI2008 and everything works perfectly. Before anyone chimes in suggesting card incompatibility of some sort, > I have a twin system with a LSI3008 working like a charm. ;) > > > > > > > > Borja. > > > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org"