Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 2 May 2013 11:23:19 -0600
From:      "Kenneth D. Merry" <ken@freebsd.org>
To:        Steven Hartland <killing@multiplay.co.uk>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, trasz@freebsd.org
Subject:   Re: svn commit: r249939 - head/sys/cam/scsi
Message-ID:  <20130502172319.GA6226@nargothrond.kdm.org>
In-Reply-To: <8323FBAA5BD84C7D948801DB8E15D12D@multiplay.co.uk>
References:  <201304261617.r3QGH58Q048395@svn.freebsd.org> <20130429225641.GA1375@nargothrond.kdm.org> <F701AE7B956C498AB89649768B748D81@multiplay.co.uk> <8C38141DA56442218C71EDCBA9790CF8@multiplay.co.uk> <20130501035832.GB79864@nargothrond.kdm.org> <8323FBAA5BD84C7D948801DB8E15D12D@multiplay.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, May 02, 2013 at 18:09:29 +0100, Steven Hartland wrote:
> 
> ----- Original Message ----- 
> From: "Kenneth D. Merry" <ken@freebsd.org>
> 
> >On Tue, Apr 30, 2013 at 04:16:49 +0100, Steven Hartland wrote:
> >>----- Original Message ----- 
> >>From: "Steven Hartland"
> >>>>On Fri, Apr 26, 2013 at 16:17:05 +0000, Steven Hartland wrote:
> >>>>>Author: smh
> >>>>>Date: Fri Apr 26 16:17:04 2013
> >>>>>New Revision: 249939
> >>>>>URL: http://svnweb.freebsd.org/changeset/base/249939
> >>>>>
> >>>>>Log:
> >>>>>  Added available delete methods discovery during device probe, 
> >>>>>  including the
> >>>>>  maximum sizes for said methods, which are used when processing 
> >>>>>  BIO_DELETE
> >>>>>  requests. This includes updating UNMAP support discovery to be based 
> >>on
> >>>>>  SBC-3 T10/1799-D Revision 31 specification.
> >>>>>  Added ATA TRIM support to cam scsi devices via ATA Pass-Through(16)
> >>>>>  sys/cam/scsi/scsi_da.c:
> >>>>>          - Added ATA Data Set Management TRIM support via ATA 
> >>>>>          Pass-Through(16)
> >>>>>            as a delete_method
> >>>>>
> >>>>
> >>>>This adds a lot of unnecessary verbosity for devices that don't support 
> >>>>ATA
> >>>>passthrough.  For example:
> >>>>
> >>>>(da7:iscsi4:0:0:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0a 00 00 
> >>02 >>00 00 00 00 00 00 00 40 ec 00
> >>>>(da7:iscsi4:0:0:0): CAM status: SCSI Status Error
> >>>>(da7:iscsi4:0:0:0): SCSI status: Check Condition
> >>>>(da7:iscsi4:0:0:0): Retrying command (per sense data)
> >>>>(2:2:0:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0a 00 00 02 00 00 
> >>00 >>00 00 00 00 40 ec 00
> >>>>(2:2:0:0): Tag: 0x00f6, Type: 1
> >>>>(2:2:0:0): CTL Status: SCSI Error
> >>>>(2:2:0:0): SCSI Status: Check Condition
> >>>>(2:2:0:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command 
> >>>>operation code)
> >>>>(2:2:0:0): Command byte 0 is invalid
> >>>>
> >>>>(da8:iscsi4:0:0:1): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0a 00 00 
> >>02 >>00 00 00 00 00 00 00 40 ec 00
> >>>>(da8:iscsi4:0:0:1): CAM status: SCSI Status Error
> >>>>(da8:iscsi4:0:0:1): SCSI status: Check Condition
> >>>>(da8:iscsi4:0:0:1): Retrying command (per sense data)
> >>>>(da8:iscsi4:0:0:1): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0a 00 00 
> >>02 >>00 00 00 00 00 00 00 40 ec 00
> >>>>(da8:iscsi4:0:0:1): CAM status: SCSI Status Error
> >>>>(da8:iscsi4:0:0:1): SCSI status: Check Condition
> >>>>(da8:iscsi4:0:0:1): Error 5, Retries exhausted
> >>>>
> >>>>That is with CTL and and trasz's new iSCSI initiator, but you should 
> >>see >>it
> >>>>with any CTL configuration.  (And probably with any controller or device
> >>>>that doesn't support ATA passthrough.)
> >>>>
> >>>>So, please:
> >>>>- Check for the presence of VPD page 0x89 before sending an ATA
> >>>>  passthrough command.  The spec (sat3r03 in this case) says that
> >>>>  it "shall" be implemented, so I think we can count on that.
> >>>>- If the target returns an illegal request sense key, don't retry
> >>>>  again.  The target will keep returning illegal request
> >>>
> >>>Thanks for the report Ken I'll check this on a card I know doesn't 
> >>support
> >>>pass-through.
> >>
> >>I checked with an areca controller, which I know doesn't support pass16, 
> >>and
> >>all is quiet during boot / probe.
> >>
> >>I can provoke output at the app layer with camcontrol identify da0, which 
> >>is
> >>to be expected, but nothing in /var/log/messages.
> >>
> >>daerror handler in scsi_da.c definitely has SF_QUIET_IR so Illegal Request
> >>should never be output.
> >
> >You're correct, it should not print out any messages.
> >
> >>I can't provoke this using standard CAM devices even when they don't 
> >>support
> >>ATA Pass-Through (16) so I think this is actually an issue with CTL:
> >>1. Being verbose when it shouldn't (ctl_process_done - line 12571?)
> >
> >That part at least is by design -- it prints out rate-limited messages so
> >you know when the initiator is sending commands CTL doesn't support.  It
> >comes in handy for figuring out initiator problems sometimes.
> >
> >>2. Retrying Illegal Request commands when it shouldn't (I didn't experince
> >>this on r250032)
> >
> >CTL doesn't do any retries, it leaves that up to the initiator.
> >
> >I finally tracked down the problem.  The issue is that the iSCSI initiator
> >wasn't setting the CAM_AUTOSNS_VALID flag in the CCB status field.  As a
> >result, only the CAM status and SCSI status fields were being printed.
> >SF_QUIET_IR
> >
> >>The attached patch implements a check for ATA Information VPD before
> >>using ATA Pass-Through (16). It's had limited testing but I have
> >>confirmed it eliminates the use of pass16 under CTL and still enables
> >>ATA TRIM under on mpt for SATA disks, so looks promising.
> >>
> >>If you could test and let me know if it works for you Ken that would
> >>be appreciated.
> >
> >Yes, it does work and eliminates the error message, thanks!  I think it is
> >also a better way to go, just in case we run into a target that has an
> >issue with the ATA passthrough CDB.
> 
> A slightly modified version of this is now in the current r250181
> http://svnweb.freebsd.org/changeset/base/250181
> 
> >>The test case I used for CTL was:-
> >>kldload ctl
> >>ctladm create -b ramdisk -s 10485760
> >>ctladm port -o on
> >>
> >>As a side note while test "kldload ctl" caused kernel panic the once.
> >
> >What was the panic message?
> 
> Unread portion of the kernel message buffer:
> ctl: CAM Target Layer loaded
> 
> Fatal trap 9: general protection fault while in kernel mode
> cpuid = 0; apic id = 00
> instruction pointer     = 0x20:0xffffffff8029c860
> stack pointer           = 0x28:0xffffff823b8dc840
> frame pointer           = 0x28:0xffffff823b8dc910
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                        = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 4 (xpt_thrd)
> 
> >This doesn't look like a CTL-specific problem (not on the surface), but
> >rather a more general CAM problem:
> >
> >>#8  0xffffffff8073e1f2 in calltrap () at 
> >>/usr/home/smh/freebsd/base/head/sys/amd64/amd64/exception.S:228
> >>#9  0xffffffff8029c860 in cam_periph_alloc 
> >>(periph_ctor=0xffffffff802af410 <proberegister>, 
> >>periph_oninvalidate=0xfffffe0019cfa200, periph_dtor=<value optimized 
> >>out>, periph_start=0xfffffe0015980a90, name=<value optimized out>, 
> >>type=2159638184,
> >>   path=0xfffffe0015ad79a0, ac_callback=<value optimized out>, 
> >>   code=<value optimized out>) at 
> >>   /usr/home/smh/freebsd/base/head/sys/cam/cam_periph.c:227
> >>#10 0xffffffff802aed6b in scsi_scan_lun (request_ccb=<value optimized 
> >>out>) at /usr/home/smh/freebsd/base/head/sys/cam/scsi/scsi_xpt.c:2266
> >>#11 0xffffffff802b2859 in scsi_scan_bus (periph=0x0, 
> >>request_ccb=0xfffffe00234df000) at 
> >>/usr/home/smh/freebsd/base/head/sys/cam/scsi/scsi_xpt.c:1969
> >>#12 0xffffffff802a66c5 in xpt_scanner_thread (dummy=<value optimized 
> >>out>) at /usr/home/smh/freebsd/base/head/sys/cam/cam_xpt.c:2411
> >>#13 0xffffffff8055cde4 in fork_exit (callout=0xffffffff802a6650 
> >><xpt_scanner_thread>, arg=0x0, frame=0xffffff823b8dcc00) at 
> >>/usr/home/smh/freebsd/base/head/sys/kern/kern_fork.c:991
> >>#14 0xffffffff8073e72e in fork_trampoline () at 
> >>/usr/home/smh/freebsd/base/head/sys/amd64/amd64/exception.S:602
> >>#15 0x0000000000000000 in ?? ()
> >
> >If this is on a stock FreeBSD/head, it looks like it failed on this line in
> >cam_periph_alloc():
> 
> Apart from those patches yes stock FreeBSD/head.

Okay.

> >       cur_periph = TAILQ_FIRST(&(*p_drv)->units);
> >
> >Perhaps there's an issue with the ctl peripheral driver getting added to
> >the list of peripheral drivers (via periphdriver_register()).  The topology
> >lock isn't held in periphdriver_register(), so it's possible that the
> >thread was in between the peripheral driver list traversal and the line
> >above when the peripheral was added.
> >
> >It's more likely that the scan request was generated as a result of the CTL
> >ports going online.  And that means that the peripheral driver registration
> >should have long since been done.
> >
> >If you can reproduce the problem, you can probably figure out where the
> >issue is.
> 
> I've not been able to reproduce I'm afraid, seems you cant kldunload ctl
> so needs a reboot between loads, tried 3 times but nothing.

Thanks for trying, I'll be on the lookout for any other reports.

> I did however notice the following after turning ports off, which may
> indicated a reference leak:-
> 
> May  2 17:06:25 head kernel: (da2:ctl2cam0:0:1:0): lost device - 0 
> outstanding, 2 refs
> May  2 17:06:25 head kernel: (pass6:ctl2cam0:0:1:0): lost device
> May  2 17:06:25 head kernel: (pass6:ctl2cam0:0:1:0): removing device entry
> May  2 17:06:25 head kernel: (da2:ctl2cam0:0:1:0): removing device entry

The references are normal.  The da(4) driver keeps one reference for GEOM
until GEOM calls it back after cleaning up its resources.  (That's the
reference released in dadiskgonecb().)  It also has a reference for each
open call, and acquires references during probe for things like the sysctl
variable initialization.

The key thing is the last line, "removing device entry", since it means
that every context that has a reference into the da(4) driver has cleaned
it up so it can be removed.

Ken
-- 
Kenneth Merry
ken@FreeBSD.ORG



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130502172319.GA6226>