From owner-svn-src-all@FreeBSD.ORG Fri Sep 17 21:53:56 2010 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D0FFE1065670; Fri, 17 Sep 2010 21:53:56 +0000 (UTC) (envelope-from ken@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id BFC668FC18; Fri, 17 Sep 2010 21:53:56 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.3/8.14.3) with ESMTP id o8HLruDu042949; Fri, 17 Sep 2010 21:53:56 GMT (envelope-from ken@svn.freebsd.org) Received: (from ken@localhost) by svn.freebsd.org (8.14.3/8.14.3/Submit) id o8HLruj7042946; Fri, 17 Sep 2010 21:53:56 GMT (envelope-from ken@svn.freebsd.org) Message-Id: <201009172153.o8HLruj7042946@svn.freebsd.org> From: "Kenneth D. Merry" Date: Fri, 17 Sep 2010 21:53:56 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r212802 - head/sys/dev/mps X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Sep 2010 21:53:56 -0000 Author: ken Date: Fri Sep 17 21:53:56 2010 New Revision: 212802 URL: http://svn.freebsd.org/changeset/base/212802 Log: Fix a couple of mps problems. When the driver is completely saturated with commands (1024 in the case of the SAS2008 in my test system), I/O stops. If we tell CAM that we have one less command slot than we have actually allocated, everything works fine. We also need a few extra command slots to allow for aborts and other task management commands to be sent down. This needs more investigation to determine the root cause, but for now this fixes things in my testing. mps.c: Change a printf() to mps_printf(). mps_sas.c: Subtract 5 command slots when we tell CAM how many commands we can handle. Add some commented-out logic to print the contents the CDBs for timed-out commands. This can help in debugging devices that are timing out. This will be uncommented once I bring some CAM changes in. Reported by: Andrew Boyer Modified: head/sys/dev/mps/mps.c head/sys/dev/mps/mps_sas.c Modified: head/sys/dev/mps/mps.c ============================================================================== --- head/sys/dev/mps/mps.c Fri Sep 17 19:20:39 2010 (r212801) +++ head/sys/dev/mps/mps.c Fri Sep 17 21:53:56 2010 (r212802) @@ -1416,7 +1416,7 @@ mps_data_cb(void *arg, bus_dma_segment_t chain = mps_alloc_chain(sc); if (chain == NULL) { /* Resource shortage, roll back! */ - printf("out of chain frames\n"); + mps_printf(sc, "out of chain frames\n"); return; } Modified: head/sys/dev/mps/mps_sas.c ============================================================================== --- head/sys/dev/mps/mps_sas.c Fri Sep 17 19:20:39 2010 (r212801) +++ head/sys/dev/mps/mps_sas.c Fri Sep 17 21:53:56 2010 (r212802) @@ -596,6 +596,7 @@ mps_attach_sas(struct mps_softc *sc) { struct mpssas_softc *sassc; int error = 0; + int num_sim_reqs; mps_dprint(sc, MPS_TRACE, "%s\n", __func__); @@ -605,15 +606,30 @@ mps_attach_sas(struct mps_softc *sc) sc->sassc = sassc; sassc->sc = sc; - if ((sassc->devq = cam_simq_alloc(sc->num_reqs)) == NULL) { + /* + * Tell CAM that we can handle 5 fewer requests than we have + * allocated. If we allow the full number of requests, all I/O + * will halt when we run out of resources. Things work fine with + * just 1 less request slot given to CAM than we have allocated. + * We also need a couple of extra commands so that we can send down + * abort, reset, etc. requests when commands time out. Otherwise + * we could wind up in a situation with sc->num_reqs requests down + * on the card and no way to send an abort. + * + * XXX KDM need to figure out why I/O locks up if all commands are + * used. + */ + num_sim_reqs = sc->num_reqs - 5; + + if ((sassc->devq = cam_simq_alloc(num_sim_reqs)) == NULL) { mps_dprint(sc, MPS_FAULT, "Cannot allocate SIMQ\n"); error = ENOMEM; goto out; } sassc->sim = cam_sim_alloc(mpssas_action, mpssas_poll, "mps", sassc, - device_get_unit(sc->mps_dev), &sc->mps_mtx, sc->num_reqs, sc->num_reqs, - sassc->devq); + device_get_unit(sc->mps_dev), &sc->mps_mtx, num_sim_reqs, + num_sim_reqs, sassc->devq); if (sassc->sim == NULL) { mps_dprint(sc, MPS_FAULT, "Cannot allocate SIM\n"); error = EINVAL; @@ -930,6 +946,9 @@ mpssas_scsiio_timeout(void *data) struct mps_softc *sc; struct mps_command *cm; struct mpssas_target *targ; +#if 0 + char cdb_str[(SCSI_MAX_CDBLEN * 3) + 1]; +#endif cm = (struct mps_command *)data; sc = cm->cm_sc; @@ -954,6 +973,22 @@ mpssas_scsiio_timeout(void *data) xpt_print(ccb->ccb_h.path, "SCSI command timeout on device handle " "0x%04x SMID %d\n", targ->handle, cm->cm_desc.Default.SMID); + /* + * XXX KDM this is useful for debugging purposes, but the existing + * scsi_op_desc() implementation can't handle a NULL value for + * inq_data. So this will remain commented out until I bring in + * those changes as well. + */ +#if 0 + xpt_print(ccb->ccb_h.path, "Timed out command: %s. CDB %s\n", + scsi_op_desc((ccb->ccb_h.flags & CAM_CDB_POINTER) ? + ccb->csio.cdb_io.cdb_ptr[0] : + ccb->csio.cdb_io.cdb_bytes[0], NULL), + scsi_cdb_string((ccb->ccb_h.flags & CAM_CDB_POINTER) ? + ccb->csio.cdb_io.cdb_ptr : + ccb->csio.cdb_io.cdb_bytes, cdb_str, + sizeof(cdb_str))); +#endif /* Inform CAM about the timeout and that recovery is starting. */ #if 0