From owner-freebsd-scsi@FreeBSD.ORG Mon Dec 1 11:07:02 2008 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A16501065673 for ; Mon, 1 Dec 2008 11:07:02 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 8D09E8FC1C for ; Mon, 1 Dec 2008 11:07:02 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id mB1B72sY052681 for ; Mon, 1 Dec 2008 11:07:02 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id mB1B721f052677 for freebsd-scsi@FreeBSD.org; Mon, 1 Dec 2008 11:07:02 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 1 Dec 2008 11:07:02 GMT Message-Id: <200812011107.mB1B721f052677@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-scsi@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-scsi@FreeBSD.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Dec 2008 11:07:02 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/128452 scsi [sa] [panic] Accessing SCSI tape drive randomly crashe o kern/128245 scsi [scsi] "inquiry data fails comparison at DV1 step" [re o kern/127927 scsi [isp] isp(4) target driver crashes kernel when set up o kern/126866 scsi [isp] [panic] kernel panic on card initialization o kern/124667 scsi [amd] [panic] FreeBSD-7 kernel page faults at amd-scsi o kern/123674 scsi [ahc] ahc driver dumping o kern/123666 scsi [aac] attach fails with Adaptec SAS RAID 3805 controll o sparc/121676 scsi [iscsi] iscontrol do not connect iscsi-target on sparc o kern/120487 scsi [sg] scsi_sg incompatible with scanners o kern/120247 scsi [mpt] FreeBSD 6.3 and LSI Logic 1030 = only 3.300MB/s o kern/119668 scsi [cam] [patch] certain errors are too verbose comparing o kern/114597 scsi [sym] System hangs at SCSI bus reset with dual HBAs o kern/110847 scsi [ahd] Tyan U320 onboard problem with more than 3 disks o kern/99954 scsi [ahc] reading from DVD failes on 6.x [regression] o kern/94838 scsi Kernel panic while mounting SD card with lock switch o o kern/92798 scsi [ahc] SCSI problem with timeouts o kern/90282 scsi [sym] SCSI bus resets cause loss of ch device o kern/76178 scsi [ahd] Problem with ahd and large SCSI Raid system o kern/74627 scsi [ahc] [hang] Adaptec 2940U2W Can't boot 5.3 s kern/61165 scsi [panic] kernel page fault after calling cam_send_ccb o kern/60641 scsi [sym] Sporadic SCSI bus resets with 53C810 under load o kern/60598 scsi wire down of scsi devices conflicts with config s kern/57398 scsi [mly] Current fails to install on mly(4) based RAID di o kern/52638 scsi [panic] SCSI U320 on SMP server won't run faster than o kern/44587 scsi dev/dpt/dpt.h is missing defines required for DPT_HAND o kern/40895 scsi wierd kernel / device driver bug o kern/39388 scsi ncr/sym drivers fail with 53c810 and more than 256MB m o kern/38828 scsi [dpt] [request] DPT PM2012B/90 doesn't work o kern/35234 scsi World access to /dev/pass? (for scanner) requires acce 29 problems total. From owner-freebsd-scsi@FreeBSD.ORG Wed Dec 3 15:51:53 2008 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 825A01065676 for ; Wed, 3 Dec 2008 15:51:53 +0000 (UTC) (envelope-from jh@saunalahti.fi) Received: from emh07.mail.saunalahti.fi (emh07.mail.saunalahti.fi [62.142.5.117]) by mx1.freebsd.org (Postfix) with ESMTP id 11A478FC1C for ; Wed, 3 Dec 2008 15:51:52 +0000 (UTC) (envelope-from jh@saunalahti.fi) Received: from saunalahti-vams (vs3-10.mail.saunalahti.fi [62.142.5.94]) by emh07-2.mail.saunalahti.fi (Postfix) with SMTP id 5785E18D957; Wed, 3 Dec 2008 17:33:02 +0200 (EET) Received: from emh03.mail.saunalahti.fi ([62.142.5.109]) by vs3-10.mail.saunalahti.fi ([62.142.5.94]) with SMTP (gateway) id A05D61529CD; Wed, 03 Dec 2008 17:33:02 +0200 Received: from a91-153-125-115.elisa-laajakaista.fi (a91-153-125-115.elisa-laajakaista.fi [91.153.125.115]) by emh03.mail.saunalahti.fi (Postfix) with SMTP id 342BF158AAC; Wed, 3 Dec 2008 17:32:58 +0200 (EET) Date: Wed, 3 Dec 2008 17:32:58 +0200 From: Jaakko Heinonen To: bug-followup@FreeBSD.org, per.qu@email.it Message-ID: <20081203153258.GA3249@a91-153-125-115.elisa-laajakaista.fi> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.18 (2008-05-17) X-Antivirus: VAMS Cc: freebsd-scsi@FreeBSD.org Subject: Re: kern/88823: [modules] [atapicam] atapicam - kernel trap 12 on loading and unloading X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Dec 2008 15:51:53 -0000 Hi, There is a CAM(4)/pass(4) bug which causes passcleanup() (in sys/cam/scsi/scsi_pass.c) to call destroy_dev(9) with the device mutex held. It's not allowed to call destroy_dev() with sleepable locks held. Here's the call trace: destroy_dev(c7b28400,0,c569f754,c7b15080,f46c6a38,...) at destroy_dev+0x10 passcleanup(c7b15080,c0b8f83b,c0bdf975,c585d058,c0e5afe0,...) at passcleanup+0x2e camperiphfree(c7b15080,0,f46c6a58,c0477b7d,c7b15080,...) at camperiphfree+0xc2 cam_periph_invalidate(c7b15080,c59328d0,f46c6a8c,c0492b4a,c7b15080,...) at cam_periph_invalidate+0x3e cam_periph_async(c7b15080,100,f46c6b18,0,0,...) at cam_periph_async+0x2d passasync(c7b15080,100,f46c6b18,0,c7ae0000,...) at passasync+0xca xpt_async_bcast(0,4,c0b6dbbf,11a5,c7b428c0,...) at xpt_async_bcast+0x32 xpt_async(100,f46c6b18,0,10,c575ccb8,...) at xpt_async+0x194 xpt_bus_deregister(0,0,c7b75b30,378,c577fc00,...) at xpt_bus_deregister+0x4e free_softc(c577fe64,0,c7b75b30,103,c7b18100,...) at free_softc+0xe1 atapi_cam_detach(c7b18100,c7b30858,c0caa340,9a4,1,...) at atapi_cam_detach+0x7f device_detach(c7b18100,c081bf09,c7691840,1,c7b760f8,...) at device_detach+0x8c devclass_delete_driver(c554b6c0,c7b7610c,c0bd0dfd,2d,0,...) at devclass_delete_driver+0x91 driver_module_handler(c7692040,1,c7b760f8,ef,c7692040,...) at driver_module_handler+0xdf module_unload(c7692040,0,253,250,f46c6c40,...) at module_unload+0x75 linker_file_unload(c7ab9d00,0,c0bcf326,400,c7b73000,...) at linker_file_unload+0xc9 kern_kldunload(c5957460,6,0,f46c6d2c,c0b11ff3,...) at kern_kldunload+0xd5 kldunloadf(c5957460,f46c6cf8,8,c0bd96d0,c0cad660,...) at kldunloadf+0x2b Calling xpt_bus_deregister() in atapicam results this code path. xpt_bus_deregister() must be called with the device mutex held. Following change fixes the atapicam problem; however the patch may be incorrect because I am not sure if passcleanup() is always called with the lock held. I have tried the patch with atapicam(4) and umass(4) (both use pass(4)). %%% Index: sys/cam/scsi/scsi_pass.c =================================================================== --- sys/cam/scsi/scsi_pass.c (revision 185331) +++ sys/cam/scsi/scsi_pass.c (working copy) @@ -167,7 +167,9 @@ passcleanup(struct cam_periph *periph) devstat_remove_entry(softc->device_stats); + mtx_unlock(periph->sim->mtx); destroy_dev(softc->dev); + mtx_lock(periph->sim->mtx); if (bootverbose) { xpt_print(periph->path, "removing device entry\n"); %%% There are also other bugs involved in unloading the atapicam module. * If there are pending hcbs kernel will panic on unload. There's an obvious bug in free_softc(): it uses TAILQ_FOREACH() instead of TAILQ_FOREACH_SAFE(). However fixing that is not enough. There are additional problem(s) and I don't have a fix for them. Here's a patch that changes it to refuse to detach if there are pending hcbs: %%% Index: sys/dev/ata/atapi-cam.c =================================================================== --- sys/dev/ata/atapi-cam.c (revision 185519) +++ sys/dev/ata/atapi-cam.c (working copy) @@ -254,6 +254,13 @@ atapi_cam_detach(device_t dev) struct atapi_xpt_softc *scp = device_get_softc(dev); mtx_lock(&scp->state_lock); + /* + * XXX: Detaching when pending hcbs exist is broken. + */ + if (!TAILQ_EMPTY(&scp->pending_hcbs)) { + mtx_unlock(&scp->state_lock); + return (EBUSY); + } xpt_freeze_simq(scp->sim, 1 /*count*/); scp->flags |= DETACHING; mtx_unlock(&scp->state_lock); @@ -882,11 +889,11 @@ free_hcb(struct atapi_hcb *hcb) static void free_softc(struct atapi_xpt_softc *scp) { - struct atapi_hcb *hcb; + struct atapi_hcb *hcb, *tmp_hcb; if (scp != NULL) { mtx_lock(&scp->state_lock); - TAILQ_FOREACH(hcb, &scp->pending_hcbs, chain) { + TAILQ_FOREACH_SAFE(hcb, &scp->pending_hcbs, chain, tmp_hcb) { free_hcb_and_ccb_done(hcb, CAM_UNREC_HBA_ERROR); } if (scp->path != NULL) { %%% * cd(4) doesn't tolerate well disappearing devices. There's code in cdinvalidate() to invalidate further I/O operations but calling for example d_close causes a crash. Thus you can't unmount a file system after the device has disappeared. This patch makes it to survive unmounting. %%% Index: sys/cam/scsi/scsi_cd.c =================================================================== --- sys/cam/scsi/scsi_cd.c (revision 185331) +++ sys/cam/scsi/scsi_cd.c (working copy) @@ -382,6 +382,9 @@ cdoninvalidate(struct cam_periph *periph camq_remove(&softc->changer->devq, softc->pinfo.index); disk_gone(softc->disk); + softc->disk->d_drv1 = NULL; + softc->disk->d_close = NULL; /* allow closing the disk */ + xpt_print(periph->path, "lost device\n"); } %%% -- Jaakko From owner-freebsd-scsi@FreeBSD.ORG Fri Dec 5 02:50:47 2008 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 40F251065673 for ; Fri, 5 Dec 2008 02:50:47 +0000 (UTC) (envelope-from jpaetzel@FreeBSD.org) Received: from mail.tcbug.org (mail.tcbug.org [216.243.150.78]) by mx1.freebsd.org (Postfix) with ESMTP id 1FEFD8FC14 for ; Fri, 5 Dec 2008 02:50:47 +0000 (UTC) (envelope-from jpaetzel@FreeBSD.org) Received: from roadrash.tcbug.org (c-24-118-145-206.hsd1.mn.comcast.net [24.118.145.206]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.tcbug.org (Postfix) with ESMTPSA id 2A1C3169E464; Thu, 4 Dec 2008 20:31:34 -0600 (CST) Message-ID: <493892AE.8060607@FreeBSD.org> Date: Thu, 04 Dec 2008 20:32:14 -0600 From: Josh Paetzel User-Agent: Thunderbird 2.0.0.18 (Macintosh/20081105) MIME-Version: 1.0 To: shakeb ainul References: <124704c40811270052s1d215d24kc7b057da17a1cb83@mail.gmail.com> In-Reply-To: <124704c40811270052s1d215d24kc7b057da17a1cb83@mail.gmail.com> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-scsi@freebsd.org Subject: Re: Problem with RAID1 Disk on Freebsd X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Dec 2008 02:50:47 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 shakeb ainul wrote: > Hi, > > I am a developer with one of the top IT companies in Asia. I have the > following problem with my RAID 1 server. > > Last week, the only disk of my server failed due to unknown reason and it > required a reboot of the server. Following error messages were logged in: > > /var/log/messages > > Nov 21 02:18:17 server1 kernel: ciss0: *** SCSI bus speed downshifted, SCSI > port 2 > Nov 21 02:20:58 server1 kernel: ciss0: *** SCSI bus speed downshifted, SCSI > port 2 > Nov 21 02:22:37 server1 kernel: ciss0: *** SCSI bus speed downshifted, SCSI > port 2 > Nov 21 02:31:01 server1 kernel: ciss0: *** Physical drive failure: SCSI port > 2 ID 1 > Nov 21 02:31:01 server1 kernel: ciss0: *** State change, logical drive 0 > Nov 21 02:31:01 server1 kernel: ciss0: logical drive 0 (da0) changed status > OK->interim recovery, spare status 0x0 > > Attached is the dmesg.boot file of my server. > > Please advise on what could be the possible causes for this fault and what > can we do to ensure it does not happen again in future. > > Thanks in anticipation. > > Regards, > SHAKEB AINUL - From what I can tell based on the information you've provided a drive failed in such a way that the controller tried stepping the bus down. When that didn't help it faulted the drive out of the array. But you don't say, and it doesn't seem discernable from the information you provided, what happened after that. Did the server hang or reboot or something? - -- Thanks, Josh Paetzel PGP: 8A48 EF36 5E9F 4EDA 5ABC 11B4 26F9 01F1 27AF AECB -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) iEYEARECAAYFAkk4kq4ACgkQJvkB8SevrsurUACeLxHdjvTxdo6IMxJrAbvJh9mK 63cAnR2iW1eOKCv8bEV77VeKM8oCSumh =GJe2 -----END PGP SIGNATURE----- From owner-freebsd-scsi@FreeBSD.ORG Fri Dec 5 10:30:06 2008 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 10EB51065670 for ; Fri, 5 Dec 2008 10:30:06 +0000 (UTC) (envelope-from gjb@semihalf.com) Received: from semihalf.com (semihalf.com [206.130.101.55]) by mx1.freebsd.org (Postfix) with ESMTP id C82FC8FC12 for ; Fri, 5 Dec 2008 10:30:05 +0000 (UTC) (envelope-from gjb@semihalf.com) Received: from mail.semihalf.com (mail.semihalf.com [83.15.139.206]) by semihalf.com (8.13.1/8.13.1) with ESMTP id mB5AJ3xs028230 for ; Fri, 5 Dec 2008 03:19:04 -0700 Message-ID: <4939002E.9070001@semihalf.com> Date: Fri, 05 Dec 2008 11:19:26 +0100 From: Grzegorz Bernacki MIME-Version: 1.0 To: freebsd-scsi@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: USB stick probing X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Dec 2008 10:30:06 -0000 Hi, We have a problem with discovering USB stick. We see following output after inserting the stick: (da0:umass-sim0:0:0:0): Retrying Command (da0:umass-sim0:0:0:0): Retrying Command (da0:umass-sim0:0:0:0): Retrying Command (da0:umass-sim0:0:0:0): Retrying Command (da0:umass-sim0:0:0:0): error 6 (da0:umass-sim0:0:0:0): Unretryable Error da0 at umass-sim0 bus 0 target 0 lun 0 da0: Removable Direct Access SCSI-2 device da0: 40.000MB/s transfers da0: Attempt to query device size failed: UNIT ATTENTION, Not ready to ready change I turned on some debugs to see which exact command fails and this is how sequence of commands looks like: - INQUIRY - INQUIRY - TEST UNIT READY which fails with Not Ready - READ CAPACITY which fails a few times and then we get error. My knowlegde of CAM?XPT is very limited so I got some questions regarding probing devices by CAM/XPT: 1) Test unit ready is sent in PROBE_TUR_FOR_NEGOTIATION state. But we don't check for SCSI errors. So if TEST UNIT READY fails we just go on without retrying this command to the next state. Why we don't care if device is ready or not? Shouldn't we check the errors and retry the command? Are there any reason why we skip error checking? 2) After USB stick is inserted we start from PROBE_INQUIRY state. Is it expected behaviour? I thought we should start from PROBE_TUR. I going to check the status of command in PROBE_TUR_FOR_NEGOTIATION and retry command if it fails. Is it good solution? Maybe it can be solved in other easier way. Thanks in advance, Grzesiek