From owner-freebsd-scsi@FreeBSD.ORG Sun Jan 18 07:45:06 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3D44616A4D0; Sun, 18 Jan 2004 07:45:06 -0800 (PST) Received: from phantom.cris.net (phantom.cris.net [212.110.130.74]) by mx1.FreeBSD.org (Postfix) with ESMTP id 36D4143D69; Sun, 18 Jan 2004 07:44:52 -0800 (PST) (envelope-from ru@FreeBSD.org.ua) Received: from phantom.cris.net (ru@localhost [127.0.0.1]) by phantom.cris.net (8.12.10/8.12.10) with ESMTP id i0IFirjm032431; Sun, 18 Jan 2004 17:44:56 +0200 (EET) (envelope-from ru@FreeBSD.org.ua) Received: (from ru@localhost) by phantom.cris.net (8.12.10/8.12.10/Submit) id i0IFinPN032425; Sun, 18 Jan 2004 17:44:49 +0200 (EET) (envelope-from ru) Date: Sun, 18 Jan 2004 17:44:48 +0200 From: Ruslan Ermilov To: Paul Twohey Message-ID: <20040118154447.GA32115@FreeBSD.org.ua> References: Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="lrZ03NoBR/3+SXJZ" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.5.1i cc: freebsd-hackers@freebsd.org cc: scsi@freebsd.org Subject: Re: [CHECKER] bugs in FreeBSD X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jan 2004 15:45:06 -0000 --lrZ03NoBR/3+SXJZ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Jan 16, 2004 at 04:09:34PM -0800, Paul Twohey wrote: [...] > --------------------------------------------------------- > [BUG] > /u2/engler/mc/freebsd/sys/i386/compile/GENERIC/../../../dev/dpt/dpt_scsi.= c:1542:dpt_attach:ERROR:LEAK:1542:1571: pointer=3Ddevq from RO=3Dcam_simq_a= lloc(-1) [s=3D21,pop=3D21,pr=3D0.99] [rank=3Dmed] leaked! [z=3D1.0] [succes= s=3D3] >=20 > int i; >=20 > /* > * Create the device queue for our SIM. > */ > Start ---> > devq =3D cam_simq_alloc(dpt->max_dccbs); >=20 > ... DELETED 23 lines ... >=20 >=20 > } > if (i > 0) > EVENTHANDLER_REGISTER(shutdown_final, dptshutdown, > dpt, SHUTDOWN_PRI_DEFAULT); > Error ---> > return (i); > } >=20 > int > --------------------------------------------------------- We aren't leaking "devq" here, it's freed (if necessary) by setting the second cam_sim_free() argument to true: if (xpt_bus_register(dpt->sims[i], i) !=3D CAM_SUCCESS) { cam_sim_free(dpt->sims[i], /*free_devq*/i =3D=3D 0); break; } But we're missing the proper NULL checking, here's the fix: %%% Index: dpt_scsi.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/sys/dev/dpt/dpt_scsi.c,v retrieving revision 1.45 diff -u -p -r1.45 dpt_scsi.c --- dpt_scsi.c 24 Aug 2003 17:46:04 -0000 1.45 +++ dpt_scsi.c 18 Jan 2004 15:39:13 -0000 @@ -1553,6 +1553,8 @@ dpt_attach(dpt_softc_t *dpt) dpt->sims[i] =3D cam_sim_alloc(dpt_action, dpt_poll, "dpt", dpt, dpt->unit, /*untagged*/2, /*tagged*/dpt->max_dccbs, devq); + if (dpt->sims[i] =3D=3D NULL) + break; if (xpt_bus_register(dpt->sims[i], i) !=3D CAM_SUCCESS) { cam_sim_free(dpt->sims[i], /*free_devq*/i =3D=3D 0); break; %%% --=20 Ruslan Ermilov FreeBSD committer ru@FreeBSD.org --lrZ03NoBR/3+SXJZ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (FreeBSD) iD8DBQFACqnvUkv4P6juNwoRAmc2AJ4yZOY/4fv1WzHuGBEtrFzVYHmRiACeMSY6 /ucH2Zb2vN73gaOViebu77U= =g3Hd -----END PGP SIGNATURE----- --lrZ03NoBR/3+SXJZ-- From owner-freebsd-scsi@FreeBSD.ORG Sun Jan 18 08:08:13 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 68CA016A4CE; Sun, 18 Jan 2004 08:08:13 -0800 (PST) Received: from phantom.cris.net (phantom.cris.net [212.110.130.74]) by mx1.FreeBSD.org (Postfix) with ESMTP id 56CD943D31; Sun, 18 Jan 2004 08:08:06 -0800 (PST) (envelope-from ru@FreeBSD.org.ua) Received: from phantom.cris.net (ru@localhost [127.0.0.1]) by phantom.cris.net (8.12.10/8.12.10) with ESMTP id i0IG86jm032701; Sun, 18 Jan 2004 18:08:09 +0200 (EET) (envelope-from ru@FreeBSD.org.ua) Received: (from ru@localhost) by phantom.cris.net (8.12.10/8.12.10/Submit) id i0IG834v032696; Sun, 18 Jan 2004 18:08:03 +0200 (EET) (envelope-from ru) Date: Sun, 18 Jan 2004 18:08:02 +0200 From: Ruslan Ermilov To: Paul Twohey Message-ID: <20040118160802.GC32115@FreeBSD.org.ua> References: <20040118154447.GA32115@FreeBSD.org.ua> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="SO98HVl1bnMOfKZd" Content-Disposition: inline In-Reply-To: <20040118154447.GA32115@FreeBSD.org.ua> User-Agent: Mutt/1.5.5.1i cc: freebsd-hackers@freebsd.org cc: scsi@freebsd.org Subject: Re: [CHECKER] bugs in FreeBSD X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jan 2004 16:08:13 -0000 --SO98HVl1bnMOfKZd Content-Type: multipart/mixed; boundary="yLVHuoLXiP9kZBkt" Content-Disposition: inline --yLVHuoLXiP9kZBkt Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Jan 18, 2004 at 05:44:48PM +0200, Ruslan Ermilov wrote: > On Fri, Jan 16, 2004 at 04:09:34PM -0800, Paul Twohey wrote: > [...] > > --------------------------------------------------------- > > [BUG] > > /u2/engler/mc/freebsd/sys/i386/compile/GENERIC/../../../dev/dpt/dpt_scs= i.c:1542:dpt_attach:ERROR:LEAK:1542:1571: pointer=3Ddevq from RO=3Dcam_simq= _alloc(-1) [s=3D21,pop=3D21,pr=3D0.99] [rank=3Dmed] leaked! [z=3D1.0] [succ= ess=3D3] > >=20 > > int i; > >=20 > > /* > > * Create the device queue for our SIM. > > */ > > Start ---> > > devq =3D cam_simq_alloc(dpt->max_dccbs); > >=20 > > ... DELETED 23 lines ... > >=20 > >=20 > > } > > if (i > 0) > > EVENTHANDLER_REGISTER(shutdown_final, dptshutdown, > > dpt, SHUTDOWN_PRI_DEFAULT); > > Error ---> > > return (i); > > } > >=20 > > int > > --------------------------------------------------------- >=20 > We aren't leaking "devq" here, it's freed (if necessary) by setting > the second cam_sim_free() argument to true: >=20 > if (xpt_bus_register(dpt->sims[i], i) !=3D CAM_SUCCESS) { > cam_sim_free(dpt->sims[i], /*free_devq*/i =3D=3D = 0); > break; > } >=20 > But we're missing the proper NULL checking, here's the fix: >=20 > %%% > Index: dpt_scsi.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > RCS file: /home/ncvs/src/sys/dev/dpt/dpt_scsi.c,v > retrieving revision 1.45 > diff -u -p -r1.45 dpt_scsi.c > --- dpt_scsi.c 24 Aug 2003 17:46:04 -0000 1.45 > +++ dpt_scsi.c 18 Jan 2004 15:39:13 -0000 > @@ -1553,6 +1553,8 @@ dpt_attach(dpt_softc_t *dpt) > dpt->sims[i] =3D cam_sim_alloc(dpt_action, dpt_poll, "dpt", > dpt, dpt->unit, /*untagged*/2, > /*tagged*/dpt->max_dccbs, devq); > + if (dpt->sims[i] =3D=3D NULL) > + break; > if (xpt_bus_register(dpt->sims[i], i) !=3D CAM_SUCCESS) { > cam_sim_free(dpt->sims[i], /*free_devq*/i =3D=3D 0); > break; > %%% >=20 Bah, but with this patch that avoids the NULL pointer dereference we start leaking devq. Attached is a more complete patch, and for dev/irr/irr.c too. Cheers, --=20 Ruslan Ermilov FreeBSD committer ru@FreeBSD.org --yLVHuoLXiP9kZBkt Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=p Index: dpt/dpt_scsi.c =================================================================== RCS file: /home/ncvs/src/sys/dev/dpt/dpt_scsi.c,v retrieving revision 1.45 diff -u -p -r1.45 dpt_scsi.c --- dpt/dpt_scsi.c 24 Aug 2003 17:46:04 -0000 1.45 +++ dpt/dpt_scsi.c 18 Jan 2004 15:51:44 -0000 @@ -1553,6 +1553,11 @@ dpt_attach(dpt_softc_t *dpt) dpt->sims[i] = cam_sim_alloc(dpt_action, dpt_poll, "dpt", dpt, dpt->unit, /*untagged*/2, /*tagged*/dpt->max_dccbs, devq); + if (dpt->sims[i] == NULL) { + if (i == 0) + cam_simq_free(devq); + break; + } if (xpt_bus_register(dpt->sims[i], i) != CAM_SUCCESS) { cam_sim_free(dpt->sims[i], /*free_devq*/i == 0); break; Index: iir/iir.c =================================================================== RCS file: /home/ncvs/src/sys/dev/iir/iir.c,v retrieving revision 1.9 diff -u -p -r1.9 iir.c --- iir/iir.c 26 Sep 2003 15:36:47 -0000 1.9 +++ iir/iir.c 18 Jan 2004 15:52:04 -0000 @@ -510,6 +510,11 @@ iir_attach(struct gdt_softc *gdt) gdt->sims[i] = cam_sim_alloc(iir_action, iir_poll, "iir", gdt, gdt->sc_hanum, /*untagged*/2, /*tagged*/GDT_MAXCMDS, devq); + if (gdt->sims[i] == NULL) { + if (i == 0) + cam_simq_free(devq); + break; + } if (xpt_bus_register(gdt->sims[i], i) != CAM_SUCCESS) { cam_sim_free(gdt->sims[i], /*free_devq*/i == 0); break; --yLVHuoLXiP9kZBkt-- --SO98HVl1bnMOfKZd Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (FreeBSD) iD8DBQFACq9iUkv4P6juNwoRAghWAKCBpqGJmtW1g7vOJS15YgKfg/782QCeImr/ aZ5eUYh2kvOaSBl5zcFd4mE= =j+I+ -----END PGP SIGNATURE----- --SO98HVl1bnMOfKZd-- From owner-freebsd-scsi@FreeBSD.ORG Sun Jan 18 10:45:04 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 329B216A4CE; Sun, 18 Jan 2004 10:45:04 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0CAE743D2F; Sun, 18 Jan 2004 10:45:00 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) i0IIiv82096390; Sun, 18 Jan 2004 10:44:58 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id i0IIivlQ096389; Sun, 18 Jan 2004 10:44:57 -0800 (PST) (envelope-from dillon) Date: Sun, 18 Jan 2004 10:44:57 -0800 (PST) From: Matthew Dillon Message-Id: <200401181844.i0IIivlQ096389@apollo.backplane.com> To: Ruslan Ermilov References: <20040118160802.GC32115@FreeBSD.org.ua> cc: freebsd-hackers@freebsd.org cc: Paul Twohey cc: scsi@freebsd.org Subject: Re: [CHECKER] bugs in FreeBSD X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jan 2004 18:45:04 -0000 These cam_sim_alloc() calls seem to be fairly critical to the operation of DPT and friends, why is it even possible for them to return NULL in the first place and what would be the effect of a (properly handled) NULL return if it did occur at this point? -Matt Matthew Dillon :> > * Create the device queue for our SIM. :> > */ :> > Start ---> :> > devq =3D cam_simq_alloc(dpt->max_dccbs); :> >=20 :... :> Index: dpt_scsi.c :> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= :=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= :=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D :> RCS file: /home/ncvs/src/sys/dev/dpt/dpt_scsi.c,v :> retrieving revision 1.45 :> diff -u -p -r1.45 dpt_scsi.c :> --- dpt_scsi.c 24 Aug 2003 17:46:04 -0000 1.45 :> +++ dpt_scsi.c 18 Jan 2004 15:39:13 -0000 :> @@ -1553,6 +1553,8 @@ dpt_attach(dpt_softc_t *dpt) :> dpt->sims[i] =3D cam_sim_alloc(dpt_action, dpt_poll, "dpt", :> dpt, dpt->unit, /*untagged*/2, :> /*tagged*/dpt->max_dccbs, devq); :> + if (dpt->sims[i] =3D=3D NULL) :> + break; :> if (xpt_bus_register(dpt->sims[i], i) !=3D CAM_SUCCESS) { :> cam_sim_free(dpt->sims[i], /*free_devq*/i =3D=3D 0); :> break; :> %%% :>=20 :Bah, but with this patch that avoids the NULL pointer dereference :we start leaking devq. Attached is a more complete patch, and for :dev/irr/irr.c too. : : :Cheers, :--=20 :Ruslan Ermilov :FreeBSD committer :ru@FreeBSD.org : :--yLVHuoLXiP9kZBkt :Content-Type: text/plain; charset=us-ascii :Content-Disposition: attachment; filename=p : :Index: dpt/dpt_scsi.c :=================================================================== :RCS file: /home/ncvs/src/sys/dev/dpt/dpt_scsi.c,v :retrieving revision 1.45 :diff -u -p -r1.45 dpt_scsi.c :--- dpt/dpt_scsi.c 24 Aug 2003 17:46:04 -0000 1.45 :+++ dpt/dpt_scsi.c 18 Jan 2004 15:51:44 -0000 :@@ -1553,6 +1553,11 @@ dpt_attach(dpt_softc_t *dpt) : dpt->sims[i] = cam_sim_alloc(dpt_action, dpt_poll, "dpt", : dpt, dpt->unit, /*untagged*/2, : /*tagged*/dpt->max_dccbs, devq); :+ if (dpt->sims[i] == NULL) { :+ if (i == 0) :+ cam_simq_free(devq); :+ break; :+ } : if (xpt_bus_register(dpt->sims[i], i) != CAM_SUCCESS) { : cam_sim_free(dpt->sims[i], /*free_devq*/i == 0); : break; :Index: iir/iir.c :=================================================================== :RCS file: /home/ncvs/src/sys/dev/iir/iir.c,v :retrieving revision 1.9 :diff -u -p -r1.9 iir.c :--- iir/iir.c 26 Sep 2003 15:36:47 -0000 1.9 :+++ iir/iir.c 18 Jan 2004 15:52:04 -0000 :@@ -510,6 +510,11 @@ iir_attach(struct gdt_softc *gdt) : gdt->sims[i] = cam_sim_alloc(iir_action, iir_poll, "iir", : gdt, gdt->sc_hanum, /*untagged*/2, : /*tagged*/GDT_MAXCMDS, devq); :+ if (gdt->sims[i] == NULL) { :+ if (i == 0) :+ cam_simq_free(devq); :+ break; :+ } : if (xpt_bus_register(gdt->sims[i], i) != CAM_SUCCESS) { : cam_sim_free(gdt->sims[i], /*free_devq*/i == 0); : break; : :--yLVHuoLXiP9kZBkt-- : :--SO98HVl1bnMOfKZd :Content-Type: application/pgp-signature :Content-Disposition: inline : :-----BEGIN PGP SIGNATURE----- :Version: GnuPG v1.2.4 (FreeBSD) : :iD8DBQFACq9iUkv4P6juNwoRAghWAKCBpqGJmtW1g7vOJS15YgKfg/782QCeImr/ :aZ5eUYh2kvOaSBl5zcFd4mE= :=j+I+ :-----END PGP SIGNATURE----- From owner-freebsd-scsi@FreeBSD.ORG Sun Jan 18 11:53:05 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1589B16A4CE for ; Sun, 18 Jan 2004 11:53:05 -0800 (PST) Received: from smtp.mho.com (smtp.mho.net [64.58.4.6]) by mx1.FreeBSD.org (Postfix) with SMTP id 5055B43D5C for ; Sun, 18 Jan 2004 11:53:00 -0800 (PST) (envelope-from scottl@freebsd.org) Received: (qmail 63113 invoked by uid 1002); 18 Jan 2004 19:52:57 -0000 Received: from unknown (HELO freebsd.org) (64.58.1.252) by smtp.mho.net with SMTP; 18 Jan 2004 19:52:57 -0000 Message-ID: <400AE3AB.1070102@freebsd.org> Date: Sun, 18 Jan 2004 12:51:07 -0700 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.5) Gecko/20031103 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Matthew Dillon References: <20040118160802.GC32115@FreeBSD.org.ua> <200401181844.i0IIivlQ096389@apollo.backplane.com> In-Reply-To: <200401181844.i0IIivlQ096389@apollo.backplane.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-hackers@freebsd.org cc: Paul Twohey cc: Ruslan Ermilov cc: scsi@freebsd.org Subject: Re: [CHECKER] bugs in FreeBSD X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jan 2004 19:53:05 -0000 Matthew Dillon wrote: > These cam_sim_alloc() calls seem to be fairly critical to the operation > of DPT and friends, why is it even possible for them to return NULL > in the first place and what would be the effect of a (properly handled) > NULL return if it did occur at this point? > > -Matt > Matthew Dillon > cam_sim_alloc() is vital to the operation of any CAM driver. However, cam_sim_alloc() mallocs it's data structures with the M_NOWAIT flag, so it is possible for it to fail and have to return NULL. The reason it uses the M_NOWAIT flag is because there is no restrictions on when driver attach events happen, though they all happen in normal process or kthread context these days (except at boot, but if you have to sleep for memory during boot, you have a lot of other problems). Scott From owner-freebsd-scsi@FreeBSD.ORG Sun Jan 18 11:57:18 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id ADC6D16A4CE; Sun, 18 Jan 2004 11:57:18 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7FB8B43D2D; Sun, 18 Jan 2004 11:57:17 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) i0IJvF82096884; Sun, 18 Jan 2004 11:57:15 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id i0IJvFTe096883; Sun, 18 Jan 2004 11:57:15 -0800 (PST) (envelope-from dillon) Date: Sun, 18 Jan 2004 11:57:15 -0800 (PST) From: Matthew Dillon Message-Id: <200401181957.i0IJvFTe096883@apollo.backplane.com> To: Scott Long References: <20040118160802.GC32115@FreeBSD.org.ua> <200401181844.i0IIivlQ096389@apollo.backplane.com> <400AE3AB.1070102@freebsd.org> cc: freebsd-hackers@freebsd.org cc: Paul Twohey cc: Ruslan Ermilov cc: scsi@freebsd.org Subject: Re: [CHECKER] bugs in FreeBSD X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jan 2004 19:57:18 -0000 : :Matthew Dillon wrote: :> These cam_sim_alloc() calls seem to be fairly critical to the operation :> of DPT and friends, why is it even possible for them to return NULL :> in the first place and what would be the effect of a (properly handled) :> NULL return if it did occur at this point? :> :> -Matt :> Matthew Dillon :> : : :cam_sim_alloc() is vital to the operation of any CAM driver. However, :cam_sim_alloc() mallocs it's data structures with the M_NOWAIT flag, so :it is possible for it to fail and have to return NULL. The reason it :uses the M_NOWAIT flag is because there is no restrictions on when :driver attach events happen, though they all happen in normal process :or kthread context these days (except at boot, but if you have to sleep :for memory during boot, you have a lot of other problems). : :Scott So, the question becomes: If one were to use M_WAITOK is it possible for a cam_sim_alloc() call for driver A to stall an I/O operation occuring on driver B ? It's the I/O stalls that cause memory deadlocks. Allocations that do not cause I/O stalls on unrelated devices (e.g. your swap) will not cause memory allocation deadlocks. I know cam uses some helper threads so I am not entirely sure about the context the cam_sim_alloc() calls are being made in, but if they do not create I/O stalls for already-operational SCSI devices then I am inclined (in DFly anyway) to simply make the malloc in cam_sim_alloc() M_WAITOK. -Matt Matthew Dillon From owner-freebsd-scsi@FreeBSD.ORG Sun Jan 18 12:29:08 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C879116A4CE for ; Sun, 18 Jan 2004 12:29:08 -0800 (PST) Received: from smtp.mho.com (smtp.mho.net [64.58.4.6]) by mx1.FreeBSD.org (Postfix) with SMTP id 2091343D2D for ; Sun, 18 Jan 2004 12:29:04 -0800 (PST) (envelope-from scottl@freebsd.org) Received: (qmail 64397 invoked by uid 1002); 18 Jan 2004 20:29:03 -0000 Received: from unknown (HELO freebsd.org) (64.58.1.252) by smtp.mho.net with SMTP; 18 Jan 2004 20:29:03 -0000 Message-ID: <400AEC20.70709@freebsd.org> Date: Sun, 18 Jan 2004 13:27:12 -0700 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.5) Gecko/20031103 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Matthew Dillon References: <20040118160802.GC32115@FreeBSD.org.ua> <200401181844.i0IIivlQ096389@apollo.backplane.com> <400AE3AB.1070102@freebsd.org> <200401181957.i0IJvFTe096883@apollo.backplane.com> In-Reply-To: <200401181957.i0IJvFTe096883@apollo.backplane.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-hackers@freebsd.org cc: Paul Twohey cc: Ruslan Ermilov cc: scsi@freebsd.org Subject: Re: [CHECKER] bugs in FreeBSD X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jan 2004 20:29:08 -0000 Matthew Dillon wrote: > : > :Matthew Dillon wrote: > :> These cam_sim_alloc() calls seem to be fairly critical to the operation > :> of DPT and friends, why is it even possible for them to return NULL > :> in the first place and what would be the effect of a (properly handled) > :> NULL return if it did occur at this point? > :> > :> -Matt > :> Matthew Dillon > :> > : > : > :cam_sim_alloc() is vital to the operation of any CAM driver. However, > :cam_sim_alloc() mallocs it's data structures with the M_NOWAIT flag, so > :it is possible for it to fail and have to return NULL. The reason it > :uses the M_NOWAIT flag is because there is no restrictions on when > :driver attach events happen, though they all happen in normal process > :or kthread context these days (except at boot, but if you have to sleep > :for memory during boot, you have a lot of other problems). > : > :Scott > > So, the question becomes: If one were to use M_WAITOK is it possible > for a cam_sim_alloc() call for driver A to stall an I/O operation > occuring on driver B ? > > It's the I/O stalls that cause memory deadlocks. Allocations that do > not cause I/O stalls on unrelated devices (e.g. your swap) will not > cause memory allocation deadlocks. > > I know cam uses some helper threads so I am not entirely sure about > the context the cam_sim_alloc() calls are being made in, but if they > do not create I/O stalls for already-operational SCSI devices then I > am inclined (in DFly anyway) to simply make the malloc in > cam_sim_alloc() M_WAITOK. > > -Matt > Matthew Dillon > > In the 4.x case, so long as the driver doesn't do an splcam() or somehow block hardware interrupts before calling cam_sim_alloc() you are probably fine. For 5.x, you might run into Giant problems. Scott From owner-freebsd-scsi@FreeBSD.ORG Sun Jan 18 13:57:26 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AE93E16A4CE; Sun, 18 Jan 2004 13:57:26 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3226643D3F; Sun, 18 Jan 2004 13:57:25 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) i0ILvN82097288; Sun, 18 Jan 2004 13:57:23 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id i0ILvNQe097287; Sun, 18 Jan 2004 13:57:23 -0800 (PST) (envelope-from dillon) Date: Sun, 18 Jan 2004 13:57:23 -0800 (PST) From: Matthew Dillon Message-Id: <200401182157.i0ILvNQe097287@apollo.backplane.com> To: Scott Long References: <20040118160802.GC32115@FreeBSD.org.ua> <200401181844.i0IIivlQ096389@apollo.backplane.com> <400AE3AB.1070102@freebsd.org> <200401181957.i0IJvFTe096883@apollo.backplane.com> <400AEC20.70709@freebsd.org> cc: freebsd-hackers@freebsd.org cc: Paul Twohey cc: Ruslan Ermilov cc: scsi@freebsd.org Subject: Re: [CHECKER] bugs in FreeBSD X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jan 2004 21:57:26 -0000 :> I know cam uses some helper threads so I am not entirely sure about :> the context the cam_sim_alloc() calls are being made in, but if they :> do not create I/O stalls for already-operational SCSI devices then I :> am inclined (in DFly anyway) to simply make the malloc in :> cam_sim_alloc() M_WAITOK. :> :> -Matt :> Matthew Dillon :> :> : :In the 4.x case, so long as the driver doesn't do an splcam() or somehow :block hardware interrupts before calling cam_sim_alloc() you are :probably fine. For 5.x, you might run into Giant problems. : :Scott Well, I don't see how a spl or Giant could possibly have anything to do with memory deadlocks. Both are dropped when a thread blocks so the worst that happens is that you add some latency. The culprit is almost guarenteed to be blocking in the interrupt threads themselves or blocking in serialized multi-device-handling threads such as some of CAM's helper threads. Blocking in either could deadlock the system in a low memory situation. But what people seem to have done... using M_NOWAIT with very little regard for the side effects that occur when malloc() might then fail, is not the right solution. If the CAM code cannot use a blocking malloc for a critical structure allocation then it certainly can't use a non-blocking malloc that might then fail as a workaround! Some other solution is needed for those situations (something like the MPIPE solution I came up with to guarentee the availability of I/O request structures in interrupt service routines). What it comes down to for cam_sim_alloc() is, again, the context in which it is called. Can it be called from a serialized cam thread or an interrupt thread in a way that could potential block I/O operations for devices other then the one trying to attach? If so then there's a real problem that needs to be solved. If not then M_WAITOK can be safely used in this particular situation and the NULL case no longer needs to be worried about. -Matt Matthew Dillon From owner-freebsd-scsi@FreeBSD.ORG Sun Jan 18 14:08:50 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1ACEE16A4CE for ; Sun, 18 Jan 2004 14:08:50 -0800 (PST) Received: from smtp.mho.com (smtp.mho.net [64.58.4.6]) by mx1.FreeBSD.org (Postfix) with SMTP id 2212743D6A for ; Sun, 18 Jan 2004 14:08:38 -0800 (PST) (envelope-from scottl@freebsd.org) Received: (qmail 69148 invoked by uid 1002); 18 Jan 2004 22:08:37 -0000 Received: from unknown (HELO freebsd.org) (64.58.1.252) by smtp.mho.net with SMTP; 18 Jan 2004 22:08:37 -0000 Message-ID: <400B0377.4070405@freebsd.org> Date: Sun, 18 Jan 2004 15:06:47 -0700 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.5) Gecko/20031103 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Matthew Dillon References: <20040118160802.GC32115@FreeBSD.org.ua> <200401181844.i0IIivlQ096389@apollo.backplane.com> <400AE3AB.1070102@freebsd.org> <200401181957.i0IJvFTe096883@apollo.backplane.com> <400AEC20.70709@freebsd.org> <200401182157.i0ILvNQe097287@apollo.backplane.com> In-Reply-To: <200401182157.i0ILvNQe097287@apollo.backplane.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-hackers@FreeBSD.org cc: Paul Twohey cc: Ruslan Ermilov cc: scsi@FreeBSD.org Subject: Re: [CHECKER] bugs in FreeBSD X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jan 2004 22:08:50 -0000 Matthew Dillon wrote: > :> I know cam uses some helper threads so I am not entirely sure about > :> the context the cam_sim_alloc() calls are being made in, but if they > :> do not create I/O stalls for already-operational SCSI devices then I > :> am inclined (in DFly anyway) to simply make the malloc in > :> cam_sim_alloc() M_WAITOK. > :> > :> -Matt > :> Matthew Dillon > :> > :> > : > :In the 4.x case, so long as the driver doesn't do an splcam() or somehow > :block hardware interrupts before calling cam_sim_alloc() you are > :probably fine. For 5.x, you might run into Giant problems. > : > :Scott > > Well, I don't see how a spl or Giant could possibly have anything to > do with memory deadlocks. Both are dropped when a thread blocks so the > worst that happens is that you add some latency. CAM doesn't use a kthread in 4.x. It uses it's own SWI hooks. If you call splcam(), then you will block those from running, and no CAM I/O will complete until you call splx(). That's why I say that it's ok to use M_WAITOK so long as you don't block CAM from running. If you want to add a WAITOK/NOWAIT flag parameter to cam_sim_alloc(), that might be a good solution. Scott From owner-freebsd-scsi@FreeBSD.ORG Sun Jan 18 14:38:34 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DB25116A4CF; Sun, 18 Jan 2004 14:38:34 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id A2C6243D41; Sun, 18 Jan 2004 14:38:32 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) i0IMcQ82097544; Sun, 18 Jan 2004 14:38:31 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id i0IMcQYZ097543; Sun, 18 Jan 2004 14:38:26 -0800 (PST) (envelope-from dillon) Date: Sun, 18 Jan 2004 14:38:26 -0800 (PST) From: Matthew Dillon Message-Id: <200401182238.i0IMcQYZ097543@apollo.backplane.com> To: Scott Long , freebsd-hackers@freebsd.org, Paul Twohey , scsi@freebsd.org References: <20040118160802.GC32115@FreeBSD.org.ua> <200401181844.i0IIivlQ096389@apollo.backplane.com> <400AE3AB.1070102@freebsd.org> <200401181957.i0IJvFTe096883@apollo.backplane.com> <200401182157.i0ILvNQe097287@apollo.backplane.com> Subject: Re: [CHECKER] bugs in FreeBSD X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jan 2004 22:38:35 -0000 Well, this is fun. There are over 460 files in the 5.x source tree (360 in DFly) that make calls to malloc(... M_NOWAIT), and so far about 80% of the calls that I've reviewed generate inappropriate side effects when/if a failure occurs. CAM is the biggest violator... it even has a few panic() conditionals if a malloc(... M_NOWAIT) fails. Not Fun! The only reason it works at all is because M_NOWAIT actually does appear to allow malloc() to block in a number of situations (such as on VM object and map mutexes), and M_NOWAIT triggers VM_ALLOC_INTERRUPT which allows kmem_malloc() to dig into the free page reserve. So in 5.x M_NOWAIT allocations will actually work most of the time.. well, at least until something exhausts the free page reserve at just the wrong time, which is quite possible to do considering how much code is being allowed to dig into the reserve now. M_NOWAIT is being used pretty much as if it were M_WAITOK|M_USE_RESERVE most of the time, especially considering the side effect situation when such allocations fail. I don't think M_WAITOK|M_USE_RESERVE would be any less reliable, actually. It looks like the whole paradigm has shifted away from the original definition of M_NOWAIT to something that is more like a cross between M_NOWAIT, M_WAITOK, and M_USE_RESERVE. This creates a conundrum for me. In DFly M_NOWAIT really means M_NOWAIT, so I am going to have to do something about all the improper M_NOWAIT use in the source base. I'm amazed we haven't had more crashes but even in DFly M_NOWAIT failures due to, e.g. not being able to get the kernel_map lock non-blocking, do not occur all that often. -Matt From owner-freebsd-scsi@FreeBSD.ORG Sun Jan 18 14:44:54 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9A61316A4CE; Sun, 18 Jan 2004 14:44:54 -0800 (PST) Received: from sasami.jurai.net (sasami.jurai.net [66.92.160.223]) by mx1.FreeBSD.org (Postfix) with ESMTP id 63EB643D39; Sun, 18 Jan 2004 14:44:53 -0800 (PST) (envelope-from winter@jurai.net) Received: from sasami.jurai.net (winter@sasami.jurai.net [66.92.160.223]) by sasami.jurai.net (8.12.9/8.12.9) with ESMTP id i0IMiqdi027690; Sun, 18 Jan 2004 17:44:52 -0500 (EST) (envelope-from winter@jurai.net) Date: Sun, 18 Jan 2004 17:44:52 -0500 (EST) From: "Matthew N. Dodd" To: Ruslan Ermilov In-Reply-To: <20040118154447.GA32115@FreeBSD.org.ua> Message-ID: <20040118174428.X90982@sasami.jurai.net> References: <20040118154447.GA32115@FreeBSD.org.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-hackers@freebsd.org cc: scsi@freebsd.org Subject: Re: [CHECKER] bugs in FreeBSD X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jan 2004 22:44:54 -0000 On Sun, 18 Jan 2004, Ruslan Ermilov wrote: > But we're missing the proper NULL checking, here's the fix: ... I've already dealt with this. -- 10 40 80 C0 00 FF FF FF FF C0 00 00 00 00 10 AA AA 03 00 00 00 08 00 From owner-freebsd-scsi@FreeBSD.ORG Sun Jan 18 14:57:20 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6B76016A4CE; Sun, 18 Jan 2004 14:57:20 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9349343D54; Sun, 18 Jan 2004 14:57:10 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) i0IMv682097664; Sun, 18 Jan 2004 14:57:07 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id i0IMv63i097663; Sun, 18 Jan 2004 14:57:06 -0800 (PST) (envelope-from dillon) Date: Sun, 18 Jan 2004 14:57:06 -0800 (PST) From: Matthew Dillon Message-Id: <200401182257.i0IMv63i097663@apollo.backplane.com> To: Scott Long , freebsd-hackers@freebsd.org, Paul Twohey , scsi@freebsd.org References: <20040118160802.GC32115@FreeBSD.org.ua> <200401181844.i0IIivlQ096389@apollo.backplane.com> <400AE3AB.1070102@freebsd.org> <200401181957.i0IJvFTe096883@apollo.backplane.com> <200401182238.i0IMcQYZ097543@apollo.backplane.com> Subject: Re: [CHECKER] bugs in FreeBSD X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jan 2004 22:57:20 -0000 : M_NOWAIT is being used pretty much as if it were M_WAITOK|M_USE_RESERVE : most of the time, especially considering the side effect situation when : such allocations fail. I don't think M_WAITOK|M_USE_RESERVE would be : any less reliable, actually. It looks like the whole paradigm has : shifted away from the original definition of M_NOWAIT to something that : is more like a cross between M_NOWAIT, M_WAITOK, and M_USE_RESERVE. oops, don't take that literally. M_USE_RESERVE means something else. M_NOWAIT is triggering VM_ALLOC_INTERRUPT which is allowed to dig into the free (vm) page reserve. Another interesting thing I've found, and correct me if I'm wrong, but it looks like when the 5.x slab allocator allocates M_NOWAIT memory that newly allocated zone becomes available for normal M_WAITOK allocations as well. This is something DFly's slab allocator does too (I have a big XXX comment for it), and the 4.x allocator too I think. That could create an exhaustion issue. -Matt Matthew Dillon From owner-freebsd-scsi@FreeBSD.ORG Sun Jan 18 15:37:16 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 16FA116A4CE; Sun, 18 Jan 2004 15:37:16 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 753FF43D2F; Sun, 18 Jan 2004 15:37:14 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) i0INbD82097879; Sun, 18 Jan 2004 15:37:13 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id i0INbDtj097878; Sun, 18 Jan 2004 15:37:13 -0800 (PST) (envelope-from dillon) Date: Sun, 18 Jan 2004 15:37:13 -0800 (PST) From: Matthew Dillon Message-Id: <200401182337.i0INbDtj097878@apollo.backplane.com> To: Scott Long , freebsd-hackers@freebsd.org, Paul Twohey , scsi@freebsd.org References: <20040118160802.GC32115@FreeBSD.org.ua> <200401181844.i0IIivlQ096389@apollo.backplane.com> <400AE3AB.1070102@freebsd.org> <200401181957.i0IJvFTe096883@apollo.backplane.com> <200401182257.i0IMv63i097663@apollo.backplane.com> Subject: Re3: [CHECKER] bugs in FreeBSD X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jan 2004 23:37:16 -0000 More research... correct me if I am wrong but it appears that the 5.x kmem_malloc() code may have some issues. If you look down at around line 349 there is a comment: /* * Note: if M_NOWAIT specified alone, allocate from * interrupt-safe queues only (just the free list). If * M_USE_RESERVE is also specified, we can also * allocate from the cache. Neither of the latter two * flags may be specified from an interrupt since interrupts * are not allowed to mess with the cache queue. */ if ((flags & (M_NOWAIT|M_USE_RESERVE)) == M_NOWAIT) pflags = VM_ALLOC_INTERRUPT | VM_ALLOC_WIRED; else pflags = VM_ALLOC_SYSTEM | VM_ALLOC_WIRED; Here's the problem... the problem is that malloc(...M_NOWAIT) is used by interrupts not only to avoid blocking, but also to avoid messing with the VM Page 'cache' queue. But in 5.x it is possible for non-interrupt threads to preempt other non-interrupt threads indirectly (due to an interrupt trying to get a mutex that a non-interrupt thread currently holds). Am I correct? But the non-interrupt thread will almost certainly be making memory allocations with M_WAITOK, which means that a preempting thread *CAN* wind up pulling pages out of the 'cache' queue. Now, my understanding is that 5.x's mutexes around the VM system means that this, in fact, will work just fine. So, that means that the above comment is no longer correct, right? In fact, interrupts *should* be able to allocate pages from the VM page 'cache' queue in 5.x now. This leads to the obvious conclusion that 'critical' code, such as the CAM code, which cannot afford to block but which also does terrible things when an M_NOWAIT allocation fails should be able to use (M_WAITOK|M_USE_RESERVE|M_USE_INTERRUPT_RESERVE) and this would result in far safer operation then the current M_NOWAIT use results in. (M_USE_INTERRUPT_RESERVE would be a new M_* flag that allows the system to exhaust the entire free page reserve if necessary and has the same effect as M_NOWAIT had before, but the combination of flags would now allow interrupt-time allocations to also allocate from the cache queue making it virtually impossible for such allocations to fail and that, combined with M_WAITOK, would allow all NULL checks to be removed. It could actually be considered a critical error for the above flags combination to deadlock. -Matt From owner-freebsd-scsi@FreeBSD.ORG Mon Jan 19 06:57:56 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9FFCC16A4CE for ; Mon, 19 Jan 2004 06:57:56 -0800 (PST) Received: from phantom.cris.net (phantom.cris.net [212.110.130.74]) by mx1.FreeBSD.org (Postfix) with ESMTP id 936DE43D58 for ; Mon, 19 Jan 2004 06:57:50 -0800 (PST) (envelope-from ru@FreeBSD.org.ua) Received: from phantom.cris.net (ru@localhost [127.0.0.1]) by phantom.cris.net (8.12.10/8.12.10) with ESMTP id i0JEu6jm046113; Mon, 19 Jan 2004 16:56:06 +0200 (EET) (envelope-from ru@FreeBSD.org.ua) Received: (from ru@localhost) by phantom.cris.net (8.12.10/8.12.10/Submit) id i0JEu3Pg046098; Mon, 19 Jan 2004 16:56:03 +0200 (EET) (envelope-from ru) Date: Mon, 19 Jan 2004 16:56:03 +0200 From: Ruslan Ermilov To: "Matthew N. Dodd" Message-ID: <20040119145603.GN41159@FreeBSD.org.ua> References: <20040118154447.GA32115@FreeBSD.org.ua> <20040118174428.X90982@sasami.jurai.net> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="muT+E17Lr9urPYYJ" Content-Disposition: inline In-Reply-To: <20040118174428.X90982@sasami.jurai.net> User-Agent: Mutt/1.5.5.1i cc: freebsd-hackers@freebsd.org cc: scsi@freebsd.org Subject: Re: [CHECKER] bugs in FreeBSD X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Jan 2004 14:57:56 -0000 --muT+E17Lr9urPYYJ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Jan 18, 2004 at 05:44:52PM -0500, Matthew N. Dodd wrote: > On Sun, 18 Jan 2004, Ruslan Ermilov wrote: > > But we're missing the proper NULL checking, here's the fix: > ... >=20 > I've already dealt with this. >=20 Neat, this works much better! Can you please propagate your fix to dev/iir/iir.c:iir_attach()? Cheers, --=20 Ruslan Ermilov FreeBSD committer ru@FreeBSD.org --muT+E17Lr9urPYYJ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (FreeBSD) iD8DBQFAC/ADUkv4P6juNwoRAsocAJ9vo3rWqAnzlfhj9R+5jpbjEV4k7gCdET5A IurQzwuzqCJF7UtuH8T8Bns= =8ZMi -----END PGP SIGNATURE----- --muT+E17Lr9urPYYJ-- From owner-freebsd-scsi@FreeBSD.ORG Mon Jan 19 09:33:03 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9EA2E16A4CE for ; Mon, 19 Jan 2004 09:33:03 -0800 (PST) Received: from leticia.terra.com.br (leticia.terra.com.br [200.154.55.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 67C5B43D48 for ; Mon, 19 Jan 2004 09:33:02 -0800 (PST) (envelope-from eick.jac@terra.com.br) Received: from canela.terra.com.br (canela.terra.com.br [200.176.3.79]) by leticia.terra.com.br (Postfix) with ESMTP id DD6793CEAF for ; Mon, 19 Jan 2004 15:33:00 -0200 (BRST) Received: from eicke (unknown [200.162.114.126]) (authenticated user eick.jac) by canela.terra.com.br (Postfix) with ESMTP id 6D92C14814B for ; Mon, 19 Jan 2004 15:33:00 -0200 (BRST) Message-ID: <00e301c3deb1$d8299f90$0905a8c0@alellyxbr.com.br> From: "Eicke" To: Date: Mon, 19 Jan 2004 15:29:50 -0200 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2720.3000 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2727.1300 Subject: LSI SCSI X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Jan 2004 17:33:03 -0000 Hi folks, I am trying to test a server (AMD 64) with the folowing SCSI controler configuration: LSI 531030 dual-channel Ultra320 SCSI, dual-channel ATA-100 I visited http://www.freebsd.org/cgi/man.cgi?query=mpt&sektion=4&manpath=FreeBSD+5.2-R ELEASE to confirm the hardware support. When I tried to install the FreeBSD an error occours: NO DISKS FOUND! Could you help me? Regards. Eicke. From owner-freebsd-scsi@FreeBSD.ORG Mon Jan 19 11:04:44 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AEAFA16A4CE for ; Mon, 19 Jan 2004 11:04:44 -0800 (PST) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id B83A943D6B for ; Mon, 19 Jan 2004 11:03:42 -0800 (PST) (envelope-from owner-bugmaster@freebsd.org) Received: from freefall.freebsd.org (peter@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.10/8.12.10) with ESMTP id i0JJ31FR063737 for ; Mon, 19 Jan 2004 11:03:01 -0800 (PST) (envelope-from owner-bugmaster@freebsd.org) Received: (from peter@localhost) by freefall.freebsd.org (8.12.10/8.12.10/Submit) id i0JJ31Au063731 for scsi@freebsd.org; Mon, 19 Jan 2004 11:03:01 -0800 (PST) (envelope-from owner-bugmaster@freebsd.org) Date: Mon, 19 Jan 2004 11:03:01 -0800 (PST) Message-Id: <200401191903.i0JJ31Au063731@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: peter set sender to owner-bugmaster@freebsd.org using -f From: FreeBSD bugmaster To: scsi@FreeBSD.org Subject: Current problem reports assigned to you X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Jan 2004 19:04:44 -0000 Current FreeBSD problem reports Critical problems Serious problems Non-critical problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- f [1999/12/21] kern/15608 scsi acd0 / cd0 give inconsistent errors on em 1 problem total. From owner-freebsd-scsi@FreeBSD.ORG Tue Jan 20 08:48:31 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D32B016A4CE for ; Tue, 20 Jan 2004 08:48:31 -0800 (PST) Received: from web21108.mail.yahoo.com (web21108.mail.yahoo.com [216.136.227.110]) by mx1.FreeBSD.org (Postfix) with SMTP id DCA0343D81 for ; Tue, 20 Jan 2004 08:48:30 -0800 (PST) (envelope-from materribile@yahoo.com) Message-ID: <20040120164830.39726.qmail@web21108.mail.yahoo.com> Received: from [64.19.133.100] by web21108.mail.yahoo.com via HTTP; Tue, 20 Jan 2004 08:48:30 PST Date: Tue, 20 Jan 2004 08:48:30 -0800 (PST) From: Mark Terribile To: scsi@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: Does this document (also) describe FreeBSD SCSI? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Jan 2004 16:48:31 -0000 Hi, This document: http://www.cs.arizona.edu/computer.help/policy/DIGITAL_unix/AA-PS3GD-TET1_html/camosf2.html claims to describe the SCSI subsystem in ``Digital UNIX'' but it looks a lot like FreeBSD. Does anyone know if there is a genetic relationship, or how reliable a guide it is? Mark Terribile materribile@yahoo.com __________________________________ Do you Yahoo!? Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes http://hotjobs.sweepstakes.yahoo.com/signingbonus From owner-freebsd-scsi@FreeBSD.ORG Tue Jan 20 09:45:19 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0AD1116A4CE for ; Tue, 20 Jan 2004 09:45:19 -0800 (PST) Received: from panzer.kdm.org (panzer.kdm.org [216.160.178.169]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2A41043D8B for ; Tue, 20 Jan 2004 09:44:56 -0800 (PST) (envelope-from ken@panzer.kdm.org) Received: from panzer.kdm.org (localhost [127.0.0.1]) by panzer.kdm.org (8.12.9/8.12.5) with ESMTP id i0KHit64013892; Tue, 20 Jan 2004 10:44:55 -0700 (MST) (envelope-from ken@panzer.kdm.org) Received: (from ken@localhost) by panzer.kdm.org (8.12.9/8.12.5/Submit) id i0KHitJ4013891; Tue, 20 Jan 2004 10:44:55 -0700 (MST) (envelope-from ken) Date: Tue, 20 Jan 2004 10:44:54 -0700 From: "Kenneth D. Merry" To: Mark Terribile Message-ID: <20040120174454.GA13828@panzer.kdm.org> References: <20040120164830.39726.qmail@web21108.mail.yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040120164830.39726.qmail@web21108.mail.yahoo.com> User-Agent: Mutt/1.4.1i cc: scsi@freebsd.org Subject: Re: Does this document (also) describe FreeBSD SCSI? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Jan 2004 17:45:19 -0000 On Tue, Jan 20, 2004 at 08:48:30 -0800, Mark Terribile wrote: > Hi, > > This document: > > http://www.cs.arizona.edu/computer.help/policy/DIGITAL_unix/AA-PS3GD-TET1_html/camosf2.html > > claims to describe the SCSI subsystem in ``Digital UNIX'' but it looks a lot > like > FreeBSD. Does anyone know if there is a genetic relationship, or how reliable > a > guide it is? FreeBSD and DEC UNIX/Tru64 have both implemented the ANSI CAM spec. Thus the reason that DEC's implementation looks a lot like the FreeBSD SCSI layer. (They're based on the same spec.) The spec is here: http://www.t10.org/ftp/t10/drafts/cam/cam-r12b.pdf So the DEC CAM paper may be somewhat applicable to FreeBSD, but certainly not 100% applicable. Ken -- Kenneth Merry ken@kdm.org From owner-freebsd-scsi@FreeBSD.ORG Tue Jan 20 13:18:01 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4FCA816A4CE for ; Tue, 20 Jan 2004 13:18:01 -0800 (PST) Received: from mail.vicor-nb.com (bigwoop.vicor-nb.com [208.206.78.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8C8F043D54 for ; Tue, 20 Jan 2004 13:18:00 -0800 (PST) (envelope-from julian@vicor.com) Received: from vicor.com (julian.vicor-nb.com [208.206.78.97]) by mail.vicor-nb.com (Postfix) with ESMTP id 79B867A3D4 for ; Tue, 20 Jan 2004 13:18:00 -0800 (PST) Message-ID: <400D9B08.70005@vicor.com> Date: Tue, 20 Jan 2004 13:18:00 -0800 From: Julian Elischer Organization: VICOR User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3.1) Gecko/20030516 X-Accept-Language: en, hu MIME-Version: 1.0 To: SCSI@freebsd.org Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Subject: CAM//SCSI disk timeouts X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Jan 2004 21:18:01 -0000 If we are talkign to a raid array and it discovers that it has to do some maintanance (e.g. declare a physical disk bad or recreate some data on the fly) it is possible that transactions to the raid that are outstanding may tak a lot longer than usual. In particular, we've seen the case where a log entry in teh Raid controller's log that indicates that a drive has been declared bad, is often accompanied in FreeBSD with a set of CAM /scsi-disk timeouts and things degrade from there... Where can I find the timeouts in force fo rscsi disks and teh retry policies? I need to compare the theoretical behaviour of FreeBSD CAM with the theoretical timeouts needed for the raid array, given a disk failure.. From owner-freebsd-scsi@FreeBSD.ORG Tue Jan 20 13:33:34 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8C4E216A55D for ; Tue, 20 Jan 2004 13:33:34 -0800 (PST) Received: from panzer.kdm.org (panzer.kdm.org [216.160.178.169]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5864443D2D for ; Tue, 20 Jan 2004 13:33:32 -0800 (PST) (envelope-from ken@panzer.kdm.org) Received: from panzer.kdm.org (localhost [127.0.0.1]) by panzer.kdm.org (8.12.9/8.12.5) with ESMTP id i0KLXV64016018; Tue, 20 Jan 2004 14:33:31 -0700 (MST) (envelope-from ken@panzer.kdm.org) Received: (from ken@localhost) by panzer.kdm.org (8.12.9/8.12.5/Submit) id i0KLXVea016017; Tue, 20 Jan 2004 14:33:31 -0700 (MST) (envelope-from ken) Date: Tue, 20 Jan 2004 14:33:31 -0700 From: "Kenneth D. Merry" To: Julian Elischer Message-ID: <20040120213330.GA15900@panzer.kdm.org> References: <400D9B08.70005@vicor.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <400D9B08.70005@vicor.com> User-Agent: Mutt/1.4.1i cc: SCSI@freebsd.org Subject: Re: CAM//SCSI disk timeouts X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Jan 2004 21:33:34 -0000 On Tue, Jan 20, 2004 at 13:18:00 -0800, Julian Elischer wrote: > > > If we are talkign to a raid array and it discovers that it has to > do some maintanance (e.g. declare a physical disk bad or recreate > some data on the fly) it is possible that transactions to the raid > that are outstanding may tak a lot longer than usual. In particular, > we've seen the case where a log entry in teh Raid controller's log > that indicates that a drive has been declared bad, is often accompanied > in FreeBSD with a set of CAM /scsi-disk timeouts > and things degrade from there... > > Where can I find the timeouts in force fo rscsi disks and teh retry > policies? > > I need to compare the theoretical behaviour of FreeBSD CAM with the > theoretical timeouts needed for the raid array, given a disk failure.. You can tweak the timeout and retry count values with the kern.cam.da.retry_count and kern.cam.da.default_timeout sysctl variables. They are also available as loader tunables. Ken -- Kenneth Merry ken@kdm.org From owner-freebsd-scsi@FreeBSD.ORG Tue Jan 20 13:41:38 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9F43A16A4CE for ; Tue, 20 Jan 2004 13:41:38 -0800 (PST) Received: from mail.vicor-nb.com (bigwoop.vicor-nb.com [208.206.78.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id D8DA643D31 for ; Tue, 20 Jan 2004 13:41:37 -0800 (PST) (envelope-from julian@vicor.com) Received: from vicor.com (julian.vicor-nb.com [208.206.78.97]) by mail.vicor-nb.com (Postfix) with ESMTP id B14E17A436; Tue, 20 Jan 2004 13:41:37 -0800 (PST) Message-ID: <400DA091.70803@vicor.com> Date: Tue, 20 Jan 2004 13:41:37 -0800 From: Julian Elischer Organization: VICOR User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3.1) Gecko/20030516 X-Accept-Language: en, hu MIME-Version: 1.0 To: "Kenneth D. Merry" References: <400D9B08.70005@vicor.com> <20040120213330.GA15900@panzer.kdm.org> In-Reply-To: <20040120213330.GA15900@panzer.kdm.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: SCSI@freebsd.org Subject: Re: CAM//SCSI disk timeouts X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Jan 2004 21:41:38 -0000 Kenneth D. Merry wrote: > You can tweak the timeout and retry count values with the > kern.cam.da.retry_count and kern.cam.da.default_timeout sysctl variables. thanks.. sysctl kern.cam.da kern.cam.da.retry_count: 4 kern.cam.da.default_timeout: 60 <---- I assume this is in seconds? kern.cam.da.no_6_byte: 0 > > They are also available as loader tunables. > > Ken From owner-freebsd-scsi@FreeBSD.ORG Tue Jan 20 13:48:49 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CDD4D16A4CE for ; Tue, 20 Jan 2004 13:48:49 -0800 (PST) Received: from panzer.kdm.org (panzer.kdm.org [216.160.178.169]) by mx1.FreeBSD.org (Postfix) with ESMTP id A7DFC43D6A for ; Tue, 20 Jan 2004 13:48:46 -0800 (PST) (envelope-from ken@panzer.kdm.org) Received: from panzer.kdm.org (localhost [127.0.0.1]) by panzer.kdm.org (8.12.9/8.12.5) with ESMTP id i0KLmk64016192; Tue, 20 Jan 2004 14:48:46 -0700 (MST) (envelope-from ken@panzer.kdm.org) Received: (from ken@localhost) by panzer.kdm.org (8.12.9/8.12.5/Submit) id i0KLmjXD016191; Tue, 20 Jan 2004 14:48:46 -0700 (MST) (envelope-from ken) Date: Tue, 20 Jan 2004 14:48:45 -0700 From: "Kenneth D. Merry" To: Julian Elischer Message-ID: <20040120214845.GA16176@panzer.kdm.org> References: <400D9B08.70005@vicor.com> <20040120213330.GA15900@panzer.kdm.org> <400DA091.70803@vicor.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <400DA091.70803@vicor.com> User-Agent: Mutt/1.4.1i cc: SCSI@freebsd.org Subject: Re: CAM//SCSI disk timeouts X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Jan 2004 21:48:49 -0000 On Tue, Jan 20, 2004 at 13:41:37 -0800, Julian Elischer wrote: > > > Kenneth D. Merry wrote: > > >You can tweak the timeout and retry count values with the > >kern.cam.da.retry_count and kern.cam.da.default_timeout sysctl variables. > > thanks.. > > sysctl kern.cam.da > kern.cam.da.retry_count: 4 > kern.cam.da.default_timeout: 60 <---- I assume this is in seconds? > kern.cam.da.no_6_byte: 0 Yes, it's in seconds. Ken -- Kenneth Merry ken@kdm.org From owner-freebsd-scsi@FreeBSD.ORG Tue Jan 20 14:14:30 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F143F16A4CE for ; Tue, 20 Jan 2004 14:14:30 -0800 (PST) Received: from main.gmane.org (main.gmane.org [80.91.224.249]) by mx1.FreeBSD.org (Postfix) with ESMTP id 644BC43D75 for ; Tue, 20 Jan 2004 14:12:39 -0800 (PST) (envelope-from freebsd-scsi@m.gmane.org) Received: from list by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 1Aj46w-00033q-00 for ; Tue, 20 Jan 2004 23:12:38 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-scsi@freebsd.org Received: from sea.gmane.org ([80.91.224.252]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1Aj46v-00033i-00 for ; Tue, 20 Jan 2004 23:12:37 +0100 Received: from news by sea.gmane.org with local (Exim 3.35 #1 (Debian)) id 1Aj46v-000136-00 for ; Tue, 20 Jan 2004 23:12:37 +0100 From: Jesse Guardiani Date: Tue, 20 Jan 2004 17:12:31 -0500 Organization: WingNET Lines: 124 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7Bit X-Complaints-To: usenet@sea.gmane.org User-Agent: KNode/0.7.2 X-Mail-Copies-To: never Sender: news Subject: adaptec 2940u/uw dump card state ends X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: jesse@wingnet.net List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Jan 2004 22:14:31 -0000 Howdy list, I'm having trouble with a Sony AIT TSL-SA300C autochanger attached to an adaptec 2940u/uw SCSI card and a Seagate AIT tape drive. Hopefully someone can shed some light on my problem. The physical (as in cable) SCSI topology looks like the following, with terminated devices marked by a (T): (T)AIT Autochanger <-> 2940u/uw <-> Seagate AIT Tape(T) I have termination manually turned off in the 2940u/uw's software configuration utility. The Seagate AIT tape drive is terminated via jumper (Term PWR is off, and SCSI Term is ON, if I remember correctly). The Sony Autochanger is terminated using a long yellow external wide SCSI ribbon cable. The cable has a little black thing on the end that I am ASS-U-MEing is a resistor pack for termination. The Autochanger's PDF indicates that it is not capable of active termination, which makes sense because it's external and you can turn it off. Admittedly, termination may be the issue. I'm relatively new to SCSI, and I may not have gotten it right. In particular, I don't have a clue what the TERM PWR jumpers do on devices that also include a SCSI TERM jumper... Here is the problem I'm having: The first time I issue a `chio move drive 0 slot 0` command, I get this: (ch0:ahc0:0:2:1): SCB 0xe - timed out >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< ahc0: Dumping Card State while idle, at SEQADDR 0x7 Card was paused ACCUM = 0xb, SINDEX = 0x27, DINDEX = 0x23, ARG_2 = 0x3 HCNT = 0x0 SCBPTR = 0x1 SCSISIGI[0x0] ERROR[0x0] SCSIBUSL[0x0] LASTPHASE[0x1]:(P_BUSFREE) SCSISEQ[0x12]:(ENAUTOATNP|ENRSELI) SBLKCTL[0x2]:(SELWIDE) SCSIRATE[0x0] SEQCTL[0x10]:(FASTMODE) SEQ_FLAGS[0xc0]:(NO_CDB_SENT|NOT_IDENTIFIE D) SSTAT0[0x5]:(DMADONE|SDONE) SSTAT1[0xa]:(PHASECHG|BUSFREE) SSTAT2[0x0] SSTAT3[0x0] SIMODE0[0x0] SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTI MO) SXFRCTL0[0x80]:(DFON) DFCNTRL[0x0] DFSTATUS[0x2d]:(FIFOEMP|DFTHRESH|HDONE|FIFOQW DEMP) STACK: 0x0 0x169 0x199 0x3 SCB count = 20 Kernel NEXTQSCB = 7 Card NEXTQSCB = 7 QINFIFO entries: Waiting Queue entries: Disconnected Queue entries: 1:14 QOUTFIFO entries: Sequencer Free SCB List: 0 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Sequencer SCB Info: 0 SCB_CONTROL[0xc0]:(DISCENB|TARGET_SCB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff] 1 SCB_CONTROL[0x44]:(DISCONNECTED|DISCENB) SCB_SCSIID[0x27] SCB_LUN[0x1] SCB_TAG[0xe] 2 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 3 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 4 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 5 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 6 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 7 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 8 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 9 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 10 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 11 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] CB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 13 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 14 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] 15 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID) SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff] Pending list: 14 SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x27] SCB_LUN[0x1] Kernel Free SCB list: 8 15 16 17 18 19 0 1 2 3 4 5 6 9 13 12 11 10 Untagged Q(2): 14 <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> (ch0:ahc0:0:2:1): Queuing a BDR SCB (ch0:ahc0:0:2:1): Bus Device Reset Message Sent (ch0:ahc0:0:2:1): no longer in timeout, status = 34b ahc0: Bus Device Reset on A:2. 1 SCBs aborted (ahc0:A:2:1): refuses WIDE negotiation. Using 8bit transfers (ahc0:A:2:1): refuses synchronous negotiation. Using asynchronous transfers Jan 20 16:39:59 billmax /kernel: <<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>> >>>>>>>>> So far, all further commands have been successfull. Here's my uname: # uname -a FreeBSD billmax.int.wingnet.net 4.9-RELEASE FreeBSD 4.9-RELEASE #0: Sun Jan 18 18:29:28 EST 2004 jesse@billmax.int.wingnet.net:/usr/src/sys/compile/BILLMAX i386 Any suggestions? Thanks! -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net From owner-freebsd-scsi@FreeBSD.ORG Thu Jan 22 00:45:58 2004 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DC79816A4DD for ; Thu, 22 Jan 2004 00:45:58 -0800 (PST) Received: from mail.dti.supsi.ch (mail.die.supsi.ch [193.5.153.13]) by mx1.FreeBSD.org (Postfix) with ESMTP id 603C243D45 for ; Thu, 22 Jan 2004 00:45:54 -0800 (PST) (envelope-from roberto.nunnari@supsi.ch) Received: from supsi.ch (pcm2027.dti.supsi.ch [193.5.152.27]) by mail.dti.supsi.ch (8.11.6/8.11.6) with ESMTP id i0M8jqX22945 for ; Thu, 22 Jan 2004 09:45:52 +0100 Message-ID: <400F8DC0.2000605@supsi.ch> Date: Thu, 22 Jan 2004 09:45:52 +0100 From: Roberto Nunnari User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-scsi@freebsd.org Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Subject: problem with scsi tape streamer X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Jan 2004 08:45:59 -0000 Hi all. I'll try to post here at scsi.. as on current I got no answer. I've got some problem with a SCSI DDS-4 tape streamer.. The system identifies it as: Removable Sequential Access SCSI-3 device while the BIOS sais: Python 06408-XXX 80.0 It just doesn't behave as expected... the same problem is present since I first installed 5.0-RELEASE, then with 5.1-RELEASE and now with 5.2-RELEASE.. so.. last week I installed another OS on that same machine.. just to make sure it's not a hardware problem.. and it worked without a problem.. went back to FreeBSD and the problem is again there.. It seams that if I use a new tape, then I can dump on it.. but if I try to overwrite the tape, it starts behaving funny. also.. anytime I try to write to a non new tape, it will make errors and turn on the dirt led. The result is that it's not possible to make any reliable dump/restore on it. The server and the tape drive are younger than 1 year old and the tapes are of good quality. I'd be grateful if somebody could look into these logs and tell me what's the problem... Please ask me if you want more details. Thank you. ********************************************* web.dti.supsi.ch# uname -a FreeBSD web.dti.supsi.ch 5.2-RELEASE FreeBSD 5.2-RELEASE #0: Tue Jan 13 14:28:57 CET 2004 root@web.dti.supsi.ch:/usr/obj/usr/src/sys/WEB i386 web.dti.supsi.ch# cat mydump00.sh #!/bin/sh /sbin/dump -0uLaf /dev/nsa0 / /sbin/dump -0uLaf /dev/nsa0 /usr /sbin/dump -0uLaf /dev/nsa0 /var web.dti.supsi.ch# df Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/amrd0s1a 257838 60070 177142 25% / devfs 1 1 0 100% /dev /dev/amrd0s1e 257838 12 237200 0% /tmp /dev/amrd0s1f 32123334 5850974 23702494 20% /usr /dev/amrd0s1d 257838 54726 182486 23% /var B:/usr/users 17782536 6544189 11062576 37% /usr/users web.dti.supsi.ch# dmesg Copyright (c) 1992-2004 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.2-RELEASE #0: Tue Jan 13 14:28:57 CET 2004 root@web.dti.supsi.ch:/usr/obj/usr/src/sys/WEB Preloaded elf kernel "/boot/kernel/kernel" at 0xc098c000. Preloaded elf module "/boot/kernel/acpi.ko" at 0xc098c21c. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(TM) CPU 2.40GHz (2392.30-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf27 Stepping = 7 Features=0xbfebfbff Hyperthreading: 2 logical CPUs real memory = 1073610752 (1023 MB) avail memory = 1033445376 (985 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 ioapic0: Changing APIC ID to 4 ioapic1: Changing APIC ID to 5 ioapic2: Changing APIC ID to 6 ioapic2: WARNING: intbase 72 != expected base 48 ioapic3: Changing APIC ID to 7 ioapic3: WARNING: intbase 120 != expected base 96 ioapic4: Changing APIC ID to 8 ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-47 on motherboard ioapic2 irqs 72-95 on motherboard ioapic3 irqs 120-143 on motherboard ioapic4 irqs 144-167 on motherboard Pentium Pro MTRR support enabled npx0: [FAST] npx0: on motherboard npx0: INT 16 interface acpi0: on motherboard pcibios: BIOS version 2.10 Using $PIR table, 12 entries at 0xc00fc160 acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 acpi_cpu0: on acpi0 acpi_cpu1: on acpi0 acpi_cpu2: on acpi0 acpi_cpu3: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: at device 2.0 on pci0 pcib1: could not get PCI interrupt routing table for \\_SB_.PCI0.PCI2 - AE_NOT_FOUND pci1: on pcib1 pci1: at device 28.0 (no driver attached) pcib2: at device 29.0 on pci1 pci2: on pcib2 ahc0: port 0xec00-0xecff mem 0xfeaff000-0xfeafffff irq 24 at device 2.0 on pci2 aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs ahc1: port 0xe800-0xe8ff mem 0xfeafe000-0xfeafefff irq 25 at device 2.1 on pci2 aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs pci1: at device 30.0 (no driver attached) pcib3: at device 31.0 on pci1 pci3: on pcib3 em0: port 0xdce0-0xdcff mem 0xfe8c0000-0xfe8dffff,0xfe8e0000-0xfe8fffff irq 28 at device 1.0 on pci3 em0: Speed:N/A Duplex:N/A pcib4: at device 3.0 on pci0 pcib4: could not get PCI interrupt routing table for \\_SB_.PCI0.PCI3 - AE_NOT_FOUND pci4: on pcib4 pci4: at device 28.0 (no driver attached) pcib5: at device 29.0 on pci4 pci5: on pcib5 pci4: at device 30.0 (no driver attached) pcib6: at device 31.0 on pci4 pci6: on pcib6 pcib7: at device 4.0 on pci0 pcib7: could not get PCI interrupt routing table for \\_SB_.PCI0.PCI4 - AE_NOT_FOUND pci7: on pcib7 pci7: at device 28.0 (no driver attached) pcib8: at device 29.0 on pci7 pci8: on pcib8 amr0: mem 0xfbff0000-0xfbffffff irq 120 at device 8.0 on pci8 amr0: Firmware 2.24, BIOS 1.03, 128MB RAM pci7: at device 30.0 (no driver attached) pcib9: at device 31.0 on pci7 pci10: on pcib9 uhci0: port 0xbce0-0xbcff irq 16 at device 29.0 on pci0 usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered pcib10: at device 30.0 on pci0 pci11: on pcib10 pci11: at device 4.0 (no driver attached) isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0xfc00-0xfc0f,0-0x3,0-0x7,0-0x3,0-0x7 at device 31.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata0: [MPSAFE] ata1: at 0x170 irq 15 on atapci0 ata1: [MPSAFE] fdc0: port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 atkbdc0: port 0x64,0x60 irq 1 on acpi0 atkbd0: flags 0x1 irq 1 on atkbdc0 kbd0 at atkbd0 psm0: irq 12 on atkbdc0 psm0: model Generic PS/2 mouse, device ID 0 sio0 port 0x3f8-0x3ff irq 4 on acpi0 sio0: type 16550A sio1 port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A ppc0 port 0x778-0x77f,0x378-0x37f irq 7 drq 1 on acpi0 ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/8 bytes threshold ppbus0: on ppc0 plip0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 orm0: