From owner-freebsd-scsi@FreeBSD.ORG Sun Aug 24 13:40:08 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 02DD416A4BF; Sun, 24 Aug 2003 13:40:08 -0700 (PDT) Received: from arthur.nitro.dk (port324.ds1-khk.adsl.cybercity.dk [212.242.113.79]) by mx1.FreeBSD.org (Postfix) with ESMTP id 564CA43F75; Sun, 24 Aug 2003 13:40:07 -0700 (PDT) (envelope-from simon@arthur.nitro.dk) Received: by arthur.nitro.dk (Postfix, from userid 1000) id B9D2A10BF89; Sun, 24 Aug 2003 22:40:05 +0200 (CEST) Date: Sun, 24 Aug 2003 22:40:05 +0200 From: "Simon L. Nielsen" To: freebsd-scsi@FreeBSD.org Message-ID: <20030824204004.GD399@FreeBSD.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="2Z2K0IlrPCVsbNpk" Content-Disposition: inline User-Agent: Mutt/1.5.4i cc: groudier@FreeBSD.org Subject: Is 53C875A supported by sym(4) ? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Aug 2003 20:40:08 -0000 --2Z2K0IlrPCVsbNpk Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hello I'm trying to find out whether the sym(4) driver supports 53C875A, so the hardware notes/sym(4) manual page can be updated. The hardware notes were updated two years ago to say that the 53C875A is supported [1], but when I look at the driver source code it doesn't look like 53C875A is supported. In sys/dev/sym/sym_defs.h there isn't a PCI ID for 53C875A (which should be 0x13 from what I can gather from [2] and [3]), but I found a newer version of the sym driver [2] which as far as I can see does support the 53C875A. So, does the sym(4) driver in -CURRENT support the 53C875A, or was it just planned support that was never integrated into -CURRENT ? [1] http://cvsweb.freebsd.org/src/release/texts/Attic/HARDWARE.TXT#rev1.98 [2] http://people.freebsd.org/~groudier/sym-2.1.13-20010916.tar.gz [3] http://pciids.sourceforge.net/iii/?i=3D1000 --=20 Simon L. Nielsen FreeBSD Documentation Team --2Z2K0IlrPCVsbNpk Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (FreeBSD) iD8DBQE/SSKkh9pcDSc1mlERAjtRAJ9eXy9EOHwojalFjgaqVnOlmHOdpQCeKUAI kOBodZkXtVfuomG7Bvc/GjY= =dE8/ -----END PGP SIGNATURE----- --2Z2K0IlrPCVsbNpk-- From owner-freebsd-scsi@FreeBSD.ORG Mon Aug 25 04:17:05 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8B6E916A4BF for ; Mon, 25 Aug 2003 04:17:05 -0700 (PDT) Received: from siralan.org (12-223-225-141.client.insightbb.com [12.223.225.141]) by mx1.FreeBSD.org (Postfix) with ESMTP id B69F643FE0 for ; Mon, 25 Aug 2003 04:17:04 -0700 (PDT) (envelope-from mikes@siralan.org) Received: from siralan.org (localhost [127.0.0.1]) by siralan.org (8.12.9/8.12.9) with ESMTP id h7PBH2i1018125; Mon, 25 Aug 2003 06:17:03 -0500 (EST) (envelope-from mikes@siralan.org) Received: (from mikes@localhost) by siralan.org (8.12.9/8.12.9/Submit) id h7PBH1SA018124; Mon, 25 Aug 2003 06:17:01 -0500 (EST) From: "Michael L. Squires" Message-Id: <200308251117.h7PBH1SA018124@siralan.org> In-Reply-To: <3EF37A16.15316.2772DF95@localhost> "from Dan Langille at Jun 20, 2003 09:18:14 pm" To: Dan Langille Date: Mon, 25 Aug 2003 06:17:01 -0500 (EST) X-Mailer: ELM [version 2.4ME+ PL88 (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII cc: freebsd-scsi@freebsd.org Subject: Re: SCSI tape data loss X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Aug 2003 11:17:05 -0000 I ran "tapetest.c" under 5.1-RELEASE (without pthread under 5.1-RELEASE, first part with pthread under 5.1-RELEASE, second part (scan) under 5.1-RELEASE-p2). 216012 blocks written/read without pthreads 217073 blocks written/read with pthreads (same!) Hardware is SuperMicro P6DGH, dual PIII/850, onboard AIC7896, DLT4000 tape drive, Seagate ST423451W 23GB HD. --------------------------results---------------------------------------------- dmesg: FreeBSD 5.1-RELEASE #1: Sat Aug 23 14:56:45 EST 2003 root@mikes.siralan.org:/usr/obj/usr/src/sys/MIKES CPU: Intel Pentium III (851.93-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x686 Stepping = 6 ahc0: port 0xe400-0xe4ff mem 0xffafe000-0xffafefff irq 10 at device 14.0 on pci0 aic7896/97: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs ahc1: port 0xe800-0xe8ff mem 0xffaff000-0xffafffff irq 10 at device 14.1 on pci0 aic7896/97: Ultra2 Wide Channel B, SCSI Id=7, 32/253 SCBs sa0 at ahc1 bus 0 target 4 lun 0 sa0: Removable Sequential Access SCSI-2 device sa0: 10.000MB/s transfers (10.000MHz, offset 15) output from tapetest.c (edited) weof_dev Wrote EOF to /dev/sa0 Write failed. Last block written=216012. stat=0 ERR=Unknown error: 0 *rewind Rewound /dev/sa0 *scan Starting scan at file 0 216012 blocks of 64512 bytes in file 0 End of File mark. End of File mark. End of tape Total files=1, blocks=216012, bytes = 1050464256 ioctl MTWEOF error on /dev/sa0. ERR=Input/output error. Bad status from weof -1. ERR=Input/output error Write failed. Last block written=217073. stat=-1 ERR=No space left on device *rewind Rewound /dev/sa0 *scan Starting scan at file 0 Bad status from read -1. ERR=Input/output error 217073 blocks of 64512 bytes in file 0 * From owner-freebsd-scsi@FreeBSD.ORG Mon Aug 25 09:13:21 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DF5A916A4BF for ; Mon, 25 Aug 2003 09:13:21 -0700 (PDT) Received: from bast.unixathome.org (bast.unixathome.org [66.11.174.150]) by mx1.FreeBSD.org (Postfix) with ESMTP id 126C943F93 for ; Mon, 25 Aug 2003 09:13:19 -0700 (PDT) (envelope-from dan@langille.org) Received: from wocker (wocker.unixathome.org [192.168.0.99]) by bast.unixathome.org (Postfix) with ESMTP id 773673D29; Mon, 25 Aug 2003 12:13:18 -0400 (EDT) From: "Dan Langille" To: "Michael L. Squires" Date: Mon, 25 Aug 2003 12:14:00 -0400 MIME-Version: 1.0 Message-ID: <3F49FD88.8464.1539E125@localhost> Priority: normal In-reply-to: <200308251117.h7PBH1SA018124@siralan.org> References: <3EF37A16.15316.2772DF95@localhost> "from Dan Langille at Jun 20, 2003 09:18:14 pm" X-mailer: Pegasus Mail for Windows (v4.02a) Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Content-description: Mail message body cc: freebsd-scsi@freebsd.org Subject: Re: SCSI tape data loss X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Aug 2003 16:13:22 -0000 On 25 Aug 2003 at 6:17, Michael L. Squires wrote: > I ran "tapetest.c" under 5.1-RELEASE (without pthread under 5.1-RELEASE, > first part with pthread under 5.1-RELEASE, second part (scan) under > 5.1-RELEASE-p2). I don't understand what the above means or how it relates to the results below. Could you elaborate please? Thanks. > 216012 blocks written/read without pthreads > 217073 blocks written/read with pthreads (same!) > > Hardware is SuperMicro P6DGH, dual PIII/850, onboard AIC7896, DLT4000 tape > drive, Seagate ST423451W 23GB HD. > > --------------------------results---------------------------------------------- > > dmesg: > > FreeBSD 5.1-RELEASE #1: Sat Aug 23 14:56:45 EST 2003 > root@mikes.siralan.org:/usr/obj/usr/src/sys/MIKES > CPU: Intel Pentium III (851.93-MHz 686-class CPU) > Origin = "GenuineIntel" Id = 0x686 Stepping = 6 > > ahc0: port 0xe400-0xe4ff mem 0xffafe000-0xffafefff irq 10 at device 14.0 on pci0 > aic7896/97: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs > ahc1: port 0xe800-0xe8ff mem 0xffaff000-0xffafffff irq 10 at device 14.1 on pci0 > aic7896/97: Ultra2 Wide Channel B, SCSI Id=7, 32/253 SCBs > > sa0 at ahc1 bus 0 target 4 lun 0 > sa0: Removable Sequential Access SCSI-2 device > sa0: 10.000MB/s transfers (10.000MHz, offset 15) > > output from tapetest.c (edited) > > weof_dev > Wrote EOF to /dev/sa0 > Write failed. Last block written=216012. stat=0 ERR=Unknown error: 0 > > *rewind > Rewound /dev/sa0 > *scan > Starting scan at file 0 > 216012 blocks of 64512 bytes in file 0 > End of File mark. > End of File mark. > End of tape > Total files=1, blocks=216012, bytes = 1050464256 > > ioctl MTWEOF error on /dev/sa0. ERR=Input/output error. > Bad status from weof -1. ERR=Input/output error > Write failed. Last block written=217073. stat=-1 ERR=No space left on device > > *rewind > Rewound /dev/sa0 > *scan > Starting scan at file 0 > Bad status from read -1. ERR=Input/output error > 217073 blocks of 64512 bytes in file 0 > * > > -- Dan Langille : http://www.langille.org/ From owner-freebsd-scsi@FreeBSD.ORG Mon Aug 25 11:06:51 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1430116A4BF for ; Mon, 25 Aug 2003 11:06:51 -0700 (PDT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 38E0243FBD for ; Mon, 25 Aug 2003 11:06:49 -0700 (PDT) (envelope-from owner-bugmaster@freebsd.org) Received: from freefall.freebsd.org (peter@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.9/8.12.9) with ESMTP id h7PI6nUp035971 for ; Mon, 25 Aug 2003 11:06:49 -0700 (PDT) (envelope-from owner-bugmaster@freebsd.org) Received: (from peter@localhost) by freefall.freebsd.org (8.12.9/8.12.9/Submit) id h7PI6mS8035965 for scsi@freebsd.org; Mon, 25 Aug 2003 11:06:48 -0700 (PDT) Date: Mon, 25 Aug 2003 11:06:48 -0700 (PDT) Message-Id: <200308251806.h7PI6mS8035965@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: peter set sender to owner-bugmaster@freebsd.org using -f From: FreeBSD bugmaster To: scsi@FreeBSD.org Subject: Current problem reports assigned to you X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Aug 2003 18:06:51 -0000 Current FreeBSD problem reports Critical problems Serious problems Non-critical problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- f [1999/12/21] kern/15608 scsi acd0 / cd0 give inconsistent errors on em 1 problem total. From owner-freebsd-scsi@FreeBSD.ORG Mon Aug 25 20:47:07 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 19DE816A4BF for ; Mon, 25 Aug 2003 20:47:07 -0700 (PDT) Received: from pd6mo2so.prod.shaw.ca (shawidc-mo1.cg.shawcable.net [24.71.223.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id 391C243FB1 for ; Mon, 25 Aug 2003 20:47:06 -0700 (PDT) (envelope-from chi.lam@shaw.ca) Received: from pd2mr1so.prod.shaw.ca (pd2mr1so-ser.prod.shaw.ca [10.0.141.110])2003)) with ESMTP id <0HK700IFFJTM1M@l-daemon> for freebsd-scsi@freebsd.org; Mon, 25 Aug 2003 21:46:34 -0600 (MDT) Received: from pn2ml1so.prod.shaw.ca (pn2ml1so-qfe0.prod.shaw.ca [10.0.121.145]) by l-daemon (iPlanet Messaging Server 5.2 HotFix 1.16 (built May 14 2003)) with ESMTP id <0HK700BR1JTM4W@l-daemon> for freebsd-scsi@freebsd.org; Mon, 25 Aug 2003 21:46:34 -0600 (MDT) Received: from lithium (h24-86-239-180.ed.shawcable.net [24.86.239.180]) by l-daemon (iPlanet Messaging Server 5.2 HotFix 1.16 (built May 14 2003)) with SMTP id <0HK7004R1JTLCM@l-daemon> for freebsd-scsi@freebsd.org; Mon, 25 Aug 2003 21:46:34 -0600 (MDT) Date: Mon, 25 Aug 2003 21:48:19 -0600 From: "Chi V. Lam" To: freebsd-scsi@freebsd.org Message-id: <00bc01c36b84$e39bcec0$050ca8c0@lithium> MIME-version: 1.0 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 X-Mailer: Microsoft Outlook Express 6.00.2800.1158 X-Priority: 3 X-MSMail-priority: Normal Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 7BIT X-Content-Filtered-By: Mailman/MimeDel 2.1.1 Subject: Serveraid II not seeing all the hard drive space X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: "Chi V. Lam" List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Aug 2003 03:47:07 -0000 I have a netfinity 5500 M20, serveraid II, 4x9gb, raid 5. Booted the 5.1-release cd, drive detected fine but it's only seeing 1453M totals of the raid 5. I though it was the firmware, i try 3.50C, 6.00, and the latest 6.10 from the ibm site, no help. Anyone got the serveraid running with freebsd current? Chi From owner-freebsd-scsi@FreeBSD.ORG Wed Aug 27 05:27:26 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E241216A4BF for ; Wed, 27 Aug 2003 05:27:26 -0700 (PDT) Received: from siralan.org (12-223-227-231.client.insightbb.com [12.223.227.231]) by mx1.FreeBSD.org (Postfix) with ESMTP id E1E7D43FA3 for ; Wed, 27 Aug 2003 05:27:25 -0700 (PDT) (envelope-from mikes@siralan.org) Received: from siralan.org (localhost [127.0.0.1]) by siralan.org (8.12.9/8.12.9) with ESMTP id h7RCRMBR000137; Wed, 27 Aug 2003 07:27:22 -0500 (EST) (envelope-from mikes@siralan.org) Received: (from mikes@localhost) by siralan.org (8.12.9/8.12.9/Submit) id h7RCLUsP001665; Wed, 27 Aug 2003 07:21:30 -0500 (EST) From: "Michael L. Squires" Message-Id: <200308271221.h7RCLUsP001665@siralan.org> In-Reply-To: <200308251117.h7PBH1SA018124@siralan.org> "from Michael L. Squires at Aug 25, 2003 06:17:01 am" To: dan@langille.org Date: Wed, 27 Aug 2003 07:21:30 -0500 (EST) X-Mailer: ELM [version 2.4ME+ PL88 (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII cc: FreeBSD SCSI Subject: Re: SCSI tape data loss X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Aug 2003 12:27:27 -0000 I ran "tapetest.c" under 4.8-STABLE and did not observe any difference between the version compiled with pthreads and the version compiled without pthreads. Without pthreads taptest wrote and read 141,776 blocks on a DLT III (10/20GB) tape in a DLT4000 drive; with pthreads it wrote and read 142,879 blocks. Results using 5.1-RELEASE running on another system were similar, i.e., no difference between the number of blocks written and read using either pthreads or not using pthreads. 4.8-STABLE is running on a Supermicro P6DGH with dual PII/300's, onboard Adaptec U2W controller, ADIC VLS DLT changer with DLT4000 tape drive. 5.1-RELEASE is running on a Supermicro P6DGH with dual PIII/850's, onboard Adaptec U2W controller, single DLT4000 tape drive. MLS >From dmesg: ---------------------------------------------------------------------------- FreeBSD 4.8-STABLE #1: Mon Aug 4 21:05:33 EST 2003 root@newserv.siralan.org:/usr/obj/usr/src/sys/NEWSERV Timecounter "i8254" frequency 1193182 Hz CPU: Pentium II/Pentium II Xeon/Celeron (300.68-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x634 Stepping = 4 Features=0x80fbff real memory = 268304384 (262016K bytes) config> q avail memory = 255660032 (249668K bytes) FreeBSD/SMP: Multiprocessor motherboard: 2 CPUs cpu0 (BSP): apic id: 0, version: 0x00040011, at 0xfee00000 cpu1 (AP): apic id: 1, version: 0x00040011, at 0xfee00000 io0 (APIC): apic id: 2, version: 0x00170011, at 0xfec00000 ahc0: port 0xe000-0xe0ff mem 0xfebfd000-0xfebfdfff irq 10 at device 14.0 on pci0 aic7896/97: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs ahc1: port 0xe400-0xe4ff mem 0xfebfe000-0xfebfefff irq 10 at device 14.1 on pci0 aic7896/97: Ultra2 Wide Channel B, SCSI Id=7, 32/253 SCBs SMP: AP CPU #1 Launched! sa0 at ahc2 bus 0 target 2 lun 0 sa0: Removable Sequential Access SCSI-2 device sa0: 10.000MB/s transfers (10.000MHz, offset 15) ---------------------------------------------------------------------------- Tapetest output: no pthread: Write failed. Last block writen=141776. stat=0 ERR=Unknown error: 0 End of tape Total files=1, blocks=141776, bytes = 556318720 with pthread: rawfill: Write failed. Last block written=142879. stat=-1 ERR No space left on device scan: Bad status from read -1. ERR=Input/output error 142879 blocks of 64512 bytes in file 0 Mike Squires From owner-freebsd-scsi@FreeBSD.ORG Wed Aug 27 07:45:46 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8F7EF16A4BF for ; Wed, 27 Aug 2003 07:45:46 -0700 (PDT) Received: from matou.sibbald.com (matou.sibbald.com [195.202.201.48]) by mx1.FreeBSD.org (Postfix) with ESMTP id 561BB43FBF for ; Wed, 27 Aug 2003 07:45:44 -0700 (PDT) (envelope-from kern@sibbald.com) Received: from [192.168.68.112] (rufus [192.168.68.112]) by matou.sibbald.com (8.11.6/8.11.6) with ESMTP id h7REjUE16203; Wed, 27 Aug 2003 16:45:30 +0200 From: Kern Sibbald To: mikes@siralan.org In-Reply-To: <3F4C6F55.17392.1EC66A86@localhost> References: <3F4C6F55.17392.1EC66A86@localhost> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-dAJqc8fFSnJgp6UHNjAB" Organization: Message-Id: <1061995529.1258.273.camel@rufus> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.4 Date: 27 Aug 2003 16:45:29 +0200 cc: freebsd-scsi@freebsd.org Subject: Re: (Fwd) Re: SCSI tape data loss X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Aug 2003 14:45:46 -0000 --=-dAJqc8fFSnJgp6UHNjAB Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Hello, Many thanks for testing this ... It seems to me that your tests, clearly indicate that there is a problem even though you had no data loss. When you ran without -pthread, the status received by the program was a 0 with 141776 blocks written. This is correct. When you ran with -pthread, the status received by the program was a -1 with 142879 blocks written. This is "not correct". To me, that shows very clearly that with -pthread the 0 status was lost and more blocks were written. In fact, in this case so many blocks were written that the tape was not properly terminated with an EOF (actually two EOF marks). I suspect that you did not get missing data because your drive is newer than Dan's and more robust in that it let you write to the hard end of file mark without losing data. Note, however,=20 that with -pthread, the volume is not correctly=20 terminated because the weof got an error and=20 was not written, then when reading back the data,=20 you do get all the blocks, but the terminating EOF is not there. Consequently, the tape is in a "non-correct" state. In any case, I wouldn't want my data written to tapes without a proper termination since a program re-reading the tape could get very confused. Best regards, Kern On Wed, 2003-08-27 at 14:44, Dan Langille wrote: > This just in, and I'm just heading out. >=20 > ------- Forwarded message follows ------- > From: "Michael L. Squires" > Subject: Re: SCSI tape data loss > To: dan@langille.org > Date sent: Wed, 27 Aug 2003 07:21:30 -0500 (EST) > Copies to: FreeBSD SCSI >=20 > I ran "tapetest.c" under 4.8-STABLE and did not observe any difference > between the version compiled with pthreads and the version compiled > without pthreads. >=20 > Without pthreads taptest wrote and read 141,776 blocks on a DLT III (10/2= 0GB) > tape in a DLT4000 drive; with pthreads it wrote and read 142,879 blocks. >=20 > Results using 5.1-RELEASE running on another system were similar, i.e., n= o=20 > difference between the number of blocks written and read using either pth= reads=20 > or not using pthreads. >=20 > 4.8-STABLE is running on a Supermicro P6DGH with dual PII/300's, onboard=20 > Adaptec U2W controller, ADIC VLS DLT changer with DLT4000 tape drive. > 5.1-RELEASE is running on a Supermicro P6DGH with dual PIII/850's, onboar= d > Adaptec U2W controller, single DLT4000 tape drive. >=20 > MLS >=20 > >From dmesg: > -------------------------------------------------------------------------= --- > FreeBSD 4.8-STABLE #1: Mon Aug 4 21:05:33 EST 2003 > root@newserv.siralan.org:/usr/obj/usr/src/sys/NEWSERV > Timecounter "i8254" frequency 1193182 Hz > CPU: Pentium II/Pentium II Xeon/Celeron (300.68-MHz 686-class CPU) > Origin =3D "GenuineIntel" Id =3D 0x634 Stepping =3D 4 > Features=3D0x80fbff > real memory =3D 268304384 (262016K bytes) > config> q > avail memory =3D 255660032 (249668K bytes) >=20 > FreeBSD/SMP: Multiprocessor motherboard: 2 CPUs > cpu0 (BSP): apic id: 0, version: 0x00040011, at 0xfee00000 > cpu1 (AP): apic id: 1, version: 0x00040011, at 0xfee00000 > io0 (APIC): apic id: 2, version: 0x00170011, at 0xfec00000 >=20 > ahc0: port 0xe000-0xe0ff mem 0xf= ebfd000-0xfebfdfff irq 10 at device 14.0 on pci0 > aic7896/97: Ultra2 Wide Channel A, SCSI Id=3D7, 32/253 SCBs > ahc1: port 0xe400-0xe4ff mem 0xf= ebfe000-0xfebfefff irq 10 at device 14.1 on pci0 > aic7896/97: Ultra2 Wide Channel B, SCSI Id=3D7, 32/253 SCBs >=20 > SMP: AP CPU #1 Launched! >=20 > sa0 at ahc2 bus 0 target 2 lun 0 > sa0: Removable Sequential Access SCSI-2 device=20 > sa0: 10.000MB/s transfers (10.000MHz, offset 15) > -------------------------------------------------------------------------= --- >=20 > Tapetest output: >=20 > no pthread: >=20 > Write failed. Last block writen=3D141776. stat=3D0 ERR=3DUnknown error:= 0 > End of tape > Total files=3D1, blocks=3D141776, bytes =3D 556318720 >=20 > with pthread: >=20 > rawfill: > Write failed. Last block written=3D142879. stat=3D-1 ERR No space left o= n device >=20 > scan: > Bad status from read -1. ERR=3DInput/output error > 142879 blocks of 64512 bytes in file 0 >=20 > Mike Squires >=20 > ------- End of forwarded message ------- --=-dAJqc8fFSnJgp6UHNjAB Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA/TMQJNgfoSvWqwEgRAinRAKDJGJGykJzNpwRTt/6E8CsAox8DWgCeKtQS 7t9cmUPLU/kfuQcIt4bc6cY= =xxfL -----END PGP SIGNATURE----- --=-dAJqc8fFSnJgp6UHNjAB-- From owner-freebsd-scsi@FreeBSD.ORG Wed Aug 27 11:06:55 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2037116A4BF for ; Wed, 27 Aug 2003 11:06:55 -0700 (PDT) Received: from rootlabs.com (root.org [67.118.192.226]) by mx1.FreeBSD.org (Postfix) with SMTP id 88E9543FBD for ; Wed, 27 Aug 2003 11:06:54 -0700 (PDT) (envelope-from nate@rootlabs.com) Received: (qmail 31828 invoked by uid 1000); 27 Aug 2003 18:06:55 -0000 Date: Wed, 27 Aug 2003 11:06:55 -0700 (PDT) From: Nate Lawson To: Kern Sibbald In-Reply-To: <1061995529.1258.273.camel@rufus> Message-ID: <20030827110534.J31798@root.org> References: <3F4C6F55.17392.1EC66A86@localhost> <1061995529.1258.273.camel@rufus> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-scsi@freebsd.org Subject: Re: (Fwd) Re: SCSI tape data loss X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Aug 2003 18:06:55 -0000 On Wed, 27 Aug 2003, Kern Sibbald wrote: > Many thanks for testing this ... > > It seems to me that your tests, clearly indicate that > there is a problem even though you had no data loss. > > When you ran without -pthread, the status > received by the program was a 0 with 141776 > blocks written. This is correct. > > When you ran with -pthread, the status > received by the program was a -1 with 142879 > blocks written. This is "not correct". > > To me, that shows very clearly that with -pthread > the 0 status was lost and more blocks were written. > In fact, in this case so many blocks were written > that the tape was not properly terminated with > an EOF (actually two EOF marks). Here is a response I got by forwarding this to the pthreads maintainer: > A return status of 0 from write is not interpreted as an End-Of-Tape. > The threads library isn't smart enough to know that the file > is a tape device and that a 0 status should break it out of the > loop. Thus, it continues writing. > > Use libkse :-) > > -- > Dan Eischen -Nate From owner-freebsd-scsi@FreeBSD.ORG Wed Aug 27 11:14:54 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 23DBF16A4BF for ; Wed, 27 Aug 2003 11:14:54 -0700 (PDT) Received: from bast.unixathome.org (bast.unixathome.org [66.11.174.150]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6A99C43FCB for ; Wed, 27 Aug 2003 11:14:53 -0700 (PDT) (envelope-from dan@langille.org) Received: from wocker (wocker.unixathome.org [192.168.0.99]) by bast.unixathome.org (Postfix) with ESMTP id E904D3D28; Wed, 27 Aug 2003 14:14:52 -0400 (EDT) From: "Dan Langille" To: Nate Lawson Date: Wed, 27 Aug 2003 14:15:47 -0400 MIME-Version: 1.0 Message-ID: <3F4CBD13.545.1FF6190E@localhost> Priority: normal References: <1061995529.1258.273.camel@rufus> In-reply-to: <20030827110534.J31798@root.org> X-mailer: Pegasus Mail for Windows (v4.02a) Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Content-description: Mail message body cc: freebsd-scsi@freebsd.org cc: Kern Sibbald Subject: Re: (Fwd) Re: SCSI tape data loss X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Aug 2003 18:14:54 -0000 On 27 Aug 2003 at 11:06, Nate Lawson wrote: > On Wed, 27 Aug 2003, Kern Sibbald wrote: > > Many thanks for testing this ... > > > > It seems to me that your tests, clearly indicate that > > there is a problem even though you had no data loss. > > > > When you ran without -pthread, the status > > received by the program was a 0 with 141776 > > blocks written. This is correct. > > > > When you ran with -pthread, the status > > received by the program was a -1 with 142879 > > blocks written. This is "not correct". > > > > To me, that shows very clearly that with -pthread > > the 0 status was lost and more blocks were written. > > In fact, in this case so many blocks were written > > that the tape was not properly terminated with > > an EOF (actually two EOF marks). > > Here is a response I got by forwarding this to the pthreads maintainer: > > A return status of 0 from write is not interpreted as an End-Of-Tape. > > The threads library isn't smart enough to know that the file > > is a tape device and that a 0 status should break it out of the > > loop. Thus, it continues writing. > > > > Use libkse :-) > > > > -- > > Dan Eischen Nate: thanks for getting in touch with him. It is interesting to note that the code works OK on Linux and Solaris. Why is FreeBSD different in this case? Kern: I can't comment on libkse. I don't know it and I don't know what effect it would have on Bacula. -- Dan Langille : http://www.langille.org/ From owner-freebsd-scsi@FreeBSD.ORG Wed Aug 27 11:29:55 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6660B16A4BF for ; Wed, 27 Aug 2003 11:29:55 -0700 (PDT) Received: from rootlabs.com (root.org [67.118.192.226]) by mx1.FreeBSD.org (Postfix) with SMTP id 83A6043FB1 for ; Wed, 27 Aug 2003 11:29:54 -0700 (PDT) (envelope-from nate@rootlabs.com) Received: (qmail 31967 invoked by uid 1000); 27 Aug 2003 18:29:55 -0000 Date: Wed, 27 Aug 2003 11:29:55 -0700 (PDT) From: Nate Lawson To: Dan Langille In-Reply-To: <3F4CBD13.545.1FF6190E@localhost> Message-ID: <20030827112748.Y31947@root.org> References: <1061995529.1258.273.camel@rufus> <3F4CBD13.545.1FF6190E@localhost> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-scsi@freebsd.org cc: Kern Sibbald Subject: Re: (Fwd) Re: SCSI tape data loss X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Aug 2003 18:29:55 -0000 On Wed, 27 Aug 2003, Dan Langille wrote: > On 27 Aug 2003 at 11:06, Nate Lawson wrote: > > Here is a response I got by forwarding this to the pthreads maintainer: > > > A return status of 0 from write is not interpreted as an End-Of-Tape. > > > The threads library isn't smart enough to know that the file > > > is a tape device and that a 0 status should break it out of the > > > loop. Thus, it continues writing. > > > > > > Use libkse :-) > > > > > > -- > > > Dan Eischen > > Nate: thanks for getting in touch with him. > > It is interesting to note that the code works OK on Linux and > Solaris. Why is FreeBSD different in this case? I don't know. Our pthreads implementation is purely userland so it's likely that it is difficult to differentiate a non-blocking read from an EOF. > Kern: I can't comment on libkse. I don't know it and I don't know > what effect it would have on Bacula. libkse is a drop-in replacement for libpthreads. Unfortunately, it's only on 5.x. -Nate From owner-freebsd-scsi@FreeBSD.ORG Wed Aug 27 11:35:57 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 523B116A4F2 for ; Wed, 27 Aug 2003 11:35:57 -0700 (PDT) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2B84943FCB for ; Wed, 27 Aug 2003 11:35:50 -0700 (PDT) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h7RIZntp016015; Wed, 27 Aug 2003 14:35:49 -0400 (EDT) Date: Wed, 27 Aug 2003 14:35:49 -0400 (EDT) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Nate Lawson In-Reply-To: <20030827112748.Y31947@root.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-scsi@freebsd.org cc: Kern Sibbald Subject: Re: (Fwd) Re: SCSI tape data loss X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: deischen@freebsd.org List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Aug 2003 18:35:57 -0000 On Wed, 27 Aug 2003, Nate Lawson wrote: > On Wed, 27 Aug 2003, Dan Langille wrote: > > On 27 Aug 2003 at 11:06, Nate Lawson wrote: > > > Here is a response I got by forwarding this to the pthreads maintainer: > > > > A return status of 0 from write is not interpreted as an End-Of-Tape. > > > > The threads library isn't smart enough to know that the file > > > > is a tape device and that a 0 status should break it out of the > > > > loop. Thus, it continues writing. > > > > > > > > Use libkse :-) > > > > > > > > -- > > > > Dan Eischen > > > > Nate: thanks for getting in touch with him. > > > > It is interesting to note that the code works OK on Linux and > > Solaris. Why is FreeBSD different in this case? > > I don't know. Our pthreads implementation is purely userland so it's > likely that it is difficult to differentiate a non-blocking read from an > EOF. Correct. -- Dan Eischen From owner-freebsd-scsi@FreeBSD.ORG Wed Aug 27 12:24:44 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3BBF016A4BF for ; Wed, 27 Aug 2003 12:24:44 -0700 (PDT) Received: from matou.sibbald.com (matou.sibbald.com [195.202.201.48]) by mx1.FreeBSD.org (Postfix) with ESMTP id DA04043FD7 for ; Wed, 27 Aug 2003 12:24:41 -0700 (PDT) (envelope-from kern@sibbald.com) Received: from [192.168.68.112] (rufus [192.168.68.112]) by matou.sibbald.com (8.11.6/8.11.6) with ESMTP id h7RJN3E17169; Wed, 27 Aug 2003 21:23:03 +0200 From: Kern Sibbald To: Nate Lawson In-Reply-To: <20030827110534.J31798@root.org> References: <3F4C6F55.17392.1EC66A86@localhost> <1061995529.1258.273.camel@rufus> <20030827110534.J31798@root.org> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-esrvvy33W9WLocWBVFPf" Organization: Message-Id: <1062012182.1227.363.camel@rufus> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.4 Date: 27 Aug 2003 21:23:03 +0200 cc: freebsd-scsi@freebsd.org Subject: Re: (Fwd) Re: SCSI tape data loss X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Aug 2003 19:24:44 -0000 --=-esrvvy33W9WLocWBVFPf Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Hello, Below is my response: On Wed, 2003-08-27 at 20:06, Nate Lawson wrote: > On Wed, 27 Aug 2003, Kern Sibbald wrote: > > Many thanks for testing this ... > > > > It seems to me that your tests, clearly indicate that > > there is a problem even though you had no data loss. > > > > When you ran without -pthread, the status > > received by the program was a 0 with 141776 > > blocks written. This is correct. > > > > When you ran with -pthread, the status > > received by the program was a -1 with 142879 > > blocks written. This is "not correct". > > > > To me, that shows very clearly that with -pthread > > the 0 status was lost and more blocks were written. > > In fact, in this case so many blocks were written > > that the tape was not properly terminated with > > an EOF (actually two EOF marks). >=20 > Here is a response I got by forwarding this to the pthreads maintainer: > > A return status of 0 from write is not interpreted as an End-Of-Tape. > > The threads library isn't smart enough to know that the file > > is a tape device and that a 0 status should break it out of the > > loop. Thus, it continues writing. Well, the threads library currently may not be smart enough to figure out that I want and need the 0 status, but I sure hope that you or someone else will figure out how to make it smarter. Best regards, Kern > > > > Use libkse :-) > > > > -- > > Dan Eischen >=20 > -Nate --=-esrvvy33W9WLocWBVFPf Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA/TQUWNgfoSvWqwEgRAhWAAKDWB/qPWlZnu1Z6T18pzeh/ySERewCg6Utc 73gODZnEoGAA43UXvWqfA7Q= =vBd5 -----END PGP SIGNATURE----- --=-esrvvy33W9WLocWBVFPf-- From owner-freebsd-scsi@FreeBSD.ORG Wed Aug 27 12:27:59 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 579DF16A4BF for ; Wed, 27 Aug 2003 12:27:59 -0700 (PDT) Received: from matou.sibbald.com (matou.sibbald.com [195.202.201.48]) by mx1.FreeBSD.org (Postfix) with ESMTP id ADD1843FAF for ; Wed, 27 Aug 2003 12:27:57 -0700 (PDT) (envelope-from kern@sibbald.com) Received: from [192.168.68.112] (rufus [192.168.68.112]) by matou.sibbald.com (8.11.6/8.11.6) with ESMTP id h7RJRGE17180; Wed, 27 Aug 2003 21:27:16 +0200 From: Kern Sibbald To: Dan Langille In-Reply-To: <3F4CBD13.545.1FF6190E@localhost> References: <1061995529.1258.273.camel@rufus> <3F4CBD13.545.1FF6190E@localhost> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-A4pdlsyMtgx5kdp47sdd" Organization: Message-Id: <1062012436.1258.367.camel@rufus> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.4 Date: 27 Aug 2003 21:27:16 +0200 cc: freebsd-scsi@freebsd.org Subject: Re: (Fwd) Re: SCSI tape data loss X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Aug 2003 19:27:59 -0000 --=-A4pdlsyMtgx5kdp47sdd Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Hello Dan, On Wed, 2003-08-27 at 20:15, Dan Langille wrote: > On 27 Aug 2003 at 11:06, Nate Lawson wrote: >=20 > > On Wed, 27 Aug 2003, Kern Sibbald wrote: > > > Many thanks for testing this ... > > > > > > It seems to me that your tests, clearly indicate that > > > there is a problem even though you had no data loss. > > > > > > When you ran without -pthread, the status > > > received by the program was a 0 with 141776 > > > blocks written. This is correct. > > > > > > When you ran with -pthread, the status > > > received by the program was a -1 with 142879 > > > blocks written. This is "not correct". > > > > > > To me, that shows very clearly that with -pthread > > > the 0 status was lost and more blocks were written. > > > In fact, in this case so many blocks were written > > > that the tape was not properly terminated with > > > an EOF (actually two EOF marks). > >=20 > > Here is a response I got by forwarding this to the pthreads maintainer: > > > A return status of 0 from write is not interpreted as an End-Of-Tape. > > > The threads library isn't smart enough to know that the file > > > is a tape device and that a 0 status should break it out of the > > > loop. Thus, it continues writing. > > > > > > Use libkse :-) > > > > > > -- > > > Dan Eischen >=20 > Nate: thanks for getting in touch with him. >=20 > It is interesting to note that the code works OK on Linux and=20 > Solaris. Why is FreeBSD different in this case? >=20 > Kern: I can't comment on libkse. I don't know it and I don't know=20 > what effect it would have on Bacula. I cannot comment on libkse either since I don't know what it is, and it is not indexed in the FreeBSD man pages -- at least not under libkse. =20 Best regards, Kern --=-A4pdlsyMtgx5kdp47sdd Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA/TQYUNgfoSvWqwEgRAuEQAKCSFP+1EBD7jtXKQHnEGDLpJP85hwCeL6F/ LkM8JJWDnl6vW2zc2A9EG74= =huqD -----END PGP SIGNATURE----- --=-A4pdlsyMtgx5kdp47sdd-- From owner-freebsd-scsi@FreeBSD.ORG Wed Aug 27 12:38:19 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9A36B16A4BF for ; Wed, 27 Aug 2003 12:38:19 -0700 (PDT) Received: from matou.sibbald.com (matou.sibbald.com [195.202.201.48]) by mx1.FreeBSD.org (Postfix) with ESMTP id CD39243F75 for ; Wed, 27 Aug 2003 12:38:17 -0700 (PDT) (envelope-from kern@sibbald.com) Received: from [192.168.68.112] (rufus [192.168.68.112]) by matou.sibbald.com (8.11.6/8.11.6) with ESMTP id h7RJbWE17201; Wed, 27 Aug 2003 21:37:32 +0200 From: Kern Sibbald To: Nate Lawson In-Reply-To: <20030827112748.Y31947@root.org> References: <1061995529.1258.273.camel@rufus> <3F4CBD13.545.1FF6190E@localhost> <20030827112748.Y31947@root.org> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-EuLDxxy+ZrwF8iBR5gJ0" Organization: Message-Id: <1062013051.1226.376.camel@rufus> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.4 Date: 27 Aug 2003 21:37:31 +0200 cc: freebsd-scsi@freebsd.org Subject: Re: (Fwd) Re: SCSI tape data loss X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Aug 2003 19:38:19 -0000 --=-EuLDxxy+ZrwF8iBR5gJ0 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Hello, On Wed, 2003-08-27 at 20:29, Nate Lawson wrote: > On Wed, 27 Aug 2003, Dan Langille wrote: > > On 27 Aug 2003 at 11:06, Nate Lawson wrote: > > > Here is a response I got by forwarding this to the pthreads maintaine= r: > > > > A return status of 0 from write is not interpreted as an End-Of-Tap= e. > > > > The threads library isn't smart enough to know that the file > > > > is a tape device and that a 0 status should break it out of the > > > > loop. Thus, it continues writing. > > > > > > > > Use libkse :-) > > > > > > > > -- > > > > Dan Eischen > > > > Nate: thanks for getting in touch with him. > > > > It is interesting to note that the code works OK on Linux and > > Solaris. Why is FreeBSD different in this case? >=20 > I don't know. Our pthreads implementation is purely userland so it's > likely that it is difficult to differentiate a non-blocking read from an > EOF. Perhaps FreeBSD is different from Linux, but my understanding of non-blocking reads is that you get a -1 status with errno set to EWOULDBLOCK (or EAGAIN), while an EOF should return a 0 status. Best regards, Kern >=20 > > Kern: I can't comment on libkse. I don't know it and I don't know > > what effect it would have on Bacula. >=20 > libkse is a drop-in replacement for libpthreads. Unfortunately, it's onl= y > on 5.x. >=20 > -Nate --=-EuLDxxy+ZrwF8iBR5gJ0 Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA/TQh7NgfoSvWqwEgRAu+FAKDhEIlefLc/sDpZh53NfdYFhWOnRwCgkovS oOlnKnH5C9pKBqEpM/Q2BFs= =eoFo -----END PGP SIGNATURE----- --=-EuLDxxy+ZrwF8iBR5gJ0-- From owner-freebsd-scsi@FreeBSD.ORG Thu Aug 28 10:48:58 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EFBE816A4BF for ; Thu, 28 Aug 2003 10:48:58 -0700 (PDT) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1725043FF7 for ; Thu, 28 Aug 2003 10:48:58 -0700 (PDT) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h7SHmstp015243; Thu, 28 Aug 2003 13:48:56 -0400 (EDT) Date: Thu, 28 Aug 2003 13:48:54 -0400 (EDT) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Nate Lawson In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-scsi@freebsd.org cc: Kern Sibbald Subject: Re: (Fwd) Re: SCSI tape data loss X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: deischen@freebsd.org List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Aug 2003 17:48:59 -0000 On Wed, 27 Aug 2003, Daniel Eischen wrote: > On Wed, 27 Aug 2003, Nate Lawson wrote: > > > On Wed, 27 Aug 2003, Dan Langille wrote: > > > On 27 Aug 2003 at 11:06, Nate Lawson wrote: > > > > Here is a response I got by forwarding this to the pthreads maintainer: > > > > > A return status of 0 from write is not interpreted as an End-Of-Tape. > > > > > The threads library isn't smart enough to know that the file > > > > > is a tape device and that a 0 status should break it out of the > > > > > loop. Thus, it continues writing. > > > > > > > > > > Use libkse :-) > > > > > > > > > > -- > > > > > Dan Eischen > > > > > > Nate: thanks for getting in touch with him. > > > > > > It is interesting to note that the code works OK on Linux and > > > Solaris. Why is FreeBSD different in this case? > > > > I don't know. Our pthreads implementation is purely userland so it's > > likely that it is difficult to differentiate a non-blocking read from an > > EOF. > > Correct. Can I ask a question? When writing to a character/block special file in non-blocking mode, are there any instances where 0 can be returned from the write() other than when writing to a tape device? The only way I can see to fix this in libc_r is to fstat() the descriptor when the threads library initializes it (uthread_fd.c) and save st_mode for that fd. Then in write() check to see if it is S_ISCHR() or S_ISBLK() and 0 was returned. It could break out of write() if that was the case and return 0 to the caller. But this doesn't work if you can get 0 back from a write to other devices. -- Dan Eischen From owner-freebsd-scsi@FreeBSD.ORG Thu Aug 28 11:19:36 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CDE0F16A4BF for ; Thu, 28 Aug 2003 11:19:36 -0700 (PDT) Received: from main.gmane.org (main.gmane.org [80.91.224.249]) by mx1.FreeBSD.org (Postfix) with ESMTP id 38DA143FA3 for ; Thu, 28 Aug 2003 11:19:34 -0700 (PDT) (envelope-from freebsd-scsi@m.gmane.org) Received: from root by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 19sRNY-0004di-00 for ; Thu, 28 Aug 2003 20:20:16 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-scsi@freebsd.org Received: from sea.gmane.org ([80.91.224.252]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 19sRFg-0004XA-00 for ; Thu, 28 Aug 2003 20:12:08 +0200 Received: from news by sea.gmane.org with local (Exim 3.35 #1 (Debian)) id 19sREy-0006IX-00 for ; Thu, 28 Aug 2003 20:11:24 +0200 From: Dan Nelson Date: Thu, 28 Aug 2003 13:11:19 -0500 Lines: 21 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@sea.gmane.org User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5b) Gecko/20030824 X-Accept-Language: en-us, en In-Reply-To: Sender: news Subject: Re: (Fwd) Re: SCSI tape data loss X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Aug 2003 18:19:36 -0000 Daniel Eischen wrote: > Can I ask a question? When writing to a character/block special file > in non-blocking mode, are there any instances where 0 can be returned > from the write() other than when writing to a tape device? > > The only way I can see to fix this in libc_r is to fstat() the > descriptor when the threads library initializes it (uthread_fd.c) > and save st_mode for that fd. Then in write() check to see if > it is S_ISCHR() or S_ISBLK() and 0 was returned. It could > break out of write() if that was the case and return 0 to > the caller. But this doesn't work if you can get 0 back > from a write to other devices. I would be inclined to always pass a zero return from read or write back to the application; doesn't a read/write on a nonblocking device return EAGAIN if there's nothing to do? -- Dan Nelson dnelson@allantgroup.com From owner-freebsd-scsi@FreeBSD.ORG Thu Aug 28 11:51:32 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 085A016A4C0; Thu, 28 Aug 2003 11:51:32 -0700 (PDT) Received: from mail.sandvine.com (sandvine.com [199.243.201.138]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5DC1A43F85; Thu, 28 Aug 2003 11:51:28 -0700 (PDT) (envelope-from don@sandvine.com) Received: by mail.sandvine.com with Internet Mail Service (5.5.2653.19) id ; Thu, 28 Aug 2003 14:51:27 -0400 Message-ID: From: Don Bowman To: "'freebsd-scsi@freebsd.org'" , "'aic7xxx@freebsd.org'" Date: Thu, 28 Aug 2003 14:51:20 -0400 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain; charset="iso-8859-1" Subject: Infinite interrupt loop, INTSTAT = 0 in ahd driver? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Aug 2003 18:51:32 -0000 In aic79xx.c, in ahd_pause_and_flushwork() there is a heuristic to prevent looping more than 1000 times. If this happens a message like "Infinite interrupt loop, INTSTAT = 0" is emitted. I am hitting this case. System has a aic7902. If i set the clock to 20MHz, disable wide negotiation, disable packetisation and qas, the system will come up. There appears to be no trouble for the bios to access the drive, it is only the driver that hits this case. The output is as below. The question is... what do i look for? The driver is from RELENG_4. This is very repeatable. Infinite interrupt loop, INTSTAT = 0(probe0:ahd0:0:0:1): SCB 0xf - timed out >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< ahd0: Dumping Card State at program address 0x0 Mode 0x33 Card was paused HS_MAILBOX[0x0] INTCTL[0x80]:(SWTMINTMASK) SEQINTSTAT[0x0] SAVED_MODE[0x11] DFFSTAT[0x31]:(CURRFIFO_1|FIFO0FREE|FIFO1FREE) SCSISIGI[0x0]:(P_DATAOUT) SCSIPHASE[0x0] SCSIBUS[0x0] LASTPHASE[0x1]:(P_DATAOUT|P_BUSFREE) SCSISEQ0[0x0] SCSISEQ1[0x12]:(ENAUTOATNP|ENRSELI) SEQCTL0[0x10]:(FASTMODE) SEQINTCTL[0x0] SEQ_FLAGS[0x0] SEQ_FLAGS2[0x0] SSTAT0[0x40]:(SELDO) SSTAT1[0x0] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x8]:(AIPERR) SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO) LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] LQOSTAT0[0x0] LQOSTAT1[0x2]:(LQOBUSFREE) LQOSTAT2[0x0] SCB Count = 16 CMDS_PENDING = 1 LASTSCB 0xf CURRSCB 0xf NEXTSCB 0x0 qinstart = 22 qinfifonext = 22 QINFIFO: WAITING_TID_QUEUES: 0 ( 0xf ) Pending list: 15 FIFO_USE[0x0] SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x7] Total 1 Kernel Free SCB list: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 Sequencer Complete DMA-inprog list: Sequencer Complete list: Sequencer DMA-Up and Complete list: ahd0: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0 SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENS AVEPTRS) SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) ahd0: FIFO1 Free, LONGJMP == 0x8063, SCB 0xf SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENS AVEPTRS) SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 ahd0: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x42 ahd0: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0 SIMODE0[0xc]:(ENOVERRUN|ENIOERR) CCSCBCTL[0x4]:(CCSCBDIR) ahd0: REG0 == 0xf, SINDEX = 0x133, DINDEX = 0x102 ahd0: SCBPTR == 0xf, SCB_NEXT == 0xff00, SCB_NEXT2 == 0xff38 CDB 12 20 0 0 24 0 STACK: 0x125 0x0 0x0 0x0 0x0 0x0 0x29 0x1 <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> Infinite interrupt loop, INTSTAT = 0(probe0:ahd0:0:0:1): SCB 0xf - timed out >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< ahd0: Dumping Card State at program address 0xb0 Mode 0x33 Card was paused HS_MAILBOX[0x0] INTCTL[0x0] SEQINTSTAT[0x0] SAVED_MODE[0x11] DFFSTAT[0x30]:(CURRFIFO_0|FIFO0FREE|FIFO1FREE) SCSISIGI[0x0]:(P_DATAOUT) SCSIPHASE[0x0] SCSIBUS[0x0] LASTPHASE[0x1]:(P_DATAOUT|P_BUSFREE) SCSISEQ0[0x0] SCSISEQ1[0x12]:(ENAUTOATNP|ENRSELI) SEQCTL0[0x10]:(FASTMODE) SEQINTCTL[0x0] SEQ_FLAGS[0x0] SEQ_FLAGS2[0x0] SSTAT0[0x40]:(SELDO) SSTAT1[0x0] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x8]:(AIPERR) SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO) LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] LQOSTAT0[0x0] LQOSTAT1[0x2]:(LQOBUSFREE) LQOSTAT2[0x0] SCB Count = 16 CMDS_PENDING = 1 LASTSCB 0xf CURRSCB 0xf NEXTSCB 0x0 qinstart = 2 qinfifonext = 2 QINFIFO: WAITING_TID_QUEUES: 0 ( 0xf ) Pending list: 15 FIFO_USE[0x0] SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x7] Total 1 Kernel Free SCB list: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 Sequencer Complete DMA-inprog list: Sequencer Complete list: Sequencer DMA-Up and Complete list: ahd0: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0 SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENS AVEPTRS) SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) ahd0: FIFO1 Free, LONGJMP == 0x8063, SCB 0x0 SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENS AVEPTRS) SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 ahd0: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x42 ahd0: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0 SIMODE0[0xc]:(ENOVERRUN|ENIOERR) CCSCBCTL[0x4]:(CCSCBDIR) ahd0: REG0 == 0xf, SINDEX = 0x133, DINDEX = 0x102 ahd0: SCBPTR == 0xf, SCB_NEXT == 0xff00, SCB_NEXT2 == 0xff38 CDB 12 20 0 0 24 0 STACK: 0x125 0x0 0x0 0x0 0x0 0x0 0x29 0x1 <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> Infinite interrupt loop, INTSTAT = 0(probe0:ahd0:0:0:1): SCB 0xf - timed out >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< ahd0: Dumping Card State at program address 0x1 Mode 0x33 Card was paused HS_MAILBOX[0x0] INTCTL[0x0] SEQINTSTAT[0x0] SAVED_MODE[0x11] DFFSTAT[0x30]:(CURRFIFO_0|FIFO0FREE|FIFO1FREE) SCSISIGI[0x0]:(P_DATAOUT) SCSIPHASE[0x0] SCSIBUS[0x0] LASTPHASE[0x1]:(P_DATAOUT|P_BUSFREE) SCSISEQ0[0x0] SCSISEQ1[0x12]:(ENAUTOATNP|ENRSELI) SEQCTL0[0x10]:(FASTMODE) SEQINTCTL[0x0] SEQ_FLAGS[0x0] SEQ_FLAGS2[0x0] SSTAT0[0x40]:(SELDO) SSTAT1[0x0] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x8]:(AIPERR) SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO) LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] LQOSTAT0[0x0] LQOSTAT1[0x2]:(LQOBUSFREE) LQOSTAT2[0x0] SCB Count = 16 CMDS_PENDING = 1 LASTSCB 0xf CURRSCB 0xf NEXTSCB 0x0 qinstart = 2 qinfifonext = 2 QINFIFO: WAITING_TID_QUEUES: 0 ( 0xf ) Pending list: 15 FIFO_USE[0x0] SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x7] Total 1 Kernel Free SCB list: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 Sequencer Complete DMA-inprog list: Sequencer Complete list: Sequencer DMA-Up and Complete list: ahd0: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0 SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENS AVEPTRS) SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) ahd0: FIFO1 Free, LONGJMP == 0x8063, SCB 0x0 SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENS AVEPTRS) SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 ahd0: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x42 ahd0: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0 SIMODE0[0xc]:(ENOVERRUN|ENIOERR) CCSCBCTL[0x4]:(CCSCBDIR) ahd0: REG0 == 0xf, SINDEX = 0x133, DINDEX = 0x102 ahd0: SCBPTR == 0xf, SCB_NEXT == 0xff00, SCB_NEXT2 == 0xff38 CDB 12 20 0 0 24 0 STACK: 0x125 0x0 0x0 0x0 0x0 0x0 0x29 0x1 <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> Infinite interrupt loop, INTSTAT = 0(probe0:ahd0:0:0:1): SCB 0xf - timed out >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< ahd0: Dumping Card State at program address 0x2 Mode 0x33 Card was paused HS_MAILBOX[0x0] INTCTL[0x0] SEQINTSTAT[0x0] SAVED_MODE[0x11] DFFSTAT[0x30]:(CURRFIFO_0|FIFO0FREE|FIFO1FREE) SCSISIGI[0x0]:(P_DATAOUT) SCSIPHASE[0x0] SCSIBUS[0x0] LASTPHASE[0x1]:(P_DATAOUT|P_BUSFREE) SCSISEQ0[0x0] SCSISEQ1[0x12]:(ENAUTOATNP|ENRSELI) SEQCTL0[0x10]:(FASTMODE) SEQINTCTL[0x0] SEQ_FLAGS[0x0] SEQ_FLAGS2[0x0] SSTAT0[0x40]:(SELDO) SSTAT1[0x0] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x8]:(AIPERR) SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO) LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] LQOSTAT0[0x0] LQOSTAT1[0x2]:(LQOBUSFREE) LQOSTAT2[0x0] SCB Count = 16 CMDS_PENDING = 1 LASTSCB 0xf CURRSCB 0xf NEXTSCB 0x0 qinstart = 2 qinfifonext = 2 QINFIFO: WAITING_TID_QUEUES: 0 ( 0xf ) Pending list: 15 FIFO_USE[0x0] SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x7] Total 1 Kernel Free SCB list: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 Sequencer Complete DMA-inprog list: Sequencer Complete list: Sequencer DMA-Up and Complete list: ahd0: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0 SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENS AVEPTRS) SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) ahd0: FIFO1 Free, LONGJMP == 0x8063, SCB 0x0 SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENS AVEPTRS) SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 ahd0: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x42 ahd0: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0 SIMODE0[0xc]:(ENOVERRUN|ENIOERR) CCSCBCTL[0x4]:(CCSCBDIR) ahd0: REG0 == 0xf, SINDEX = 0x133, DINDEX = 0x102 ahd0: SCBPTR == 0xf, SCB_NEXT == 0xff00, SCB_NEXT2 == 0xff38 CDB 12 20 0 0 24 0 STACK: 0x125 0x0 0x0 0x0 0x0 0x0 0x29 0x1 <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> Infinite interrupt loop, INTSTAT = 0(probe0:ahd0:0:0:1): SCB 0xf - timed out >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< ahd0: Dumping Card State at program address 0x0 Mode 0x33 Card was paused HS_MAILBOX[0x0] INTCTL[0x0] SEQINTSTAT[0x0] SAVED_MODE[0x11] DFFSTAT[0x30]:(CURRFIFO_0|FIFO0FREE|FIFO1FREE) SCSISIGI[0x0]:(P_DATAOUT) SCSIPHASE[0x0] SCSIBUS[0x0] LASTPHASE[0x1]:(P_DATAOUT|P_BUSFREE) SCSISEQ0[0x0] SCSISEQ1[0x12]:(ENAUTOATNP|ENRSELI) SEQCTL0[0x10]:(FASTMODE) SEQINTCTL[0x0] SEQ_FLAGS[0x0] SEQ_FLAGS2[0x0] SSTAT0[0x40]:(SELDO) SSTAT1[0x0] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x8]:(AIPERR) SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO) LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] LQOSTAT0[0x0] LQOSTAT1[0x2]:(LQOBUSFREE) LQOSTAT2[0x0] SCB Count = 16 CMDS_PENDING = 1 LASTSCB 0xf CURRSCB 0xf NEXTSCB 0x0 qinstart = 2 qinfifonext = 2 QINFIFO: WAITING_TID_QUEUES: 0 ( 0xf ) Pending list: 15 FIFO_USE[0x0] SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x7] Total 1 Kernel Free SCB list: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 Sequencer Complete DMA-inprog list: Sequencer Complete list: Sequencer DMA-Up and Complete list: ahd0: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0 SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENS AVEPTRS) SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) ahd0: FIFO1 Free, LONGJMP == 0x8063, SCB 0x0 SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENS AVEPTRS) SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 ahd0: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x42 ahd0: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0 SIMODE0[0xc]:(ENOVERRUN|ENIOERR) CCSCBCTL[0x4]:(CCSCBDIR) ahd0: REG0 == 0xf, SINDEX = 0x133, DINDEX = 0x102 ahd0: SCBPTR == 0xf, SCB_NEXT == 0xff00, SCB_NEXT2 == 0xff38 CDB 12 20 0 0 24 0 STACK: 0x125 0x0 0x0 0x0 0x0 0x0 0x29 0x1 <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> Infinite interrupt loop, INTSTAT = 0(probe0:ahd0:0:0:1): SCB 0xf - timed out >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< ahd0: Dumping Card State at program address 0x1 Mode 0x33 Card was paused HS_MAILBOX[0x0] INTCTL[0x0] SEQINTSTAT[0x0] SAVED_MODE[0x11] DFFSTAT[0x30]:(CURRFIFO_0|FIFO0FREE|FIFO1FREE) SCSISIGI[0x0]:(P_DATAOUT) SCSIPHASE[0x0] SCSIBUS[0x0] LASTPHASE[0x1]:(P_DATAOUT|P_BUSFREE) SCSISEQ0[0x0] SCSISEQ1[0x12]:(ENAUTOATNP|ENRSELI) SEQCTL0[0x10]:(FASTMODE) SEQINTCTL[0x0] SEQ_FLAGS[0x0] SEQ_FLAGS2[0x0] SSTAT0[0x40]:(SELDO) SSTAT1[0x0] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x8]:(AIPERR) SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO) LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] LQOSTAT0[0x0] LQOSTAT1[0x2]:(LQOBUSFREE) LQOSTAT2[0x0] SCB Count = 16 CMDS_PENDING = 1 LASTSCB 0xf CURRSCB 0xf NEXTSCB 0x0 qinstart = 2 qinfifonext = 2 QINFIFO: WAITING_TID_QUEUES: 0 ( 0xf ) Pending list: 15 FIFO_USE[0x0] SCB_CONTROL[0x40]:(DISCENB) SCB_SCSIID[0x7] Total 1 Kernel Free SCB list: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 Sequencer Complete DMA-inprog list: Sequencer Complete list: Sequencer DMA-Up and Complete list: ahd0: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0 SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENS AVEPTRS) SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) ahd0: FIFO1 Free, LONGJMP == 0x8063, SCB 0x0 SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENS AVEPTRS) SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 ahd0: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x42 ahd0: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0 SIMODE0[0xc]:(ENOVERRUN|ENIOERR) CCSCBCTL[0x4]:(CCSCBDIR) ahd0: REG0 == 0xf, SINDEX = 0x133, DINDEX = 0x102 ahd0: SCBPTR == 0xf, SCB_NEXT == 0xff00, SCB_NEXT2 == 0xff38 CDB 12 20 0 0 24 0 STACK: 0x125 0x0 0x0 0x0 0x0 0x0 0x29 0x1 <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> From owner-freebsd-scsi@FreeBSD.ORG Thu Aug 28 12:49:20 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5D56316A4BF; Thu, 28 Aug 2003 12:49:20 -0700 (PDT) Received: from magic.adaptec.com (magic-mail.adaptec.com [216.52.22.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id 999FF43FE0; Thu, 28 Aug 2003 12:49:19 -0700 (PDT) (envelope-from gibbs@scsiguy.com) Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11]) by magic.adaptec.com (8.11.6/8.11.6) with ESMTP id h7SJnJo27663; Thu, 28 Aug 2003 12:49:19 -0700 Received: from [10.100.253.70] (aslan.btc.adaptec.com [10.100.253.70]) by redfish.adaptec.com (8.8.8p2+Sun/8.8.8) with ESMTP id MAA11323; Thu, 28 Aug 2003 12:49:18 -0700 (PDT) Date: Thu, 28 Aug 2003 13:51:03 -0600 From: "Justin T. Gibbs" To: Don Bowman , "'freebsd-scsi@freebsd.org'" , "'aic7xxx@freebsd.org'" Message-ID: <1509578112.1062100263@aslan.btc.adaptec.com> In-Reply-To: References: X-Mailer: Mulberry/3.1.0b5 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline Subject: Re: Infinite interrupt loop, INTSTAT = 0 in ahd driver? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: "Justin T. Gibbs" List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Aug 2003 19:49:20 -0000 > In aic79xx.c, in ahd_pause_and_flushwork() there > is a heuristic to prevent looping more than 1000 times. > If this happens a message like > "Infinite interrupt loop, INTSTAT = 0" is emitted. > > I am hitting this case. > System has a aic7902. If i set the clock to 20MHz, > disable wide negotiation, disable packetisation and > qas, the system will come up. There appears to > be no trouble for the bios to access the drive, it > is only the driver that hits this case. The BIOS does not operate in packetized mode. It also only sends one trasaction at a time. This roughly equivalent to the behavior you've setup for the driver with your settings in SCSI-Select. The ahd_pause_and_flushwork() routine is only called from timeouts. While there may be a bug in this routine, it is not the root cause of your failure. What drives are you using? Is the controller operating in PCI or PCI-X mode? Are there any other busmasters on the same PCI(-X) segment? What chipset is on your MB (include revision numbers if your system is using the P64H2 PCI-X hub)? You might avoid the loop problem with this change: do { struct scb *waiting_scb; + /* + * Give the sequencer some time to service + * any active selections. + */ ahd_unpause(ahd); + ahd_delay(200); + ahd_intr(ahd); ahd_pause(ahd); But you should continue to look into the root cause of your failure. -- Justin From owner-freebsd-scsi@FreeBSD.ORG Thu Aug 28 12:57:11 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B9FEE16A4BF; Thu, 28 Aug 2003 12:57:11 -0700 (PDT) Received: from mail.sandvine.com (sandvine.com [199.243.201.138]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9208843FB1; Thu, 28 Aug 2003 12:57:10 -0700 (PDT) (envelope-from don@sandvine.com) Received: by mail.sandvine.com with Internet Mail Service (5.5.2653.19) id ; Thu, 28 Aug 2003 15:57:09 -0400 Message-ID: From: Don Bowman To: "'Justin T. Gibbs'" , Don Bowman , "'freebsd-scsi@freebsd.org'" , "'aic7xxx@freebsd.org'" Date: Thu, 28 Aug 2003 15:57:01 -0400 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain; charset="iso-8859-1" Subject: RE: Infinite interrupt loop, INTSTAT = 0 in ahd driver? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Aug 2003 19:57:11 -0000 > From: Justin T. Gibbs [mailto:gibbs@scsiguy.com] > > In aic79xx.c, in ahd_pause_and_flushwork() there > > is a heuristic to prevent looping more than 1000 times. > > If this happens a message like > > "Infinite interrupt loop, INTSTAT = 0" is emitted. > > > > I am hitting this case. > > System has a aic7902. If i set the clock to 20MHz, > > disable wide negotiation, disable packetisation and > > qas, the system will come up. There appears to > > be no trouble for the bios to access the drive, it > > is only the driver that hits this case. > > The BIOS does not operate in packetized mode. It also only > sends one trasaction at a time. This roughly equivalent > to the behavior you've setup for the driver with your > settings in SCSI-Select. > > The ahd_pause_and_flushwork() routine is only called from > timeouts. While there may be a bug in this routine, it > is not the root cause of your failure. What drives are > you using? Is the controller operating in PCI or PCI-X > mode? Are there any other busmasters on the same PCI(-X) > segment? What chipset is on your MB (include revision numbers > if your system is using the P64H2 PCI-X hub)? > P64H2 is B1 rev [rev 4]. Its e7501 chipset. its pci-x. its the only master on the bus. chip0@pci0:0:0: class=0x060000 card=0x358015d9 chip=0x254c8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = 'E7501 Host Controller' class = bridge subclass = HOST-PCI none0@pci0:0:1: class=0xff0000 card=0x358015d9 chip=0x25418086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = 'E7500 System Controller (MCH, Hub Interface A) Error Reporter' pcib1@pci0:2:0: class=0x060400 card=0x00000000 chip=0x25438086 rev=0x01 hdr=0x01 vendor = 'Intel Corporation' device = 'E7500/E7501 HI_B Virtual PCI-to-PCI Bridge' class = bridge subclass = PCI-PCI none1@pci0:29:0: class=0x0c0300 card=0x358015d9 chip=0x24828086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801CA/CAM (ICH3-S/ICH3-M) USB Controller #1' class = serial bus subclass = USB none2@pci0:29:1: class=0x0c0300 card=0x358015d9 chip=0x24848086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801CA/CAM (ICH3-S/ICH3-M) USB Controller #2' class = serial bus subclass = USB none3@pci0:29:2: class=0x0c0300 card=0x358015d9 chip=0x24878086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801CA/CAM (ICH3-S/ICH3-M) USB Controller #3' class = serial bus subclass = USB pcib4@pci0:30:0: class=0x060400 card=0x00000000 chip=0x244e8086 rev=0x42 hdr=0x01 vendor = 'Intel Corporation' device = '82801BA/CA/DB (ICH2/3/4) Hub Interface to PCI Bridge (244E)' class = bridge subclass = PCI-PCI isab0@pci0:31:0: class=0x060100 card=0x00000000 chip=0x24808086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801CA/CAM (ICH3-S/ICH3-M) LPC Interface' class = bridge subclass = PCI-ISA atapci0@pci0:31:1: class=0x01018a card=0x358015d9 chip=0x248b8086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801CA (ICH3) UltraATA/100 EIDE Controller' class = mass storage subclass = ATA ichsmb0@pci0:31:3: class=0x0c0500 card=0x358015d9 chip=0x24838086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801CA/CAM (ICH3-S/ICH3-M) SMBus Controller' class = serial bus subclass = SMBus none4@pci1:28:0: class=0x080020 card=0x358015d9 chip=0x14618086 rev=0x04 hdr=0x00 vendor = 'Intel Corporation' device = '82870P2 I/OxAPIC Interrupt Controller' class = base peripheral subclass = interrupt controller pcib2@pci1:29:0: class=0x060400 card=0x00000050 chip=0x14608086 rev=0x04 hdr=0x01 vendor = 'Intel Corporation' device = '82870P2 P64H2 PCI/PCI-X Hub Controller' class = bridge subclass = PCI-PCI none5@pci1:30:0: class=0x080020 card=0x358015d9 chip=0x14618086 rev=0x04 hdr=0x00 vendor = 'Intel Corporation' device = '82870P2 I/OxAPIC Interrupt Controller' class = base peripheral subclass = interrupt controller pcib3@pci1:31:0: class=0x060400 card=0x00000050 chip=0x14608086 rev=0x04 hdr=0x01 vendor = 'Intel Corporation' device = '82870P2 P64H2 PCI/PCI-X Hub Controller' class = bridge subclass = PCI-PCI em0@pci2:1:0: class=0x020000 card=0x10118086 chip=0x10108086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82546EB Gigabit Ethernet Controller (copper)' class = network subclass = ethernet em1@pci2:1:1: class=0x020000 card=0x10118086 chip=0x10108086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82546EB Gigabit Ethernet Controller (copper)' class = network subclass = ethernet em2@pci2:3:0: class=0x020000 card=0x10118086 chip=0x10108086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82546EB Gigabit Ethernet Controller (copper)' class = network subclass = ethernet em3@pci2:3:1: class=0x020000 card=0x10118086 chip=0x10108086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82546EB Gigabit Ethernet Controller (copper)' class = network subclass = ethernet ahd0@pci3:2:0: class=0x010000 card=0x005f9005 chip=0x801f9005 rev=0x03 hdr=0x00 vendor = 'Adaptec' device = 'AIC-7902 Ultra320 SCSI Host Adapter' class = mass storage subclass = SCSI ahd1@pci3:2:1: class=0x010000 card=0x005f9005 chip=0x801f9005 rev=0x03 hdr=0x00 vendor = 'Adaptec' device = 'AIC-7902 Ultra320 SCSI Host Adapter' class = mass storage subclass = SCSI none6@pci4:1:0: class=0x030000 card=0x00081002 chip=0x47521002 rev=0x27 hdr=0x00 vendor = 'ATI Technologies' device = 'Rage XL PCI' class = display subclass = VGA From owner-freebsd-scsi@FreeBSD.ORG Thu Aug 28 13:11:57 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 62E9716A4BF; Thu, 28 Aug 2003 13:11:57 -0700 (PDT) Received: from magic.adaptec.com (magic-mail.adaptec.com [216.52.22.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id BACED43FEA; Thu, 28 Aug 2003 13:11:56 -0700 (PDT) (envelope-from gibbs@scsiguy.com) Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11]) by magic.adaptec.com (8.11.6/8.11.6) with ESMTP id h7SKBuo24097; Thu, 28 Aug 2003 13:11:56 -0700 Received: from [10.100.253.70] (aslan.btc.adaptec.com [10.100.253.70]) by redfish.adaptec.com (8.8.8p2+Sun/8.8.8) with ESMTP id NAA00796; Thu, 28 Aug 2003 13:11:55 -0700 (PDT) Date: Thu, 28 Aug 2003 14:13:07 -0600 From: "Justin T. Gibbs" To: Don Bowman , "'freebsd-scsi@freebsd.org'" , "'aic7xxx@freebsd.org'" Message-ID: <1522128112.1062101587@aslan.btc.adaptec.com> In-Reply-To: References: X-Mailer: Mulberry/3.1.0b5 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline Subject: RE: Infinite interrupt loop, INTSTAT = 0 in ahd driver? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: "Justin T. Gibbs" List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Aug 2003 20:11:57 -0000 >> The ahd_pause_and_flushwork() routine is only called from >> timeouts. While there may be a bug in this routine, it >> is not the root cause of your failure. What drives are >> you using? Is the controller operating in PCI or PCI-X >> mode? Are there any other busmasters on the same PCI(-X) >> segment? What chipset is on your MB (include revision numbers >> if your system is using the P64H2 PCI-X hub)? >> > > P64H2 is B1 rev [rev 4]. > Its e7501 chipset. > its pci-x. > its the only master on the bus. And the drives? -- Justin From owner-freebsd-scsi@FreeBSD.ORG Thu Aug 28 13:41:11 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BEFBA16A4BF; Thu, 28 Aug 2003 13:41:11 -0700 (PDT) Received: from mail.sandvine.com (sandvine.com [199.243.201.138]) by mx1.FreeBSD.org (Postfix) with ESMTP id C9E5043FE9; Thu, 28 Aug 2003 13:41:10 -0700 (PDT) (envelope-from don@sandvine.com) Received: by mail.sandvine.com with Internet Mail Service (5.5.2653.19) id ; Thu, 28 Aug 2003 16:41:10 -0400 Message-ID: From: Don Bowman To: "'Justin T. Gibbs'" , Don Bowman , "'freebsd-scsi@freebsd.org'" , "'aic7xxx@freebsd.org'" Date: Thu, 28 Aug 2003 16:41:08 -0400 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain; charset="iso-8859-1" Subject: RE: Infinite interrupt loop, INTSTAT = 0 in ahd driver? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Aug 2003 20:41:11 -0000 > From: Justin T. Gibbs [mailto:gibbs@scsiguy.com] > >> The ahd_pause_and_flushwork() routine is only called from > >> timeouts. While there may be a bug in this routine, it > >> is not the root cause of your failure. What drives are > >> you using? Is the controller operating in PCI or PCI-X > >> mode? Are there any other busmasters on the same PCI(-X) > >> segment? What chipset is on your MB (include revision numbers > >> if your system is using the P64H2 PCI-X hub)? > >> > > > > P64H2 is B1 rev [rev 4]. > > Its e7501 chipset. > > its pci-x. > > its the only master on the bus. > > And the drives? SEAGATE ST318453LW U320 15KRPM CHEETAH 18GB with rev 5 rom. From owner-freebsd-scsi@FreeBSD.ORG Thu Aug 28 21:10:24 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0E9D116A4BF for ; Thu, 28 Aug 2003 21:10:24 -0700 (PDT) Received: from axl.seasidesoftware.co.za (axl.seasidesoftware.co.za [196.31.7.201]) by mx1.FreeBSD.org (Postfix) with ESMTP id BCED744003 for ; Thu, 28 Aug 2003 21:10:21 -0700 (PDT) (envelope-from sheldonh@starjuice.net) Received: from sheldonh by axl.seasidesoftware.co.za with local (Exim 4.22) id 19saaX-0000Im-MX for freebsd-scsi@FreeBSD.org; Fri, 29 Aug 2003 06:10:17 +0200 Date: Fri, 29 Aug 2003 06:10:17 +0200 From: Sheldon Hearn To: freebsd-scsi@FreeBSD.org Message-ID: <20030829041017.GL93028@starjuice.net> Mail-Followup-To: freebsd-scsi@FreeBSD.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.4i Sender: Sheldon Hearn Subject: SMP, the aac driver and command timeouts X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Aug 2003 04:10:24 -0000 Hi there, I've just installed a fresh 4.8-RELEASE on a dual-Xeon (2.6GHz) with 4GB of RAM. I'm using the aac driver to support an Adaptec 2120S: aac0: mem 0xd0000000-0xd3ffffff irq 2 at device 2.0 on pci4 aac0: i960RX 100MHz, 48MB cache memory, optional battery present aac0: Kernel 4.0-0, Build 6003, S/N b7e76e When I try to boot an SMP kernel (with or without HTT enabled in the BIOS and kernel), the system fails to boot. On serial console, I see messages that look like this: aac0 ... COMMAND 0x...... TIMEOUT AFTER ... seconds I've googled around, and the advice I've seen is: 1) Make sure your drives have enough power, and 2) Flash up the firmware of your drives. I'll get the guys at our colo to flash the drive firmware and confirm that the PSU is a 350W, but I'm under serious time pressure and wanted to ask in advance whether this is likely to solve my problem, or whether there are other likely candidates I should investigate. On the plus side, this box as a FreeBSD installation is faring much better than it did as a Windows 2000 Advanced Server. I couldn't even get Apache2 to start up more than 200 threads. Try downgrading Windows 2000 from multiprocessor to uniprocessor over serial console. ;-) TIA, Sheldon. From owner-freebsd-scsi@FreeBSD.ORG Fri Aug 29 01:31:15 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DD63416A4BF for ; Fri, 29 Aug 2003 01:31:14 -0700 (PDT) Received: from mail.messagingengine.com (out1.smtp.messagingengine.com [66.111.4.25]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1046D43F3F for ; Fri, 29 Aug 2003 01:31:14 -0700 (PDT) (envelope-from freebsd@soith.com) Received: from www.fastmail.fm (localhost [127.0.0.1]) by localhost.localdomain (Postfix) with ESMTP id 8727614D9A2; Fri, 29 Aug 2003 04:30:51 -0400 (EDT) Received: from 10.202.2.132 ([10.202.2.132] helo=www.fastmail.fm) by messagingengine.com with SMTP; Fri, 29 Aug 2003 04:30:51 -0400 Received: by www.fastmail.fm (Postfix, from userid 99) id 4ACFC3A1A5; Fri, 29 Aug 2003 04:30:51 -0400 (EDT) Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="ISO-8859-1" MIME-Version: 1.0 X-Mailer: MIME::Lite 1.2 (F2.71; T1.001; A1.51; B2.12; Q2.03) From: "Aaron Wohl" To: "Sheldon Hearn" , freebsd-scsi@FreeBSD.org Date: Fri, 29 Aug 2003 02:30:51 -0600 X-Epoch: 1062145851 X-Sasl-enc: wKhRn0HlDx90lDVYkZmwFg References: <20030829041017.GL93028@starjuice.net> In-Reply-To: <20030829041017.GL93028@starjuice.net> Message-Id: <20030829083051.4ACFC3A1A5@www.fastmail.fm> Subject: Re: SMP, the aac driver and command timeouts X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Aug 2003 08:31:15 -0000 Yeah im getting 2-3 aac driver related crashes a day now with -current on a 5400s. I was seeing that "aac0 ... COMMAND 0x...... TIMEOUT AFTER ... seconds" as well. I did a cvsup and rebuild/install yesterday. Im not getting that now but still geting "command not in queue" panics. from an adaptic 5400S. AAC0> controller details Executing: controller details Controller Information ---------------------- Remote Computer: S Device Name: S Controller Type: Adaptec 5400S Access Mode: READ-WRITE Controller Serial Number: Last Six Digits = 6B1825 Number of Buses: 4 Devices per Bus: 15 Controller CPU: Strong Arm 110 Controller CPU Speed: 233 Mhz Controller Memory: 144 Mbytes Battery State: Ok Component Revisions ------------------- CLI: 1.0-0 (Build #5263) API: 1.0-0 (Build #5263) Miniport Driver: 1.0-0 (Build #5262) Controller Software: 1.0-0 (Build #5262) Controller BIOS: 1.0-0 (Build #5262) Controller Firmware: (Build #5262) Controller Hardware: 3.3 AAC0> uname -a (hostname edited) FreeBSD xxx 5.1-CURRENT FreeBSD 5.1-CURRENT #34: Wed Aug 27 17:26:58 EDT 2003 xxx:/usr/obj/usr/src/sys/PASODOBLE i386 *** email I sent this morning to the vendor we got our SMP hardware from ** We are getting 2-3 crashes a day in the aac driver on the machine we thought about replacing the processor on. Ive read all the goings on on the -current lists etc and trieed asking there about it. The crashes happen when doing heavy scsi io. Either with disk intensive mysql jobs or using the tape drive (amanda). Each time the panic is "panic: command not in queue" from the aac driver. The other server we got from you is not having these crashes. But we havent updated the OS on it since Fri Aug 1 19:50:58 EDT 2003. Its interesting that the stack backtrace for this crash ALWAYS has fork_exit in the stack backtrace. Its trying to remove a command from the response queue, but the item in the response queue has a santity check that says which queue its on and its not listed as being on the that queue. I think you mentioned you where shipping 5.x on your server now? Do you get -current or is there a specific date/time for cvs checkout of the operating system sources. Id read the stuff on the -current list about having INVARIANTs on pissing off the scsi driver due to new restrictions on doing INVARIANT checks from drivers. I tried building a kernel with INVARIANT off but it didnt help. panic: command not in queue panic messages: --- dmesg: kvm_read: --- Reading symbols from /usr/obj/usr/src/sys/PASODOBLE/modules/usr/src/sys/modules/acpi/acpi.ko.debug...done. Loaded symbols for /usr/obj/usr/src/sys/PASODOBLE/modules/usr/src/sys/modules/acpi/acpi.ko.debug Reading symbols from /boot/kernel/green_saver.ko...done. Loaded symbols for /boot/kernel/green_saver.ko #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:240 240 dumping++; (kgdb) where #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:240 #1 0xc0332b41 in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:372 #2 0xc0332f98 in panic () at /usr/src/sys/kern/kern_shutdown.c:550 #3 0xc01676b4 in aac_complete (context=0xcb918000, pending=1) at /usr/src/sys/dev/aac/aacvar.h:535 #4 0xc03599ed in taskqueue_run (queue=0xc6768780) at /usr/src/sys/kern/subr_taskqueue.c:205 #5 0xc0359ac3 in taskqueue_swi_run (dummy=0x0) at /usr/src/sys/kern/subr_taskqueue.c:221 #6 0xc031c8d8 in ithread_loop (arg=0xc6768700) at /usr/src/sys/kern/kern_intr.c:534 #7 0xc031b511 in fork_exit (callout=0xc031c700 , arg=0x0, frame=0x0) at /usr/src/sys/kern/kern_fork.c:796 (kgdb) From owner-freebsd-scsi@FreeBSD.ORG Fri Aug 29 04:31:38 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8AD8816A4C0 for ; Fri, 29 Aug 2003 04:31:38 -0700 (PDT) Received: from axl.seasidesoftware.co.za (axl.seasidesoftware.co.za [196.31.7.201]) by mx1.FreeBSD.org (Postfix) with ESMTP id 844D743FDF for ; Fri, 29 Aug 2003 04:31:37 -0700 (PDT) (envelope-from sheldonh@starjuice.net) Received: from sheldonh by axl.seasidesoftware.co.za with local (Exim 4.22) id 19shTa-0001WR-7B; Fri, 29 Aug 2003 13:31:34 +0200 Date: Fri, 29 Aug 2003 13:31:34 +0200 From: Sheldon Hearn To: Aaron Wohl Message-ID: <20030829113134.GB5234@starjuice.net> Mail-Followup-To: Aaron Wohl , freebsd-scsi@FreeBSD.org References: <20030829041017.GL93028@starjuice.net> <20030829083051.4ACFC3A1A5@www.fastmail.fm> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030829083051.4ACFC3A1A5@www.fastmail.fm> User-Agent: Mutt/1.5.4i Sender: Sheldon Hearn cc: freebsd-scsi@FreeBSD.org Subject: Re: SMP, the aac driver and command timeouts X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Aug 2003 11:31:38 -0000 On (2003/08/29 02:30), Aaron Wohl wrote: > Yeah im getting 2-3 aac driver related crashes a day now with -current on > a 5400s. > > I was seeing that "aac0 ... COMMAND 0x...... TIMEOUT AFTER ... seconds" > as well. I did a cvsup and rebuild/install yesterday. Im not getting > that now but still geting "command not in queue" panics. from an adaptic > 5400S. Sorry, I should have mentioned that I'm using 4.8-RELEASE (soon to be 4.8-STABLE). > AAC0> controller details > Executing: controller details > Controller Information What tool are you using to get this? Ciao, Sheldon. From owner-freebsd-scsi@FreeBSD.ORG Fri Aug 29 07:15:36 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0E16716A4BF for ; Fri, 29 Aug 2003 07:15:36 -0700 (PDT) Received: from smtp.mho.com (smtp.mho.net [64.58.4.6]) by mx1.FreeBSD.org (Postfix) with SMTP id 0E23A43FBD for ; Fri, 29 Aug 2003 07:15:35 -0700 (PDT) (envelope-from scottl@freebsd.org) Received: (qmail 43970 invoked by uid 1002); 29 Aug 2003 14:15:34 -0000 Received: from unknown (HELO freebsd.org) (64.58.1.252) by smtp.mho.net with SMTP; 29 Aug 2003 14:15:34 -0000 Message-ID: <3F4F6009.90809@freebsd.org> Date: Fri, 29 Aug 2003 08:15:37 -0600 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Aaron Wohl References: <20030829041017.GL93028@starjuice.net> <20030829083051.4ACFC3A1A5@www.fastmail.fm> In-Reply-To: <20030829083051.4ACFC3A1A5@www.fastmail.fm> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-scsi@freebsd.org cc: Sheldon Hearn Subject: Re: SMP, the aac driver and command timeouts X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Aug 2003 14:15:36 -0000 Aaron Wohl wrote: > Yeah im getting 2-3 aac driver related crashes a day now with -current on > a 5400s. > > I was seeing that "aac0 ... COMMAND 0x...... TIMEOUT AFTER ... seconds" > as well. I did a cvsup and rebuild/install yesterday. Im not getting > that now but still geting "command not in queue" panics. from an adaptic > 5400S. This is all quite serious. Did the driver ever work for you? Is this an SMP machine? Is there a reproducable test case that I could use to debug it locally? Scott From owner-freebsd-scsi@FreeBSD.ORG Fri Aug 29 08:43:48 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5C5BF16A4C0; Fri, 29 Aug 2003 08:43:48 -0700 (PDT) Received: from mail.messagingengine.com (out1.smtp.messagingengine.com [66.111.4.25]) by mx1.FreeBSD.org (Postfix) with ESMTP id B948143FE3; Fri, 29 Aug 2003 08:43:44 -0700 (PDT) (envelope-from freebsd@soith.com) Received: from www.fastmail.fm (localhost [127.0.0.1]) by localhost.localdomain (Postfix) with ESMTP id 2E6B6143C64; Fri, 29 Aug 2003 11:43:40 -0400 (EDT) Received: from 10.202.2.132 ([10.202.2.132] helo=www.fastmail.fm) by messagingengine.com with SMTP; Fri, 29 Aug 2003 11:43:40 -0400 Received: by www.fastmail.fm (Postfix, from userid 99) id E839C3A1DD; Fri, 29 Aug 2003 11:43:38 -0400 (EDT) Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="ISO-8859-1" MIME-Version: 1.0 X-Mailer: MIME::Lite 1.2 (F2.71; T1.001; A1.51; B2.12; Q2.03) From: "Aaron Wohl" To: "Scott Long" Date: Fri, 29 Aug 2003 09:43:38 -0600 X-Epoch: 1062171820 X-Sasl-enc: TlaULoTEpR84wVQ4787VCQ References: <20030829041017.GL93028@starjuice.net> <20030829083051.4ACFC3A1A5@www.fastmail.fm> <3F4F6009.90809@freebsd.org> In-Reply-To: <3F4F6009.90809@freebsd.org> Message-Id: <20030829154338.E839C3A1DD@www.fastmail.fm> cc: freebsd-scsi@freebsd.org cc: Sheldon Hearn Subject: Re: SMP, the aac driver and command timeouts X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Aug 2003 15:43:48 -0000 Yes it worked ok. We got two servers both 2U SMP Xeon 2800mhz with adaptec 5400S controllers from Lanny @ freebsdsystems.com. The one thats working ok was built from -current at Fri Aug 1 19:50:58 EDT 2003, its in production hard to get any time on it to tinker. The 2nd system (the one thats been crashing) is the same hardware plus a scsi tape plugged into the scsi controller on the motherboard. Its been crashing lately with "command not in queue" but ive seeh the aac0 COMMAND ... TIMEOUT as well. As for repeating it... the machine thats crashing tends to crash if I dump and load a 1.5 gbyte mysql table, and or do amanda tape backups. Im sorry thats not much to go on. Ive been try to get thru the weekly amanda dump to tape each day this week but its crashing with "command not in queue". Here is a stack backtrace: gdb -k /usr/obj/usr/src/sys/PASODOBLE/kernel.debug vmcore.9 ... dmesg: kvm_read: --- Reading symbols from /usr/obj/usr/src/sys/PASODOBLE/modules/usr/src/sys/modules/acpi/acpi.ko.debug...done. Loaded symbols for /usr/obj/usr/src/sys/PASODOBLE/modules/usr/src/sys/modules/acpi/acpi.ko.debug Reading symbols from /boot/kernel/green_saver.ko...done. Loaded symbols for /boot/kernel/green_saver.ko #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:240 240 dumping++; (kgdb) where #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:240 #1 0xc0332b41 in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:372 #2 0xc0332f98 in panic () at /usr/src/sys/kern/kern_shutdown.c:550 #3 0xc01676b4 in aac_complete (context=0xcb918000, pending=1) at /usr/src/sys/dev/aac/aacvar.h:535 #4 0xc03599ed in taskqueue_run (queue=0xc6768780) at /usr/src/sys/kern/subr_taskqueue.c:205 #5 0xc0359ac3 in taskqueue_swi_run (dummy=0x0) at /usr/src/sys/kern/subr_taskqueue.c:221 #6 0xc031c8d8 in ithread_loop (arg=0xc6768700) at /usr/src/sys/kern/kern_intr.c:534 #7 0xc031b511 in fork_exit (callout=0xc031c700 , arg=0x0, frame=0x0) at /usr/src/sys/kern/kern_fork.c:796 (kgdb) On Fri, 29 Aug 2003 08:15:37 -0600, "Scott Long" said: > Aaron Wohl wrote: > > > Yeah im getting 2-3 aac driver related crashes a day now with -current on > > a 5400s. > > > > I was seeing that "aac0 ... COMMAND 0x...... TIMEOUT AFTER ... seconds" > > as well. I did a cvsup and rebuild/install yesterday. Im not getting > > that now but still geting "command not in queue" panics. from an adaptic > > 5400S. > > This is all quite serious. Did the driver ever work for you? Is this > an SMP machine? Is there a reproducable test case that I could use to > debug it locally? > > Scott > > From owner-freebsd-scsi@FreeBSD.ORG Sat Aug 30 07:59:51 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3A8D216A4BF for ; Sat, 30 Aug 2003 07:59:51 -0700 (PDT) Received: from mail.webjockey.net (mail.webjockey.net [208.141.46.3]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2ED0B43FDF for ; Sat, 30 Aug 2003 07:59:50 -0700 (PDT) (envelope-from gary@outloud.org) Received: from nebula-evojtguv.outloud.org (wv-mrtnbrg-cmts1a-a-246.shphwv.adelphia.net [68.67.224.246]) by mail.webjockey.net (8.12.9/8.12.8) with ESMTP id h7UExkCO098264; Sat, 30 Aug 2003 10:59:47 -0400 (EDT) (envelope-from gary@outloud.org) Message-Id: <6.0.0.14.0.20030830105516.02042c78@localhost> X-Sender: ancient/208.141.46.254@localhost X-Mailer: QUALCOMM Windows Eudora Version 6.0.0.14 (Beta) Date: Sat, 30 Aug 2003 10:59:42 -0400 To: Sheldon Hearn From: Gary Stanley In-Reply-To: <20030829041017.GL93028@starjuice.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed cc: freebsd-scsi@freebsd.org Subject: Re: SMP, the aac driver and command timeouts X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 30 Aug 2003 14:59:51 -0000 I used to get this problem on our 2120's/2200 cards. After doing so much to try and fix the problem, my employers were forced to switch to other brands to get rid of the "command timeout" errors. We've tried various Ultra320 SCSI drives (Maxtor, Seagate) and both of them had the same problem, after swapping cables/backplanes/machines the problem was traced to the cards (even with updated controller firmware, the problem was still there.) All machines were supermicro computers with 1 or 2 Xeons and 2GB of ram (some had 2GB, some had 4GB) The only thing that really bothered me was FreeBSD seem to really dislike (or maybe the aac driver) being installed onto a RAID5 array that was building with many members (I would get panic's and command timeout errors during the boot after the install.. ) YMMV, tho. At 12:10 AM 8/29/2003, you wrote: >Hi there, > >I've just installed a fresh 4.8-RELEASE on a dual-Xeon (2.6GHz) with 4GB >of RAM. > >I'm using the aac driver to support an Adaptec 2120S: > >aac0: mem 0xd0000000-0xd3ffffff irq 2 at > device 2.0 on pci4 >aac0: i960RX 100MHz, 48MB cache memory, optional battery present >aac0: Kernel 4.0-0, Build 6003, S/N b7e76e > >When I try to boot an SMP kernel (with or without HTT enabled in the >BIOS and kernel), the system fails to boot. > >On serial console, I see messages that look like this: > >aac0 ... COMMAND 0x...... TIMEOUT AFTER ... seconds > >I've googled around, and the advice I've seen is: > >1) Make sure your drives have enough power, and >2) Flash up the firmware of your drives. > >I'll get the guys at our colo to flash the drive firmware and confirm >that the PSU is a 350W, but I'm under serious time pressure and wanted >to ask in advance whether this is likely to solve my problem, or whether >there are other likely candidates I should investigate. > >On the plus side, this box as a FreeBSD installation is faring much >better than it did as a Windows 2000 Advanced Server. I couldn't even >get Apache2 to start up more than 200 threads. Try downgrading Windows >2000 from multiprocessor to uniprocessor over serial console. ;-) > >TIA, >Sheldon. >_______________________________________________ >freebsd-scsi@freebsd.org mailing list >http://lists.freebsd.org/mailman/listinfo/freebsd-scsi >To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org"