From owner-freebsd-stable@FreeBSD.ORG Sun Aug 10 23:41:59 2008 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3D30D1065670 for ; Sun, 10 Aug 2008 23:41:59 +0000 (UTC) (envelope-from jdc@parodius.com) Received: from mx01.sc1.parodius.com (mx01.sc1.parodius.com [72.20.106.3]) by mx1.freebsd.org (Postfix) with ESMTP id 2D5898FC12 for ; Sun, 10 Aug 2008 23:41:59 +0000 (UTC) (envelope-from jdc@parodius.com) Received: by mx01.sc1.parodius.com (Postfix, from userid 1000) id 222E01CC0BB; Sun, 10 Aug 2008 16:41:59 -0700 (PDT) Date: Sun, 10 Aug 2008 16:41:59 -0700 From: Jeremy Chadwick To: Larry Rosenman Message-ID: <20080810234159.GA89742@eos.sc1.parodius.com> References: <20080810175934.X2427@borg> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080810175934.X2427@borg> User-Agent: Mutt/1.5.18 (2008-05-17) Cc: freebsd-stable@FreeBSD.org Subject: Re: ICRC's X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2008 23:41:59 -0000 On Sun, Aug 10, 2008 at 06:01:34PM -0500, Larry Rosenman wrote: > I'm getting the following on a zpool scrub: > > ad8: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=54817587 > > I replaced the drive at ad8 because the original one would get an ICRC and then hang the bus. > > Model Family: Seagate Barracuda 7200.10 family > Device Model: ST3500630AS > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE > 1 Raw_Read_Error_Rate 0x000f 105 100 006 Pre-fail Always - 9366477 > 7 Seek_Error_Rate 0x000f 063 060 030 Pre-fail Always - 2364626 > 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 41 > 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 7 > 190 Airflow_Temperature_Cel 0x0022 064 061 045 Old_age Always - 36 (Lifetime Min/Max 35/39) > 194 Temperature_Celsius 0x0022 036 040 000 Old_age Always - 36 (0 32 0 0) > 195 Hardware_ECC_Recovered 0x001a 068 064 000 Old_age Always - 207627383 > 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 94 > > Error 110 occurred at disk power-on lifetime: 41 hours (1 days + 17 hours) > When the command that caused the error occurred, the device was active or idle. > > After command completion occurred, registers were: > ER ST SC SN CL CH DH > -- -- -- -- -- -- -- > 84 51 0f fe e7 36 49 Error: ICRC, ABRT 15 sectors at LBA = 0x0936e7fe = 154593278 > > Commands leading to the command that caused the error were: > CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name > -- -- -- -- -- -- -- -- ---------------- -------------------- > c8 00 00 0d e7 36 49 00 01:23:46.872 READ DMA > c8 00 00 0d e6 36 49 00 01:23:46.871 READ DMA > c8 00 00 0d e5 36 49 00 01:23:46.871 READ DMA > c8 00 00 0d e4 36 49 00 01:23:46.870 READ DMA > c8 00 00 0d e3 36 49 00 01:23:46.853 READ DMA > > Ideas? > > This is on a SuperMicro SYS-7045-TR+ You have one or more of the following: 1. Faulty ATA cable 2. Faulty ATA port 3. Faulty ATA controller (doubtful, unless the errors are specific to one role (e.g. master or slave)) 4. A 2nd disk which is equally as bad (came from the same manufacturing batch, which is very likely if the drive is of the same vendor and model type, and manufacturing date (within a month or two)) The disk's SMART error log even confirms the DMA errors, which proves there is in fact a problem with one of the above. In this particular case, it's not FreeBSD. :-) My recommendation: * Try another disk from a different manufacturer (not Seagate) * If similar errors appear using that disk, the problem is either item 1, 2, or 3. * If no errors appear, it's item 4, in which case send the disk to Seagate for RMA; their SeaTools utility, on a full scan, should definitely return an error code which you can give to Support when filing for the RMA. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |