From owner-freebsd-geom@FreeBSD.ORG Thu Aug 3 17:10:27 2006 Return-Path: X-Original-To: freebsd-geom@freebsd.org Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 891DC16A5B3 for ; Thu, 3 Aug 2006 17:10:27 +0000 (UTC) (envelope-from rick@kiwi-computer.com) Received: from kiwi-computer.com (megan.kiwi-computer.com [63.224.10.3]) by mx1.FreeBSD.org (Postfix) with SMTP id D96AE43D46 for ; Thu, 3 Aug 2006 17:10:26 +0000 (GMT) (envelope-from rick@kiwi-computer.com) Received: (qmail 23489 invoked by uid 2001); 3 Aug 2006 17:10:25 -0000 Date: Thu, 3 Aug 2006 12:10:25 -0500 From: "Rick C. Petty" To: Miroslav Lachman <000.fbsd@quip.cz> Message-ID: <20060803171025.GA23405@megan.kiwi-computer.com> References: <44D06650.1030803@quip.cz> <20060802183001.GA14279@megan.kiwi-computer.com> <44D10D1D.9040700@quip.cz> <20060802210709.GA15310@megan.kiwi-computer.com> <44D126EF.9070503@quip.cz> <44D12A80.9040802@quip.cz> <20060802233255.GB16385@megan.kiwi-computer.com> <44D1BE0B.9090709@quip.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <44D1BE0B.9090709@quip.cz> User-Agent: Mutt/1.4.2.1i Cc: freebsd-geom@freebsd.org Subject: Re: gmirror Cannot add disk ad5 to gm0 (error=22) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: rick-freebsd@kiwi-computer.com List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Aug 2006 17:10:27 -0000 On Thu, Aug 03, 2006 at 11:12:43AM +0200, Miroslav Lachman wrote: > Rick C. Petty wrote: > > >What other activity is happening on the box? Are you in the middle of a > >background fsck? > > Almost no other activities, system has installed apache, mysql, postfix > etc., but not serving any requests. Fsck was not running. But was any other process hitting the disk? You could try doing the synchronization in single-user mode and see if the throughput jumps up. > Now it seems that it is disk problem this time. [snip] > After six hours I got message from smartd > Device: /dev/ad5, FAILED SMART self-check. BACK UP DATA NOW! > Device: /dev/ad5, 52 Currently unreadable (pending) sectors > Device: /dev/ad5, 52 Offline uncorrectable sectors [snip] > smartd[506]: Device: /dev/ad5, failed to read SMART Attribute Data > > In MRTG graphs I got disk temperature (38?C) and Reallocated Sector > Count which is increasing from time of synchronization start and after 5 > hours the number of reallocated sectors goes above 2000! (out of range > of the graph) This certainly sounds like a disk-related problem. Likely your previous failures were due to the same problems. Time to send the disks back to the manufacturer for replacement.. :-/ > After manual reboot, there is no ad5 device. I hope new drive helps, but > I am still nervous, because I have similar troubles with 2 machines > (both replaced with new one - so I played with 4 machines)... Chance of one "new" disk being bad-- pretty low. Chance of two new disks being bad-- even lower. Chance of three or more disks going bad around the same time-- much higher. I've noticed this type of behavior before. Someone correct me if I'm wrong but it appears that you probably got a bad batch of disks. Try throwing a different set of disks on the boxes (preferrably a different manufacturer). I would also try swapping with brand new high-quality cables (just because they're cheaper than new disks). -- Rick C. Petty