From owner-freebsd-geom@FreeBSD.ORG Wed Aug 2 20:38:01 2006 Return-Path: X-Original-To: freebsd-geom@freebsd.org Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3FEF116A4DE for ; Wed, 2 Aug 2006 20:38:01 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from home.quip.cz (grimm.quip.cz [213.220.192.218]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2104B43D6E for ; Wed, 2 Aug 2006 20:37:51 +0000 (GMT) (envelope-from 000.fbsd@quip.cz) Received: from [192.168.1.2] (qwork.quip.test [192.168.1.2]) by home.quip.cz (Postfix) with ESMTP id 3CB2D5280; Wed, 2 Aug 2006 22:37:50 +0200 (CEST) Message-ID: <44D10D1D.9040700@quip.cz> Date: Wed, 02 Aug 2006 22:37:49 +0200 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 X-Accept-Language: cs, cz, en, en-us MIME-Version: 1.0 To: rick-freebsd@kiwi-computer.com References: <44D06650.1030803@quip.cz> <20060802183001.GA14279@megan.kiwi-computer.com> In-Reply-To: <20060802183001.GA14279@megan.kiwi-computer.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-geom@freebsd.org Subject: Re: gmirror Cannot add disk ad5 to gm0 (error=22) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Aug 2006 20:38:01 -0000 Rick C. Petty wrote: > On Wed, Aug 02, 2006 at 10:46:08AM +0200, Miroslav Lachman wrote: > >>Aug 1 00:03:42 track kernel: ad5: TIMEOUT - WRITE_DMA48 retrying (1 >>retry left) LBA=290279525 > > > Out of curiosity-- what's the dmesg output of your ATA controllers? atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 31.1 on pci0 ata0: on atapci0 ata1: on atapci0 atapci1: port 0xe800-0xe807,0xe480-0xe483,0xe400-0xe407,0xe080-0xe083,0xe000-0xe00f mem 0xfebff800-0xfebffbff irq 19 at device 31.2 on pci0 ata2: on atapci1 ata3: on atapci1 full dmesg output is included in http://www.quip.cz/1/freebsd/asus_rs120-e3/track_messages.txt >>I tried smartctl -a /dev/ad4 and smartctl -a /dev/ad5, but does not see >>any errors. > > > Did you have SMART enabled in the BIOS? Yes, (as I remember - I have only remote access now) and have smartd_enable="YES" in /etc/rc.conf and smartd.conf has these lines: /dev/ad4 -a -o on -S on -m root -M test -s (S/../.././04|L/../../6/05) -t -I 194 /dev/ad5 -a -o on -S on -m root -M test -s (S/../.././04|L/../../6/05) -t -I 194 full output of smartctl -a /dev/adX http://www.quip.cz/1/freebsd/asus_rs120-e3/track_SMART_ad4.txt http://www.quip.cz/1/freebsd/asus_rs120-e3/track_SMART_ad5.txt >>If I use gmirror activate -v gm0 ad5 I got >>Aug 2 10:24:03 track kernel: GEOM_MIRROR: Component ad5 (device gm0) >>broken, skipping. >>Aug 2 10:24:03 track kernel: GEOM_MIRROR: Cannot add disk ad5 to gm0 >>(error=22). > > > It's already activated, so you can't add it again (as the message states). But how can I force gmirror to re-use this disk? I don't know, what "broken, skipping" or "error=22" really means. >>I can successfuly mount partitions from drive ad5 like this >>mount /dev/ad5s2d /mnt >> >>(Aug 2 10:35:21 track kernel: WARNING: /vol0 was not properly dismounted) >> >>And read any files from this drive. > > > That shouldn't be a surprise-- the disks themselves didn't fail, only > writing to them (possibly under heavy load?) failed-- and gmirror dropped > the disks. The first disk drop was ok-- the mirror should still work in > DEGRADED state. The second drop was critical which is why your system > broke. Mounting the disks individually will work of course. This error occured after 5 days of periodical copying /usr/ports to another partition. (I used this to test disk/filesystem before deploying to production) Before this test, the server has another problems with disks and whole server was replaced with newone, only first drive (ad4) is from original machine. (originaly discussed on freebsd-stable@ - disk disappeared from ATA channel - not listed by atacontrol list command) >>Can anybody tell me, where is the problem / how can I found what is wrong? > > > What's the output of "gmirror status" ?? I suspect on a reboot, gmirror > will try to synchronize ad4 to ad5 (since ad5 was the first to drop). Once > that is complete, gmirror won't be DEGRADED anymore. # gmirror status Name Status Components mirror/gm0 DEGRADED ad4 gmirror list and atacontrol list output can be found on http://www.quip.cz/1/freebsd/asus_rs120-e3/track_gmirror_list.txt Gmirror is not synchronized after reboot: Aug 1 09:14:50 track kernel: acd0: DVDROM at ata0-slave UDMA100 Aug 1 09:14:50 track kernel: ad4: 238475MB at ata2-master SATA150 Aug 1 09:14:50 track kernel: GEOM_MIRROR: Device gm0 created (id=565164480). Aug 1 09:14:50 track kernel: GEOM_MIRROR: Device gm0: provider ad4 detected. Aug 1 09:14:50 track kernel: ad5: 238475MB at ata2-slave SATA150 Aug 1 09:14:50 track kernel: GEOM_MIRROR: Device gm0: provider ad5 detected. Aug 1 09:14:50 track kernel: GEOM_MIRROR: Component ad5 (device gm0) broken, skipping. Aug 1 09:14:50 track kernel: GEOM_MIRROR: Device gm0: provider ad4 activated. Aug 1 09:14:50 track kernel: GEOM_MIRROR: Device gm0: provider mirror/gm0 launched. Aug 1 09:14:50 track kernel: Trying to mount root from ufs:/dev/mirror/gm0s1a (also included in http://www.quip.cz/1/freebsd/asus_rs120-e3/track_messages.txt) So disk is OK, but gmirror refused to use it? If disks are OK, what is wrong? What caused READ / WRITE timeouts? Broken SATA controler? FreeBSD ATA driver? Miroslav Lachman