Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Aug 2008 01:50:16 -0700
From:      Jeremy Chadwick <koitsu@FreeBSD.org>
To:        Antony Mawer <fbsd-stable@mawer.org>
Cc:        stable@freebsd.org
Subject:   Re: Finding which GEOM provider is generating errors in a graid3
Message-ID:  <20080827085016.GA75552@icarus.home.lan>
In-Reply-To: <48B51003.4060207@mawer.org>
References:  <48B51003.4060207@mawer.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Aug 27, 2008 at 06:27:47PM +1000, Antony Mawer wrote:
> I have a FreeBSD 6.2-based server running a 1.2TB graid3 volume, which  
> consists of 5x 320gb SATA hard drives. I've been getting errors in  
> /var/log/messages from the graid3 volume, which I suspect means an  
> underlying fault with one of the disks, but is there any way to decipher  
> which one of these drives is throwing errors?
>
> I've checked smartctl -a /dev/adXX but nothing shows up there..

When you say "nothing shows up there", what exactly do you mean?  A lot
of people don't know how to read SMART statistics.  I hope by "nothing
shows up there" you mean "nothing stands out"

> I'm wondering if this is the infamous ata driver bug(s) that may be
> rearing its ugly head..

The bugs in question only apply when there's kernel messages coming from
the *disks themselves*, and not a GEOM provider.  Your below dmesg
doesn't indicate there's any ATA errors, just GEOM errors.  If the
disks were failing, you *would* be getting errors from the ATA
subsystem, but you're not.

I'm not familiar with GEOM "stuff", so I can't really comment on what
all is going on here.

> Also, does anyone know what "ZoneXXFailed" items in the graid3 list
> output mean?
>
> Relevant output:
>
> $ graid3 status Name    Status  Components raid3/data1  COMPLETE  ad12
> ad14 ad16 ad18 ad20
>
> $ graid3 list Geom name: data1 State: COMPLETE Components: 5 Flags:
> VERIFY GenID: 0 SyncID: 1 ID: 3700500186 Zone64kFailed: 791239
> Zone64kRequested: 49197268 Zone16kFailed: 40204 Zone16kRequested:
> 1283738 Zone4kFailed: 12005939 Zone4kRequested: 2445799003 Providers:
> 1. Name: raid3/data1 Mediasize: 1280291731456 (1.2T) Sectorsize: 2048
> Mode: r1w1e1 ...
>
> $ atacontrol list ...  ATA channel 6: Master: ad12 <ST3320620AS/3.AAK>
> Serial ATA v1.0 ATA channel 7: Master: ad14 <ST3320620AS/3.AAK> Serial
> ATA v1.0 ATA channel 8: Master: ad16 <ST3320620AS/3.AAK> Serial ATA
> v1.0 ATA channel 9: Master: ad18 <ST3320620AS/3.AAK> Serial ATA v1.0
> ATA channel 10: Master: ad20 <ST3320620AS/3.AAK> Serial ATA v1.0
>
>
> Output in /var/log/messages:
>
>> Aug 27 17:17:27 backup kernel:
>> g_vfs_done():raid3/data1[READ(offset=160320159744,
>> length=16384)]error = 5 Aug 27 17:25:45 backup kernel:
>> g_vfs_done():raid3/data1[READ(offset=160320159744,
>> length=16384)]error = 5 Aug 27 17:25:45 backup last message repeated
>> 7 times Aug 27 17:25:45 backup kernel:
>> g_vfs_done():raid3/data1[READ(offset=160320176128,
>> length=16384)]error = 5 Aug 27 17:25:45 backup last message repeated
>> 22 times Aug 27 17:25:45 backup kernel:
>> g_vfs_done():raid3/data1[READ(offset=160320192512,
>> length=16384)]error = 5 Aug 27 17:25:45 backup last message repeated
>> 21 times Aug 27 17:38:24 backup kernel:
>> g_vfs_done():raid3/data1[READ(offset=160320176128,
>> length=16384)]error = 5 Aug 27 17:38:26 backup last message repeated
>> 4 times Aug 27 17:46:02 backup kernel:
>> g_vfs_done():raid3/data1[READ(offset=160320159744,
>> length=16384)]error = 5 Aug 27 17:53:48 backup kernel:
>> g_vfs_done():raid3/data1[READ(offset=160320159744,
>> length=16384)]error = 5 Aug 27 17:53:48 backup last message repeated
>> 7 times Aug 27 17:53:48 backup kernel:
>> g_vfs_done():raid3/data1[READ(offset=160320176128,
>> length=16384)]error = 5 Aug 27 17:53:48 backup last message repeated
>> 22 times Aug 27 17:53:48 backup kernel:
>> g_vfs_done():raid3/data1[READ(offset=160320192512,
>> length=16384)]error = 5 Aug 27 17:53:49 backup last message repeated
>> 21 times

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080827085016.GA75552>