From owner-freebsd-stable@FreeBSD.ORG Mon Jul 19 20:33:22 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9D16F106564A for ; Mon, 19 Jul 2010 20:33:22 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta15.emeryville.ca.mail.comcast.net (qmta15.emeryville.ca.mail.comcast.net [76.96.27.228]) by mx1.freebsd.org (Postfix) with ESMTP id 7AD428FC12 for ; Mon, 19 Jul 2010 20:33:21 +0000 (UTC) Received: from omta06.emeryville.ca.mail.comcast.net ([76.96.30.51]) by qmta15.emeryville.ca.mail.comcast.net with comcast id joZ41e00416AWCUAFwZMsA; Mon, 19 Jul 2010 20:33:21 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta06.emeryville.ca.mail.comcast.net with comcast id jwZL1e0023LrwQ28SwZLHh; Mon, 19 Jul 2010 20:33:20 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 0F7B29B425; Mon, 19 Jul 2010 13:33:20 -0700 (PDT) Date: Mon, 19 Jul 2010 13:33:20 -0700 From: Jeremy Chadwick To: Mike Tancsa Message-ID: <20100719203320.GB21088@icarus.home.lan> References: <201007182108.o6IL88eG043887@lava.sentex.ca> <20100718211415.GA84127@icarus.home.lan> <201007182142.o6ILgDQW044046@lava.sentex.ca> <20100719023419.GA91006@icarus.home.lan> <201007190301.o6J31Hs1045607@lava.sentex.ca> <20100719033424.GA92607@icarus.home.lan> <201007191237.o6JCbmj7049339@lava.sentex.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201007191237.o6JCbmj7049339@lava.sentex.ca> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-stable@freebsd.org Subject: Re: deadlock or bad disk ? RELENG_8 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Jul 2010 20:33:22 -0000 On Mon, Jul 19, 2010 at 08:37:50AM -0400, Mike Tancsa wrote: > At 11:34 PM 7/18/2010, Jeremy Chadwick wrote: > >> > >> yes, da0 is a RAID volume with 4 disks behind the scenes. > > > >Okay, so can you get full SMART statistics for all 4 of those disks? > >The adjusted/calculated values for SMART thresholds won't be helpful > >here, one will need the actual raw SMART data. I hope the Areca CLI can > >provide that. > > I thought there was, but I cant seem to get the current smartctl to > work with the card. > > -d TYPE, --device=TYPE > Specifies the type of the device. The valid arguments to this > option are ata, scsi, sat, marvell, 3ware,N, areca,N, usbcy- > press, usbjmicron, usbsunplus, cciss,N, hpt,L/M (or hpt,L/M/N), > and test. > > # smartctl -a -d areca,0 /dev/arcmsr0 > smartctl 5.39.1 2010-01-28 r3054 [FreeBSD 8.1-PRERELEASE amd64] (local build) > Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net > > /dev/arcmsr0: Unknown device type 'areca,0' > =======> VALID ARGUMENTS ARE: ata, scsi, sat[,N][+TYPE], > usbcypress[,X], usbjmicron[,x][,N], usbsunplus, 3ware,N, hpt,L/M/N, > cciss,N, atacam, test <======= > > Use smartctl -h to get a usage summary According to the official smartctl documentation and man page, the "areca,N" argument is only supported on Linux. Bummer. Areca SATA RAID controllers are currently supported under Linux only. To look at SATA disks behind Areca RAID controllers, use syntax such as: smartctl -a -d areca,2 /dev/sg2 smartctl -a -d areca,3 /dev/sg3 > The latest CLI tool only gives this info > > CLI> disk info drv=1 > Drive Information > =============================================================== > IDE Channel : 1 > Model Name : ST31000340AS > Serial Number : 3QJ07F1N > Firmware Rev. : SD15 > Disk Capacity : 1000.2GB > Device State : NORMAL > Timeout Count : 0 > Media Error Count : 0 > Device Temperature : 29 C > SMART Read Error Rate : 108(6) > SMART Spinup Time : 91(0) > SMART Reallocation Count : 100(36) > SMART Seek Error Rate : 81(30) > SMART Spinup Retries : 100(97) > SMART Calibration Retries : N.A.(N.A.) > =============================================================== > GuiErrMsg<0x00>: Success. > > CLI> disk smart drv=1 > S.M.A.R.T Information For Drive[#01] > # Attribute Items Flag Value Thres State > =============================================================================== > 1 Raw Read Error Rate 0x0f 108 6 OK > 3 Spin Up Time 0x03 91 0 OK > 4 Start/Stop Count 0x32 100 20 OK > 5 Reallocated Sector Count 0x33 100 36 OK > 7 Seek Error Rate 0x0f 81 30 OK > 9 Power-on Hours Count 0x32 79 0 OK > 10 Spin Retry Count 0x13 100 97 OK > 12 Device Power Cycle Count 0x32 100 20 OK > 194 Temperature 0x22 29 0 OK > 197 Current Pending Sector Count 0x12 100 0 OK > 198 Off-line Scan Uncorrectable Sector Count 0x10 100 0 OK > 199 Ultra DMA CRC Error Count 0x3e 200 0 OK > =============================================================================== > GuiErrMsg<0x00>: Success. Yeah, this isn't going to help much. The raw SMART data isn't being shown. I downloaded the Areca CLI manual dated 2010/07 which doesn't state anything other than what you've already shown. Bummer. > >If so, think about what would happen if heavy I/O happened on > >both da0 and da1 at the same time. I talk about this a bit more below. > > No different than any other single disk being heavily worked. > Again, this particular hardware configuration has been beaten about > for a couple of years. So I am not sure why all of a sudden it would > be not possible to do That's a very good question, and I don't have an answer to it. I also would have a hard time believing that suddenly out of no where heavy I/O would exhibit this problem. I'm just going over possibilities. For example, I see that the da1 RAID volume is labelled "backup1", so if you were storing backups there possibly the I/O degrades over time as a result of there being more data/files, etc... Wouldn't have seen it a year ago, but might see it now. Just thinking out loud. > >situation (since you'd then be dedicating an entire disk to just swap). > >Others may have other advice. You mention in a later mail that the > >ada[0-3] disks make up a ZFS pool of some sort. You might try splitting > >ada0 into two slices, one for swap and the other used as a pool member. > > That seems like it would just move the problem you are trying to get > me to avoid to a different set of disks. If putting swap on a raid > array is a bad thing, I am not sure how moving it to a ZFS raid > array will help. The idea wasn't to move swap to ZFS (that's a bad idea from what I remember, something about crash dumps not working in that situation). My idea was to move swap to a dedicated partition on a disk that happens to also be used for ZFS. E.g.: ada0 ada0s1a = 20GB = swap ada0s1b = 980GB = ZFS pool ada1 = 1000GB = ZFS pool ada2 = 1000GB = ZFS pool ada3 = 1000GB = ZFS pool Again, this isn't a solution for the problem. I'm in no way trying to dissuade anyone from figuring out the root cause. But quite often on the list if someone can't get an answer to "why" they want to know what they can do as a workaround. There just happens to be reports of this problem going all the way back to RELENG_6, and all the posts I've read so far have been when people have had swap backed by some sort of RAID. > >Again: I don't think this is necessarily a bad disk problem. The only > >way you'd be able to determine that would be to monitor on a per-disk > >basis the I/O response time of each disk member on the Areca. If the > >CLI tools provide this, awesome. Otherwise you'll probably need to > >involve Areca Support. > > In the past when I have had bad disks on the areca, it did catch and > flag device timeouts. There were no such alerts leading up to this > situation. Yeah, which makes it sound more like a driver issue or something. I really don't know what to say. Areca does officially support FreeBSD so they might have some ideas. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |