From owner-freebsd-fs@FreeBSD.ORG  Fri Apr 12 22:03:51 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 99171D2A
 for <freebsd-fs@freebsd.org>; Fri, 12 Apr 2013 22:03:51 +0000 (UTC)
 (envelope-from jdc@koitsu.org)
Received: from qmta01.emeryville.ca.mail.comcast.net
 (qmta01.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:43:76:96:30:16])
 by mx1.freebsd.org (Postfix) with ESMTP id 7D9F61F83
 for <freebsd-fs@freebsd.org>; Fri, 12 Apr 2013 22:03:51 +0000 (UTC)
Received: from omta04.emeryville.ca.mail.comcast.net ([76.96.30.35])
 by qmta01.emeryville.ca.mail.comcast.net with comcast
 id NzsR1l00C0lTkoCA1A3rT0; Fri, 12 Apr 2013 22:03:51 +0000
Received: from koitsu.strangled.net ([67.180.84.87])
 by omta04.emeryville.ca.mail.comcast.net with comcast
 id PA3q1l00L1t3BNj8QA3qvQ; Fri, 12 Apr 2013 22:03:50 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
 id 5A40C73A33; Fri, 12 Apr 2013 15:03:50 -0700 (PDT)
Date: Fri, 12 Apr 2013 15:03:50 -0700
From: Jeremy Chadwick <jdc@koitsu.org>
To: Radio =?unknown-8bit?B?bcU/b2R5Y2ggYmFuZHl0w7N3?=
 <radiomlodychbandytow@o2.pl>
Subject: Re: A failed drive causes system to hang
Message-ID: <20130412220350.GA82467@icarus.home.lan>
References: <mailman.11.1365681601.78138.freebsd-fs@freebsd.org>
 <51672164.1090908@o2.pl> <20130411212408.GA60159@icarus.home.lan>
 <5168821F.5020502@o2.pl>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <5168821F.5020502@o2.pl>
User-Agent: Mutt/1.5.21 (2010-09-15)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net;
 s=q20121106; t=1365804231;
 bh=TL2F0//uUgilTub7gstIjuKD49GAHhekt+OfD5On0I4=;
 h=Received:Received:Received:Date:From:To:Subject:Message-ID:
 MIME-Version:Content-Type;
 b=NGpTahXGwnOpVjIQ+mkwFlgccFttQq/c56By/GK18mvQG7KA/5dgpfsufgZ4PYTu6
 PLYCnLVYNIfqIWyK2I4Egsm+ayKCAiAGaQPbr2bIbGuQkmM89JeGuGq1z0ezn/rvnT
 ZDw4ydb1c/m4h5b/MeNS2uEotGlBfaf6fwVTAqT2zJvpVWktHZisJWK4m3t7g1Xxn0
 mUZRuzuaal+c7cUm8gSiHhG/ZaDJzvkjFkJBDHXhTsRZUJ1w0fGzGNbIrpgWRvErjS
 vdOli56ExUQ86dnJh128vWmj7V0WZvBXo6dqiftu8o4Gvcdv1exbG6bmdbiWlJiGMx
 vYtM8eCY22FLA==
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Apr 2013 22:03:51 -0000

On Fri, Apr 12, 2013 at 11:52:31PM +0200, Radio m?odych bandytw wrote:
> On 11/04/2013 23:24, Jeremy Chadwick wrote:
> >On Thu, Apr 11, 2013 at 10:47:32PM +0200, Radio m?odych bandytw wrote:
> >>Seeing a ZFS thread, I decided to write about a similar problem that
> >>I experience.
> >>I have a failing drive in my array. I need to RMA it, but don't have
> >>time and it fails rarely enough to be a yet another annoyance.
> >>The failure is simple: it fails to respond.
> >>When it happens, the only thing I found I can do is switch consoles.
> >>Any command fails, login fails, apps hang.
> >>
> >>On the 1st console I see a series of messages like:
> >>
> >>(ada0:ahcich0:0:0:0): CAM status: Command timeout
> >>(ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
> >>(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED
> >>
> >>I use RAIDZ1 and I'd expect that none single failure would cause the
> >>system to fail...
> >
> >You need to provide full output from "dmesg", and you need to define
> >what the word "fails" means (re: "any command fails", "login fails").
> Fails = hangs. When trying to log it, I can type my user name, but
> after I press enter the prompt for password never appear.
> As to dmesg, tough luck. I have 2 photos on my phone and their
> transcripts are all I can give until the problem reappears (which
> should take up to 2 weeks). Photos are blurry and in many cases I'm
> not sure what exactly is there.
> 
> Screen1:
> (ada0:ahcich0:0:0:0): FLUSHCACHE40. ACB: (ea?) 00 00 00 00 (cut?)
> (ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-qu (cut)
> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 05 d3(cut)
> 00
> (ada0:ahcich0:0:0:0): CAM status: Command timeout
> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 03 7b(cut)
> 00
> (ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-qu (cut)
> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 03 d0(cut)
> 00
> (ada0:ahcich0:0:0:0): CAM status: Command timeout
> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
> 
> 
> Screen 2:
> ahcich0: Timeout on slot 29 port 0
> ahcich0: (unreadable, lots of numbers, some text)
> (aprobe0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: (cc?) 00 (cut)
> (aprobe0:ahcich0:0:0:0): CAM status: Command timeout
> (aprobe0:ahcich0:0:0:0): Error (5?), Retry was blocked
> ahcich0: Timeout on slot 29 port 0
> ahcich0: (unreadable, lots of numbers, some text)
> (aprobe0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: (cc?) 00 (cut)
> (aprobe0:ahcich0:0:0:0): CAM status: Command timeout
> (aprobe0:ahcich0:0:0:0): Error (5?), Retry was blocked
> ahcich0: Timeout on slot 30 port 0
> ahcich0: (unreadable, lots of numbers, some text)
> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 01 (cut)
> (ada0:ahcich0:0:0:0): CAM status: Command timeout
> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 01 (cut)
> 
> Both are from the same event. In general, messages:
> 
> (ada0:ahcich0:0:0:0): CAM status: Command timeout
> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED.
> 
> are the most common.
> 
> I've waited for more than 1/2 hour once and the system didn't return
> to a working state, the messages kept flowing and pretty much
> nothing was working. What's interesting, I remember that it happened
> to me even when I was using an installer (PC-BSD one), before the
> actual installation began, so the disk stored no program data. And I
> *think* there was no ZFS yet anyway.
> 
> >
> >I've already demonstrated that loss of a disk in raidz1 (or even 2 disks
> >in raidz2) does not cause ""the system to fail"" on stable/9.  However,
> >if you lose enough members or vdevs to cause catastrophic failure, there
> >may be anomalies depending on how your system is set up:
> >
> >http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html
> >
> >If the pool has failmode=wait, any I/O to that pool will block (wait)
> >indefinitely.  This is the default.
> >
> >If the pool has failmode=continue, existing write I/O operations will
> >fail with EIO (I/O error) (and hopefully applications/daemons will
> >handle that gracefully -- if not, that's their fault) but any subsequent
> >I/O (read or write) to that pool will block (wait) indefinitely.
> >
> >If the pool has failmode=panic, the kernel will immediately panic.
> >
> >If the CAM layer is what's wedged, that may be a different issue (and
> >not related to ZFS).  I would suggest running stable/9 as many
> >improvements in this regard have been committed recently (some related
> >to CAM, others related to ZFS and its new "deadman" watcher).
> 
> Yeah, because of the installer failure, I don't think it's related to ZFS.
> Even if it is, for now I won't set any ZFS properties in hope it
> repeats and I can get better data.
> >
> >Bottom line: terse output of the problem does not help.  Be verbose,
> >provide all output (commands you type, everything!), as well as any
> >physical actions you take.
> >
> Yep. In fact having little data was what made me hesitate to write
> about it; since I did already, I'll do my best to get more info,
> though for now I can only wait for a repetition.
> 
> 
> On 12/04/2013 00:08, Quartz wrote:>
> >> Seeing a ZFS thread, I decided to write about a similar problem that I
> >> experience.
> >
> > I'm assuming you're referring to my "Failed pool causes system to hang"
> > thread. I wonder if there's some common issue with zfs where it locks up
> > if it can't write to disks how it wants to.
> >
> > I'm not sure how similar your problem is to mine. What's your pool setup
> > look like? Redundancy options? Are you booting from a pool? I'd be
> > interested to know if you can just yank the cable to the drive and see
> > if the system recovers.
> >
> > You seem to be worse off than me- I can still login and run at least a
> > couple commands. I'm booting from a straight ufs drive though.
> >
> > ______________________________________
> > it has a certain smooth-brained appeal
> >
> Like I said, I don't think it's ZFS-specific, but just in case...:
> RAIDZ1, root on ZFS. I should reduce severity of a pool loss before
> pulling cables, so no tests for now.

Key points:

1. We now know why "commands hang" and anything I/O-related blocks
(waits) for you: because your root filesystem is ZFS.  If the ZFS layer
is waiting on CAM, and CAM is waiting on your hardware, then those I/O
requests are going to block indefinitely.  So now you know the answer to
why that happens.

2. I agree that the problem is not likely in ZFS, but rather either with
CAM, the AHCI implementation used, or hardware (either disk or storage
controller).

3. Your lack of "dmesg" is going to make this virtually impossible to
solve.  We really, ***really*** need that.  I cannot stress this enough.
This will tell us a lot of information about your system.  We're also
going to need to see "zpool status" output, as well as "zpool get all"
and "zfs get all".  "pciconf -lvbc" would also be useful.

There are some known "gotchas" with certain models of hard disks or AHCI
controllers (which is responsible is unknown at this time), but I don't
want to start jumping to conclusions until full details can be provided
first.

I would recommend formatting a USB flash drive as FAT/FAT32, booting
into single-user mode, then mounting the USB flash drive and issuing
the above commands + writing the output to files on the flash drive,
then provide those here.

We really need this information.

4. Please involve the PC-BSD folks in this discussion.  They need to be
made aware of issues like this so they (and iXSystems, potentially) can
investigate from their side.

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |