Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 14 Apr 2013 12:34:46 +0200
From:      =?UTF-8?B?UmFkaW8gbcWCb2R5Y2ggYmFuZHl0w7N3?= <radiomlodychbandytow@o2.pl>
To:        Jeremy Chadwick <jdc@koitsu.org>
Cc:        freebsd-fs@freebsd.org, support@lists.pcbsd.org
Subject:   Re: A failed drive causes system to hang
Message-ID:  <516A8646.4000101@o2.pl>
In-Reply-To: <20130413000731.GA84309@icarus.home.lan>
References:  <mailman.11.1365681601.78138.freebsd-fs@freebsd.org> <51672164.1090908@o2.pl> <20130411212408.GA60159@icarus.home.lan> <5168821F.5020502@o2.pl> <20130412220350.GA82467@icarus.home.lan> <51688BA6.1000507@o2.pl> <20130413000731.GA84309@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
On 13/04/2013 02:07, Jeremy Chadwick wrote:
>
>
> On Sat, Apr 13, 2013 at 12:33:10AM +0200, Radio m?odych bandytw wrote:
>> On 13/04/2013 00:03, Jeremy Chadwick wrote:
>>> On Fri, Apr 12, 2013 at 11:52:31PM +0200, Radio m?odych bandytw wrote:
>>>> On 11/04/2013 23:24, Jeremy Chadwick wrote:
>>>>> On Thu, Apr 11, 2013 at 10:47:32PM +0200, Radio m?odych bandytw wrote:
>>>>>> Seeing a ZFS thread, I decided to write about a similar problem that
>>>>>> I experience.
>>>>>> I have a failing drive in my array. I need to RMA it, but don't have
>>>>>> time and it fails rarely enough to be a yet another annoyance.
>>>>>> The failure is simple: it fails to respond.
>>>>>> When it happens, the only thing I found I can do is switch consoles.
>>>>>> Any command fails, login fails, apps hang.
>>>>>>
>>>>>> On the 1st console I see a series of messages like:
>>>>>>
>>>>>> (ada0:ahcich0:0:0:0): CAM status: Command timeout
>>>>>> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
>>>>>> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED
>>>>>>
>>>>>> I use RAIDZ1 and I'd expect that none single failure would cause the
>>>>>> system to fail...
>>>>>
>>>>> You need to provide full output from "dmesg", and you need to define
>>>>> what the word "fails" means (re: "any command fails", "login fails").
>>>> Fails = hangs. When trying to log it, I can type my user name, but
>>>> after I press enter the prompt for password never appear.
>>>> As to dmesg, tough luck. I have 2 photos on my phone and their
>>>> transcripts are all I can give until the problem reappears (which
>>>> should take up to 2 weeks). Photos are blurry and in many cases I'm
>>>> not sure what exactly is there.
>>>>
>>>> Screen1:
>>>> (ada0:ahcich0:0:0:0): FLUSHCACHE40. ACB: (ea?) 00 00 00 00 (cut?)
>>>> (ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-qu (cut)
>>>> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
>>>> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 05 d3(cut)
>>>> 00
>>>> (ada0:ahcich0:0:0:0): CAM status: Command timeout
>>>> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
>>>> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 03 7b(cut)
>>>> 00
>>>> (ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-qu (cut)
>>>> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
>>>> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 03 d0(cut)
>>>> 00
>>>> (ada0:ahcich0:0:0:0): CAM status: Command timeout
>>>> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
>>>>
>>>>
>>>> Screen 2:
>>>> ahcich0: Timeout on slot 29 port 0
>>>> ahcich0: (unreadable, lots of numbers, some text)
>>>> (aprobe0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: (cc?) 00 (cut)
>>>> (aprobe0:ahcich0:0:0:0): CAM status: Command timeout
>>>> (aprobe0:ahcich0:0:0:0): Error (5?), Retry was blocked
>>>> ahcich0: Timeout on slot 29 port 0
>>>> ahcich0: (unreadable, lots of numbers, some text)
>>>> (aprobe0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: (cc?) 00 (cut)
>>>> (aprobe0:ahcich0:0:0:0): CAM status: Command timeout
>>>> (aprobe0:ahcich0:0:0:0): Error (5?), Retry was blocked
>>>> ahcich0: Timeout on slot 30 port 0
>>>> ahcich0: (unreadable, lots of numbers, some text)
>>>> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 01 (cut)
>>>> (ada0:ahcich0:0:0:0): CAM status: Command timeout
>>>> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
>>>> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 01 (cut)
>>>>
>>>> Both are from the same event. In general, messages:
>>>>
>>>> (ada0:ahcich0:0:0:0): CAM status: Command timeout
>>>> (ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
>>>> (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED.
>>>>
>>>> are the most common.
>>>>
>>>> I've waited for more than 1/2 hour once and the system didn't return
>>>> to a working state, the messages kept flowing and pretty much
>>>> nothing was working. What's interesting, I remember that it happened
>>>> to me even when I was using an installer (PC-BSD one), before the
>>>> actual installation began, so the disk stored no program data. And I
>>>> *think* there was no ZFS yet anyway.
>>>>
>>>>>
>>>>> I've already demonstrated that loss of a disk in raidz1 (or even 2 disks
>>>>> in raidz2) does not cause ""the system to fail"" on stable/9.  However,
>>>>> if you lose enough members or vdevs to cause catastrophic failure, there
>>>>> may be anomalies depending on how your system is set up:
>>>>>
>>>>> http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html
>>>>>
>>>>> If the pool has failmode=wait, any I/O to that pool will block (wait)
>>>>> indefinitely.  This is the default.
>>>>>
>>>>> If the pool has failmode=continue, existing write I/O operations will
>>>>> fail with EIO (I/O error) (and hopefully applications/daemons will
>>>>> handle that gracefully -- if not, that's their fault) but any subsequent
>>>>> I/O (read or write) to that pool will block (wait) indefinitely.
>>>>>
>>>>> If the pool has failmode=panic, the kernel will immediately panic.
>>>>>
>>>>> If the CAM layer is what's wedged, that may be a different issue (and
>>>>> not related to ZFS).  I would suggest running stable/9 as many
>>>>> improvements in this regard have been committed recently (some related
>>>>> to CAM, others related to ZFS and its new "deadman" watcher).
>>>>
>>>> Yeah, because of the installer failure, I don't think it's related to ZFS.
>>>> Even if it is, for now I won't set any ZFS properties in hope it
>>>> repeats and I can get better data.
>>>>>
>>>>> Bottom line: terse output of the problem does not help.  Be verbose,
>>>>> provide all output (commands you type, everything!), as well as any
>>>>> physical actions you take.
>>>>>
>>>> Yep. In fact having little data was what made me hesitate to write
>>>> about it; since I did already, I'll do my best to get more info,
>>>> though for now I can only wait for a repetition.
>>>>
>>>>
>>>> On 12/04/2013 00:08, Quartz wrote:>
>>>>>> Seeing a ZFS thread, I decided to write about a similar problem that I
>>>>>> experience.
>>>>>
>>>>> I'm assuming you're referring to my "Failed pool causes system to hang"
>>>>> thread. I wonder if there's some common issue with zfs where it locks up
>>>>> if it can't write to disks how it wants to.
>>>>>
>>>>> I'm not sure how similar your problem is to mine. What's your pool setup
>>>>> look like? Redundancy options? Are you booting from a pool? I'd be
>>>>> interested to know if you can just yank the cable to the drive and see
>>>>> if the system recovers.
>>>>>
>>>>> You seem to be worse off than me- I can still login and run at least a
>>>>> couple commands. I'm booting from a straight ufs drive though.
>>>>>
>>>>> ______________________________________
>>>>> it has a certain smooth-brained appeal
>>>>>
>>>> Like I said, I don't think it's ZFS-specific, but just in case...:
>>>> RAIDZ1, root on ZFS. I should reduce severity of a pool loss before
>>>> pulling cables, so no tests for now.
>>>
>>> Key points:
>>>
>>> 1. We now know why "commands hang" and anything I/O-related blocks
>>> (waits) for you: because your root filesystem is ZFS.  If the ZFS layer
>>> is waiting on CAM, and CAM is waiting on your hardware, then those I/O
>>> requests are going to block indefinitely.  So now you know the answer to
>>> why that happens.
>>>
>>> 2. I agree that the problem is not likely in ZFS, but rather either with
>>> CAM, the AHCI implementation used, or hardware (either disk or storage
>>> controller).
>>>
>>> 3. Your lack of "dmesg" is going to make this virtually impossible to
>>> solve.  We really, ***really*** need that.  I cannot stress this enough.
>>> This will tell us a lot of information about your system.  We're also
>>> going to need to see "zpool status" output, as well as "zpool get all"
>>> and "zfs get all".  "pciconf -lvbc" would also be useful.
>>>
>>> There are some known "gotchas" with certain models of hard disks or AHCI
>>> controllers (which is responsible is unknown at this time), but I don't
>>> want to start jumping to conclusions until full details can be provided
>>> first.
>>>
>>> I would recommend formatting a USB flash drive as FAT/FAT32, booting
>>> into single-user mode, then mounting the USB flash drive and issuing
>>> the above commands + writing the output to files on the flash drive,
>>> then provide those here.
>>>
>>> We really need this information.
>>>
>>> 4. Please involve the PC-BSD folks in this discussion.  They need to be
>>> made aware of issues like this so they (and iXSystems, potentially) can
>>> investigate from their side.
>>>
>> OK, thanks for the info.
>> Since dmesg is so important, I'd say the best thing is to wait for
>> the problem to happen again. When it does, I'll restart the thread
>> with every information that you requested here and with a PC-BSD
>> cross-post.
>>
>> However, I just got a different hang just a while ago. This time it
>> was temporary, I don't know, I switched to console0 after ~10
>> seconds, there were 2 errors. Nothing appeared for ~1 minute, so I
>> switched back and the system was OK. Different drive, I haven't seen
>> problems with this one. And I think they used to be ahci, here's
>> ata.
>>
>> dmesg:
>>
>> fuse4bsd: version 0.3.9-pre1, FUSE ABI 7.19
>> (ada1:ata0:0:0:0): READ_DMA48. ACB: 25 00 82 46 b8 40 25 00 00 00 01 00
>> (ada1:ata0:0:0:0): CAM status: Command timeout
>> (ada1:ata0:0:0:0): Retrying command
>> vboxdrv: fAsync=0 offMin=0x53d offMax=0x52b9
>> linux: pid 17170 (npviewer.bin): syscall pipe2 not implemented
>> (ada1:ata0:0:0:0): READ_DMA48. ACB: 25 00 87 1a c7 40 1a 00 00 00 01 00
>> (ada1:ata0:0:0:0): CAM status: Command timeout
>> (ada1:ata0:0:0:0): Retrying command
>>
>> {another 150KBytes of data snipped}
>
> The above output indicates that there was a timeout when trying to issue
> a 48-bit DMA request to the disk.  The disk did not respond to the
> request within 30 seconds.
>
> If you were using AHCI, we'd be able to see if the AHCI layer was
> reporting signalling problems or other anomalies that could explain the
> behaviour.  With ATA, such is significantly limited.  It's worse if
> you're hiding/not showing us the entire information.
>
> The classic FreeBSD ATA driver does not provide command queueing (NCQ),
> while AHCI via CAM does.  The difference is that command queueing causes
> xxx_FPDMA_QUEUED CDBs to be issued to the disk.
>
> I'm going to repeat myself -- for the last time: CAN YOU PLEASE JUST
> PROVIDE "DMESG" FROM THE SYSTEM?  Like after a fresh reboot?  If you're
> able to provide all of the above, I don't know why you can't provide
> dmesg.  It is the most important information that there is.  I am sick
> and tired of stressing this point.
Sorry. I thought just the error was important. So here you are:
dmesg.boot:
http://pastebin.com/LFXPusMX
>
> Furthermore, please stop changing ATA vs. AHCI interface drivers.
> The more you change/screw around with, the less likely people are going
> to help.  CHANGE NOTHING ON THE SYSTEM.  Leave it how it is.  Do not
> fiddle with things or start flipping switches/changing settings/etc. to
> "try and relieve the problem".  You're asking other people for help,
> which means you need to be patient and follow what we ask.
I haven't changed one bit myself. It may have been a change of defaults 
in PC-BSD. I just asked them about it.
Or maybe different drives use different drivers.

>
> Thank you for the rest of the output, however.  It looks like this is
> another system with an ATI-based controller (which is usually the kind
> involved in my aforementioned "gotchas"), but there still isn't enough
> information that can help.  I have a gut feeling of what's about to
> come, but I need to see dmesg output before I can determine that.
>
> Furthermore, can you please provide this information with its formatting
> intact?  Your Email client is screwing up "long lines" and causing
> unnecesary wrapping.
>
> The mailing list will nuke attachments, so please use pastebin or some
> similar service + provide URLs.
pciconf -lvbc:
http://pastebin.com/vvCKAWm1
zpool status:
http://pastebin.com/D3Av7x9X
zfs get all:
http://pastebin.com/4sT37VqZ
zpool get all tank1:
http://pastebin.com/HZJTJPa2
-- 
Twoje radio



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?516A8646.4000101>