Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 24 Mar 2013 14:54:14 -0400
From:      Quartz <quartz@sneakertech.com>
To:        Jeremy Chadwick <jdc@koitsu.org>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: ZFS: Failed pool causes system to hang
Message-ID:  <514F4BD6.1060807@sneakertech.com>
In-Reply-To: <20130324155448.GA4122@icarus.home.lan>
References:  <20130321044557.GA15977@icarus.home.lan> <514AA192.2090006@sneakertech.com> <20130321085304.GB16997@icarus.home.lan> <20130324153342.GA3687@icarus.home.lan> <20130324155448.GA4122@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
>> However, commands like "zpool status"
>
> ...and seems a typo I made in vim caused the rest of my sentence to get
> deleted before I sent it out.  This should have read:
>
>> However, commands like "zpool status" work just fine, but things like
>> "zpool destroy" and so on indefinitely block ("mount drain"), which to
>> me makes some degree of sense.

I'll have to double check this. I *know* I've run status and had it 
hang, but I'm not 100% certain if I've done it fast enough to guarantee 
that something else didn't hit the pool first.


> Yes, you will need to reboot for the ZFS layer to effectively "un-wedge"
> itself from whatever catatonic state its in.  No argument: this is a bug
> somewhere, and my guess is that it relates to the confused state of the
> devices in CAM-land.  But regardless, I think if you were to lose 3 of 4
> disks on a raidz2 pool you'd have much more serious things to be worried
> about than "well crap I have to issue a reboot".

My concern is proper investigation and damage control. The "it stopped 
working, guess I should reboot" is the windows way of administration. In 
the case of serious hardware failure, rebooting or otherwise continuing 
to provide power to the affected devices can be a very BAD thing. I'd 
like to have some idea of what the heck happened before I blindly 
powercycle something.


> And yes, I did test a reboot in the scenario I described -- the system
> did reboot without physically pressing the button.

It *never* does for me. Ever.


> People who run servers remotely yet lack this capability are
> intentionally choosing [snip]

Before you get up on a high horse and preach at me, consider a couple 
things:

1) Yes I can set that up, but this is a test box on my desk right now.

2) A hard reset is a hard reset is a hard reset. I'm not bitching that I 
have to physically walk over to the machine, I'm bitching that *THAT I 
HAVE TO RESET IT*. Being able to reset it remotely is NOT an acceptable 
solution or workaround, and has no bearing on my problem.

______________________________________
it has a certain smooth-brained appeal



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?514F4BD6.1060807>