Date: Fri, 22 Jul 2011 19:22:46 +0300 From: Alexander Motin <mav@FreeBSD.org> To: lev@FreeBSD.org Cc: freebsd-hardware@freebsd.org Subject: Re: ahci.ko / geom_mirror / zfs hangs up system when one of HDDs fauilts. Message-ID: <4E29A3D6.1080609@FreeBSD.org> In-Reply-To: <1981757790.20110720013856@serebryakov.spb.ru> References: <1981757790.20110720013856@serebryakov.spb.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
Lev Serebryakov wrote: > I've have two identical live locks when HDD becomes broken on > 8.2-STABLE system with two SATA HDDs withgmirror and ZFS on them. > > It is Hetzner-based server, so only access I have is LARA console, > but symptoms are identical in both cases: HDD becomes bad, ahci.ko > complains about timeouts, and after that server stops to respond on > high-level access attempts (ssh/HTTP/SMTP), but can be pinged both > with IPv4 and IPv6 addresses. > > HDDs are identical, and they are splitted into several (BSD)partions. > Some partitions are mirrired with geom_mirror and one pair of > partitions are added to (mirrored) ZFS pool like this (I proved output > on rebooted one-HDD-only system, but, I think, it is clear how it > looks when both HDDs are Ok): > > Screenshot of LARA console in such case is attached. Kernel messages look like if controller or device stuck, unable to complete some command and can't recover from that condition even after device hard reset. I don't see what driver can do about it, except being more aggressive in dropping faulty device after several consecutive timeouts. If that is not a wanted way out, start from updating card BIOS and devices firmware. -- Alexander Motin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4E29A3D6.1080609>