Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 22 May 2009 11:39:29 -0700
From:      Kip Macy <kmacy@freebsd.org>
To:        Joe Karthauser <joe@freebsd.org>
Cc:        Alexander Motin <mav@freebsd.org>, freebsd-stable@freebsd.org
Subject:   Re: ZFS hanging at kernel boot now, but didn't before... (Re: ZFS MFC heads up)
Message-ID:  <3c1674c90905221139i6f335062k74d641b7c91c188c@mail.gmail.com>
In-Reply-To: <4A16E37C.2030005@freebsd.org>
References:  <3c1674c90905201459k19776d53n309b2abeab0f8d0a@mail.gmail.com> <200905202209.n4KM9Bcg094853@lava.sentex.ca> <3c1674c90905201541n65f997e6jaa20d93bf566fb98@mail.gmail.com> <68BDAD74-021A-4169-B003-21A2BCF2AD5C@transsys.com> <4A156AD7.8000003@icyb.net.ua> <4A159482.9080903@freebsd.org> <3c1674c90905211128n45814519o903ee2b6eb3cf195@mail.gmail.com> <4A16E37C.2030005@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Motin is your best bet in tracking down ATA problems.

Cheers,
Kip


On Fri, May 22, 2009 at 10:40 AM, Joe Karthauser <joe@freebsd.org> wrote:
> Hi Kip,
>
> I seriously don't understand what has happened. If I boot kernel.old I st=
ill
> get the same problem. Very confusing. :(.
>
> Joe
>
> on 21/05/2009 19:28 Kip Macy said the following:
>>
>> I have no idea what is happening. I think our best bet is having
>> someone with insight into ATA provide us with help in adding
>> diagnostics.
>>
>> Sorry for the trouble. Perhaps you can just roll back to 7.2 for now.
>>
>> Cheers,
>> Kip
>>
>>
>> On Thu, May 21, 2009 at 10:50 AM, Joe Karthauser<joe@freebsd.org> =A0wro=
te:
>>>
>>> Hmm, I've had a bit of a miserable afternoon trying to fight my RELENG_=
7
>>> server, which now doesn't boot. :(.
>>>
>>> So, it's a ZRAID2 pool with a ufs/gmirror root partition split over 5
>>> disks
>>> (gmirror on 500Mb partition on each of five disks, and zraid2 over the
>>> rest
>>> of each drive).
>>>
>>> What I did was to update the userland, and then reboot. I didn't upgrad=
e
>>> the
>>> kernel (but I've subsequently done that and have the same problem).
>>>
>>> What happens is that the kernel hangs booting just after displaying a
>>> LABEL
>>> message or ZFS pool/spool message. I _can_ get it to boot if I boot
>>> single
>>> user with acpi switched off. When I do that I can manually start zfs, a=
nd
>>> mount all the partitions. However, one of the disks is missing.... more
>>> on
>>> that next.
>>>
>>> The machine is running a gigabyte motherboard (domestic gamer P35 board=
,
>>> similar to this
>>>
>>> http://www.gigabyte.com.tw/Products/Motherboard/Products_Overview.aspx?=
ProductID=3D2533,
>>> although it might be a DS4 variant). =A0I've got 5 of the 6 sata ports
>>> wired
>>> to a 5 unit SATA hot swap bay (5 drives vertially mounted into 3 5-1/4"
>>> bays
>>> kind of thing).
>>>
>>> Now, because of the gmirror I can boot the system on any disk, or
>>> combination of plugged in disks. I should be able to succeed with the
>>> kernel probe up to the attempt to mount the root filesystem irrespectiv=
e
>>> of
>>> any zfs pool, etc. And, indeed, this has been working fine for about tw=
o
>>> years.
>>>
>>> But, now it hangs in the same place no matter what disk I boot on (I've
>>> tried every bay).
>>>
>>> But, without ACPI enabled it does appear to boot ok... what's going on
>>> here?
>>> Is it possible that the machine has developed a hardware fault?
>>>
>>> Ok, finally, if I boot with ACPI disabled then one of the disks is
>>> missing.
>>> If I unplug it I get a disconnect message from the ata device, and a
>>> reconnect and reinit attempt when I plug it back in, but no device
>>> appears
>>> on the bus. Usually I can do a 'atacontrol detach sata4; sleep 1;
>>> atacontrol
>>> attach sata4' and the device reappears. This happens on the other buses=
,
>>> but
>>> not on the last one. It's not the disk, because if I swap it into anoth=
er
>>> bay, it comes up and appears on the bus. On the other hand it doesn't
>>> appear
>>> to be that controller or slow in the drive bay because if I unplug all
>>> the
>>> over disks the system will boot that disk and get as far as the hang...=
.
>>> hmm.
>>>
>>> Is this a consequence of disabling the ACPI?
>>>
>>> Does anyone have a clue what might be going on?
>>>
>>> Joe
>>> _______________________________________________
>>> freebsd-stable@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.or=
g"
>>>
>>
>>
>>
>
>



--=20
When bad men combine, the good must associate; else they will fall one
by one, an unpitied sacrifice in a contemptible struggle.

    Edmund Burke



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3c1674c90905221139i6f335062k74d641b7c91c188c>