Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Apr 2011 06:51:41 +0200
From:      David Naylor <naylor.b.david@gmail.com>
To:        Garrett Cooper <yanegomi@gmail.com>
Cc:        pyunyh@gmail.com, Alexander Motin <mav@freebsd.org>, FreeBSD-Current <freebsd-current@freebsd.org>
Subject:   Re: [regression] unable to boot: no GEOM devices found.
Message-ID:  <201104130651.45408.naylor.b.david@gmail.com>
In-Reply-To: <BANLkTikfxNRvyL%2Bc5JCbix%2BQoaS%2B1V2wtw@mail.gmail.com>
References:  <mailpost.1302585106.8448174.20731.mailing.freebsd.current@FreeBSD.cs.nctu.edu.tw> <4DA4BF6A.7010806@FreeBSD.org> <BANLkTikfxNRvyL%2Bc5JCbix%2BQoaS%2B1V2wtw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--nextPart1709355.o4kG2Z27Zl
Content-Type: Text/Plain;
  charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

On Tuesday 12 April 2011 23:39:30 Garrett Cooper wrote:
> On Tue, Apr 12, 2011 at 2:08 PM, Alexander Motin <mav@freebsd.org> wrote:
> > YongHyeon PYUN wrote:
> >> On Tue, Apr 12, 2011 at 11:12:55PM +0300, Alexander Motin wrote:
> >>> David Naylor wrote:
> >>>> On Tuesday 12 April 2011 08:17:51 Alexander Motin wrote:
> >>>>> David Naylor wrote:
> >>>>>> I am running -current and since a few days ago (at least 2011/04/1=
1)
> >>>>>> I am unable to boot.
> >>>>>>=20
> >>>>>> The boot process stops when it looks to find a bootable device.  T=
he
> >>>>>> prompt (when pressing '?') does not display any device and yielding
> >>>>>> one second (or more) to the kernel (by pressing '.') does not
> >>>>>> improve the situation.
> >>>>>>=20
> >>>>>> A known working date is 2011/02/20.
> >>>>>>=20
> >>>>>> I am running amd64 on a nVidia MCP51 chipset.
> >>>>>=20
> >>>>> MCP51... again...
> >>>>>=20
> >>>>>> I am willing to help any way I can.
> >>>>>=20
> >>>>> You could start from capturing and showing verbose dmesg. Full or at
> >>>>> least in parts related to disks.
> >>>>=20
> >>>> I captured the dmesg output for both the old (working) kernel and the
> >>>> new (bad) kernel.  See attached for the difference between the two.
> >>>>  If you need the full dmesg please let me know.
> >>>>=20
> >>>> One thing I found is that the old kernel would not boot if I simply
> >>>> rebooted from the bad kernel.  I had to do a hard power off before
> >>>> the old kernel would work again.  Is some device state surviving
> >>>> between reboots?
> >>>=20
> >>> +ata2: reiniting channel ..
> >>> +ata2: SATA connect time=3D0ms status=3D00000113
> >>> +ata2: reset tp1 mask=3D01 ostat0=3D58 ostat1=3D00
> >>> +ata2: stat0=3D0x50 err=3D0x01 lsb=3D0x00 msb=3D0x00
> >>> +ata2: reset tp2 stat0=3D50 stat1=3D00 devices=3D0x1
> >>> +ata2: reinit done ..
> >>> +unknown: FAILURE - ATA_IDENTIFY timed out LBA=3D0
> >>>=20
> >>> As soon as all devices detected but not responding to commands, I wou=
ld
> >>> suppose that there is something wrong with ATA interrupts. There is a
> >>> long chain of interrupt problems in this chipset. I have already tried
> >>> to debug one case where ATA wasn't generating interrupts at all.
> >>> Unfortunately, without success -- requests were executing, but not
> >>> generating interrupts, it wasn't looked like ATA driver problem.
> >>>=20
> >>> What's about possible candidate to revision triggering your problem, I
> >>> would look on this message:
> >>> +pcib0: Enabling MSI window for HyperTransport slave at pci0:0:9:0
> >>>=20
> >>> At least it is recent (SVN revs 219737,219740 on 2011-03-18 by jhb) a=
nd
> >>> it is interrupt related.
> >>=20
> >> Does the driver disable MSI for MCP51?
> >=20
> > ata(4) doesn't uses MSI by default and I doubt this controller supports
> > them any way. But if I am not mixing something, there were very strange
> > situations with MSI on that chipset, when enabling them one one device
> > caused interrupt problems on another.
> >=20
> >> I think jhb's patch fixed one MSI issue of all MCP chipset.
> >=20
> > I am not telling it is wrong. It could just trigger something.
>=20
> Could the OP try disabling MSI[X] to see whether or not the issue
> still occurs then?
> -Garrett

I added:
hw.pci.enable_msi=3D0
hw.pci.enable_msix=3D0
to loader.conf but the problem persisted. =20

@mav: I will revert r219737 and r219740 and try again but this will be in +=
10=20
hours... =20

Thanks

--nextPart1709355.o4kG2Z27Zl
Content-Type: application/pgp-signature; name=signature.asc 
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (FreeBSD)

iEYEABECAAYFAk2lK+EACgkQUaaFgP9pFrKtkwCcDMr2BtREyyB5Q4EF4F4s6M8P
eQ8AnRf8/qPwSmW7kGWm2ve6otlr9+1Q
=m91d
-----END PGP SIGNATURE-----

--nextPart1709355.o4kG2Z27Zl--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201104130651.45408.naylor.b.david>