Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 25 Jun 2011 07:27:20 -0400
From:      "Justin T. Gibbs" <gibbs@FreeBSD.org>
To:        Andrey Chernov <ache@FreeBSD.org>, Scott Long <scottl@samsco.org>, Kostik Belousov <kostikbel@gmail.com>, Eir Nym <eirnym@gmail.com>, "Kenneth D. Merry" <ken@FreeBSD.org>, current@FreeBSD.org, will@FreeBSD.org
Subject:   Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)
Message-ID:  <4E05C618.8070703@FreeBSD.org>
In-Reply-To: <20110625043916.GA78847@vniz.net>
References:  <20110621204934.GB9877@vniz.net> <20110622035404.GA38834@nargothrond.kdm.org> <20110622041325.GA13754@vniz.net> <20110622200919.GA72504@nargothrond.kdm.org> <4E03FDFD.70203@FreeBSD.org> <55FDA4B1-CA5E-4304-9239-3AAF0FC6FF5F@samsco.org> <C7D47D8B-5E29-4066-892A-F547F6DB9E8B@samsco.org> <4E04F188.9030105@FreeBSD.org> <20110624222645.GA75222@vniz.net> <4E053534.4080205@FreeBSD.org> <20110625043916.GA78847@vniz.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On 6/25/11 12:39 AM, Andrey Chernov wrote:
>  On Fri, Jun 24, 2011 at 09:09:08PM -0400, Justin T. Gibbs wrote:
> >> No problem. I just set kern.geom.debugflags=4 in loader.conf and here is
> >> new photo (with recent kernel, no patches):
> >> http://img803.imageshack.us/img803/4679/25062011006.jpg
> >> I skip all noisy parts related to ada0 and ada1 partitions probes.
> >> As you can see, only 3 cd0-related geom call issued, right before cd1
> >> probe shown. Strange thing is that I see no single cd1-related geom
> >> call, but it may be because of hang.
> >
> > The GEOM processing is serialized, so that is not unexpected. What your
> > logs are telling me is that the probe for CD0 is hanging. I don't know
> > why.
>
>  Could you just postpone GEOM calls after any probe will be completed? It
>  seems GEOM goes here even before probe and waits for probe forever. What
>  probe waits in the same time is unclear for me (ccb_scan), but CD devices
>  are slow and may not survive such multisleeping, missing some 
responses in
>  the middle.

The problem is not GEOM.  It's not the thread waiting in ccb_scan - that
thread is designed to wait there until an asynchronous device
arrival/departure event occurs which is not the case here.  The problem 
is in
or below CAM, and that problem is causing the probe to never complete.

> > Are you positive it is this specific SVN revision that prevents cd0
> > from probing properly and not one of my previous CAM commits?

>  I use splitting by half method to find exact date which boots, then see
>  the next commit above that date. Pre-commit kernel goes to multiuser and
>  network is alive. I don't test CDs are working, I'll do that later and
>  report it.

So you know that revisions 223081, 223084, 223085, and 223089 all boot
just fine?  I committed five revisions on that date.  223099 just happens
to be the last one for that day.

--
Justin




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4E05C618.8070703>