From owner-freebsd-stable@FreeBSD.ORG Mon Mar 28 07:21:01 2005 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 60BE816A4CE for ; Mon, 28 Mar 2005 07:21:01 +0000 (GMT) Received: from mail07.syd.optusnet.com.au (mail07.syd.optusnet.com.au [211.29.132.188]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8181B43D46 for ; Mon, 28 Mar 2005 07:21:00 +0000 (GMT) (envelope-from gmenhennitt@optusnet.com.au) Received: from [203.2.73.8] (c220-237-137-84.mckinn1.vic.optusnet.com.au [220.237.137.84])j2S7Kvid019691; Mon, 28 Mar 2005 17:20:58 +1000 Message-ID: <4247AFDB.1060307@optusnet.com.au> Date: Mon, 28 Mar 2005 17:18:51 +1000 From: Graham Menhennitt User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b) Gecko/20050217 MIME-Version: 1.0 To: Doug White References: <42436771.3060006@optusnet.com.au> <20050325133558.U16071@carver.gumbysoft.com> <42449BCE.7010600@optusnet.com.au> <20050327130409.F35584@carver.gumbysoft.com> In-Reply-To: <20050327130409.F35584@carver.gumbysoft.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-stable@freebsd.org Subject: Re: "ffs_mountroot: can't find rootvp" after cvsup and making worldfmen X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Mar 2005 07:21:01 -0000 Doug White wrote: >On Sat, 26 Mar 2005, Graham Menhennitt wrote: > > > >>>>I just cvsupped to the latest RELENG_5 (as of yesterday) and built and >>>>installed the world and a new kernel. When I boot the new kernel, I get >>>>an error "ffs_mountroot: can't find rootvp". At the "mountroot>" prompt, >>>>whatever I type (even '?') causes a crash and reboot. I can still boot >>>>my old kernel without a problem. The dmesg from the old kernel and a >>>>capture of the boot of the new kernel are below. Noticably absent from >>>>the new one is the line "ad0: 76319MB [155061/16/63] at >>>>ata0-master UDMA100" which is my only disk drive. >>>> >>>> > >Hm .. from the -v output it looks like the first ATA channel is not >resetting properly for the probe. We'll need to get a better idea of when >the problem appears since a lot has changed between January and now. > > Doug, I compared the output of "boot -v" for the working and broken kernels. It seems that the broken one does fewer loops around the disk probe and hence has less lines of ata0-master: stat=0x90 err=0x90 lsb=0x90 msb=0x90 than the one that works. Since that line comes from ata-lowlevel.c, I cvs'ed versions of that file going back to around when I built the working kernel. The following seems to be the change that broke it. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --- ata-lowlevel.c Mon Mar 28 15:59:57 2005 +++ ata-lowlevel.c_orig Wed Mar 23 19:17:46 2005 @@ -605,19 +605,26 @@ } } if (mask == 0x01) /* wait for master only */ - if (!(stat0 & ATA_S_BUSY) || (stat0 == 0xff && timeout > 5)) + if (!(stat0 & ATA_S_BUSY) || (stat0 == 0xff && timeout > 5) || + (stat0 == err && lsb == err && msb == err && timeout > 5)) break; if (mask == 0x02) /* wait for slave only */ - if (!(stat1 & ATA_S_BUSY) || (stat1 == 0xff && timeout > 5)) + if (!(stat1 & ATA_S_BUSY) || (stat1 == 0xff && timeout > 5) || + (stat1 == err && lsb == err && msb == err && timeout > 5)) break; if (mask == 0x03) { /* wait for both master & slave */ if (!(stat0 & ATA_S_BUSY) && !(stat1 & ATA_S_BUSY)) break; - if (stat0 == 0xff && timeout > 5) + if ((stat0 == 0xff && timeout > 5) || + (stat0 == err && lsb == err && msb == err && timeout > 5)) mask &= ~0x01; - if (stat1 == 0xff && timeout > 5) + if ((stat1 == 0xff && timeout > 5) || + (stat1 == err && lsb == err && msb == err && timeout > 5)) mask &= ~0x02; } + if (mask == 0 && !(stat0 & ATA_S_BUSY) && !(stat1 & ATA_S_BUSY)) + break; + ata_udelay(100000); } <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Anyway, I now have a working kernel. I presume that I should file a PR on this. Thanks again for your help. Graham