Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 20 Sep 2008 13:50:24 -0500
From:      Bob Willcox <bob@immure.com>
To:        Jeremy Chadwick <koitsu@FreeBSD.org>
Cc:        PYUN Yong-Hyeon <pyunyh@gmail.com>, freebsd-stable@FreeBSD.org
Subject:   Re: RELENG_7 hangs on boot w/Gigabyte MA78GM-S2H MB
Message-ID:  <20080920185024.GC15275@rancor.immure.com>
In-Reply-To: <20080920150533.GA75785@icarus.home.lan>
References:  <20080920042418.GB1322@rancor.immure.com> <20080920123914.GA72833@icarus.home.lan> <20080920132429.GA15275@rancor.immure.com> <20080920140456.GA74663@icarus.home.lan> <20080920144510.GB15275@rancor.immure.com> <20080920150533.GA75785@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Sep 20, 2008 at 08:05:33AM -0700, Jeremy Chadwick wrote:
> On Sat, Sep 20, 2008 at 09:45:10AM -0500, Bob Willcox wrote:
> > On Sat, Sep 20, 2008 at 07:04:56AM -0700, Jeremy Chadwick wrote:
> > > On Sat, Sep 20, 2008 at 08:24:29AM -0500, Bob Willcox wrote:
> > > > > 1) It would be helpful to know if you installed i386 or amd64 FreeBSD,
> > > > 
> > > > This is amd64 on this particular machine.
> > > > 
> > > > > 2) With regards to the lock-up after "mount root", if you press NumLock
> > > > > or CapsLock, do the keyboard LEDs turn on/off?
> > > > 
> > > > Nope, no keys do anything. You must either push reset or pull the plug.
> > > 
> > > Is it possible to get the output when booting in verbose mode?  If not,
> > > what are the last few lines before the machine locks up when booting
> > > verbosely?
> > 
> > Yep, just did that. The last things printed right before hang are:
> > 
> > ioapic0: Assigning ISA IRQ 1 to local APIC 0
> > ioapic0: Assigning ISA IRQ 4 to local APIC 1
> > ioapic0: Assigning ISA IRQ 6 to local APIC 2
> > ioapic0: Assigning ISA IRQ 7 to local APIC 0
> > ioapic0: Assigning ISA IRQ 9 to local APIC 1
> > ioapic0: Assigning ISA IRQ 12 to local APIC 2
> > ioapic0: Assigning ISA IRQ 14 to local APIC 0
> > ioapic0: Assigning ISA PCI 16 to local APIC 1
> > ioapic0: Assigning ISA PCI 17 to local APIC 2
> > ioapic0: Assigning ISA PCI 18 to local APIC 0
> > ioapic0: Assigning ISA PCI 19 to local APIC 1
> > ioapic0: Assigning ISA PCI 22 to local APIC 2
> > trying to mount root from ufs:/dev/ad4s1a
> > start_init: trying /sbin/init
> > [hung at this point]
> > 
> > > > > 3) Many others have seen the hanging/lock-up after "mount root".  I
> > > > > believe one found a workaround by setting ATA_STATIC_ID in their kernel
> > > > > configuration.  I realise this is a problem when you can't get the
> > > > > system up to a point of building a kernel; chicken-and-egg problem,
> > > > 
> > > > Well, I can build a kernel if I run the 7.0-release kernel. That's how I
> > > > got to 7-stable on the machine in the first place. I used "sneaker net"
> > > > to copy it to this one via a CD (as I mentioned, the 7.0 kernel boots
> > > > but the Realtek ethernet device is not recognized).
> > > 
> > > So the problem is that 7.0-RELEASE works fine for you, but after
> > > upgrading your RELENG_7 source (to what is now 7.1-BETA), the machine
> > > hangs after printing the mount root message.  Is this correct?
> > 
> > Yes, that is pretty much it. The Realtek ethernet isn't working in in
> > 7.0-RELEASE either, but I'm guessing that that is a different (and less
> > serious) problem related to changes in that device.
> > 
> > > Here's another question: does booting into single-user exhibit the same
> > > problem as multi-user?
> > 
> > It looks like when I try a single-user mode (and verbose) boot the only
> > difference is that the las line shown above (the start_init line) isn't
> > printed. Otherwise, the hang is the same.
> > 
> > > > > 4) The Realtek NIC on that motherboard is probably too new to be
> > > > > supported under RELENG_7.  Realtek has a history of releasing different
> > > > > sub-revisions of the same NIC/PHY, and the internal changes are severe
> > > > > enough to cause the NIC to not work correctly (under any OS) without
> > > > > full driver support for that specific sub-revision.
> > > > 
> > > > That's what I suspected. The values displayed when doing a "pciconf -lv"
> > > > are similar as for this system I'm using to type this, but now that
> > > > I look closer and make a direct comparison, the failing device has a
> > > > rev=0x02 vs. rev=0x01 for the working one. The pciconf -lv output for
> > > > the failing mb is:
> > > > 
> > > > none3@pci0:2:0:0: class=0x020000 card=0xe0001458 chip=0x816810ec rev=0x02 hdr=0x00
> > > >     vendor     = 'Realtek Semiconductor'
> > > >     device     = 'RTL8168/8111 PCI-E Gigabit Ethernet NIC'
> > > >     class      = network
> > > >     subclass   = ethernet
> > > 
> > > Regarding the Realtek issue: I've CC'd PYUN Yong-Hyeon (surname in
> > > caps), who maintains the re(4) driver for FreeBSD.  He might have a
> > > patch available for you to try, or help determine how to get this NIC
> > > working on FreeBSD.  He'll probably need more than just pciconf -lv
> > > output, but should be able to work with you.
> > 
> > Ok, that'd be great. I must say that I'm close to simply returning this
> > MB and going with something not quite so new that is more likely to
> > work. I was hoping to get this system up and running this weekend. :(
> 
> I wish I knew what was causing the lock-up for you.  I'm truly baffled,
> especially given that the system is able to boot + find the kernel +
> load kernel modules.  Debugging this problem is out of field; jhb@ might
> have some ideas, as I'm not sure what magic happens immediately before
> the root filesystem is mounted.
> 
> Those debugging/helping may want "disklabel -r -A ad4s1" output.  At
> least you can boot 7.0-RELEASE to get that information.
> 
> Regarding hardware:
> 
> I myself purchased an Asus P5Q SE board, with an Intel Q9550 CPU earlier
> this week.  The board was affordable (barely US$100).  One of the
> reasons I went with this board is because it lacks a) Realtek NICs, b)
> Broadcom NICs, c) JMicron SATA controllers, and d) Silicon Image SATA
> controllers.  All of those are devices I stay away from.
> 
> The Atheros/Attansic L1E NIC is known to have issues under Vista (not
> sure if the issues are with Vista or the actual driver itself), but I
> use XP).  FreeBSD supports this NIC under the age(4) driver, also
> maintained by Yong-Hyeon.  Of course I haven't tested it yet.
> 
> I'll be building the above system today, and will post the results of
> booting/installing FreeBSD on it as a test case.

Well, I swapped motherboards to a slightly older but similar board with
pretty much identical results. This one's a Gigabyte GA-MA74GM-S2. I
was trying to stick with a MB that has onboard graphics (this is in a
2u rack case and I don't have a riser card for it so there's no way to
install a video card currently).

With this board it doesn't get quite as far even. It prints out all the
APIC messages as above then a GEOM message:

GEOM: new disk ad4

And then hangs. Also, the ethernet NIC on this board is the same as the
previous one so it's not found either.

Bob

> 
> -- 
> | Jeremy Chadwick                                jdc at parodius.com |
> | Parodius Networking                       http://www.parodius.com/ |
> | UNIX Systems Administrator                  Mountain View, CA, USA |
> | Making life hard for others since 1977.              PGP: 4BD6C0CB |

-- 
Bob Willcox              All the evidence concerning the universe
bob@immure.com           has not yet been collected, so there's still hope.
Austin, TX



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080920185024.GC15275>