Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 01 Dec 2008 14:30:22 -0500
From:      Ken Smith <kensmith@cse.Buffalo.EDU>
To:        Jo Rhett <jrhett@netconsonance.com>
Cc:        freebsd-stable Stable <freebsd-stable@freebsd.org>
Subject:   Re: Can I get a committer to mark this bug as blocking 6.4-RELEASE ?
Message-ID:  <1228159822.15856.45.camel@bauer.cse.buffalo.edu>
In-Reply-To: <EC872352-4A50-404E-A93E-DBA5FCAA1431@netconsonance.com>
References:  <A5A9A4D4-CD16-45FA-A2AC-62C4B5AE976D@netconsonance.com> <BEBF7B15-DECE-4872-9687-4AD4BE65DB05@netconsonance.com> <84E1EC10-5323-4A8C-AD60-31142621DB32@netconsonance.com> <200810271151.47366.jhb@freebsd.org> <C6DC3DB1-40FF-4896-81DB-EF37874428AF@netconsonance.com> <280616DD-A58F-4AE5-AB03-92C5F2C244EC@netconsonance.com> <1227733967.83059.1.camel@neo.cse.buffalo.edu> <EC872352-4A50-404E-A93E-DBA5FCAA1431@netconsonance.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--=-vh/iv9y6o2sXMCNjByrG
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

On Mon, 2008-12-01 at 10:20 -0800, Jo Rhett wrote:
> On Nov 26, 2008, at 1:12 PM, Ken Smith wrote:
> > Unfortunately no.  As John indicated in the earlier thread BIOS
> > issues tend to be extremely hard to diagnose and so far it seems
> > like its specific to this one motherboard.
> >
> > Given this problem does cause issues with installs I'd be willing
> > to provide ISOs built at the point we've done the Errata Notice that
> > fixes the problem.  But its too nebulous an issue to hold up the
> > release itself for.
>=20
> It does *not* cause an issue with installs.  Installs work fine.  It =20
> prevents booting an installed operating system.  This appears to =20
> affect *ALL* of the Intel multi-cpu motherboards, including 3 =20
> generations of Rackable systems.

Understood, I guess I wasn't quite specific enough.  The machine not
being able to boot what got installed on its disk I consider an install
problem.

To date this is the first mention I've seen of it affecting more than
one specific machine type.  I might have missed it but I can't recall
you mentioning this affected more than one particular machine.  And it
does not seem to affect *ALL* of the Intel multi-cpu motherboards.

> The only reason it is nebulous is because absolutely nobody bothered =20
> to investigate the issue.  I've been asking for what information would =20
> help.  I've offered to setup serial consoles, or even ship systems, to =20
> anyone who would work on this problem.

Both John and Xin Li have chimed in on the two threads I've seen that
are related to this specific topic.  John diagnosed it as a issue with
the BIOS.  That's what makes it a nebulous problem.  When working on
those sorts of things most people liken it to "Whack-a-mole".

> This is very big problem that will affect thousands of freebsd servers.

Its still not clear it will affect thousands of servers.  The same set
of changes got made to stable/7 as were done to stable/6, and the test
builds for the 7.1 release have been seeing much more testing than the
test builds for the 6.4 release.  If the problem was as wide-spread as
you're suggesting we'd likely have seen a lot more reports and that
factored into the decision about whether to go ahead or not.

This all left me with a decision.  My choices were to back out the BTX
changes that were known to fix boot issues with certain motherboards and
enabled booting from USB devices or leave things as they are.  The
motherboards that didn't boot with the older code had no work-around.
The motherboards that did boot with the older code but not the newer
code do have a work-around (use the old loader).  Decisions like that
suck, no matter which choice I make it's wrong.  Holding the release
until all bios issues get resolved isn't a viable option because of the
"Whack-a-mole" thing mentioned above.  Fix it for one and two break.  It
takes a lot of time/work to settle into what seems to work for the
widest set of machines.

> Ken, the complete lack of action taken by FreeBSD to even CONSIDER =20
> investigating a significant bug reported during the testing process is =20
> shocking.  And it truly puts a lie to those who continue to claim that =20
> we should be more active in the testing process.  Every time I have =20
> done this, I'd found significant issues that affect a significant =20
> portion of the user base and COMPLETELY prevent deployment of a given =20
> release, and absolutely nothing has been done to even investigate the =20
> reports, nevermind address them.
>=20
> Congradulations.  Good Job.  If you aren't going to accept bug =20
> reports, why exactly do you release testing candidates at all?

So you're saying John and Xin Li's responses (Xin Li's questions still
un-answered) to you show a complete lack to even consider investigating
it?  I know from past email threads your preference is for 6.X right now
but as a test point if you aren't totally fried over this whole thing it
would still be useful to know for sure if the issue exists with 7.1 test
builds.  If yes it eliminates a variety of possibilities and helps focus
on the exact problem.

--=20
                                                Ken Smith
- From there to here, from here to      |       kensmith@cse.buffalo.edu
  there, funny things are everywhere.   |
                      - Theodore Geisel |


--=-vh/iv9y6o2sXMCNjByrG
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (FreeBSD)

iEYEABECAAYFAkk0O04ACgkQ/G14VSmup/bHUACbB2lpUaUPgbUTFxAZBBFtJwVT
eXgAniLrWRlhr3J/D0ZmfT9znqkho4/2
=+YW9
-----END PGP SIGNATURE-----

--=-vh/iv9y6o2sXMCNjByrG--




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1228159822.15856.45.camel>