Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Jul 2005 13:20:49 +0100 (BST)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Marc Olzheim <marcolz@stack.nl>
Cc:        Alexey Yakimovich <aiy@ferens.net>, freebsd-stable@freebsd.org
Subject:   Re: Quality of FreeBSD
Message-ID:  <20050721125632.F97888@fledge.watson.org>
In-Reply-To: <20050721113737.GB52753@stack.nl>
References:  <1121917413.4895.47.camel@localhost.localdomain> <20050721113927.T97888@fledge.watson.org> <20050721113737.GB52753@stack.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--0-1602063093-1121948449=:97888
Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE


On Thu, 21 Jul 2005, Marc Olzheim wrote:

> Indeed. That's why my company started taking FreeBSD 5.3 in use for=20
> production servers when it was out. Since then numerous bugs were fixed,=
=20
> some of which reported by us. Now that we're X bug fixes later in time=20
> and started to get a good feeling about the number of open problems, it=
=20
> is extremely annoying to hear the "This will (probably) not be fixed in=
=20
> 5.x" statements. That conflicts with 'gradually get resolved'. What do=20
> you recommend larger consumers to do ? Keep using FreeBSD 4 and start=20
> testing FreeBSD 6.x, dropping 5.x all together ?
>
> I know FreeBSD 5 was a strange exception in the relase scheduling and=20
> that a lot has been learned from it for the future and I'm certainly not=
=20
> unthankful for all the work that's done, but I'd like a clear answer on=
=20
> what to do now in regard to taking FreeBSD 5 into 'real' production...

Marc,

I should start out by saying I appreciate your clear and concise bug=20
reports, and the list of your company's show-stopper 5.x bugs has made the=
=20
rounds among FreeBSD developers.  I'm happy that at least one of the=20
issues on the list was fixed by me. :-)  As you probably saw yesterday,=20
I've started bugging Poul-Henning to look at the pty problem you're=20
experiencing, and will get that on our 6.0 release show-stopper list.  I=20
haven't yet had a chance to reproduce it locally, but it sounds like that=
=20
should be straight forward.

FreeBSD 5 has been an exception -- "normally", in as much as major=20
releases have a "normal", the set of new features is a lot less agressive,=
=20
and it has been our goal with 6.x to restore the expectation of a more=20
rapid release cycle with a less agressive feature set.  This should reduce=
=20
the number of problems by virtue of reducing the level of change.  It=20
should also make it easier for users to pick what version to run on, as=20
the amount of adaptation they have to do to slide forward a version will=20
be greatly reduced.  I.e., right now it's relatively easy to move back and=
=20
forward between 5.x and 6.x.

With respect to 5.x vs 6.x upgrades: I've seen companies take two=20
different strategies.  Most of them have been at least experimenting with=
=20
deploying 5.x, and are very interested in its feature set.  Support for=20
large file systems, 64-bit support on newer AMD and Intel hardware,=20
improved PAM support, etc.  Some of my customers are specifically=20
interested in the support for mandatory access control, but that's=20
obviously a less common feature request :-).  The biggest determining=20
factor for companies today comes from their own product schedule, since=20
most big consumers of FreeBSD treat it as a component in a "product" they=
=20
deliver for others.

For example, my understanding is that Yahoo is now deploying 6.0 betas=20
across their server environment with great success, but was actually=20
unable to seriously deploy 5.x because their goal was to support full=20
32-bit compatibility on 64-bit amd/intel hardware, which has only recently=
=20
reached the level of maturity they require.  In fact, you'll notice if you=
=20
follow FreeBSD commit logs that much of that support has come from Yahoo!.=
=20
Since 6.x is maturing in pretty good synch with their deployment timeline=
=20
for 5.x, they are actually deploying 6.x.  Of course, Yahoo! has a team of=
=20
in-house OS developers who adapt FreeBSD for their needs, and is quite=20
capable of debugging a kernel or two if they run into problems.

The ATA driver issue is a sticky one for many users -- we hope to get the=
=20
6.x ATA code back into 5.x in the next 5.x release.  However, hard-earned=
=20
experience tells us that ATA driver code is notoriously difficult to get=20
right across the broad range of available hardware.  Soren has been=20
lobbying to get it merged to 5.x, but given the level of testing performed=
=20
so far, we can't yet justify the merge.  My hope is that with 6.0 out the=
=20
door and a lot of testing of that code, we can get it merged back to 5.x=20
before 5.5.  Many other fixes have gone into 5.x, correcting many of the=20
most significant issues.  If you compare 5.4 with 5.3, you'll find that in=
=20
most cases, it's both faster and more stable.

The tty issue is a sticky one also.  The tty code in 6.x has been=20
substantially rewritten to better support the SMPng environment.  Because=
=20
the tty code "plugs in" to a number of device drivers, T1 adapter drivers,=
=20
etc, changing the tty interfaces is a fairly big event, and will affect=20
third party vendors like Cronyx.  This code has also not yet seen as wide=
=20
deployment as I'd like, so it's also something that really isn't=20
appropriate for an MFC immediately.  However, once it has seen significant=
=20
6.0 deployment, it may well be.  A question then will be whether it's=20
better to simply say "you're better off making the jump to 6.x, which is=20
minor" than backporting, and it's something we can't really answer until=20
we're comfortable that it's seen sufficient deployment.  My hope is that=20
we can identify a workaround for 5.x that will avoid the code upheaval a=20
full backport would require.  It's not as ideal as having the "right" fix,=
=20
but it would stop the panics.  I need to ping phk and some of the other=20
tty-centric folk to look at this some more.

In terms of advice:

If you have a "product" due out more than 3 months from now, I think 6.x=20
is the obvious way to go: you want to be ahead of the curve so that you=20
can have the foundation for your product in sync with the FreeBSD=20
production release cycle, and avoid jumping major releases early in the=20
product life cycle.  6.x has significant performance and stability=20
improvements -- performance especially in the area of file system=20
performance on SMP, preemption, network stack, and memory management, and=
=20
stability especially in the area of tty support.  By "product", I mean a=20
range of things: the OS foundation of an embedded product such as a=20
firewall or storage appliance, or deployment of an internal product, such=
=20
as a virtual server product at an ISP.

On the other hand, if you're deploying today, I think that unless you're=20
prepared to deal with the 6.0 bug fix cycle (both the BETA/RC cycle, and=20
the inevitable post-release fixes for a .0 release), 5.4+patches or=20
5-STABLE is the right place to sit.  At least two of the critical bugs on=
=20
your list were fixed in 5-STABLE after the release of 5.4, so for some,=20
5-STABLE is the best place to be.  We've opted not to do a patch/errata=20
update for 5.4 for the socket error you were receiving on the basis that=20
it doesn't affect a wide audience and doesn't correct a "Critical" failure=
=20
-- i.e., a crash or the like, unlike some of the NFS server fixes, for=20
which we did do an errata fix.

From=20the perspective of the FreeBSD developers, if you can tolerate the=
=20
6.x release process, we encourage you to jump on that bandwagon.  It will=
=20
help us release a better 6.0, and that's where the future lies.  Our goal=
=20
is to make 6.x a pretty seemless upgrade from 5.x, as it has a less=20
agressive feature set, and far fewer user-visible changes (i.e., no=20
conversion to OpenPAM, devfs,=A0UFS2, large compiler version upgrade, ... a=
s=20
in 5.x).  When I upgraded my personal web/shell server to 6.x from 5.x=20
last week, I didn't have to change any configuration in /etc at all, other=
=20
than a painless pass through mergemaster to merge the _dhcp user and=20
group.  As always, we look to the freebsd-stable users to help us test new=
=20
features ahead of the release.

Thanks,

Robert N M Watson
--0-1602063093-1121948449=:97888--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050721125632.F97888>