Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 17 Nov 2000 22:13:08 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        nate@yogotech.com
Cc:        tlambert@primenet.com (Terry Lambert), chat@FreeBSD.ORG
Subject:   Re: Turning on debugging in GENERIC
Message-ID:  <200011172213.PAA04059@usr08.primenet.com>
In-Reply-To: <14869.36557.693564.613415@nomad.yogotech.com> from "Nate Williams" at Nov 17, 2000 01:02:21 PM

next in thread | previous in thread | raw e-mail | index | archive | help
[ ... core team stuff ... ]

> You can speculate all you want, but it didn't happen, and whether or not
> it would have happened is *pure* speculation.  The folks who became core
> members were folks who were actively involved with the day-day
> operations of FreeBSD, and you had *never* been involved with the
> day-day operations of FreeBSD.

I was invited.  It may have been just a hat tip, or it may not
have, but I was invited.


> > Had I been on the core team, or even contributed anything in the post
> > acquisition phase, while still a Novell employee, my access to the USL
> > source code would have provided a ready claim to the contamination of
> > all the FreeBSD code.
> 
> No more so than David, Rod, or Jordan's access to old source code, such
> as BSD 4.[123], and SysVR3 sources.

You should have been in Mike DeFazio's office with me, arguing in
my favor, then.  I suspect that your speculation is incorrect.


> > I further stayed at Novell six months longer than I would have
> > liked to, and had to forego other offers, in order to ensure the
> > project got the same deal as BSDi, instead of having to live
> > with the cease and desist order that Jordan and other received
> > via certified mail.
> 
> Again, you can choose to believe that you staying at Novell made a
> difference, when in fact circumstantial evidence would show otherwise
> (NetBSD, plus a number of other folks involvement).

I got 386BSD, NetBSD, and FreeBSD all the same deal.  Everyone
had already gotten cease and desist letters, except BSDi, at
the time I went into his office.

A Novell VP (not DeFazio) ended up fired, based on Noorda wanting
the suit dropped.  I lobbied for 3 months straight, and had over
9 very long meetings with Noorda and DeFazio over the issue.  Caldera
Linux was formed by my coworkers over the issue of not being able to
do or participate in or promote anything that competed with UnixWare,
which was a reversal of Ray's long time "coopetition" directive by
executives like Mike DeFazio.  Ray was an initial investor in
Caldera over this.


> > I believe that I was instrumental in the lawsuit finally getting
> > dropped, after personally lobbying Ray Noorda.
> 
> Possibly, there's no way to know.

Ask Ray or Mike.


> *I* handed the FAQ off to Dave.  I owned the FAQ *AND* the patchkit for
> about 6 months (I suspect the FAQ was the reason you handed the patchkit
> off to me), and then handed it off to Dave, where it mostly languished
> because it became irrelevant almost immediately.  For what it's worth, I
> still have a copy of the 'Unofficial 386BSD FAQ' sitting on my box.

I got rid of the FAQ after I started on the patchkit.  It is much
easier to maintain something than it is to create it in the first
place.  I've been involved in the creation of at least 6 large scale
Open Source projects.  Ask Andrew Tridgell who pestered him until
he released his DEC LANMan compatability code that he had mentioned
casually on Usenet 2 years prior, for one example.


> I also had a patchkit as well, although my patchkit wasn't nearly as
> well organized as yours (it was wrapped up in the FAQ).  However, you
> had patches I didn't, and vice-versa.  My first steps in the the
> patchkit was to combine all of the outstanding patches and make a
> 'unified' patchkit, which I did as my first task upon receiving the
> tools from you.
> 
> Yes, what you did was important.  But, as they say in the industry,
> 'what have you done for me lately'?  You can't live in the past.

I've actually done a huge amount of stuff.  I generally use one
of several pseudonyms or submissions through others to do it, so
people like you can't bring baggage to the party.  I actually
have a not insiginificant amount of bug fixes of mine in Linux,
among other places.  I'm not the type to blow my own horn; I'm
sure most people are hearing about all of this for the first time
in this thread.


> > These, more than any single thing, save the Net/2 release and William
> > Jolitz's original 386BSD 0.1 release, formed the basis of FreeBSD 1.0.
> 
> FreeBSD owes most of it's existance to the obnoxiousness of Lynn Jolitz,
> who is as annoying an individual as I've ever had the chance to deal
> with.  If you wouldn't have done the patchkit, it would have been done
> differently, but it still would have (and was) been done, in a different
> format.

Lynn had one Usenet diatribe, written in anger, during a family
crisis.  I personally have no problem with Lynn; perhaps you and
I are just cut from different cloth.

I think the overwhelming number of personal attacks on Lynn by
people on Usenet, as a result of that one posting, and the grudges
these people were willing to hold for (apparently) forever, were
the reason that Bill Jolitz revoked his permission for a 0.1.5
386BSD release.  FreeBSD exists because Bill Jolitz refused to let
the patchkit people use the name "386BSD" after many of them
attacked his wife in a public forum.  If permission to use the name
had not been withdrawn, the name "FreeBSD" may never have been
coined.


> So, having said that, I disagree with teh above statement, having been
> greatly involved in it.  But, if it makes you feel any better, you can
> believe whatever you want.

Thanks for your permission.


> > > > As a single point rebuttal, my home connection has been and
> > > > remains ~28k
> > > 
> > > That's more than adequate for the task.  I ran with a 28.8K link until
> > > recently, and at times I'm using a 33.6K connection for syncing the CVS
> > > tree.  (I got a new modem when lightning blew out my 28.8K USR.)
> > 
> > How often do you perform this process?
> 
> Daily.

Daily on a 28.8k line?  If so, I would hope you would share your
secret with the rest of us.


> > I maintain that doing this six times in one day to catch the CVS tree
> > in a buildable state is not practical.
> 
> You're the one arguing that it has to be done 6X/day.  I do it once a
> day, and have found in the 8+ years that the tree has been completely
> 'broken' (in a manner than having CVS locks would fix) has happened less
> than two dozen times.

You should note that I am not currently suggesting CVS locks,
even though any true computer scientist knows that eliminating
race conditions is a Good Thing(tm).


> The tree has been broken *MUCH* more than that, but that's mostly
> related to 'personal' issues, not technical issues.  If someone checks
> in code that doesn't compile CVS can't fix it.  If someone checks in
> code that compiles in their tree, but not in a 'virgin' tree, CVS can't
> check that either.

These are the issues which result in snapshots being a better
choice for testing than local cvsup + build world's.  You are
agreeing with my premise.

Now if you would only agree that a way that would eliminate a
"try it again, it works for me" sent to the bug reporter is a
good idea, we will have found common ground.

All I'm suggesting is that (1) snapshots be the preferred thing
to test -- not cvsup'ed, locally built GENERIC's with an obscene
amount of console messages -- and, that (2) snapshots have some
half-life, so that a developer can grab the snapshot that is
having the problem and see if it _really_ "works for me", to be
able to determine immediately if the problem is that the code
used to build the snapshot is bad for all hardware, or if it
only breaks the failing test hardware.

Without this, the diagnostic process results in iterative and
time wasting finger pointing: "your code breaks!" -- "no, you're
using inconsitant code!" -- "am not!" -- "are too!" -- "am not!"
-- "are t... oh, I guess there is a bug".


> Those kinds of breakage happens *ALL* the time in every project, and
> FreeBSD is no exception to that.

No, They Do Not.  They are much more frequent in software projects,
true, but they are an artifact of process and tools permitting such
mistakes, not the laws of probability making them inevitable.

I have never worked for an engineering organization where it was
permissable to check in code before compiling and at least unit
testing it, until very, very recently.

Taking 40 engineers and making them sit on their thumbs until Bob
gets back from windsurfing and fixes his mistake is unacceptable.

In a commercial organization, let's say that an average engineer
salaray was $100k a year.  That makes my overhead for employing
that engineer, including facilities charges, insurance, employer
tax contributions, etc., ~$200k a year.  Out of that year, there
are 2080 hours (40hrs/wk x 52 wks) - 80hrs (2wks vacation).  We
can call that ~2000hrs.  So my outlay is $100/hr/engineer.

A one day breakage in the tree is 8hrs; with 40 engineers, this
is a cost of $32k per such breakage.


How many man hours are avaialable to the FreeBSD project, and
how many are lost to such "breakage"?  The project certainly has
many more engineers, at greatly reduced per diem time comitment?


These people are donating the equivalent of $100 per one of
their work hours to the project.  How much of this capital is
being lost or squandered through poor process?  How many more
productive contributed work hours would exist, were there process
in place to prevent such problems?


I guarantee you that in any _real world_ engineering organization,
management acts to institute process to minimize these events: it
is their fiduciary responsibility to do so, and they will be fired
for shirking it; I have personally moved someone under me in a
lateral move (they were still a useful resource) to get them out
from under responsibilities which they were unable to handle.  If
they hadn't been a useful resource laterally, I would have had no
compunction about letting them go, for the good of the overall
organization: _I_ had a fiduciary responsibility to the 15 people
who reported to me to ensure that they had jobs the next week,
month, and year.


> > I further maintain that, even if you were to do this, the
> > repeatability of your experience is low, and the first thing any
> > "works for me here" developer is going to do is blame the
> > unknown-to-them version of the bits you are running, until you waste
> > another day repeating the process.
> 
> Maybe a developer like yourself, but there are *thousands* of other
> developers throughout the entire world who have actively contributed
> real code to real problems on the project who would show otherwise.

You are assuming, incorrectly, that your testers are developers,
rather than people trying to contribute where they can, instead
of where they can't.

It's a nice bar to hold up as an ideal, but it's not a reasonable
one to hold up, if you want to speak to practicality and the best
utilization of the resources which are available to you.

I think that the FreeBSD project has been long unable to attract
the technical writers, marketing people, and other people who are
_not_ developers, to a large extent as a result of holding out a
uniform standard for all contributors in a very narrow area.

This, regardless of their abilty to contribute at a very high
level of quality elsewhere.

A number of College technical writing courses actually assign
coursework that amounts to documenting various aspects of Linux;
yet I see no similar inroads into "non nerd-dom" by the FreeBSD
project, proper.


> Yes, there are those that agree with you (Richard Wackerbath comes to
> mind), but IMNSHO, people like that won't be happy with *ANYTHING*, so
> trying to please them is nearly impossible.

As you gave your permission, I give my permission for you to hold
this opinion, even though I vehemently disagree with it.  8-).


> Even folks as difficult to work with as Matt Dillon someone get past all
> of the problems and have made *signficant* personal contributions to
> FreeBSD.

Yet there are others who have failed to leap this hurdle, not
being the track star Matt Dillon is, but being good or even
excellent athletes in their own right.  It is a mistake to make
the environment unpleasent for all but Jesse Owens.  He may win
his event and set a world record, but it is the _team_ who wins
the track meet, not one individual set above the others.


> If you don't want to do something, you can *always* find excuses not to
> do them, and you seem to have been doing this for well over 8 years.
> 
> Since 1995, what visible contributions have you made to the project,
> short of lots of email?
> 
> (Since 1997, I have made little, short of lots of email, so I'm willing
> to consider that I'm of little use to the project, for what it's worth.)

I saved Jack Vogel's SMP work, and in June of 1996 did some
updating of it, which became the basis of the initial FreeBSD
SMP project.

I participated in many architectural discussions, online and
offline, including the recent threads design, as of last year.

I have provided (admittedly minor) patches for things like newfs
and the VM cache coherency problem with sub-block fragments in
NFS (PHK reworked these patches, but the problem identification
and some ugly patches that worked were mine).  I've done many
other small patches.  I've done 4 man pages that I can count
easily; I think I've submitted more.

And yes, I've sent a lot of email; much of it educational to
new people who were given terse, short, and largely useless
answers.  I have examples as recently as last week on this,
where someone was being bullied instead of informed in -hackers,
and the bullies got into a shouting match with Jesus Monroy,
instead of answering the original question in any meaningful
manner.

You can also see a number of contributions at the page
http://www.freebsd.org/~terry/ --though I have not updated
these since the stupid SSL stuff went into effect, making
updating them nearly impossible, so these are somewhat dated.

I've done a lot of work on making the jail stuff usable, though
I have only offered that code to a few people so far, rather
than giving it to the project (see also "SSL").

I've given FreeBSD patches to MySQL, Sendmail, Bind, and about
a dozen other Open Source projects.

Me and Jeremey Allison were the people who brought the FreeBSD
pthreads code into full compliance with the Draft 4 standard,
back in 1998; before that, it was unusable.  It's my patches
to the mutex initialization code that made the Moscow Center
for Supercomputing Activities STL work with FreeBSD's Draft 4
pthreads implementation (before that, the STL code required
full compliance).  This is why why Cyrus ACAP ended up working
on FreeBSD (it was Jeremy Allison's patches to ACAP to make it
work with GCC at all that revived the Cyrus ACAP project; I had
a minor hand in some of those, as well).

I was the initial and most vocal proponent of getting Soft
Updates code into FreeBSD; it was intended as a soloution for
reducing InterJet hardware costs, but if I hadn't worked on
an FS that had Soft Updates in it before (I was a tiny, tiny
contributor to the Soft Updates code itself, in that case:
almost all of the real work was done by Matt Day), the issue
would never have been raised at all.

What do you think of OpenLDAP's continued support for FreeBSD?
Ever wonder why?  Go look at their CVS repository and see what
code they first imported on top of the UMICH sources; FreeBSD
is credited.  Who did that 120k of "FreeBSD" patches come from?
It's easy to maintain something, once someone makes the thing
work in the first place... but Open Source projects do not
start without working code.

I've evangelized FreeBSD significantly to commercial and other
organizations alike.

And I've studied Open Source social organizations, and modelled
mathematically what makes them tick.  Without this understanding,
there's a lot of stuff that people try to do which is inherently
ineffective, and there's no obvious reason why.  I've greased
the gears; I don't know anyone else who could have done this
_intentionally_.


> > Insisting on bug reports _only_ from clueful users is a mistake,
> > and one I hope the project is not ready to make.
> 
> It's a decision that's made for *ALL* software.  I'm not aware of *any*
> software that doesn't require clueful users for useful bug reports.
> Even netscape's 'automatic bug reporting' software requires that the
> user have a properly configured system (for sending email), and that the
> user be able to write coherent descriptions of what triggered the bug.

Netscape is not a good example of good engineering practice.  The
"TalkBack" option consistently fails on my Windows 98 box (it
crashes, just like Netscape did).


> XI-Graphics has the same sort of problems in their 'automatic' bug
> reporting that happens when it crashes.  You've got to write
> explanations of what's going on when the crash occurs, else the bug
> report is sent to /dev/null.

I have never really believed in the "TalkBack" model of support,
and don't intend to start believing in it until the technology
improves.  If what you say about XI is true, then they clearly
have not implemented their code well enough, since it should be
capable of providing sufficient information.  If it's not capable
of doing that, then it's useless.  The _only_ piece of information
external to it (which it should be able to get without pestering
a user) should be the email address of the user for them to
contact when they have a fixed version of the software for the
user to try ("try" assumes that they were unable to repeat it
locally, which sohuld be their first choice: they should send a
fixed piece of software, not merely an attempt at one).


> > To me, the addition of a lot of kernel diagnostic messages seems
> > clearly intended as a crutch: it is to substitute information volume
> > for clue, recognizing that an insistance on clue has not been a valid
> > success strategy so far.
> 
> Now you're contradicting yourself.  Adding kernel diagnostics means the
> users must have 'less clue', and you're complaining about it yet above
> you don't want users to have to have a clue?  Would you make up 'yer mind
> already?

I am not contradicting myself.  You assume, incorrectly, that a
notoriously lossy data channel (a human) is somehow a better
communication medium than apriori control of initial conditions
(e.g. everyone uses the same code for testing, reporting bugs,
verifying whether a bug is universal or machine specific).  I
maintain that it is not.


> > You are presuming a premise for me; do not do that: it is an
> > invalid tautology.  Do not put words in my mouth.
> 
> If you're not willing to bring the -current bits into your tree, then
> you're obviously not willing to test -current bits.  How is this
> invalid?

This should be obvious to you.

I can test them without them in my tree, using a snapshot built
from them from a centralized snapshot tree.


> See above.  You can't test something if you aren't willing to even bring
> them in...

Yes, you can.  People test programs without source code all the
time.


> Also, you've also complained about not having the 'resources' to test
> them.  However, people with *less* resources than you are testing the
> bits, therefore logically speaking your statement is false, and your
> logic flawed.  Therefore, I conclude that it's not an issue of ability,
> but an issue of 'willingnes'.

You are still crediting me with resources I do not have, and you
are crediting these tests with utility which, other than developers
being testers, that the tests themselves do not have.


> > I think the problem is better solved by ensuring repeatability
> > using identical code at all test locations.
> 
> The cure is worse than the solution.  An engineer spends his time
> optimizing for the 'standard path', and you want the project to spend
> all of it's time optimizing for a 'rare' path.  This is not the best use
> of resources, and would only cause the project to become that much more
> engendered in politics and finger-pointing than it already is, and get
> that much less done.
> 
> This is a 'political' solution that has no engineering basis, and as
> such will be thrown out today, just like it has been for the last 8+
> years.

This is incorrect.

If the only thing that changed was that instead of being told to
resup, and retry, the user was told that the first thing they
should do is try a snapshot and report results, the developer and
the user would immediately be on the same page with regard to what
is being tested, and the developer would have a known source base
from which to mobilize effort (assuming the problem was not locally
repeatable for them).

In addition, if debugging versions of the kernel had been built
at the same time, a crash-dump of the failing system could be
post-mortem'ed by the developer, since the dump and debugging
images could be guaranteed to be correlated.


> People can't be forced to become 'clueful'.  You know it as much as I
> do.

I fully agree.  But there is more than one way to resolve this
problem: putting a bunch of diagnostics in the kernel is not the
only possible soloution to the problem, and I submit that it is
certainly not the best available, for little effort.  There is
other low hanging fruit.


> > My non-scratch boxes are a laptop (the fastest machine I own; it is
> > frequently booted into Windows these days) and a Dual P90 machine I
> > bought from Rod Grimes in 1996, which can't run -current.  I have
> > other machines, but FreeBSD doesn't run on any of their architectures,
> > except a 166MHz Multia.
> 
> So you're still in better shape than me, with more hardware.  (Your dual
> P90 should still be able to run -current, just not with both CPUs.)

It is not a scratch box, it is a production SMB, NFS, DHCP, and
bootp server.


> > My scratch box is a 486/50 EISA box with an AHC1742 and 2G of
> > disk.  You have me beat by 8MHz on bus speed and 16MHz on
> > processor speed.
> 
> Whee, you've got a faster/wider 'bus' than me, which more than makes up
> for the little bit of CPU speed I have.

Your memory bus is faster than mine.  I/O to disk is not the
bottleneck. The memory bus on both sisytems is the same width.


> > My connection speed as I type this (fighting with an FTP session
> > at the same time) is 26.4k; yours is faster by 7.2k, and you state
> > that you only use that speed "at times".
> 
> So upgrade it.  DSL is dirt cheap, and if you're really willing to help
> out, you can sacrifice a bit more $$ and get a faster connection.  For a
> guy that makes as much as you do, an extra $50/month shouldn't kill you.

I do not live next door to a LATE.  I am as rural as rural Utah
or Montanna, with regards to DSL.


> If you can't stand the heat, stay out of the kitchen.  If you aren't
> willing to upgrade it, then quit bitchin and live with it.  It works for
> me, and it works for a *heck* of alot of European users who also have to
> pay for the amount of bytes they download, and you don't see them
> complaining about it.

Find someone to sell it to me, with the California PUC tarrifs
imposing sever penalties for any tarrifed speed being measured as
less than the tarrif rates (which is why DSL is zoned by distance
from a LATE in California, instead of available as "best possible"),
and I will buy it from them.

Next, I suppose you will tell me that I shuld move next door to
a LATE, and that "If you aren't willing to move, quit bitchin".

I hear that goes over real well in Afghanastan, and that people are
taking your advice in droves and moving to within 1 mile of a US
LATE to help the FreeBSD project, despite the H1B visa situation...


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-chat" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200011172213.PAA04059>