Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 24 Jun 1996 11:57:23 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        nate@sri.MT.net (Nate Williams)
Cc:        terry@lambert.org, hackers@freefall.freebsd.org
Subject:   Re: cvs commit: src/etc/mtree BSD.usr.dist
Message-ID:  <199606241857.LAA28712@phaeton.artisoft.com>
In-Reply-To: <199606241517.JAA20540@rocky.sri.MT.net> from "Nate Williams" at Jun 24, 96 09:17:01 am

next in thread | previous in thread | raw e-mail | index | archive | help
> In case you hadn't noticed, the 'fsck' patch has been in current for
> almost a month now.  The reason it wasn't put in was because I don't
> think anyone took enough time to understand the problem well enough, so
> therefore didn't want to 'break' the system if the patch didn't work.  I
> gave up on trying to understand the problem (not enough time) and simply
> went with it hoping my testing was adequate.  It *appears* to work,
> though if you asked me if I was sure it's the correct fix I couldn't
> answer with assurance.

Probably need an FS expert -- do we have one of those on the list?

-- hmmmmm... we do: me.  I stated that I had regression tested the
fix.  I did, including power-off destructive testing during latent
metadata update operations (see the debug sysctl in the source file
/sys/ufs/ffs/ffs_alloc.c).


> > patches accepted.  I am *not* the only person in this boat.  It
> > takes nearly Herculean effort to get some types of patches accepted;
> > several groups have had to go so far as to offer their own boot
> > disks and patch kits (most notably Hosokawa-san's PCCARD code).
> > Even with Nate's patronage, there is still the need for building
> > seperate-from-snap boot disks to address a number of issues, because
> > the patches are not *truly* integrated.
> 
> And won't be.  The Nomad code is *full* of hacks and kludges (admitted
> by Hosokawa).  My goal is to remove as many of those from the base
> system code and *then* add them back 'as necessary', rather than
> importing them wholesale and trying to work around them.  My recent
> 'fix' to allow you to use 'generic' IRQ's in the code made the user-land
> code smaller, while the Nomad code adds 4-5K of completely un-necessary
> code which also makes the code more difficult to understand.  Its taken
> my weeks to understand small portions of the code (not all due to them),
> and rather than continuing on in this manner I feel that it's better to
> make what we have more understandable and maintainable than chock-full
> of features that may/may not be correctly implemented.

Question: did they submit the code as if it were integratable as is,
or like my SYSINIT code, was it a prototype?  From what I can see
of their announcements, it was prototype code.  Integration in this
case would have been accepting the necessary architectural adjustements,
but not the actual code.  It's obvious that the framework in which
the prototype lives would need to be changed little to support a
production version of the code.


> The 'boot-disk' code they use would be much *simpler* if they added 2
> functions to the code and relegated all of the conditional code to it.
> (You should be able to use 1 additional function, but I'll grant 2).
> Instead of doing that there are modifications to *every* single file
> which makes the code bloat excessively, plus trying to understand it
> becomes more problematic.  I've almost considered re-writing the Nomad
> code to remove these particular kludges, but I don't agree with how it's
> being done in the first place so it would only encourage using it.

The same class of architectural arguments were the basis of some of the
patches I supplied; ironic, isn't it?  8-).


> What I've done recently is spend *ALOT* of time trying to sync. up our
> source trees.  I've been sending patches back to Hosokawa with a set of
> diffs that can be applied to our -current tree to bring the Nomad code
> into the *exact* same functionality as their code, but with white-space
> and formatting changes removed.  This way they can review their changes
> again to make sure they are valid (I question some of them), and allow
> us to at least have more commonality than before.  Right now the code
> has diverged so much that integrating is near impossible, and if they
> diverge much more than I'm going to give up and roll everything myself
> from scratch.

Look, this was not an attempt to denigrate the work you were putting
in, nor the cooperation.  It *was* intended to point out the cycle
time involved -- nothing else.

> This sounds harsh, but it's starting becoming *more* work
> to integrate their code than it would be for me to write it myself.

It should be obvious that this would be a natural result of the
integration method used... this is perhaps more obvious to me,
because this is exactly what I suggested I was willing to do with
my own code to make it acceptable.

I think it is unreasonable to believe that you must understand
everything in order to use work from third party providers; this
is not to say that you should accepts prototypes as of they were
production code; you should not.  Probably, you should approach
them about high level issues, and leave the prototype to production
conversion to them.  This is more difficult for you because of the
APM code being your baby, and integral to the use of their work.

Look on it as a collaborative effort, not a filtering effort, and
expect them to collaborate.


Use of "champions" to operate as filters is ineffective; if the
expertise (or interest) does not exist on the core team, than any
code which relies on this method for integration will fail to be
integrated.


> And, this is peanuts compared to your FS patches, your kernel locking
> patches, and the like.  Also, the Nomad kernel code is already mostly
> broken up into functional chunks because of my previous integration
> efforts, so it's easy for me to separate out function from style.

I wish that it were possible to change a VFS interface without impacting
everything that uses VFS interfaces, but it is not.

I wish it were possible to ensure the state in and out of functions
without going to a style technique like single-entry/single-exit,
but it isn't -- without seriously damaging readability.

I wish that NFS locking were simple and obvious; but it isn't -- which
explains the lack of a single public implementation of it, anywhere.

I which I could build castles in the air and not have to drag my
foundations with me -- but I can't.  Castles require foundations,
or they collapse (your own experience with the APM code should have
told you that -- one rough cut foundation stone damages the entire
structure).


> However, this has taken my close to 6 months of my time working over 20
> hours/week.  For you to expect Poul (or any other developer) to commit
> to this much time is too much.  I'm doing it because I got paid for part
> of it at work, and the last 2.5 months I've done it because I want to
> finish what I started.

I do *not* expect them to commit that much time.  I do not expect them
to have to fully understand the code.  Clearly, there are some people
who fully understand the ramifications of stacking FS architectures
sufficiently to make them operate; but to make them sing?  You'll
notice that, other than BSDI, which hasn't done much with it, there
are *no* commercial VFS stacking interfaces.  The technology has existed
and been talked about for *years*... but there are few people who fully
comprehend all of the subtleties and issues.  I know of maybe 8 people,
period, including myself, and I'd still class Rosenthal and Heidemann at
the top of the list as not having some of the blind spots I do.

Again, not to denigrate your efforts, but the idea that all code must
be vetted by a core member expert in the code is ridiculous (and,
unless John H. has joined the core team without telling me, probably
an impossibly high standard to ever meet for some FS issues).


> > I'm sure I could also halve or quarter my production, providing
> > rationale for things which are, to me at least, bloody obvious.  I'm
> > already willing to spend a large amount of time parceling up my
> > work, but I have only so much time I'm willing to spend; please do
> > not bankrupt me.
> 
> Please don't bankrupt the committers.  For a committer to understand
> your code, they must become at least passingly familiar with both the
> problem, and the fix.  So, it takes almost as long to 'commit' a fix as
> it does to create it.  So, what may be 'bloody obvious' to you isn't so
> obvious to a committer.

I am willing to explain issues in less than strictly technically
accurate terms to educate people to the level of passing familiarity;
but a number of my fixes require a full understanding of both the
existing code and John Heidemann's Master's Thesis to realize that
the deltas are designed to move the code as integrated by CSRG into
line with John's intended design.  There are fixes which add *nothing*
to existing functionality, which seem like tangential and gratuitous
code rewrites, when they are, in fact, necessary prepatory work for
architectural next steps.

Some of the stuff people submit qualifies as PhD level work; I'd
say that most of the stuff John Dyson does falls into that category,
and there are others, which I won't ennumerate for fear of omitting
someone from the list.  Unless these PhD's are already on the core
team, you are screwed: you can't expect to be able to incorporate
their work, ever.


> Remember, none of us are paid FS hackers in our day jobs, and *some*
> of the code that has been submitted in the past has been full of
> 'stylistic' and other misc. changes that are considered unacceptable.

The only "stylistic" changes I've engaged in recent history are
the single-entry/single-exit changes in vfs_syscalls.c to make it
clear when state is being inverted for a given operation.  This
was initially put in for the "exclusive" lookup (which moved some
8 duplicate code segments into vfs_lookup.c) and the path buffer
allocation/deallocation at the vfs_syscalls.c layer instead of relying
on each FS implementor to implement the same operations the same way
(an impossible task for, for instance, VFAT, which has two names
associated with each file and therfore must deal with short name
collision resolution).

That I did this consistently, instead of only hitting the name lookup
functions, simply saves the time of doing it later for other reasons,
for the remaining functions.  You may label the change gratuitous
in your belief that fine grained SMP locking is not a win; however,
since the change does not otherwise impact you, you should be willing
to accept it at face value, in that you always have the choice of not
accepting the reeentrancy locking later as a compile time option, yet
would not actively interfere *now* with the research of other computer
scientists.  There is no reason to be a proverbial "dog in the manger".


> Until you've proven yourself to the responsible committer, you must
> *help* that person understand your code, which means putting up with
> his/her idiosyncracies necessary to get code integrated.

I can buy this... to an extent.  I refuse, however, to educate them
to the point where they could be me.  It is far too much effort, and,
I suspect, the primary reason John Dyson has yet to architecturally
document the VM system, except in broad strokes.  To do so would
take a large effort, better turned toward coding, and a level of
detail which would significantly constrain the future directions he
would be allowed (by his peers) to take:  anything outside the plan
would tend to be shot down.


> And, once you've proven yourself over a period of time that you can be
> trusted to commit 'functional' code that doesn't contain 'stylistic'
> changes that *may* have function down the road, you become a committer
> on your own, able to break the tree at will like the rest of us.

What is fundamentally wrong with taking the long view?  What is wrong
with changes enabling future functionality?  I simply do not understand
this.  Escaping a 3 month (quarterly report) or 6 month (middle management
review cycle) or 12 month (employee review cycle) horizion is what
pariticipating in a volunteer effort is all about.

> But this responsibility has to be earned with trust, not with words and
> code.

Pericles, how doest thou thine love approve?

It would help if things were operated on an initial basis of trust rather
than one of distrust, wouldn't it?  Then we would not have to engage a
"prove yourself to me" protocol before being able to trust anyone.

Have you heard of the prisoner's dilemma?  The computationally perfect
method of playing is called "modified tit-for-tat with forgiveness".  I
highly recommend the book "the evolution of cooperation", and Dawkins'
book "the selfish gene".


> > Can we compromise?  Can you define how small is palletable so that I
> > can preinsure palletability before sending something, and if, when
> > I send something, it is not sufficient self explanatory (with a minimum
> > of accompanying text), tell me *that* so I can correct it?
> 
> Finally, let me say that I hope you appreciate the work Poul is doing.
> I wouldn't even *begin* to start trying to integrate the code you're
> submitting.  I looked at it *once*, and I was unable to separate out the
> functionality from the rest, and even if I was I wouldn't know if the
> changes were valid or not.  Both the project and you win if we can get
> your *fixes* in the tree.  But both you and the project lose if you
> continue to send in 'mega-commits' which are continually rejected, since
> the liklihood of integration become less each time for both personal and
> technical reasons.

I am unaware of exactly what Poul is doing; he has not communicated it
to me.  We have had offline discussions of me providing access to my
combined source tree, which I would like to do, despite the knowledge
that it means much of my work could be taken out of context.  If you
refer to a more taut integration effort of some kind, I have better
patches against -current, synchronized on a weekly basis, and directly
applicable (as Jeffrey Hsu has proven on several occasions by applying
them to local source trees on his machine).  My initial tools problems
have been resolved, though the problems of foundation-using code vs.
vendor branching has not.  There are technical issues with providing
tree access, which I am working on.

Any lessening of likelihood for personal reasons would be grossly
unprofessional ...as would me identifying so heavily with my code
that a comment on the code was taken as a personal comment on me; the
blade must cut both ways, and I realize this.

I refuse to set the bar lower than consumately professional interaction;
you do not have to *like* someone to be a benificiary of their work.
Look at Richard Stallman; he is an egocentric marxist, but the world
would be a poorer place without him.


					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199606241857.LAA28712>