Date: Mon, 24 Jun 1996 11:57:23 -0700 (MST) From: Terry Lambert <terry@lambert.org> To: nate@sri.MT.net (Nate Williams) Cc: terry@lambert.org, hackers@freefall.freebsd.org Subject: Re: cvs commit: src/etc/mtree BSD.usr.dist Message-ID: <199606241857.LAA28712@phaeton.artisoft.com> In-Reply-To: <199606241517.JAA20540@rocky.sri.MT.net> from "Nate Williams" at Jun 24, 96 09:17:01 am
next in thread | previous in thread | raw e-mail | index | archive | help
> In case you hadn't noticed, the 'fsck' patch has been in current for > almost a month now. The reason it wasn't put in was because I don't > think anyone took enough time to understand the problem well enough, so > therefore didn't want to 'break' the system if the patch didn't work. I > gave up on trying to understand the problem (not enough time) and simply > went with it hoping my testing was adequate. It *appears* to work, > though if you asked me if I was sure it's the correct fix I couldn't > answer with assurance. Probably need an FS expert -- do we have one of those on the list? -- hmmmmm... we do: me. I stated that I had regression tested the fix. I did, including power-off destructive testing during latent metadata update operations (see the debug sysctl in the source file /sys/ufs/ffs/ffs_alloc.c). > > patches accepted. I am *not* the only person in this boat. It > > takes nearly Herculean effort to get some types of patches accepted; > > several groups have had to go so far as to offer their own boot > > disks and patch kits (most notably Hosokawa-san's PCCARD code). > > Even with Nate's patronage, there is still the need for building > > seperate-from-snap boot disks to address a number of issues, because > > the patches are not *truly* integrated. > > And won't be. The Nomad code is *full* of hacks and kludges (admitted > by Hosokawa). My goal is to remove as many of those from the base > system code and *then* add them back 'as necessary', rather than > importing them wholesale and trying to work around them. My recent > 'fix' to allow you to use 'generic' IRQ's in the code made the user-land > code smaller, while the Nomad code adds 4-5K of completely un-necessary > code which also makes the code more difficult to understand. Its taken > my weeks to understand small portions of the code (not all due to them), > and rather than continuing on in this manner I feel that it's better to > make what we have more understandable and maintainable than chock-full > of features that may/may not be correctly implemented. Question: did they submit the code as if it were integratable as is, or like my SYSINIT code, was it a prototype? From what I can see of their announcements, it was prototype code. Integration in this case would have been accepting the necessary architectural adjustements, but not the actual code. It's obvious that the framework in which the prototype lives would need to be changed little to support a production version of the code. > The 'boot-disk' code they use would be much *simpler* if they added 2 > functions to the code and relegated all of the conditional code to it. > (You should be able to use 1 additional function, but I'll grant 2). > Instead of doing that there are modifications to *every* single file > which makes the code bloat excessively, plus trying to understand it > becomes more problematic. I've almost considered re-writing the Nomad > code to remove these particular kludges, but I don't agree with how it's > being done in the first place so it would only encourage using it. The same class of architectural arguments were the basis of some of the patches I supplied; ironic, isn't it? 8-). > What I've done recently is spend *ALOT* of time trying to sync. up our > source trees. I've been sending patches back to Hosokawa with a set of > diffs that can be applied to our -current tree to bring the Nomad code > into the *exact* same functionality as their code, but with white-space > and formatting changes removed. This way they can review their changes > again to make sure they are valid (I question some of them), and allow > us to at least have more commonality than before. Right now the code > has diverged so much that integrating is near impossible, and if they > diverge much more than I'm going to give up and roll everything myself > from scratch. Look, this was not an attempt to denigrate the work you were putting in, nor the cooperation. It *was* intended to point out the cycle time involved -- nothing else. > This sounds harsh, but it's starting becoming *more* work > to integrate their code than it would be for me to write it myself. It should be obvious that this would be a natural result of the integration method used... this is perhaps more obvious to me, because this is exactly what I suggested I was willing to do with my own code to make it acceptable. I think it is unreasonable to believe that you must understand everything in order to use work from third party providers; this is not to say that you should accepts prototypes as of they were production code; you should not. Probably, you should approach them about high level issues, and leave the prototype to production conversion to them. This is more difficult for you because of the APM code being your baby, and integral to the use of their work. Look on it as a collaborative effort, not a filtering effort, and expect them to collaborate. Use of "champions" to operate as filters is ineffective; if the expertise (or interest) does not exist on the core team, than any code which relies on this method for integration will fail to be integrated. > And, this is peanuts compared to your FS patches, your kernel locking > patches, and the like. Also, the Nomad kernel code is already mostly > broken up into functional chunks because of my previous integration > efforts, so it's easy for me to separate out function from style. I wish that it were possible to change a VFS interface without impacting everything that uses VFS interfaces, but it is not. I wish it were possible to ensure the state in and out of functions without going to a style technique like single-entry/single-exit, but it isn't -- without seriously damaging readability. I wish that NFS locking were simple and obvious; but it isn't -- which explains the lack of a single public implementation of it, anywhere. I which I could build castles in the air and not have to drag my foundations with me -- but I can't. Castles require foundations, or they collapse (your own experience with the APM code should have told you that -- one rough cut foundation stone damages the entire structure). > However, this has taken my close to 6 months of my time working over 20 > hours/week. For you to expect Poul (or any other developer) to commit > to this much time is too much. I'm doing it because I got paid for part > of it at work, and the last 2.5 months I've done it because I want to > finish what I started. I do *not* expect them to commit that much time. I do not expect them to have to fully understand the code. Clearly, there are some people who fully understand the ramifications of stacking FS architectures sufficiently to make them operate; but to make them sing? You'll notice that, other than BSDI, which hasn't done much with it, there are *no* commercial VFS stacking interfaces. The technology has existed and been talked about for *years*... but there are few people who fully comprehend all of the subtleties and issues. I know of maybe 8 people, period, including myself, and I'd still class Rosenthal and Heidemann at the top of the list as not having some of the blind spots I do. Again, not to denigrate your efforts, but the idea that all code must be vetted by a core member expert in the code is ridiculous (and, unless John H. has joined the core team without telling me, probably an impossibly high standard to ever meet for some FS issues). > > I'm sure I could also halve or quarter my production, providing > > rationale for things which are, to me at least, bloody obvious. I'm > > already willing to spend a large amount of time parceling up my > > work, but I have only so much time I'm willing to spend; please do > > not bankrupt me. > > Please don't bankrupt the committers. For a committer to understand > your code, they must become at least passingly familiar with both the > problem, and the fix. So, it takes almost as long to 'commit' a fix as > it does to create it. So, what may be 'bloody obvious' to you isn't so > obvious to a committer. I am willing to explain issues in less than strictly technically accurate terms to educate people to the level of passing familiarity; but a number of my fixes require a full understanding of both the existing code and John Heidemann's Master's Thesis to realize that the deltas are designed to move the code as integrated by CSRG into line with John's intended design. There are fixes which add *nothing* to existing functionality, which seem like tangential and gratuitous code rewrites, when they are, in fact, necessary prepatory work for architectural next steps. Some of the stuff people submit qualifies as PhD level work; I'd say that most of the stuff John Dyson does falls into that category, and there are others, which I won't ennumerate for fear of omitting someone from the list. Unless these PhD's are already on the core team, you are screwed: you can't expect to be able to incorporate their work, ever. > Remember, none of us are paid FS hackers in our day jobs, and *some* > of the code that has been submitted in the past has been full of > 'stylistic' and other misc. changes that are considered unacceptable. The only "stylistic" changes I've engaged in recent history are the single-entry/single-exit changes in vfs_syscalls.c to make it clear when state is being inverted for a given operation. This was initially put in for the "exclusive" lookup (which moved some 8 duplicate code segments into vfs_lookup.c) and the path buffer allocation/deallocation at the vfs_syscalls.c layer instead of relying on each FS implementor to implement the same operations the same way (an impossible task for, for instance, VFAT, which has two names associated with each file and therfore must deal with short name collision resolution). That I did this consistently, instead of only hitting the name lookup functions, simply saves the time of doing it later for other reasons, for the remaining functions. You may label the change gratuitous in your belief that fine grained SMP locking is not a win; however, since the change does not otherwise impact you, you should be willing to accept it at face value, in that you always have the choice of not accepting the reeentrancy locking later as a compile time option, yet would not actively interfere *now* with the research of other computer scientists. There is no reason to be a proverbial "dog in the manger". > Until you've proven yourself to the responsible committer, you must > *help* that person understand your code, which means putting up with > his/her idiosyncracies necessary to get code integrated. I can buy this... to an extent. I refuse, however, to educate them to the point where they could be me. It is far too much effort, and, I suspect, the primary reason John Dyson has yet to architecturally document the VM system, except in broad strokes. To do so would take a large effort, better turned toward coding, and a level of detail which would significantly constrain the future directions he would be allowed (by his peers) to take: anything outside the plan would tend to be shot down. > And, once you've proven yourself over a period of time that you can be > trusted to commit 'functional' code that doesn't contain 'stylistic' > changes that *may* have function down the road, you become a committer > on your own, able to break the tree at will like the rest of us. What is fundamentally wrong with taking the long view? What is wrong with changes enabling future functionality? I simply do not understand this. Escaping a 3 month (quarterly report) or 6 month (middle management review cycle) or 12 month (employee review cycle) horizion is what pariticipating in a volunteer effort is all about. > But this responsibility has to be earned with trust, not with words and > code. Pericles, how doest thou thine love approve? It would help if things were operated on an initial basis of trust rather than one of distrust, wouldn't it? Then we would not have to engage a "prove yourself to me" protocol before being able to trust anyone. Have you heard of the prisoner's dilemma? The computationally perfect method of playing is called "modified tit-for-tat with forgiveness". I highly recommend the book "the evolution of cooperation", and Dawkins' book "the selfish gene". > > Can we compromise? Can you define how small is palletable so that I > > can preinsure palletability before sending something, and if, when > > I send something, it is not sufficient self explanatory (with a minimum > > of accompanying text), tell me *that* so I can correct it? > > Finally, let me say that I hope you appreciate the work Poul is doing. > I wouldn't even *begin* to start trying to integrate the code you're > submitting. I looked at it *once*, and I was unable to separate out the > functionality from the rest, and even if I was I wouldn't know if the > changes were valid or not. Both the project and you win if we can get > your *fixes* in the tree. But both you and the project lose if you > continue to send in 'mega-commits' which are continually rejected, since > the liklihood of integration become less each time for both personal and > technical reasons. I am unaware of exactly what Poul is doing; he has not communicated it to me. We have had offline discussions of me providing access to my combined source tree, which I would like to do, despite the knowledge that it means much of my work could be taken out of context. If you refer to a more taut integration effort of some kind, I have better patches against -current, synchronized on a weekly basis, and directly applicable (as Jeffrey Hsu has proven on several occasions by applying them to local source trees on his machine). My initial tools problems have been resolved, though the problems of foundation-using code vs. vendor branching has not. There are technical issues with providing tree access, which I am working on. Any lessening of likelihood for personal reasons would be grossly unprofessional ...as would me identifying so heavily with my code that a comment on the code was taken as a personal comment on me; the blade must cut both ways, and I realize this. I refuse to set the bar lower than consumately professional interaction; you do not have to *like* someone to be a benificiary of their work. Look at Richard Stallman; he is an egocentric marxist, but the world would be a poorer place without him. Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199606241857.LAA28712>