FreeBSD Mail Archives

Date:      Thu, 16 Nov 2000 09:23:08 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        will@physics.purdue.edu
Cc:        tlambert@primenet.com (Terry Lambert), arch@FreeBSD.org
Subject:   Re: Turning on debugging in GENERIC
Message-ID:  <200011160923.CAA01986@usr02.primenet.com>
In-Reply-To: <20001115180257.B26516@puck.firepipe.net> from "Will Andrews" at Nov 15, 2000 06:02:57 PM

The real point of this thread should be that a 386 taking forever
to start up because it's not fast at generating pseudo-randomness
is not an acceptable state of affairs.

There are plenty of laptop and other so-called "green" processors
today which will downgrade CPU power to the level of an old 386,
merely to extend battery life or to be considered "eco-friendly",
so throwing out the old systems will not solve the problem.

The answer to this slow-start "dilemma" is _not_ to throw out
the slow processors which can't run the hulking, slow-running
new "improved" code, just for the sake of not making that code
run efficiently.

It's obvious to me that this has the same fix as if someone had
put a big "for" loop in the idle proc: tolerate it for a while,
if it does something useful, and if it doesn't get fixed after a
while, take it out and shoot it, like PHK did to Julian's slice
code, even if it _does_ something useful.

---

I will answer your points after the ^L below, even though they are
now wildly off-subject, as build engineering _is_ at least on topic
for the -arch list; those not interested can stop the remainder of
the message in their mail client now:


If you are reply to this, please change the Subject: line.

> > Many people "try" -current on small scratch disks that they
> > install from snapshots, rather than polluting their local
> > trees with -current bits, particularly since the answer to
> > their bug reports is pretty much to ignore them and tell the
> > reporter to "resup" or ask "have you tried the snapshot?".
> 
> Um, Terry, are you even on bugs@ ?  The fact of life is, many folks who
> "try" -current that report bugs do not give enough details, so they in
> return get vague suggestions like these.

No, I'm not.  But I run -current on scratch disks from snapshots
because I can't afford the bandwidth to cvsup all the time, and
because when you use cvsup, the lack of an interlock means that
the result is often unbuildable, particularly when it comes to
-current.  If I can't afford one, then I can't afford the second
one that would be necessary, were I to have a failure.

Unfortunately, the cvsup date is not useful information for use
in a bug report, either.  I would have to change my strategy to
doing a cvsup, and then backing off by date in GMT, until I got
something that compiled (not a winning strategy on a 386, in any
case).  That would give me a baseline from which I (or others)
could then report bugs that would be repeatable, even if not
bleeding-edge -current.  It takes a prohibitive amount of disk
cycles to do something like this, and hosted cross-builds are
still not that easy, unless you want to dirty your main source
tree and the /sys link, or unless your scratch disks are really
massive suckers.

Using snapshots avoids the CPU cycles problem and the cvsup data
synchronization problem: a snapshot is not made available until
it can at least successfully compile.

So you could ammend my statement, I suppose, to ``clued people
"try" snapshots to reduce the number of useless answers to their
bug reports''.

So in summary, I don't need to be on "bugs@" today, since I'm
already well aware of the dynamics involved, and nothing has
changed from the past about them that would change the dynamics;
and it's the dynamics which lead to the lack of details and the
vague responses.


> Besides, people are told to resup when the newer -current has
> fixed the problem, and using a snapshot is an easy way to
> determine points of infection.

Agreed; I think all reports should be made against snapshots, to
have a clear demarcation between developement and testing, if
nothing else.  But even though I advocate using them, in my post,
which you quoted above, they are still succeptible to the problems
I noted.

Snapshots have a relatively short archival life expectancy, and
so they aren't really useful for developers for repeating a user
reported problem.

Even with an exact date, a developer is probably not going to be
able to rebuild a system against which they can run gdb with a
user supplied crash dump.


So where are we left?  With a bunch of developers who would like
help testing their code against a lot of different hardware, and
a much larger group of users who would like to help them out, but
have huge impediments in the communications channel between them
and the developers.  How can we resolve this impasse?

There are a couple of simple procedural fixes, actually:

1)	The snapshot was rebuildable from sources, such that a
	kernel debugging session would work.  This could be done
	by build-tagging the repository sources (the tags could
	be removed as the snapshots were removed), or by using
	explicit dates to check out the snapshot trees for the
	build.

2)	Snapshots could be trusted to hang around for long enough
	for a developer and a bug reported to be able to rendesvous
	on one, and fix a problem.

3)	Kernels for snapshots were built with full debugging
	symbols (-g), and only stripped for the snapshot, and the
	unstripped version kept around with the snapshot for use
	by a developer wanting to debug a crash (this would mostly
	eliminate the ned for #1 for debugging -- but not for bug
	fixing, since you would want to rebuild with more error
	diagnostics and retry the failure, etc., until the fault
	was isolated).

Even Whistle, which was about as ad-hoc about using the local source
repository to communicate between developers in adjacent cubicles
(a practice which can result in frequently unbuildable source
trees) knew enough, institutionally, about build engineering to at
least make all successful builds (not just releases) rebuildable.
A seperate build engineer role, and a willingness to tag builds in
the repository after reverting changes which prevented builds, was
helpful in making this a nightly (or more frequent) occurrance, but
FreeBSD doesn't have that strong an "it works" requirement, nor as
formal testing vs. requirements and a regression of closed but not
verified resolved bugs, that it could not afford some delay between
snapshot instances.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200011160923.CAA01986>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation