Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Oct 1998 23:21:51 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        info@highwind.com (HighWind Software Information)
Cc:        lists@tar.com, current@FreeBSD.ORG
Subject:   Re: Recent 3.0's are Depressing
Message-ID:  <199810152321.QAA25846@usr04.primenet.com>
In-Reply-To: <199810151614.MAA25672@highwind.com> from "HighWind Software Information" at Oct 15, 98 12:14:53 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> This makes me worry:
> 
> Both of us are on the "latest" libc_r and we see different results.
> Statically linking an old libc_r into the application didn't fix the problem.
> 
> This makes me think it isn't "libc_r".
> 
> Any of the kernel folks or more knowledgable folks get a chance to try that
> program on the latest/greatest kernel + latest/greatest libc_r?

This thread is going retrograde too fast.

We *knew* several days ago, when you told us it's a statically linked
binary that's having problems, that it wasn't libc_r changes that
were biting you because there *were no libc_r changes* (software
does not mutate, and a statically linked libc_r is invariant).

Clearly, it's a problem caused by a kernel change.

It's going to take grunt work to fix this problem; you will either
have to attack it from user space by instrumenting to identify
the broken call, or you will have to attack it from kernel space
by finding "the day the universe changed" by checking out various
dates of kernel tree, and binary searching the date space.  If it
happened in the last month, the kernel attack will take you a
maximum of 5 reboots to find the exact day, and then you can tell
us what code did the deed by:

	cd /sys
	cvs diff -D date1 -D date2 > here_it_is


Here are some guesses as to the probable identity:

Did you revert the recent vfs_bio.c change (October 15th) and see
if this fixed the problem for you?

How about if you revert the recent cleanup David Greenman did to
some of the VM code?

But they are only guesses.

Unfortunately, there too much code between your example and the
kernel for us to be able to easily tell what code path in the
kernel your code is exercising just by seeing the code, and what
system call is returning something bogus when used in exactly the
way you are using it.

It's possible to track this down, but it's going to require someone
instrumenting a local copy of libc_r for your test program or doing
a number of kernel builds and reboots, and since your system can
switch the problem on and off by selecting which kernel will be booted,
you're elected.

Once you identify the problem, I'm positive that you can blackmail
the culprit into fixing it.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199810152321.QAA25846>