Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 7 Jan 2007 18:05:39 +0000 (GMT)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Ceri Davies <ceri@submonkey.net>
Cc:        stable@FreeBSD.org
Subject:   Re: (audit?) Panic in 6.2-PRERELEASE
Message-ID:  <20070107180257.I41371@fledge.watson.org>
In-Reply-To: <20070107170014.GL7088@submonkey.net>
References:  <20070105111954.GA51511@submonkey.net> <20070105120539.H46119@fledge.watson.org> <20070105131528.GB7088@submonkey.net> <20070105133028.F98541@fledge.watson.org> <20070105150857.GC7088@submonkey.net> <20070106120040.N46119@fledge.watson.org> <20070106132540.GG7088@submonkey.net> <20070107114243.K41371@fledge.watson.org> <20070107170014.GL7088@submonkey.net>

next in thread | previous in thread | raw e-mail | index | archive | help

On Sun, 7 Jan 2007, Ceri Davies wrote:

>> Could you try printing *td->td_ar?  Maybe this will give us a clue as to 
>> how far it got.  In particular, this may be able to more reliably give us 
>> the file descriptor number, which is audited early in the system call. 
>> You might find that 'td' is corrupted in many layers of the stack, keep 
>> going up until you find one where it's good.  It may well be that 
>> td->td_ar->k_ar.ar_arg_fd is correct, and might confirm that uap->fd is 
>> correct still.  We'd like also to know if ARG_SOCKINFO, ARG_VNODE1, or 
>> ARG_VNODE2 is set in the k_ar.ar_valid_arg field.  This may tell us some 
>> more about the file descriptor even though it appears to have vanished.
>
> *td->td_ar is null (0x0) in both cases...

I'm actually beginning to wonder if this is actually audit-related at all. 
Something is clearly not right, and the audit code should not actually have 
been entered at all there.  Perhaps we're being mislead by the stack trace 
corruption into thinking audit is involved.

>> I'm quite worried by the fact that the file descriptor seems not to be 
>> present any more -- this suggests a file descriptor related race of the 
>> sort that is both quite difficult to figure out and also quite a risk. It's 
>> strange that it would only trigger with audit, however--perhaps audit 
>> stretches out the race.  Is this an SMP box?
>
> It's certainly looking quite nasty.  This system is UP hardware without 
> options SMP.
>
> ...
>
> If it's at all useful, I can provide access to this system and the dumps.

Yeah, I think at this point that would probably be the most helpful thing.

Could you confirm that the kernel.debug you're using definitely matches the 
version of the kernel in the core dump?

Robert N M Watson
Computer Laboratory
University of Cambridge



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070107180257.I41371>