Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 24 Oct 2012 23:38:12 +1300
From:      Andrew Turner <andrew@fubar.geek.nz>
To:        Tim Kientzle <kientzle@freebsd.org>
Cc:        arm@freebsd.org
Subject:   Re: Trashed registers returning from kernel?
Message-ID:  <20121024233812.0eefd07f@fubar.geek.nz>
In-Reply-To: <2B1CF099-50F0-46BE-8B02-61309DF93D5F@freebsd.org>
References:  <2B1CF099-50F0-46BE-8B02-61309DF93D5F@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 21 Oct 2012 18:40:08 -0700
Tim Kientzle <kientzle@freebsd.org> wrote:

> On the BeagleBone, I'm seeing a similar crash in several different
> user land programs.  I suspect it's a kernel bug.
> 
> Symptom: program is killed with SIGSEGV.  Most of the registers
> contain values above 0xc0000000 (pointing into kernel space).
> 
> Theory:
>  * Registers are not always getting correctly restored on a
> kernel->user transition.
>  * SEGV is a consequence.
> 
> I can reproduce it semi-consistently by running "emacs existing-file"
> just after a reboot.  (But I'm pretty sure this is the same symptoms
> I've seen with several other programs, so I don't think it's a bug in
> emacs.)
> 
> Has anyone else seen this on an armv6 system?
> 
> Does anyone have suggestions for how to go about debugging this?
> 
> Suggestions appreciated.

Can you find if the crash happens after a single syscall or is it
after many different syscalls? How consistent are the register values
and instruction that causes the SEGV? Have you identified any other programs that have the same issue?

The relevant code to save the registers with system calls is in
sys/arm/arm/exception.S and sys/arm/include/asmacros.h.

In exception.S there is the function swi_entry. It:
 - Saves the registers to the stack.
 - Stores sp in r0 to be passed in as the argument to swi_handler()
 - Stores sp in r6 to allow us to restore it later
 - Aligns the stack
 - Calls swi_handler() to perform the system call
 - Restores the stack pointer from r6
 - Performs any asynchronous software trap (calls ast() if required)
 - Restores the registers from the stack
 - Returns to userland

Assuming it is a syscall causing this I can think of 3 possible causes:
1. Someone is clobbering the stack.
2. Someone is clobbering the trap frame.
3. There is a cache issue causing old data to be written to the stack.

Checking 1 should be easy. In exception.S add the instruction "sub sp,
sp, #32" before the bic instruction. This will add padding to the
stack. You may need to change the #32 if it is not large enough. This
won't help if the issue is in ast().

Andrew




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121024233812.0eefd07f>