Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 22 Apr 2013 21:43:42 +0100
From:      "Joe Holden" <joe@rewt.org.uk>
To:        "'Warner Losh'" <imp@bsdimp.com>, "'Juli Mallett'" <jmallett@FreeBSD.org>
Cc:        "'freebsd-mips@FreeBSD.org'" <freebsd-mips@freebsd.org>
Subject:   RE: kern/177876: [mips] kernel stack overflow panic on mips64, EdgeRouter Lite
Message-ID:  <007c01ce3f9a$15044d40$3f0ce7c0$@rewt.org.uk>
In-Reply-To: <EBE52100-4C0F-4B61-B872-CA30B99E2940@bsdimp.com>
References:  <201304220300.r3M301iY093070@freefall.freebsd.org> <CAJ-Vmok7m9%2B3sky1swEP6ZTnZNLpkmwTC2tOqzGNaSFwY7WmFA@mail.gmail.com> <51753506.3070901@rewt.org.uk> <CAJ-VmomKi%2BpmZ6GAjds-=RXRET=aW65dsmxe3H4m%2BfdbxoecGw@mail.gmail.com> <CACVs6=8XdAgccufabeoXEXCFGGVZ_EWJ8c-KdRz4xr9SvBxrrw@mail.gmail.com> <EBE52100-4C0F-4B61-B872-CA30B99E2940@bsdimp.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Apr 22, 2013, at 11:59 AM, Juli Mallett wrote:

> On Mon, Apr 22, 2013 at 10:35 AM, Adrian Chadd <adrian@freebsd.org> wrote:
>> Do an svn log in sys/mips/ or sys/vm/ and look at the changes.
>> 
>> I don't know how far you can go back before you don't have the 
>> edgerouter lite support, but maybe you can try going back to when 
>> Juli initially committed it, and then just work your way forward.
>> 
>> I think Juli did the initial work, so she knows when it came in.
>> 
>> juli - I don't suppose you could spin up FreeBSD-HEAD on the 
>> edgerouter lite and take a look? It's highly likely someone messed up 
>> since you did your port. :(
> 
> I can't quite imagine why EdgeRouter Lite (or Octeon more generally) 
> could be a special case here; I'd be more inclined to think it was 
> generally 64-bit MIPS that would be broken.  (A too-conservative 
> definition or something.)  Except I was pretty sure I'd run -CURRENT 
> more recently than those changes.
> 
> The only change that is suspect in mips/ since I made my changes is 
> Warner's change to include/regnum.h, which looks like there's the slim 
> possibility that it could screw up register saving in N64 builds.
> That would mean that it wasn't tested with a 64-bit build, though, 
> which I'm sure Warner wouldn't be so sloppy as to do.
> 
> Joe, can you try reverting 249523 and seeing if that fixes things for 
> you?  It seems like this breaks the order of registers saved to the 
> PCB, which would break syscalls with more than 4 arguments, like mmap.
> Even just looking at how the macros expand in the N64 case makes it 
> pretty clear that this change was made clumsily, e.g. from
> exception.S:
> 
> SAVE_REG($12, 8, $29)
> SAVE_REG($13, 9, $29)
> SAVE_REG($14, 10, $29)
> SAVE_REG($15, 11, $29)
> SAVE_REG($8, 12, $29)
> SAVE_REG($9, 13, $29)
> SAVE_REG($10, 14, $29)
> SAVE_REG($11, 15, $29)
> 
> For this to not break syscalls, struct trapframe would need to be 
> updated,

Looking at the trapframe, you are right. <doh>.  I did test boot a kernel
with the change, but after-the-fact software forensics suggest I built the
new kernel and tested the old one. I found the new one installed as
kenrel.oct rather than kernel.oct which I test booted...

> or the syscall handling code.  Joe, can you confirm that backing out 
> 249523 fixes things for you?  If it does, Adrian, would you be willing 
> to handle a backout?  I can't imagine finding the time for a couple of 
> days, and if this is really so badly, unnecessarily broken, that 
> should be fixed immediately.  I hope I'm wrong.  Nobody should be 
> making incomplete changes on the basis of a half-baked reading of 
> purportedly-conflicting documentation, and without testing.
> Yikes!

<snip>

I am just building a pre-commit kernel, but if you guys know what it is I'll
wait for a fix :)

Will this also fix the trapframe issue when the box is under heavy cpu load
or is that a different issue?




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?007c01ce3f9a$15044d40$3f0ce7c0$>