Date: Mon, 22 Apr 2013 14:27:42 -0600 From: Warner Losh <imp@bsdimp.com> To: Juli Mallett <jmallett@FreeBSD.org> Cc: Joe Holden <joe@rewt.org.uk>, "freebsd-mips@FreeBSD.org" <freebsd-mips@freebsd.org> Subject: Re: kern/177876: [mips] kernel stack overflow panic on mips64, EdgeRouter Lite Message-ID: <EBE52100-4C0F-4B61-B872-CA30B99E2940@bsdimp.com> In-Reply-To: <CACVs6=8XdAgccufabeoXEXCFGGVZ_EWJ8c-KdRz4xr9SvBxrrw@mail.gmail.com> References: <201304220300.r3M301iY093070@freefall.freebsd.org> <CAJ-Vmok7m9%2B3sky1swEP6ZTnZNLpkmwTC2tOqzGNaSFwY7WmFA@mail.gmail.com> <51753506.3070901@rewt.org.uk> <CAJ-VmomKi%2BpmZ6GAjds-=RXRET=aW65dsmxe3H4m%2BfdbxoecGw@mail.gmail.com> <CACVs6=8XdAgccufabeoXEXCFGGVZ_EWJ8c-KdRz4xr9SvBxrrw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Apr 22, 2013, at 11:59 AM, Juli Mallett wrote: > On Mon, Apr 22, 2013 at 10:35 AM, Adrian Chadd <adrian@freebsd.org> = wrote: >> Do an svn log in sys/mips/ or sys/vm/ and look at the changes. >>=20 >> I don't know how far you can go back before you don't have the >> edgerouter lite support, but maybe you can try going back to when = Juli >> initially committed it, and then just work your way forward. >>=20 >> I think Juli did the initial work, so she knows when it came in. >>=20 >> juli - I don't suppose you could spin up FreeBSD-HEAD on the >> edgerouter lite and take a look? It's highly likely someone messed up >> since you did your port. :( >=20 > I can't quite imagine why EdgeRouter Lite (or Octeon more generally) > could be a special case here; I'd be more inclined to think it was > generally 64-bit MIPS that would be broken. (A too-conservative > definition or something.) Except I was pretty sure I'd run -CURRENT > more recently than those changes. >=20 > The only change that is suspect in mips/ since I made my changes is > Warner's change to include/regnum.h, which looks like there's the slim > possibility that it could screw up register saving in N64 builds. > That would mean that it wasn't tested with a 64-bit build, though, > which I'm sure Warner wouldn't be so sloppy as to do. >=20 > Joe, can you try reverting 249523 and seeing if that fixes things for > you? It seems like this breaks the order of registers saved to the > PCB, which would break syscalls with more than 4 arguments, like mmap. > Even just looking at how the macros expand in the N64 case makes it > pretty clear that this change was made clumsily, e.g. from > exception.S: >=20 > SAVE_REG($12, 8, $29) > SAVE_REG($13, 9, $29) > SAVE_REG($14, 10, $29) > SAVE_REG($15, 11, $29) > SAVE_REG($8, 12, $29) > SAVE_REG($9, 13, $29) > SAVE_REG($10, 14, $29) > SAVE_REG($11, 15, $29) >=20 > For this to not break syscalls, struct trapframe would need to be > updated, Looking at the trapframe, you are right. <doh>. I did test boot a = kernel with the change, but after-the-fact software forensics suggest I = built the new kernel and tested the old one. I found the new one = installed as kenrel.oct rather than kernel.oct which I test booted... > or the syscall handling code. Joe, can you confirm that > backing out 249523 fixes things for you? If it does, Adrian, would > you be willing to handle a backout? I can't imagine finding the time > for a couple of days, and if this is really so badly, unnecessarily > broken, that should be fixed immediately. I hope I'm wrong. Nobody > should be making incomplete changes on the basis of a half-baked > reading of purportedly-conflicting documentation, and without testing. > Yikes! Yes, the changes are incomplete, but look easy to fix. > If, as I really, really hope, that change isn't the problem, it's not > clear to me that would be the culprit. It sure looks like you are right... I have a full new tree building = just to make sure... :( Then again, it would be useful to document where these dependencies lie = to help prevent others from tripping over this in the future :( Warner=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?EBE52100-4C0F-4B61-B872-CA30B99E2940>