From owner-freebsd-ia64@FreeBSD.ORG Wed Apr 23 17:10:26 2008 Return-Path: Delivered-To: freebsd-ia64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5026A106566B for ; Wed, 23 Apr 2008 17:10:26 +0000 (UTC) (envelope-from xcllnt@mac.com) Received: from smtpoutm.mac.com (smtpoutm.mac.com [17.148.16.83]) by mx1.freebsd.org (Postfix) with ESMTP id 1AEBC8FC0A for ; Wed, 23 Apr 2008 17:10:26 +0000 (UTC) (envelope-from xcllnt@mac.com) Received: from mac.com (asmtp008-s [10.150.69.71]) by smtpoutm.mac.com (Xserve/smtpout020/MantshX 4.0) with ESMTP id m3NHAPRa007838; Wed, 23 Apr 2008 10:10:25 -0700 (PDT) Received: from macbook-pro.jnpr.net (natint3.juniper.net [66.129.224.36]) (authenticated bits=0) by mac.com (Xserve/asmtp008/MantshX 4.0) with ESMTP id m3NHAMGZ019287 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Wed, 23 Apr 2008 10:10:22 -0700 (PDT) Message-Id: <3DFF6AF6-943F-4066-A424-3F6C73399C7B@mac.com> From: Marcel Moolenaar To: Christian Kandeler In-Reply-To: <200804231422.57226.christian.kandeler@hob.de> Content-Type: multipart/mixed; boundary=Apple-Mail-2-882670974 Mime-Version: 1.0 (Apple Message framework v919.2) Date: Wed, 23 Apr 2008 10:10:21 -0700 References: <200804231422.57226.christian.kandeler@hob.de> X-Mailer: Apple Mail (2.919.2) Cc: freebsd-ia64@freebsd.org Subject: Re: syscalls & mcontext X-BeenThere: freebsd-ia64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the IA-64 List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Apr 2008 17:10:26 -0000 --Apple-Mail-2-882670974 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit On Apr 23, 2008, at 5:22 AM, Christian Kandeler wrote: > during testing of a FreeBSD/IA64 application I had written I noticed > that it kept getting a SIGILL signal seemingly out of nowhere. On > closer inspection, I found out that the following happens: > - The library calls the kse_switchin syscall. > - The kernel's kse_switchin() function is called with the second > argument == address of trapframe + 0xe8, as set up by epc_syscall. Yes, The 8 possible syscall arguments are put in the trapframe, starting at tf_scratch.gr16. tf_scratch.gr15 holds the syscall number. > - The kse_switchin() function calls set_mcontext(), which, among > other things, sets tf->tf_scratch = mc->mc_scratch. But > tf->tf_scratch overlaps the second argument of kse_switchin(), so now > uap->tmbx in kse_switchin() is no longer a pointer to the thread > mailbox, but some random value (whatever was in mc_scratch.gr16). Ah yes. We don't pass the actual arguments around. We pass a pointer to the arguments around. This pointer is typically called uap and as per above, it points to tf_scratch.gr16 in the trapframe. > - After set_mcontext() has returned, kse_switchin() sets > td->td_mailbox = uap->tmbx, i.e. the bogus value is now copied into > the thread structure. Oops... The solution could be as simple as putting uap->tmbx and uap->flags in local variables in kse_switchin(). This also has the added advantage of having slightly more optimal code, because the compiler will know that the fields are constant and will not forcibly reload them after function calls. > Any idea of what is going wrong here? My first, uneducated guess would > be that we shouldn't set tf_scratch (because why does a synchronous > interruption need to restore the scratch registers), but my insight > into the syscall mechanism is rather superficial and I assume the > problem is more complex than that. The complexity is in having 2 distinct kernel entry paths. The exception and the EPC syscall. The kse_switchin() syscall can be called with an asynchronous context (i.e. exception-based). It must switch the scratch registers in those cases. Note that the consequence of the above is that we can enter the kernel using the EPC syscall path, but that we need to leave the kernel using the exception path. This is handled in the EPC syscall code by checking the tf_flags field in the trapframe. A similar test is in the exception path. Could you try the attached patch. It's against 7-STABLE, but should be easy to make work for 6.1 (if it doesn't apply). I'll give it a spin myself too... Thanks, BTW: Good catch! This probably accounts for a lot of threading related core dumps and may be the root cause of PR 86218... -- Marcel Moolenaar xcllnt@mac.com --Apple-Mail-2-882670974 Content-Disposition: attachment; filename=kse_switchin.diff Content-Type: application/octet-stream; x-unix-mode=0644; name="kse_switchin.diff" Content-Transfer-Encoding: 7bit Index: kern_kse.c =================================================================== RCS file: /home/ncvs/src/sys/kern/Attic/kern_kse.c,v retrieving revision 1.235.2.1 diff -u -r1.235.2.1 kern_kse.c --- kern_kse.c 18 Jan 2008 10:02:51 -0000 1.235.2.1 +++ kern_kse.c 23 Apr 2008 16:51:03 -0000 @@ -145,9 +145,17 @@ kse_switchin(struct thread *td, struct kse_switchin_args *uap) { #ifdef KSE - struct kse_thr_mailbox tmbx; + struct kse_thr_mailbox tmbx, *tmbxp; struct kse_upcall *ku; - int error; + int error, flags; + + /* + * Put the arguments in local variables, to allow uap to + * point into the trapframe. We clobber the trapframe as + * part of setting a new context. + */ + tmbxp = uap->tmbx; + flags = uap->flags; thread_lock(td); if ((ku = td->td_upcall) == NULL || TD_CAN_UNBIND(td)) { @@ -155,18 +163,18 @@ return (EINVAL); } thread_unlock(td); - error = (uap->tmbx == NULL) ? EINVAL : 0; + error = (tmbxp == NULL) ? EINVAL : 0; if (!error) - error = copyin(uap->tmbx, &tmbx, sizeof(tmbx)); - if (!error && (uap->flags & KSE_SWITCHIN_SETTMBX)) + error = copyin(tmbxp, &tmbx, sizeof(tmbx)); + if (!error && (flags & KSE_SWITCHIN_SETTMBX)) error = (suword(&ku->ku_mailbox->km_curthread, - (long)uap->tmbx) != 0 ? EINVAL : 0); + (long)tmbxp) != 0 ? EINVAL : 0); if (!error) error = set_mcontext(td, &tmbx.tm_context.uc_mcontext); if (!error) { - suword32(&uap->tmbx->tm_lwp, td->td_tid); - if (uap->flags & KSE_SWITCHIN_SETTMBX) { - td->td_mailbox = uap->tmbx; + suword32(&tmbxp->tm_lwp, td->td_tid); + if (flags & KSE_SWITCHIN_SETTMBX) { + td->td_mailbox = tmbxp; td->td_pflags |= TDP_CAN_UNBIND; } PROC_LOCK(td->td_proc); --Apple-Mail-2-882670974 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit --Apple-Mail-2-882670974--