Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 23 Apr 2008 10:10:21 -0700
From:      Marcel Moolenaar <xcllnt@mac.com>
To:        Christian Kandeler <christian.kandeler@hob.de>
Cc:        freebsd-ia64@freebsd.org
Subject:   Re: syscalls & mcontext
Message-ID:  <3DFF6AF6-943F-4066-A424-3F6C73399C7B@mac.com>
In-Reply-To: <200804231422.57226.christian.kandeler@hob.de>
References:  <200804231422.57226.christian.kandeler@hob.de>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail-2-882670974
Content-Type: text/plain;
	charset=US-ASCII;
	format=flowed
Content-Transfer-Encoding: 7bit


On Apr 23, 2008, at 5:22 AM, Christian Kandeler wrote:

> during testing of a FreeBSD/IA64 application I had written I noticed
> that it kept getting a SIGILL signal seemingly out of nowhere. On
> closer inspection, I found out that the following happens:
> 	- The library calls the kse_switchin syscall.
> 	- The kernel's kse_switchin() function is called with the second
> argument == address of trapframe + 0xe8, as set up by epc_syscall.

Yes, The 8 possible syscall arguments are put in the trapframe,
starting at tf_scratch.gr16. tf_scratch.gr15 holds the syscall
number.

> 	- The kse_switchin() function calls set_mcontext(), which, among
> other things, sets tf->tf_scratch = mc->mc_scratch. But
> tf->tf_scratch overlaps the second argument of kse_switchin(), so now
> uap->tmbx in kse_switchin() is no longer a pointer to the thread
> mailbox, but some random value (whatever was in mc_scratch.gr16).

Ah yes. We don't pass the actual arguments around. We pass a
pointer to the arguments around. This pointer is typically
called uap and as per above, it points to tf_scratch.gr16 in
the trapframe.

> 	- After set_mcontext() has returned, kse_switchin() sets
> td->td_mailbox = uap->tmbx, i.e. the bogus value is now copied into
> the thread structure.

Oops...

The solution could be as simple as putting uap->tmbx and
uap->flags in local variables in kse_switchin(). This also has
the added advantage of having slightly more optimal code,
because the compiler will know that the fields are constant
and will not forcibly reload them after function calls.

> Any idea of what is going wrong here? My first, uneducated guess would
> be that we shouldn't set tf_scratch (because why does a synchronous
> interruption need to restore the scratch registers), but my insight
> into the syscall mechanism is rather superficial and I assume the
> problem is more complex than that.

The complexity is in having 2 distinct kernel entry paths. The
exception and the EPC syscall. The kse_switchin() syscall can
be called with an asynchronous context (i.e. exception-based).
It must switch the scratch registers in those cases.

Note that the consequence of the above is that we can enter the
kernel using the EPC syscall path, but that we need to leave
the kernel using the exception path. This is handled in the EPC
syscall code by checking the tf_flags field in the trapframe.
A similar test is in the exception path.

Could you try the attached patch. It's against 7-STABLE, but
should be easy to make work for 6.1 (if it doesn't apply).
I'll give it a spin myself too...

Thanks,

BTW: Good catch! This probably accounts for a lot of threading
related core dumps and may be the root cause of PR 86218...

-- 
Marcel Moolenaar
xcllnt@mac.com


--Apple-Mail-2-882670974
Content-Disposition: attachment;
	filename=kse_switchin.diff
Content-Type: application/octet-stream; x-unix-mode=0644;
	name="kse_switchin.diff"
Content-Transfer-Encoding: 7bit

Index: kern_kse.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/Attic/kern_kse.c,v
retrieving revision 1.235.2.1
diff -u -r1.235.2.1 kern_kse.c
--- kern_kse.c	18 Jan 2008 10:02:51 -0000	1.235.2.1
+++ kern_kse.c	23 Apr 2008 16:51:03 -0000
@@ -145,9 +145,17 @@
 kse_switchin(struct thread *td, struct kse_switchin_args *uap)
 {
 #ifdef KSE
-	struct kse_thr_mailbox tmbx;
+	struct kse_thr_mailbox tmbx, *tmbxp;
 	struct kse_upcall *ku;
-	int error;
+	int error, flags;
+
+	/*
+	 * Put the arguments in local variables, to allow uap to
+	 * point into the trapframe. We clobber the trapframe as
+	 * part of setting a new context.
+	 */
+	tmbxp = uap->tmbx;
+	flags = uap->flags;
 
 	thread_lock(td);
 	if ((ku = td->td_upcall) == NULL || TD_CAN_UNBIND(td)) {
@@ -155,18 +163,18 @@
 		return (EINVAL);
 	}
 	thread_unlock(td);
-	error = (uap->tmbx == NULL) ? EINVAL : 0;
+	error = (tmbxp == NULL) ? EINVAL : 0;
 	if (!error)
-		error = copyin(uap->tmbx, &tmbx, sizeof(tmbx));
-	if (!error && (uap->flags & KSE_SWITCHIN_SETTMBX))
+		error = copyin(tmbxp, &tmbx, sizeof(tmbx));
+	if (!error && (flags & KSE_SWITCHIN_SETTMBX))
 		error = (suword(&ku->ku_mailbox->km_curthread,
-			 (long)uap->tmbx) != 0 ? EINVAL : 0);
+			 (long)tmbxp) != 0 ? EINVAL : 0);
 	if (!error)
 		error = set_mcontext(td, &tmbx.tm_context.uc_mcontext);
 	if (!error) {
-		suword32(&uap->tmbx->tm_lwp, td->td_tid);
-		if (uap->flags & KSE_SWITCHIN_SETTMBX) {
-			td->td_mailbox = uap->tmbx;
+		suword32(&tmbxp->tm_lwp, td->td_tid);
+		if (flags & KSE_SWITCHIN_SETTMBX) {
+			td->td_mailbox = tmbxp;
 			td->td_pflags |= TDP_CAN_UNBIND;
 		}
 		PROC_LOCK(td->td_proc);

--Apple-Mail-2-882670974
Content-Type: text/plain;
	charset=US-ASCII;
	format=flowed
Content-Transfer-Encoding: 7bit



--Apple-Mail-2-882670974--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3DFF6AF6-943F-4066-A424-3F6C73399C7B>