Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 17 Nov 2002 10:53:44 -0800
From:      Marcel Moolenaar <marcel@xcllnt.net>
To:        Doug Rabson <dfr@nlsystems.com>
Cc:        ia64@FreeBSD.ORG
Subject:   Re: libc_r: syscalls, epc and unwinding [was: Re: setjmp/longjmp and libc_r...]
Message-ID:  <20021117185344.GA603@athlon.pn.xcllnt.net>
In-Reply-To: <200211171017.30008.dfr@nlsystems.com>
References:  <200211140640.gAE6eNq9016231@repoman.freebsd.org> <200211161101.38075.dfr@nlsystems.com> <20021116172102.GA618@dhcp01.pn.xcllnt.net> <200211171017.30008.dfr@nlsystems.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Nov 17, 2002 at 10:17:29AM +0000, Doug Rabson wrote:
> >
> > We always save and drop the state in the SMP case. That is, if the
> > CPU holds the high FP state of the thread being switched.
> 
> Of course, I forgot about that part. The only alternative to that I can 
> see is cpu-locking a thread which owns the fp state .

Yes.  I've been thinking about a scheme that involves IPIs, based on the
assumptions that:
1. We do have processor affinity
2. The number of processes that use high FP is very small relative
   to the number of processes that don't use high FP.
3. The chance that there are more processes that use high FP than
   there are CPUs is small.

The idea is to lazily keep the high FP state on a CPU (keep track of
that) and use an IPI to force the CPU to drop the state. The CPU that
runs the process can then grab it from the PCB. When a process needs
the high FP on a CPU, but it holds the high FP of another process,
than you only have to drop the state.

Note that it doesn't exclude CPU locking or saving/restoring high FP
all the time. You want a backup scheme in case someone does have the
perceived uncommon case as the common case.

> > 1. Bite the bullet and implement EPC for syscalls. This is an ABI
> >    breaker and best be done before release, but has high impact. As
> >    a side-effect, I might be able to save the preserved registers as
> >    a special case to avoid having to depend on unwinding, which we
> >    don't have reliably yet for this purpose. Is almost a single-
> >    commit change, but very very attractive...
> > 2. Finish the unwinding job I started and use it for the *context
> >    syscalls. High impact, but can be spread over many commits;
> > 3. more?
> 
> I had forgotten that not all of the state for getcontext and setcontext 
> was available in the trapframe. Another possible (3) would be a hybrid 
> user-syscall version which flushes the register stack and accesses the 
> floating point state in user mode. You could even recognise the 
> *context syscalls in exception.s and write them as special cases of 
> do_syscall.

See below.

> > I talked to David Mosberger yesterday. He told me of a case where
> > unwinding is being used to implement longjmp. It would be very
> > interesting to find out in how many steps one can unwind to the
> > context of setjmp and how much performance gain it gives.
> 
> It probably depends on how many levels need unwinding. The setjmp would 
> be very fast though.

Yes, I forgot to qualify the sentence with "in the common case" :-)

> > In short: I'm looking into ways to avoid that we have to flesh out
> > our own unwinder in the kernel, provided we can find code that is
> > suitable for our use. In this light, doing EPC now helps because
> > otherwise I would end up fleshing out our unwinder anyway.
> >
> > I'll give this some thought over coffee...
> 
> Why would EPC syscalls make this easier? Obviously they would be faster 
> but I can't quite see how they would make it easier to switch the 
> inaccessable parts of the register state.

It allows the hybrid user-syscall solution.

This is what I have now:

1.  USRSTACK = VM_MAX_ADDRESS = VM_MAXUSER_ADDRESS-(1024*1024)
	I currently deal with page faults in the 1M area by special
	casing them in trap.c. Probably not right, but I claim
	ignorance :-)

2.  Syscalls look like:
	#define SYSCALLNUM(name)        SYS_ ## name
	#define GATEWAY_PAGE            ((5 << 61) - 1048576)

	#define CALLSYS_NOERROR(name)                                   \
	        mov             r8 = SYSCALLNUM(name) ;                 \
	        movl            r14 = GATEWAY_PAGE ;;                   \
	        nop.m           0 ;                                     \
	        mov             b7 = r14 ;                              \
	        br.call.sptk    b6 = b7

3.  The "gateway" page as I call it will have the epc, the stackframe
    setup and such, but other than that I can do anything I want in
    user space.

This is what I'm thinking about:

1.  Provide a limited *context in the gateway pages for use by the
    syscalls and the signal trampoline/sigreturn. This probablt deals
    with the RSE and the preserved registers (no sigmask I think).
2.  In the signal trampoline, save the context in the ucontext given
    to use.
3.  For sigreturn, restore the context from the ucontext given to us
    Some limitations may exist due to protection and security.
4.  The *context syscalls are probably mostly or completely user space.
5.  For the normal syscall path, don't save the preserved registers.
    Only do the minimal setup required.

Something like that...

-- 
 Marcel Moolenaar	  USPA: A-39004		 marcel@xcllnt.net

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-ia64" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20021117185344.GA603>