Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 Jan 2007 15:55:54 -0500
From:      Jung-uk Kim <jkim@FreeBSD.org>
To:        Tijl Coosemans <tijl@ulyssis.org>
Cc:        freebsd-emulation@FreeBSD.org
Subject:   Re: linuxolator: tls_test results amd64
Message-ID:  <200701231555.57521.jkim@FreeBSD.org>
In-Reply-To: <200701232113.50766.tijl@ulyssis.org>
References:  <790a9fff0701211041j1176d00gd6dd75d0989cf4ec@mail.gmail.com> <200701231400.46367.jkim@FreeBSD.org> <200701232113.50766.tijl@ulyssis.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 23 January 2007 03:13 pm, Tijl Coosemans wrote:
> On Tuesday 23 January 2007 20:00, Jung-uk Kim wrote:
> > On Monday 22 January 2007 07:01 pm, Tijl Coosemans wrote:
> > > On Monday 22 January 2007 22:26, Divacky Roman wrote:
> > > > > > 2) why real apps (ie. using %gs) show the very same
> > > > > > behaviour (first program works then it doesnt)
> > > > >
> > > > > Hmm, can you point me to the source of such a program? I
> > > > > would expect programs that use glibc to always fail. Glibc
> > > > > expects set_thread_area to setup a GDT entry and return the
> > > > > entry number. Then glibc loads that entry number into GS
> > > > > which sets up GS.base. Because of this, I would expect
> > > > > GS.base to always end up being 0x00000000 just as FS.base
> > > > > above.
> > > > >
> > > > > Wine on Linux does the same. It calls set_thread_area and
> > > > > loads the returned entry number in FS. (On Windows, FS is
> > > > > used for tls.)
> > > > >
> > > > > The reason setting GS.base directly with a wrmsr works on
> > > > > FreeBSD is because i386 user land code doesn't write to GS.
> > > > > i386_set_gsbase
> > > >
> > > > what do you mean by "writing to GS" ?
> > >
> > > mov something, %gs
> > >
> > > Linux glibc does this after calling set_thread_area, which
> > > loads the base address in the GDT entry into GS.base,
> > > overwriting the GS.base previously setup using wrmsr. FreeBSD
> > > libc/libpthread don't do this.
> > >
> > > > > already sets up GS on i386, so the compatibility code on
> > > > > amd64 can use the wrmsr trick and leave GS itself and the
> > > > > descriptor it points to untouched. As far as I understand
> > > > > things, this won't work for linux32 compatibility on amd64.
> > > >
> > > > lookin at the code it looks like:
> > > >
> > > > i386_set_gsbase = sysarch(I386_SET_GSBASE, &addr);
> > > >
> > > > and sysarch for that looks like:
> > > > wrmsr(MSR_KGSBASE, i386base);
> > > > pcb->pcb_gsbase = i386base;
> > > >
> > > > where is the setting up of the GS? I dont get it...
> > >
> > > GS.base is what matters for address calculations. In the i386
> > > version, this is set by setting up a GDT entry and loading the
> > > entry's index into GS. In the amd64 version, which you gave
> > > above, GS.base is set directly.
> > > (Actuallyn the code above sets a copy of GS.base. When
> > > switching between user and kernel mode, a swapgs instruction
> > > swaps kernel GS.base and user GS.base)
> > >
> > > > overall you are saying that to support linux32 tls we have to
> > > >
> > > > 1) load an unused segment with proper values
> > > > 2) return the number of the segment from the
> > > > set_thread_syscall 3) make the automatic loading/unloading of
> > > > that segment to happen on every context switch (just like its
> > > > done for segment 3 on i386)
> > > >
> > > > do I get it right?
> > >
> > > 1) Yes, but the amd64 code has no GDT entry reserved for this
> > > right now it seems, so you have to add one. I don't really know
> > > how that's done, but what I would try (if I had the time) is to
> > > add an entry to the gdt_segs array in
> > > sys/amd64/amd64/machdep.c, say at index 6 and then adjust the
> > > defines and NGDT in
> > > sys/amd64/include/segments.h.
> > >
> > > 2) Just as you do now. Set the entry number and do a copyout.
> > > The syscall returns 0 on success. FYI, the glibc code that uses
> > > this syscall is in
> > > glibc-2.3.6/nptl/sysdeps/i386/tls.h:185:TLS_INIT_TP
> > >
> > > 3) Yes. You'll have to add a field to the pcb to store a copy
> > > of the descriptor. And then adjust the context switch code.
> > >
> > > After that, the amd64 version of set_thread_area becomes
> > > virtually the same as the i386 version. Setup a descriptor and
> > > copy it to the pcb and GDT.
> > >
> > > Most of this is copy/paste work I guess. The tricky part is to
> > > figure out what to copy and where to paste it.
> > >
> > >
> > > This should get basic tls working I think. The actual
> > > set_thread_area is a bit more complicated. It has 3 GDT entries
> > > available and when called with -1 as the entry number, it will
> > > select an unused entry. I don't know if there are programs that
> > > use all 3 (some tests maybe?). The only program I know that
> > > uses 2 is wine.
> >
> > I was little quiet yesterday because I wasn't sure.  But I have
> > more evidence now.  First of all, wrmsr(MSR_KGSBASE, ...) must be
> > protected with 'if (td == curthread)' just as cpu_set_user_tls()
> > does, which is very trivial.  Second problem is MSR_KGSBASE is
> > scrubbed by something during context switch, i.e., it becomes 0
> > some times.
>
> You mean:
>
> *kernel sets gsbase and switches back to user mode
> *user program does things

Yes.  BTW, glibc seems to use movw instead of movl to load %gs.  I 
don't know if that makes difference, though.  It may have some effect 
when glibc is built with -mtls-direct-seg-refs flag.  Need 
confirmation.

> *back in kernel mode, save gsbase into pcb and it appears to be 0
> now?

Saved pcb_gsbase seems always correct.  MSR_KGSBASE is not, which is 
supposedly swapped with MSR_GSBASE via swapgs.  Maybe I am confused, 
or maybe my CPU is too old (it's C0 stepping and I know there are 
some segmentation issues with the revision) but that's what I see.  I 
need more time for testing (or resting?).

Jung-uk Kim



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200701231555.57521.jkim>