Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 7 Sep 2008 20:52:31 +0200
From:      Juergen Lock <nox@jelal.kn-bremen.de>
To:        "Sean C. Farley" <scf@freebsd.org>
Cc:        freebsd-emulation@freebsd.org
Subject:   Re: Linux applications core if running (k)qemu
Message-ID:  <20080907185231.GA72139@saturn.kn-bremen.de>
In-Reply-To: <200809062215.m86MF6NS040797@saturn.kn-bremen.de>
References:  <alpine.BSF.1.10.0808291711580.5866@thor.farley.org> <20080830113448.GA2152@dchagin.dialup.corbina.ru> <alpine.BSF.2.00.0809021552040.7934@thor.farley.org> <20080906104659.GA2113@dchagin.dialup.corbina.ru> <200809062215.m86MF6NS040797@saturn.kn-bremen.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Sep 07, 2008 at 12:15:06AM +0200, I wrote:
> In article <20080906152929.GB2038@deviant.kiev.zoral.com.ua> you write:
> >-=-=-=-=-=-
> >
> >On Sat, Sep 06, 2008 at 02:46:59PM +0400, Chagin Dmitry wrote:
> >> On Tue, Sep 02, 2008 at 03:56:33PM -0500, Sean C. Farley wrote:
> >> > On Sat, 30 Aug 2008, Chagin Dmitry wrote:
> >> > 
> >> > >On Fri, Aug 29, 2008 at 05:29:09PM -0500, Sean C. Farley wrote:
> >> > >>I am having trouble with kqemu.ko and linux.ko.  If I run qemu with
> >> > >>the following command, Linux applications (chroot, acroread, ls) will
> >> > >>start core dumping:
> >> > >>    qemu-system-x86_64 -m 512 \
> >> > >>    -drive file=/usr/QEMU/WinXP/c.img,if=ide,media=disk -boot c \
> >> > >>    -std-vga -parallel none -serial none -monitor stdio \
> >> > >>    -net nic,model=e1000 -net tap,ifname=tap0,script=no -localtime
> >> > >>
> >> > >>Loading kqemu.ko does not cause the problem, but the cores start a
> >> > >>little after WinXP starts running.  Unloading kqemu.ko does not help;
> >> > >>the cores still happen but more randomly.  I even tried unloading all
> >> > >>linux modules and reloading them without luck.  It takes a reboot.
> >> > >>
> >> > >>Packages:
> >> > >>qemu-devel-0.9.1s.20080620_1
> >> > >>kqemu-kmod-devel-1.4.0.p1
> >> > >>linux_base-f8-8_4
> >> > >>
> >> > >>sysctl:
> >> > >>compat.linux.osrelease: 2.6.16
> >> > >>
> >> > >>dmesg:
> >> > >>kqemu version 0x00010400
> >> > >>kqemu: KQEMU installed, max_locked_mem=1792492kB.
> >> > >>
> >> > >>System is 7-STABLE as of r181963 with or without the patch to fix RT
> >> > >>signals from Chagin.
> >> > >
> >> > >Interestingly... Sean, can you provide ktrace/kdump log of coring
> >> > >apps?  thnx!
> >> > 
> >> > Here they are (good and bad):
> >> > http://www.farley.org/freebsd/tmp/linuxulator_vs_kqemu/
> >> > 
> >> > The good trace is after the bad trace.  I just kept running ktrace
> >> > /compat/linux/bin/date over and over until I got a good trace.  Before
> >> > loading kqemu and running qemu, there were no core dumps.  Also, I
> >> > compared two bad traces and they were basically the same except for PID
> >> > and a couple of addresses (still very close in value).
> >> > 
> >> 
> >> Most likely it is a tls problem again, some days ago kib@ has made MFC
> >> r182684, probably it will help..
> >
> >I doubt it. This seems to be an ingenious kqemu bug. As far as I remember,
> >it tries to use GDT/LDT. This probably has unwanted interaction with
> >PCB_GS32BIT.
> 
> Wow.  That corner of the code had escaped me so far, and yes this (in
> amd64/linux32) looks like it won't like kqemu's seperating of the gdts
> on SMP indeed.  (it stores a pointer to &gdt[GUGS32_SEL] in pcb_gs32p and
> lets linux processes manipulate the segment pointed to by it, and when
> kqemu is (or was) running this won't be used by all cpus, see older threads
> like
> 	http://lists.freebsd.org/pipermail/freebsd-emulation/2008-May/004902.html
> for the reasons.)
> 
>  What I wonder tho is, won't this also cause problems without kqemu when
> there are linux processes running on multiple cpus that manipulate this
> segment because the gdt is then shared between the cpus?  (like, linux
> process on cpu 0 changes the segment, then linux process on cpu 1 comes
> along and changes it again and then the linux process on cpu 0 will pick
> it up from cpu 1?)  At least I must have somehow assumed the shared gdt
> wouldn't be changed later because of reasons like this...
> 
>  Anyway, fixing this will require changes to the kernel, I don't see how
> kqemu could fix it by itself alone. :(

There is a possible workaround tho that you can try if you are on
RELENG_7 or HEAD (and you are running ULE):
	cpuset -l 0 qemu...
(Obviously this is less than ideal if you need more than one qemu at a time,
so we still want a proper fix.)

 Btw I couldn't reproduce the crashing linux date(1) on RELENG_7_0, so I
guess it was only later commits that uncovered the problem...

 HTH, (or at least a bit :)
	Juergen



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080907185231.GA72139>