Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 18 Nov 2013 15:18:30 +0200
From:      Yuriy Kohut <ykohut@onapp.com>
To:        freebsd-xen@freebsd.org
Cc:        Sergi <sergi@estrafolari.com>
Subject:   Re: fpudna: fpcurthread == curthread XXXX times
Message-ID:  <A4364DFD-77FD-44FB-8231-1D016B385E19@onapp.com>
In-Reply-To: <4DE73A39.2060609@estrafolari.com>
References:  <4DE5EDD7.20105@estrafolari.com> <20110601082156.GB48734@deviant.kiev.zoral.com.ua> <4DE60820.1080409@estrafolari.com> <4DE63803.7090504@estrafolari.com> <20110601213616.GE48734@deviant.kiev.zoral.com.ua> <4DE73A39.2060609@estrafolari.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,

I do have the same issue on 8.4-RELEASE, 9.1-RELEASE, 9.2-RELEASE based =
on XENHVM config (amd64).=20

The 'uname' in the guests looks like this one:
root@my:/ # uname -a
FreeBSD my.vm 9.1-RELEASE FreeBSD 9.1-RELEASE #0: Mon Aug 19 14:08:42 =
EEST 2013     root@my.vm:/usr/obj/usr/src/sys/XENHVM  amd64

I could also confirm the issue doesn't affect 8.2 8.3, 9.0 versions.

The Hypervisor is "CentOS release 5.9" with Xen 3.4.3, and its details =
are:
# cat /proc/cpuinfo
...
processor	: 15
vendor_id	: AuthenticAMD
cpu family	: 16
model		: 9
model name	: AMD Opteron(tm) Processor 6128
stepping	: 1
cpu MHz		: 2000.140
cache size	: 512 KB
physical id	: 15
siblings	: 1
core id		: 0
cpu cores	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 5
wp		: yes
flags		: fpu de tsc msr pae mce cx8 apic mtrr mca cmov pat =
clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext =
3dnow constant_tsc pni cx16 popcnt lahf_lm cmp_legacy cr8_legacy abm =
sse4a misalignsse
bogomips	: 5002.23
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm stc [6] [7] [8]


# xm info
host                   : hv.build
release                : 2.6.18-308.20.1.el5.xen
version                : #1 SMP Wed Dec 5 13:30:38 GMT 2012
machine                : x86_64
nr_cpus                : 16
nr_nodes               : 1
cores_per_socket       : 8
threads_per_core       : 1
cpu_mhz                : 2000
hw_caps                : =
178bf3ff:efd3fbff:00000000:00000310:00802001:00000000:000837ff:00000000
virt_caps              : hvm
total_memory           : 49150
free_memory            : 45664
node_to_cpu            : node0:0-15
node_to_memory         : node0:45664
xen_major              : 3
xen_minor              : 4
xen_extra              : .4
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 =
hvm-3.0-x86_32p hvm-3.0-x86_64=20
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=3D0xffff800000000000
xen_changeset          : unavailable
cc_compiler            : gcc version 4.1.2 20080704 (Red Hat 4.1.2-52)
cc_compile_by          : root
cc_compile_domain      : hv.build
cc_compile_date        : Wed Sep  5 18:01:10 EEST 2012
xend_config_format     : 4

Could anybody please help/assist me with the issue fixing ?

Thanks
---
Yura

On Jun 2, 2011, at 10:22 PM, Sergi <sergi@estrafolari.com> wrote:

> On 01/06/11 23:36, Kostik Belousov wrote:
>> On Wed, Jun 01, 2011 at 03:00:51PM +0200, Sergi wrote:
>>  =20
>>> On 01/06/11 11:36, Sergi wrote:
>>>    =20
>>>> On 01/06/11 10:21, Kostik Belousov wrote:
>>>>      =20
>>>>> On Wed, Jun 01, 2011 at 09:44:23AM +0200, Sergi wrote:
>>>>>        =20
>>>>>> Hello,
>>>>>>=20
>>>>>> I'm working with full virtual FreeBSD 8.2-RELEASE-p1 domU under =
debian
>>>>>> squeeze and xen-hypervisor-4.0-amd64.
>>>>>>=20
>>>>>> If I cfg this hvm with cpu>   4 :
>>>>>>=20
>>>>>>  vcpus    =3D 5
>>>>>>=20
>>>>>> these messages block the server :
>>>>>>=20
>>>>>>  fpudna: fpcurthread =3D=3D curthread XXXX times
>>>>>>=20
>>>>>> The machine is pingable but I'm unable to ssh to it.
>>>>>>=20
>>>>>> On single user it works fine, fsck an so on ok, but when =
switching to
>>>>>> multiuser these fpudna messages start flooding.
>>>>>>=20
>>>>>> I've googled but haven't found anything; something from 2005 =
about
>>>>>> fpudna :
>>>>>>=20
>>>>>>=20
>>>>>> =
http://lists.freebsd.org/pipermail/freebsd-amd64/2005-April/004413.html
>>>>>>=20
>>>>>> and this link, but I don't have the options he mentions enabled =
on the
>>>>>> kernel :
>>>>>>=20
>>>>>>  http://forums.freebsd.org/showthread.php?t=3D17979
>>>>>>=20
>>>>>> Has anyone stepped on this behaviour before?, is there any =
workaround?
>>>>>> The machine really seems to detect cpu's available and responds =
to
>>>>>> keyboard
>>>>>> on VNC, but it's impossible to see whats written down because of =
the
>>>>>> messages flooding the screen.
>>>>>>          =20
>>>>> You did not specified the architecture of the domu. =46rom the =
message,
>>>>> I can
>>>>> guess that your guest is running amd64 kernel. There are slight
>>>>> differences
>>>>> in the handling of the FPU in i386 and amd64 that may matter =
there.
>>>>>=20
>>>>> The message you reported means that the FreeBSD kernel assumes =
that FPU
>>>>> is currently loaded with the context of the current thread, but =
the
>>>>> CR0.TS bit is set, meaning that FPU context is set for switch.
>>>>>=20
>>>>> AFAIR, HVM means that you run bare-metal kernel, right ? Most =
likely,
>>>>> it is some issue with Xen itself. I am curious whether the =
following
>>>>> will cause any usermode-visible regression for you:
>>>>>=20
>>>>> diff --git a/sys/amd64/amd64/fpu.c b/sys/amd64/amd64/fpu.c
>>>>> index 08e5e57..a5ee853 100644
>>>>> --- a/sys/amd64/amd64/fpu.c
>>>>> +++ b/sys/amd64/amd64/fpu.c
>>>>> @@ -394,14 +394,8 @@ fpudna(void)
>>>>>      struct pcb *pcb;
>>>>>=20
>>>>>      critical_enter();
>>>>> -    if (PCPU_GET(fpcurthread) =3D=3D curthread) {
>>>>> -        printf("fpudna: fpcurthread =3D=3D curthread %d times\n",
>>>>> -            ++err_count);
>>>>> -        stop_emulating();
>>>>> -        critical_exit();
>>>>> -        return;
>>>>> -    }
>>>>> -    if (PCPU_GET(fpcurthread) !=3D NULL) {
>>>>> +    if (PCPU_GET(fpcurthread) !=3D NULL&&
>>>>> +        PCPU_GET(fpcurthread) !=3D curthread) {
>>>>>          printf("fpudna: fpcurthread =3D %p (%d), curthread =3D %p =
(%d)\n",
>>>>>                 PCPU_GET(fpcurthread),
>>>>>                 PCPU_GET(fpcurthread)->td_proc->p_pid,
>>>>>        =20
>>>> Hello,
>>>>=20
>>>> yes, sorry, amd64, and yes, hvm hardware virtual machine, not
>>>> paravirtual.
>>>>=20
>>>> So, you mean patching fpu.c and recompiling the kernel, right?, I'm
>>>> new to modifiying src files.
>>>>=20
>>>> Thanks for your help,
>>>> Sergi
>>>>=20
>>>>=20
>>>> _______________________________________________
>>>> freebsd-xen@freebsd.org mailing list
>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-xen
>>>> To unsubscribe, send any mail to =
"freebsd-xen-unsubscribe@freebsd.org"
>>>>=20
>>>>      =20
>>> Hello,
>>>=20
>>> well, I patched fpu.c, recompiled the kernel, and booted ok with 4 =
vcpu.
>>> Then I tried to boot with 5 vcpus and got :
>>>=20
>>> kernel trap 22 with interrupts disabled
>>> ...
>>> kernel trap 22 with interrupts disabled
>>> Fatal double fault
>>> rip =3D 0xffffffff8067865a
>>> rsp =3D 0xffffff8000000000
>>> rbp =3D 0xffffff8000000040
>>> cpuid =3D 4; apic id =3D 08
>>> panic: double fault
>>> cpuid =3D 4
>>>=20
>>> 4 vcpus is the maximum number of vcpus I can use.
>>>=20
>>> How do you think I can debug this in order to provide more =
information?
>>>    =20
>> At least you can add KDB/DDB to the kernel config and get a backtrace
>> at panic.
>>=20
>> My feeling right now is that the issue is in the hypervisor, and not =
in
>> the kernel.
>>  =20
> Hello,
>=20
> well, I'll try to add debugging to the kernel and see if I get =
somewhere.
>=20
> I'll post on the xen-user mailing-list to see if there is some issue =
known in the hypervisor.
>=20
> It's strange that nobody in this list has had this same issue.
>=20
> Thanks for your help,
> regards,
> Sergi
> _______________________________________________
> freebsd-xen@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-xen
> To unsubscribe, send any mail to "freebsd-xen-unsubscribe@freebsd.org"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A4364DFD-77FD-44FB-8231-1D016B385E19>