Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 23 Mar 2001 10:56:24 +0100
From:      receiver@alize-sfl.com
To:        freebsd-smp@freebsd.org
Subject:   SMP / trap12 / heat problem.
Message-ID:  <20010323105624.C28104@pasteur.alize-sfl.com>

next in thread | raw e-mail | index | archive | help
Hi all,

yesterday i bought a ASUS CUV4X-D with 2 PIII 800 & 4x256Mo SDRAM.
(i've been expecting a smp machine for 4 years ;-) ).

please note that i'm totally new to SMP world, so be indulgent please :)

i've encountered different problems:

	* first, when everything is alright (temperature disabled
	in BIOS, bi-pro kernel),
	the system can eat up to 45% (avg 15-35%) of CPU when
	building world -j 4. do you think it's normal ?
	(/usr/src and /usr/obj are on the same 40Mb/s SCSI
	 disk (seagate) on an AHA2940UW (not U2W))

	* second, when building world -j 4, i see some (not many)
	calcru errors, only with ld and as, not make or cc1. some
	= 12 exactly (the build world didn't terminate : 
	i checked LINT, 
	and sysctl'd kern.timecounter.method,
	then the console was *FLOODED* with 'microuptime went backward'
	messages, i switched to X to be a bit cooler to type things,
	xconsole freezed, then X freezed, keyboard too, then i powered
	off and went to bed. ;-(

	third, bios problem : when all hardware monitors in BIOS are on,
	in monoproc kernel, everything is fine.
	when booting my SMP kernel, the machines starts *beeping* near
	the 'waiting 5 seconds for scsi devices to settle'.
	if i disable CPU#0 temperature watch in BIOS, everything is fine.

	* independently (sorry, i don't know if this word exists), healthd:
		* does not find CPU temp / fan properties in ISA mode.
		* cannot find smb0, even if my kernel is compiled with
		support for it (it worked with the same options on my
		old ABIT P2something.
		* reports 6.86V for 5V, 14.xx for 12V, and 4.x for
	       	3.3VCORE. (the bios reports 5.02, 12.8 and 3.01.

	* when i try AUTO_EOI_1 and AUTO_EOI_2 and NTIMECOUNTER=20 in my
	SMP kernel, nmbd (at boot) kills the kernel which says :
		abort trap 12 : default page while in kernel mode,
		and so on.
	i didn't have time this morning to test which of the 3 options
	is faultive (does this exist too ?).

I suspect my power supply to be not so good. It's a 300W, but not for
SMP mother boards, it has no EAUX pin. i will try to change it at noon.
(i've got two 250W to test, only to see if healthd reports better voltage).
I will try to swap the CPU's too (CPU#0 is at ~170K (75°C) and CPU#1 is
at ~100K (49°C), acording to the bios hardware monitor. Generally, the #0
is *much* hotter than #1.
	
 I think i've said everything. All the hardware is brand new. if you want
my KERNEL file, my dmesg, i will give you all that in the afternoon (it's
currently 10:36AM in France), and my box is at home.

note that for the moment, i don't know how to debug the kernel and catch 
the page fault message and all the magic kdb things... i've got only
few knowledge of that sort of things. but i can learn (and i want to).

thanks for you comments and your help, keep the good work, i love freebsd.

PS: does anyone knows why at BIOS boot i get :

------------------------------
blahblah ACPI rev blah

CPU1: Intel blahblah
CPU2:*Intel blahblah
     ^
 this star ?
------------------------------
could this be a problem on CPU2 (#1 for us) ? or is the mb just happy to see
a second CPU ?

thanks again,

Olivier Cortes
Free Software Admin

PS2: i checked the mailing lists archives (for -questions, -hackers, and -smp),
and have read smp(4), mptables(1), sync(*), fsync(*), syncer(*). i didn't find
anything relevant for my problems. overall, i reboot ~10 times during 4 hours,
and i love vinum (fscking a 2x30 stripped IDE storage is quick :) ).

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010323105624.C28104>