From owner-freebsd-stable@FreeBSD.ORG Sat Nov 20 21:59:21 2004 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E4C4316A4CE; Sat, 20 Nov 2004 21:59:20 +0000 (GMT) Received: from mailgate1.zdv.Uni-Mainz.DE (mailgate1.zdv.Uni-Mainz.DE [134.93.178.129]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0D72943D58; Sat, 20 Nov 2004 21:59:20 +0000 (GMT) (envelope-from ohartman@uni-mainz.de) Received: from exfront01.zdv.uni-mainz.de (exfront01.zdv.Uni-Mainz.DE [134.93.176.49]) by mailgate1.zdv.Uni-Mainz.DE (Postfix) with ESMTP id 0938530000A2; Sat, 20 Nov 2004 22:59:19 +0100 (CET) Received: from EXCHANGE03.zdv.Uni-Mainz.DE ([134.93.177.35]) by exfront01.zdv.uni-mainz.de with Microsoft SMTPSVC(6.0.3790.211); Sat, 20 Nov 2004 22:59:18 +0100 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Date: Sat, 20 Nov 2004 22:59:18 +0100 Message-ID: <1C8D2CA2FBAD5F42B366A87718010B0C854312@EXCHANGE03.zdv.Uni-Mainz.DE> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: FreeBSD 5.3-[RELEASE-p1|STABLE] SMP crashes thread-index: AcTPTC6KkKKDIDY0TQm7cz4dlnFkjw== From: "Oliver Hartmann" To: X-OriginalArrivalTime: 20 Nov 2004 21:59:18.0893 (UTC) FILETIME=[2F2981D0:01C4CF4C] X-Virus-Scanned: by amavisd-new at uni-mainz.de cc: freebsd-stable@freebsd.org Subject: FreeBSD 5.3-[RELEASE-p1|STABLE] SMP crashes X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Nov 2004 21:59:21 -0000 Dear Sirs. First, please do not reply on this address, your reply will never reach = me. Please contact me at ohartman@web.de. I can not post into this = newsgroup via web.de due to SPAM exclusion of several web.de hosts. As I reported very often in the past I have still massvie problems with = SMP enabled on a FreeBSD 5.3-RELEASE-p1 __and__ FreeBSD 5.3-STABLE box. = The crash is always of the same typus as I can 'watch' how the machine = freezes and for some lucky moments I am able to switch to the console = before the box dies definitely and watch what error message comes up. This machine is a ASUS CUR-DLS maiboard, utilizing the RCC ServerWorks = chipset, version 3 for Pentium 3 CPUs. At this moment I use two Intel = 1GHz CPUs of the same stepping, but prior to this error report I used = two CPUs with 866 Mhz and of different steppings, but it seems to make = no difference. I also tried a lot of kernel options, especially those which are = supposed to be critical (means: I switched them off) and I used a = GENERIC kernel for a while, but it makes no difference. The crash occurs = while using a graphical console, Xorg X11 (version 4.7.0 as compiled = from the ports), fvwm2 (develepmonet version, but crash occurs also with = windowmaker so the GUI seems not to be an issue). I also tried to fix = the problem by using built in fxp-NIC instead of the 64Bit Intel GBit = LAN adapter (em0), but it is always the same. I will append a mptable -verbose -dmesg output for your information and = I will add the error message I receive. Most time when the crash occurs I did a lot of graphical load (working = on several TIFF files 200MB in size or with Mozilla/FireFox), but this = may simply trigger or fasten up the problem. Sometimes I can not get a 'systat -vmstat 1' output, calling vmstat in = systat results in 'Alternate system clock has died. Reverting to = ''pigs'' ...'. This happens very often in SMP, but not in UP. I will add, that the UP system (SMP disabled by kern.smp.disable=3D'1' = in loader.conf) was up for nearly 13 days under same conditions when a = SMP box crashes after several minutes, sevral hours. This is the last console error I received: Fatal trap 12: page fault while in kernel mode cpuid =3D 1; apic id =3D 00 fault virtual address =3D 0x1c fault code =3D supervisor write, page not present instruction pointer =3D 0x8:0xc062ac76 stack pointer =3D 0x10:0x4e2d7ac frame pointer =3D 0x10:0xe4e2d7c4 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, def32 1, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 44 (swi5: clock sio) [thread 100042] Stopped at vref +0x16: lock cmpxchgl %edx, 0x1c(%edx) I am not a technical thug nor a kernel programmer. I tried to figure out = what command got executed at address via recommended mn -n kernel|grep = c062ac76 and it results in 'T vref'. What is 'swi5: clock sio'? Is this problem hardware related? Why only in = SMP? Others seem not to have problems with 5.3 and SMP, maybe this is = very specific to me due to the RCC based mainboard I use (in the past I = had a lot of problems with a TYAN 2500 mobo also based on ServerWorks = chipset in conjunction with FreeBSD 4/5).=20 This is my = mptable-output:=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D MPTable, version 2.0.15 looking for EBDA pointer @ 0x040e, found, searching EBDA @ 0x0009f000 searching CMOS 'top of mem' @ 0x0009ec00 (635K) searching default 'top of mem' @ 0x0009fc00 (639K) searching BIOS @ 0x000f0000 MP FPS found in BIOS @ physical addr: 0x000f5270 -------------------------------------------------------------------------= ------ MP Floating Pointer Structure: location: BIOS physical address: 0x000f5270 signature: '_MP_' length: 16 bytes version: 1.4 checksum: 0xe3 mode: Virtual Wire -------------------------------------------------------------------------= ------ MP Config Table Header: physical address: 0x000f4e60 signature: 'PCMP' base table length: 276 version: 1.4 checksum: 0x0d OEM ID: 'OEM00000' Product ID: 'PROD00000000' OEM table pointer: 0x00000000 OEM table size: 0 entry count: 26 local APIC address: 0xfee00000 extended table length: 124 extended table checksum: 198 -------------------------------------------------------------------------= ------ MP Config Base Table Entries: -- Processors: APIC ID Version State Family Model Step = Flags 3 0x11 BSP, usable 6 8 6 = 0x387fbff 0 0x11 AP, usable 6 8 6 = 0x387fbff -- Bus: Bus ID Type 0 PCI =20 1 PCI =20 2 ISA =20 -- I/O APICs: APIC ID Version State Address 2 0x11 usable 0xfec00000 3 0x11 usable 0xfec01000 -- I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID = PIN# ExtINT conforms conforms 2 0 2 = 0 INT conforms conforms 2 1 2 = 1 INT conforms conforms 2 0 2 = 2 INT conforms conforms 2 3 2 = 3 INT conforms conforms 2 4 2 = 4 INT conforms conforms 2 6 2 = 6 INT conforms conforms 2 7 2 = 7 INT conforms conforms 2 8 2 = 8 INT conforms conforms 2 12 2 = 12 INT conforms conforms 2 13 2 = 13 INT conforms conforms 2 14 2 = 14 INT conforms conforms 2 15 2 = 15 INT active-lo level 0 15:A 3 = 14 INT active-lo level 2 9 2 = 9 INT active-lo level 1 3:A 3 = 6 INT active-lo level 1 5:A 3 = 8 INT active-lo level 1 5:B 3 = 9 -- Local Ints: Type Polarity Trigger Bus ID IRQ APIC ID = PIN# ExtINT active-hi edge 2 0 255 = 0 NMI active-hi edge 2 0 255 = 1 -------------------------------------------------------------------------= ------ MP Config Extended Table Entries: -- System Address Space bus ID: 0 address type: I/O address address base: 0x0 address range: 0x10000 -- System Address Space bus ID: 0 address type: memory address address base: 0x40000000 address range: 0xbebe0000 -- System Address Space bus ID: 0 address type: prefetch address address base: 0xfebe0000 address range: 0xe9420000 -- System Address Space bus ID: 0 address type: memory address address base: 0xe8000000 address range: 0x18000000 -- System Address Space bus ID: 0 address type: memory address address base: 0xa0000 address range: 0x20000 -- Bus Heirarchy bus ID: 2 bus info: 0x01 parent bus ID: 0 -- Compatibility Bus Address bus ID: 0 address modifier: add predefined range: 0x00000000 -- Compatibility Bus Address bus ID: 0 address modifier: add predefined range: 0x00000001 -------------------------------------------------------------------------= ------ dmesg output: WARNING: /compat was not properly dismounted WARNING: /homes was not properly dismounted WARNING: /usr was not properly dismounted WARNING: /usr/data was not properly dismounted WARNING: /usr/local was not properly dismounted WARNING: /usr/obj was not properly dismounted /usr/obj: mount pending error: blocks 21296 files 928 /usr/obj: superblock summary recomputed WARNING: /usr/scratch was not properly dismounted WARNING: /usr/src was not properly dismounted WARNING: /var was not properly dismounted pflog0: promiscuous mode enabled em0: Link is up 100 Mbps Full Duplex em0: promiscuous mode enabled em0: promiscuous mode disabled =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D