From owner-freebsd-smp  Sun Mar 18 12:13:12 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from gratis.grondar.za (grouter.grondar.za [196.7.18.65])
	by hub.freebsd.org (Postfix) with ESMTP id 7D1EB37B718
	for <smp@freebsd.org>; Sun, 18 Mar 2001 12:13:04 -0800 (PST)
	(envelope-from mark@grondar.za)
Received: from grondar.za (root@gratis.grondar.za [196.7.18.133])
	by gratis.grondar.za (8.11.1/8.11.1) with ESMTP id f2IKCvf28535
	for <smp@freebsd.org>; Sun, 18 Mar 2001 22:12:59 +0200 (SAST)
	(envelope-from mark@grondar.za)
Message-Id: <200103182012.f2IKCvf28535@gratis.grondar.za>
To: smp@freebsd.org
Subject: Reliable KASSERT panic in CURRENT
Date: Sun, 18 Mar 2001 22:13:43 +0200
From: Mark Murray <mark@grondar.za>
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

Hi

Im getting a reliable panic (KASSERT actually) in kern_mutex.c
line 215.

It happens when I ^Z in vi on an intel SMP box running (very) CURRENT.

M
-- 
Mark Murray
Warning: this .sig is umop ap!sdn

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sun Mar 18 17: 7:48 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from moby.geekhouse.net (moby.geekhouse.net [64.81.6.36])
	by hub.freebsd.org (Postfix) with ESMTP id D077E37B718
	for <smp@FreeBSD.org>; Sun, 18 Mar 2001 17:07:45 -0800 (PST)
	(envelope-from jhb@FreeBSD.org)
Received: from laptop.baldwin.cx (john@dhcp152.geekhouse.net [192.168.1.152])
	by moby.geekhouse.net (8.11.0/8.9.3) with ESMTP id f2J19j195919;
	Sun, 18 Mar 2001 17:09:47 -0800 (PST)
	(envelope-from jhb@FreeBSD.org)
Message-ID: <XFMail.010318170704.jhb@FreeBSD.org>
X-Mailer: XFMail 1.4.0 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <200103182012.f2IKCvf28535@gratis.grondar.za>
Date: Sun, 18 Mar 2001 17:07:04 -0800 (PST)
From: John Baldwin <jhb@FreeBSD.org>
To: Mark Murray <mark@grondar.za>
Subject: RE: Reliable KASSERT panic in CURRENT
Cc: smp@FreeBSD.org
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


On 18-Mar-01 Mark Murray wrote:
> Hi
> 
> Im getting a reliable panic (KASSERT actually) in kern_mutex.c
> line 215.
> 
> It happens when I ^Z in vi on an intel SMP box running (very) CURRENT.

It's a bogus assertion now (like I said on IRC :-P).  We can actually run for a
very little bit while we are in SSTOP just before we go to sleep, so on an SMP
system during priority propagation we might hit a running process that's not in
SZOMB or SRUN.  You can either add SSTOP to the MPASS() there or just remove
the assertion entirely.  I'm leaning towards removing the assertion but don't
feel too strongly about it either way.

> M
> -- 
> Mark Murray
> Warning: this .sig is umop ap!sdn

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sun Mar 18 21:52: 5 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from gratis.grondar.za (grouter.grondar.za [196.7.18.65])
	by hub.freebsd.org (Postfix) with ESMTP
	id 53E5037B719; Sun, 18 Mar 2001 21:51:59 -0800 (PST)
	(envelope-from mark@grondar.za)
Received: from grondar.za (root@gratis.grondar.za [196.7.18.133])
	by gratis.grondar.za (8.11.1/8.11.1) with ESMTP id f2J5psf30575;
	Mon, 19 Mar 2001 07:51:54 +0200 (SAST)
	(envelope-from mark@grondar.za)
Message-Id: <200103190551.f2J5psf30575@gratis.grondar.za>
To: John Baldwin <jhb@FreeBSD.org>
Cc: smp@FreeBSD.org
Subject: Re: Reliable KASSERT panic in CURRENT 
References: <XFMail.010318170704.jhb@FreeBSD.org> 
In-Reply-To: <XFMail.010318170704.jhb@FreeBSD.org> ; from John Baldwin <jhb@FreeBSD.org>  "Sun, 18 Mar 2001 17:07:04 PST."
Date: Mon, 19 Mar 2001 07:52:37 +0200
From: Mark Murray <mark@grondar.za>
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

> > Im getting a reliable panic (KASSERT actually) in kern_mutex.c line
> > 215.
> >
> > It happens when I ^Z in vi on an intel SMP box running (very)
> > CURRENT.
>
> It's a bogus assertion now (like I said on IRC :-P).

I kept missing you on IRC :-) :-(

>                                                       We can actually
> run for a very little bit while we are in SSTOP just before we go to
> sleep, so on an SMP system during priority propagation we might hit
> a running process that's not in SZOMB or SRUN.  You can either add
> SSTOP to the MPASS() there or just remove the assertion entirely.  I'm
> leaning towards removing the assertion but don't feel too strongly
> about it either way.

Cool! Is this commitworthy (if it works?)

M
-- 
Mark Murray
Warning: this .sig is umop ap!sdn

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Mon Mar 19  9:30: 7 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from moby.geekhouse.net (moby.geekhouse.net [64.81.6.36])
	by hub.freebsd.org (Postfix) with ESMTP id DB66137B719
	for <smp@FreeBSD.org>; Mon, 19 Mar 2001 09:30:01 -0800 (PST)
	(envelope-from jhb@FreeBSD.org)
Received: from laptop.baldwin.cx (john@dhcp152.geekhouse.net [192.168.1.152])
	by moby.geekhouse.net (8.11.0/8.9.3) with ESMTP id f2JHW8199533;
	Mon, 19 Mar 2001 09:32:09 -0800 (PST)
	(envelope-from jhb@FreeBSD.org)
Message-ID: <XFMail.010319092923.jhb@FreeBSD.org>
X-Mailer: XFMail 1.4.0 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <200103190551.f2J5psf30575@gratis.grondar.za>
Date: Mon, 19 Mar 2001 09:29:23 -0800 (PST)
From: John Baldwin <jhb@FreeBSD.org>
To: Mark Murray <mark@grondar.za>
Subject: Re: Reliable KASSERT panic in CURRENT
Cc: smp@FreeBSD.org
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


On 19-Mar-01 Mark Murray wrote:
>> > Im getting a reliable panic (KASSERT actually) in kern_mutex.c line
>> > 215.
>> >
>> > It happens when I ^Z in vi on an intel SMP box running (very)
>> > CURRENT.
>>
>> It's a bogus assertion now (like I said on IRC :-P).
> 
> I kept missing you on IRC :-) :-(
> 
>>                                                       We can actually
>> run for a very little bit while we are in SSTOP just before we go to
>> sleep, so on an SMP system during priority propagation we might hit
>> a running process that's not in SZOMB or SRUN.  You can either add
>> SSTOP to the MPASS() there or just remove the assertion entirely.  I'm
>> leaning towards removing the assertion but don't feel too strongly
>> about it either way.
> 
> Cool! Is this commitworthy (if it works?)

Yes, and it will work. :)

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Mar 21  2:38:32 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from unit11.support.nl (unit11.support.nl [195.114.229.252])
	by hub.freebsd.org (Postfix) with ESMTP
	id C99D637B719; Wed, 21 Mar 2001 02:38:19 -0800 (PST)
	(envelope-from marcel@support.nl)
Received: from localhost (marcel@localhost)
	by unit11.support.nl (8.9.3/8.9.3/Debian 8.9.3-21) with ESMTP id LAA15979;
	Wed, 21 Mar 2001 11:38:28 +0100
Date: Wed, 21 Mar 2001 11:38:28 +0100 (CET)
From: Marcel Lemmen <marcel@support.nl>
To: freebsd-scsi@freebsd.org, freebsd-smp@freebsd.org
Subject: High load with FreeBSD 4.2-REL
Message-ID: <Pine.LNX.4.21.0103211133260.13344-100000@unit11.support.nl>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

Hello,

I'm running FreeBSD 4.2-REL on a box dedicated for newsfeeding. The server
specs are PIII733/1024MB Ram/Adaptec 160/7x Seagate 18GB disks. The
problem is the load. The load averages are between 20-30!!! I've started a
discussion at the Diablo list, but they couldn't find anything strange.

The machine is working, currently 30Mbit/s input and 80Mbit/s output, it's
much, but shouldn't affect the load that much. On a previous server
(PIII450/256MB RAM/Adaptec 2940) the load was around 2, which should be
normal (also a bandwith eater..).

I think the problem is the Adaptec 29160 (PCI64) adapter in combination
with FreeBSD 4.2-REL. Or a kernel option I've forgotten ;)

The server mainboard is a Micro with a ServerWorks chipset and 2
processors slots (only 1 used). I've disabled as much as possible in the
kernel, even the SMP options (should these be enabled...). I've also
enable softupdates on the spool and set the maxuser to 512 in the
Kernel. An "iostat -d 1" looks good.

Below I've attached the dmesg and a top.

Please let me know if I've forgotten something or if you have any other
options!

Cheers,

Marcel Lemmen

 --------------------------------------------------------------
| Marcel Lemmen     | Support Net BV            |              |
| System Engineer   | beheer@support.nl         |         \|/  |
|                   |                           | ___.oO___|_  |
| Jobs@SupportNet   | http://jobs.supportnet.nl |              |
 --------------------------------------------------------------
     (It's a snowman in the desert next to a saguaro)


Top:
last pid:  3184;  load averages: 30.96, 31.26, 31.62    up 0+02:23:06
11:30:25
213 processes: 16 running, 197 sleeping
CPU states: 26.4% user,  0.0% nice, 24.8% system, 36.0% interrupt, 12.8%
idle
Mem: 816M Active, 13M Inact, 134M Wired, 40M Cache, 112M Buf, 1820K Free
Swap: 512M Total, 17M Used, 495M Free, 3% Inuse

  PID USERNAME PRI NICE  SIZE    RES STATE    TIME   WCPU    CPU COMMAND
  168 news      62   0  2652K  1100K RUN      6:00  2.69%  2.69% diablo
 3097 news      52   0  1188K   368K RUN      0:10  2.05%  2.05% dnewslink
 2876 news      55   0  1032K   360K RUN      0:27  1.95%  1.95% dnewslink
<snip>


DMESG:
Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 4.2-RELEASE #12: Mon Mar 19 15:32:31 CET 2001
    support@news-x2.support.nl:/usr/src/sys/compile/NEWS-X2
Timecounter "i8254"  frequency 1193182 Hz
Timecounter "TSC"  frequency 732985260 Hz
CPU: Pentium III/Pentium III Xeon/Celeron (732.99-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x683  Stepping = 3

Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
real memory  = 1073676288 (1048512K bytes)
avail memory = 1042333696 (1017904K bytes)
Preloaded elf kernel "kernel" at 0xc02a1000.
ccd0-3: Concatenated disk drivers
Pentium Pro MTRR support enabled
md0: Malloc disk
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <ServerWorks NB6635 3.0LE host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
pci0: <ATI Mach64-GP graphics accelerator> at 2.0 irq 11
xl0: <3Com 3c905B-TX Fast Etherlink XL> port 0xd080-0xd0ff mem
0xfeafdf00-0xfeafdf7f irq 5 at device 3.0 on pci0
xl0: Ethernet address: 00:50:04:35:0a:49
miibus0: <MII bus> on xl0
xlphy0: <3Com internal media interface> on miibus0
xlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
xl1: <3Com 3c905B-TX Fast Etherlink XL> port 0xdc00-0xdc7f mem
0xfeafdf80-0xfeafdfff irq 15 at device 4.0 on pci0
xl1: Ethernet address: 00:50:04:35:0a:39
miibus1: <MII bus> on xl1
xlphy1: <3Com internal media interface> on miibus1
xlphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
pci0: <unknown card> (vendor=0x8086, dev=0x1229) at 6.0 irq 9
isab0: <ServerWorks IB6566 PCI to ISA bridge> at device 15.0 on pci0
isa0: <ISA bus> on isab0
pci0: <Unknown PCI ATA controller> at 15.1
pci0: <OHCI USB controller> at 15.2 irq 0
pcib1: <ServerWorks NB6635 3.0LE host to PCI bridge> on motherboard
pci1: <PCI bus> on pcib1
ahc0: <Adaptec 29160 Ultra160 SCSI adapter> port 0xe800-0xe8ff mem
0xfebff000-0xfebfffff irq 10 at device 1.0 on pci1
aic7892: Wide Channel A, SCSI Id=7, 32/255 SCBs
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x100>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A, console
Waiting 8 seconds for SCSI devices to settle
Mounting root from ufs:/dev/da0s1a
da0 at ahc0 bus 0 target 0 lun 0
da0: <SEAGATE ST318404LW 0006> Fixed Direct Access SCSI-3 device
da0: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing
Enabled
da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
da4 at ahc0 bus 0 target 4 lun 0
da4: <SEAGATE ST318404LW 0006> Fixed Direct Access SCSI-3 device
da4: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing
Enabled
da4: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
da3 at ahc0 bus 0 target 3 lun 0
da3: <SEAGATE ST318404LW 0006> Fixed Direct Access SCSI-3 device
da3: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing
Enabled
da3: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
da2 at ahc0 bus 0 target 2 lun 0
da2: <SEAGATE ST318404LW 0006> Fixed Direct Access SCSI-3 device
da2: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing
Enabled
da2: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
da1 at ahc0 bus 0 target 1 lun 0
da1: <SEAGATE ST318404LW 0006> Fixed Direct Access SCSI-3 device
da1: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing
Enabled
da1: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
da6 at ahc0 bus 0 target 6 lun 0
da6: <SEAGATE ST318404LW 0006> Fixed Direct Access SCSI-3 device
da6: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing
Enabled
da6: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
da5 at ahc0 bus 0 target 5 lun 0
da5: <SEAGATE ST318404LW 0006> Fixed Direct Access SCSI-3 device
da5: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing
Enabled
da5: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Mar 21  9:33:43 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP
	id E254A37B719; Wed, 21 Mar 2001 09:33:39 -0800 (PST)
	(envelope-from bright@fw.wintelcom.net)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id f2LHV2226396;
	Wed, 21 Mar 2001 09:31:02 -0800 (PST)
Date: Wed, 21 Mar 2001 09:31:02 -0800
From: Alfred Perlstein <bright@wintelcom.net>
To: Marcel Lemmen <marcel@support.nl>
Cc: freebsd-scsi@FreeBSD.ORG, freebsd-smp@FreeBSD.ORG
Subject: Re: High load with FreeBSD 4.2-REL
Message-ID: <20010321093102.C12319@fw.wintelcom.net>
References: <Pine.LNX.4.21.0103211133260.13344-100000@unit11.support.nl>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <Pine.LNX.4.21.0103211133260.13344-100000@unit11.support.nl>; from marcel@support.nl on Wed, Mar 21, 2001 at 11:38:28AM +0100
X-all-your-base: are belong to us.
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

* Marcel Lemmen <marcel@support.nl> [010321 02:38] wrote:
> Hello,
> 
> I'm running FreeBSD 4.2-REL on a box dedicated for newsfeeding. The server
> specs are PIII733/1024MB Ram/Adaptec 160/7x Seagate 18GB disks. The
> problem is the load. The load averages are between 20-30!!! I've started a
> discussion at the Diablo list, but they couldn't find anything strange.

A high load average is fine, come back when you have a problem other
than disliking a number. :)

-Alfred

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Thu Mar 22  7:24: 4 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from hand.dotat.at (inch.demon.co.uk [194.222.223.128])
	by hub.freebsd.org (Postfix) with ESMTP id 33E0E37B71D
	for <freebsd-smp@freebsd.org>; Thu, 22 Mar 2001 07:24:01 -0800 (PST)
	(envelope-from fanf@dotat.at)
Received: from fanf by hand.dotat.at with local (Exim 3.20 #3)
	id 14g6vW-00077j-00
	for freebsd-smp@freebsd.org; Thu, 22 Mar 2001 15:23:02 +0000
From: Tony Finch <dot@dotat.at>
To: freebsd-smp@freebsd.org
Subject: Locked data-structures and delayed writes.
Message-Id: <E14g6vW-00077j-00@hand.dotat.at>
Date: Thu, 22 Mar 2001 15:23:02 +0000
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


I've been having an interesting discussion elsewhere with someone
about the problems caused by delayed writes within the CPU. He's of
the general opinion that everything is broken and can be very
enlightening when explaining why he thinks this but he can also be
frustratingly vague.

Anyway, the question at hand is what happens if two threads on
different CPUs are accessing the same locked data structure when the
CPU delays writes to RAM, i.e.

	acquire_lock(s);
	modify(s);
	release_lock(s);

Things are very broken if the write can be delayed until after the
lock is released. What prevents that?

A related question, but perhaps more implausible, is what happens if a
page is unmapped from underneath a delayed write. This is particularly
pathological if the destination page was mmapped and the program is
exiting: the write may be lost.

Tony.
-- 
f.a.n.finch      fanf@covalent.net      dot@dotat.at


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Thu Mar 22  8:43:49 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from prism.flugsvamp.com (cb58709-a.mdsn1.wi.home.com [24.17.241.9])
	by hub.freebsd.org (Postfix) with ESMTP id 5DC0237B71B
	for <smp@freebsd.org>; Thu, 22 Mar 2001 08:43:46 -0800 (PST)
	(envelope-from jlemon@flugsvamp.com)
Received: (from jlemon@localhost)
	by prism.flugsvamp.com (8.11.0/8.11.0) id f2MGdgW18437;
	Thu, 22 Mar 2001 10:39:42 -0600 (CST)
	(envelope-from jlemon)
Date: Thu, 22 Mar 2001 10:39:42 -0600 (CST)
From: Jonathan Lemon <jlemon@flugsvamp.com>
Message-Id: <200103221639.f2MGdgW18437@prism.flugsvamp.com>
To: dot@dotat.at, smp@freebsd.org
Subject: Re: Locked data-structures and delayed writes.
X-Newsgroups: local.mail.freebsd-smp
In-Reply-To: <local.mail.freebsd-smp/E14g6vW-00077j-00@hand.dotat.at>
Organization: 
Cc: 
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

In article <local.mail.freebsd-smp/E14g6vW-00077j-00@hand.dotat.at> you write:
>
>I've been having an interesting discussion elsewhere with someone
>about the problems caused by delayed writes within the CPU. He's of
>the general opinion that everything is broken and can be very
>enlightening when explaining why he thinks this but he can also be
>frustratingly vague.
>
>Anyway, the question at hand is what happens if two threads on
>different CPUs are accessing the same locked data structure when the
>CPU delays writes to RAM, i.e.
>
>	acquire_lock(s);
>	modify(s);
>	release_lock(s);
>
>Things are very broken if the write can be delayed until after the
>lock is released. What prevents that?

Uhm.  Why would things be broken simply because the write is delayed?
This isn't a trick answer; if you don't subsequently read the location,
then why would it matter if the write is delayed?   (modulo writes to
device memory; in these cases, you probably want to mark the writes as
uncacheable)

Now, if your modify involves doing a read of the location, then that 
is a different question.

The ia-32 architecture uses a strong cache coherence model, so that
writes to the same location appear to be seen in the same order by all
processors (write serialization).  So even if the memory write in 
modify() above is delayed until after the release_lock() call, any
reads from that location will return the new data.

On other architectures, you may need to explicitly manage the cache
coherence yourself.


>A related question, but perhaps more implausible, is what happens if a
>page is unmapped from underneath a delayed write. This is particularly
>pathological if the destination page was mmapped and the program is
>exiting: the write may be lost.

On the alpha, the write buffers are physically addressed, so if
the virtual address mapping is removed, it will not affect the 
delayed write, since the write does not require the page tables.
I'm not sure what Intel does; I would guess probably the same thing.

I would assume that if some architecture uses virtually addressed
blocks in a write buffer (!!), then part of the task of a TLB flush
be to complete the delayed write before removing the mapping.
--
Jonathan

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Thu Mar 22  9:29:50 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP id DB49137B71C
	for <freebsd-smp@FreeBSD.ORG>; Thu, 22 Mar 2001 09:29:47 -0800 (PST)
	(envelope-from bright@fw.wintelcom.net)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id f2MHTkw05202;
	Thu, 22 Mar 2001 09:29:46 -0800 (PST)
Date: Thu, 22 Mar 2001 09:29:46 -0800
From: Alfred Perlstein <bright@wintelcom.net>
To: Tony Finch <dot@dotat.at>
Cc: freebsd-smp@FreeBSD.ORG
Subject: Re: Locked data-structures and delayed writes.
Message-ID: <20010322092946.M9431@fw.wintelcom.net>
References: <E14g6vW-00077j-00@hand.dotat.at>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <E14g6vW-00077j-00@hand.dotat.at>; from dot@dotat.at on Thu, Mar 22, 2001 at 03:23:02PM +0000
X-all-your-base: are belong to us.
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

* Tony Finch <dot@dotat.at> [010322 07:24] wrote:
> 
> I've been having an interesting discussion elsewhere with someone
> about the problems caused by delayed writes within the CPU. He's of
> the general opinion that everything is broken and can be very
> enlightening when explaining why he thinks this but he can also be
> frustratingly vague.

Ah, the glass is not only half empty, but it will most likely
shatter and slice your lips off kinda guy...  Yes, I know a
couple of people like that. :)

> Anyway, the question at hand is what happens if two threads on
> different CPUs are accessing the same locked data structure when the
> CPU delays writes to RAM, i.e.
> 
> 	acquire_lock(s);
> 	modify(s);
> 	release_lock(s);
> 
> Things are very broken if the write can be delayed until after the
> lock is released. What prevents that?

Usually one of two things:

1) any locked op forces the CPU to flush all writes before it completes
2) there are explicit write/read barrier opcodes that people who design
   lock primatives are expected to use.

> A related question, but perhaps more implausible, is what happens if a
> page is unmapped from underneath a delayed write. This is particularly
> pathological if the destination page was mmapped and the program is
> exiting: the write may be lost.

Yes, this is why there's such a thing as a 'tlb' shootdown, which
I think requires IPI (interprocessor interrupt) to notify that the
system pagetables are being changed.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
Instead of asking why a piece of software is using "1970s technology,"
start asking why software is ignoring 30 years of accumulated wisdom.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Thu Mar 22  9:32: 2 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP id A23AA37B718
	for <smp@FreeBSD.ORG>; Thu, 22 Mar 2001 09:32:00 -0800 (PST)
	(envelope-from bright@fw.wintelcom.net)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id f2MHVwF05355;
	Thu, 22 Mar 2001 09:31:58 -0800 (PST)
Date: Thu, 22 Mar 2001 09:31:58 -0800
From: Alfred Perlstein <bright@wintelcom.net>
To: Jonathan Lemon <jlemon@flugsvamp.com>
Cc: dot@dotat.at, smp@FreeBSD.ORG
Subject: Re: Locked data-structures and delayed writes.
Message-ID: <20010322093158.N9431@fw.wintelcom.net>
References: <local.mail.freebsd-smp/E14g6vW-00077j-00@hand.dotat.at> <200103221639.f2MGdgW18437@prism.flugsvamp.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <200103221639.f2MGdgW18437@prism.flugsvamp.com>; from jlemon@flugsvamp.com on Thu, Mar 22, 2001 at 10:39:42AM -0600
X-all-your-base: are belong to us.
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

* Jonathan Lemon <jlemon@flugsvamp.com> [010322 08:44] wrote:
> In article <local.mail.freebsd-smp/E14g6vW-00077j-00@hand.dotat.at> you write:
> >
> >I've been having an interesting discussion elsewhere with someone
> >about the problems caused by delayed writes within the CPU. He's of
> >the general opinion that everything is broken and can be very
> >enlightening when explaining why he thinks this but he can also be
> >frustratingly vague.
> >
> >Anyway, the question at hand is what happens if two threads on
> >different CPUs are accessing the same locked data structure when the
> >CPU delays writes to RAM, i.e.
> >
> >	acquire_lock(s);
> >	modify(s);
> >	release_lock(s);
> >
> >Things are very broken if the write can be delayed until after the
> >lock is released. What prevents that?
> 
> Uhm.  Why would things be broken simply because the write is delayed?
> This isn't a trick answer; if you don't subsequently read the location,
> then why would it matter if the write is delayed?   (modulo writes to
> device memory; in these cases, you probably want to mark the writes as
> uncacheable)
> 
> Now, if your modify involves doing a read of the location, then that 
> is a different question.

Heh, what if your write to 's' happens after lock release and without
it 's' is not consistant?  You need a write barrier.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
Represent yourself, show up at BABUG http://www.babug.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Thu Mar 22  9:41:54 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from prism.flugsvamp.com (cb58709-a.mdsn1.wi.home.com [24.17.241.9])
	by hub.freebsd.org (Postfix) with ESMTP id 7F20937B71A
	for <smp@FreeBSD.ORG>; Thu, 22 Mar 2001 09:41:51 -0800 (PST)
	(envelope-from jlemon@flugsvamp.com)
Received: (from jlemon@localhost)
	by prism.flugsvamp.com (8.11.0/8.11.0) id f2MHbiG20456;
	Thu, 22 Mar 2001 11:37:44 -0600 (CST)
	(envelope-from jlemon)
Date: Thu, 22 Mar 2001 11:37:44 -0600
From: Jonathan Lemon <jlemon@flugsvamp.com>
To: Alfred Perlstein <bright@wintelcom.net>
Cc: Jonathan Lemon <jlemon@flugsvamp.com>, dot@dotat.at,
	smp@FreeBSD.ORG
Subject: Re: Locked data-structures and delayed writes.
Message-ID: <20010322113744.T82645@prism.flugsvamp.com>
References: <local.mail.freebsd-smp/E14g6vW-00077j-00@hand.dotat.at> <200103221639.f2MGdgW18437@prism.flugsvamp.com> <20010322093158.N9431@fw.wintelcom.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0pre2i
In-Reply-To: <20010322093158.N9431@fw.wintelcom.net>
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Thu, Mar 22, 2001 at 09:31:58AM -0800, Alfred Perlstein wrote:
> * Jonathan Lemon <jlemon@flugsvamp.com> [010322 08:44] wrote:
> > In article <local.mail.freebsd-smp/E14g6vW-00077j-00@hand.dotat.at> you write:
> > >
> > >I've been having an interesting discussion elsewhere with someone
> > >about the problems caused by delayed writes within the CPU. He's of
> > >the general opinion that everything is broken and can be very
> > >enlightening when explaining why he thinks this but he can also be
> > >frustratingly vague.
> > >
> > >Anyway, the question at hand is what happens if two threads on
> > >different CPUs are accessing the same locked data structure when the
> > >CPU delays writes to RAM, i.e.
> > >
> > >	acquire_lock(s);
> > >	modify(s);
> > >	release_lock(s);
> > >
> > >Things are very broken if the write can be delayed until after the
> > >lock is released. What prevents that?
> > 
> > Uhm.  Why would things be broken simply because the write is delayed?
> > This isn't a trick answer; if you don't subsequently read the location,
> > then why would it matter if the write is delayed?   (modulo writes to
> > device memory; in these cases, you probably want to mark the writes as
> > uncacheable)
> > 
> > Now, if your modify involves doing a read of the location, then that 
> > is a different question.
> 
> Heh, what if your write to 's' happens after lock release and without
> it 's' is not consistant?  You need a write barrier.

Well, cache coherency and memory ordering are two different things.
If the architecture has a relaxed memory ordering (say, release consistency)
then you will need a write barrier to enforce ordering of the lock
with respect to modify().  But that is a different question from just
a delayed write.
--
Jonathan

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Thu Mar 22 10: 4:16 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from gratis.grondar.za (grouter.grondar.za [196.7.18.65])
	by hub.freebsd.org (Postfix) with ESMTP id BE7FB37B71F
	for <freebsd-smp@FreeBSD.ORG>; Thu, 22 Mar 2001 10:04:11 -0800 (PST)
	(envelope-from mark@grondar.za)
Received: from grondar.za (root@gratis.grondar.za [196.7.18.133])
	by gratis.grondar.za (8.11.1/8.11.1) with ESMTP id f2MI3tf50897;
	Thu, 22 Mar 2001 20:03:55 +0200 (SAST)
	(envelope-from mark@grondar.za)
Message-Id: <200103221803.f2MI3tf50897@gratis.grondar.za>
To: Tony Finch <dot@dotat.at>
Cc: freebsd-smp@FreeBSD.ORG
Subject: Re: Locked data-structures and delayed writes. 
References: <E14g6vW-00077j-00@hand.dotat.at> 
In-Reply-To: <E14g6vW-00077j-00@hand.dotat.at> ; from Tony Finch <dot@dotat.at>  "Thu, 22 Mar 2001 15:23:02 GMT."
Date: Thu, 22 Mar 2001 20:05:04 +0200
From: Mark Murray <mark@grondar.za>
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

> Anyway, the question at hand is what happens if two threads on
> different CPUs are accessing the same locked data structure when the
> CPU delays writes to RAM, i.e.
> 
> 	acquire_lock(s);
> 	modify(s);
> 	release_lock(s);
> 
> Things are very broken if the write can be delayed until after the
> lock is released. What prevents that?

"man atomic", and look at the "acquire" and "release" memory barriers.

M
-- 
Mark Murray
Warning: this .sig is umop ap!sdn

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Thu Mar 22 11: 1:29 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from meow.osd.bsdi.com (meow.osd.bsdi.com [204.216.28.88])
	by hub.freebsd.org (Postfix) with ESMTP id 627EC37B720
	for <freebsd-smp@FreeBSD.org>; Thu, 22 Mar 2001 11:01:26 -0800 (PST)
	(envelope-from jhb@FreeBSD.org)
Received: from laptop.baldwin.cx (john@jhb-laptop.osd.bsdi.com [204.216.28.241])
	by meow.osd.bsdi.com (8.11.2/8.11.2) with ESMTP id f2MIxsG80547;
	Thu, 22 Mar 2001 10:59:54 -0800 (PST)
	(envelope-from jhb@FreeBSD.org)
Message-ID: <XFMail.010322105942.jhb@FreeBSD.org>
X-Mailer: XFMail 1.4.0 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <E14g6vW-00077j-00@hand.dotat.at>
Date: Thu, 22 Mar 2001 10:59:42 -0800 (PST)
From: John Baldwin <jhb@FreeBSD.org>
To: Tony Finch <dot@dotat.at>
Subject: RE: Locked data-structures and delayed writes.
Cc: freebsd-smp@FreeBSD.org
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


On 22-Mar-01 Tony Finch wrote:
> 
> I've been having an interesting discussion elsewhere with someone
> about the problems caused by delayed writes within the CPU. He's of
> the general opinion that everything is broken and can be very
> enlightening when explaining why he thinks this but he can also be
> frustratingly vague.
> 
> Anyway, the question at hand is what happens if two threads on
> different CPUs are accessing the same locked data structure when the
> CPU delays writes to RAM, i.e.
> 
>       acquire_lock(s);
>       modify(s);
>       release_lock(s);
> 
> Things are very broken if the write can be delayed until after the
> lock is released. What prevents that?

Memory barriers.  When we acquire a lock, we enforce a memory barrier to ensure
that the data accesses to actually obtain the lock are obtained before we
perform any 'sensitive' operations.  Secondly, we use another memory barrier
during the release to ensure that all 'sensitive' operations are finished
before the lock is released.

> A related question, but perhaps more implausible, is what happens if a
> page is unmapped from underneath a delayed write. This is particularly
> pathological if the destination page was mmapped and the program is
> exiting: the write may be lost.

If the program doesn't need the data, who cares if it is lost?  If the data is
not program specific (e.g. kernel data structures) then the page won't be
unmapped. :)  However, it is actually a concern to make sure that if the data
is still in the cache, it doesnt' get written out later when some other program
is using this page.  This can be handled in various ways depending on what
cache architecture is being used.  For an excellent treatment of this topic,
see "Unix Systems for Modern Architectures: Symmetric Multiprocessing and
Caching for Kernel Programmers" by Curt Schimmel ISBN 0-201-63338-8.

> Tony.
> -- 
> f.a.n.finch      fanf@covalent.net      dot@dotat.at

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Thu Mar 22 12: 2:14 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from linuxpower.p00t.net (mke-24-167-255-186.wi.rr.com [24.167.255.186])
	by hub.freebsd.org (Postfix) with ESMTP id E080837B719
	for <freebsd-smp@freebsd.org>; Thu, 22 Mar 2001 12:02:07 -0800 (PST)
	(envelope-from tduffey@wi.rr.com)
Received: from localhost (trout@localhost)
	by linuxpower.p00t.net (8.11.3/8.11.3) with ESMTP id f2MK27F11149
	for <freebsd-smp@freebsd.org>; Thu, 22 Mar 2001 14:02:07 -0600
Date: Thu, 22 Mar 2001 14:02:07 -0600 (CST)
From: Tom Duffey <tduffey@wi.rr.com>
To: freebsd-smp@freebsd.org
Subject: IBM Netfinity 3500 SMP problem
Message-ID: <Pine.LNX.4.21.0103221357420.11141-100000@linuxpower.p00t.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

Dear Fellow FreeBSD Users and Developers,

I'm attempting to make FreeBSD 4.2-RELEASE run with SMP support on the
aforementioned Netfinity machine.  The kernel boots fine w/o SMP, but
hangs somewhere after "Waiting 15 seconds for SCSI devices to settle
down" appears.  I've checked the archives and see many mentions of similar
trouble but no solutions or conclusions.  Any help is appreciated.

The kernel works w/o SMP and I'm using the defaults when attempting to
boot with SMP.  Here is the mptable output:

===============================================================================

MPTable, version 2.0.15

-------------------------------------------------------------------------------

MP Floating Pointer Structure:

  location:                     EBDA
  physical address:             0x0009e1d0
  signature:                    '_MP_'
  length:                       16 bytes
  version:                      1.4
  checksum:                     0xd6
  mode:                         Virtual Wire

-------------------------------------------------------------------------------

MP Config Table Header:

  physical address:             0x0009e1e0
  signature:                    'PCMP'
  base table length:            244
  version:                      1.4
  checksum:                     0xef
  OEM ID:                       'IBM ENSW'
  Product ID:                   'NF 6000R SMP'
  OEM table pointer:            0x00000000
  OEM table size:               0
  entry count:                  22
  local APIC address:           0xfee00000
  extended table length:        168
  extended table checksum:      117

-------------------------------------------------------------------------------

MP Config Base Table Entries:

--
Processors:     APIC ID Version State           Family  Model   Step
Flags
                 3       0x11    BSP, usable     6       8       3
0x0301
                 0       0x11    AP, usable      6       8       3
0x0301
--
Bus:            Bus ID  Type
                 0       PCI
                 1       PCI
                 2       ISA
--
I/O APICs:      APIC ID Version State           Address
                14       0x11    usable          0xfec00000
                13       0x11    usable          0xfec01000
--
I/O Ints:       Type    Polarity    Trigger     Bus ID   IRQ    APIC ID
PIN#
                INT      conforms    conforms        2     1         14
1
                INT      conforms    conforms        2     0         14
2
                INT      conforms    conforms        2     3         14
3
                INT      conforms    conforms        2     4         14
4
                INT      conforms    conforms        2     6         14
6
                INT      conforms    conforms        2     7         14
7
                INT      conforms    conforms        2     8         14
8
                INT      conforms    conforms        2    12         14
12
                INT      conforms    conforms        2    13         14
13
                INT      conforms    conforms        2    14         14
14
                INT      conforms    conforms        0   2:A         13
11
                INT      conforms    conforms        0  15:A         14
10
                INT      conforms    conforms        1   3:A         13
12
--
Local Ints:     Type    Polarity    Trigger     Bus ID   IRQ    APIC ID
PIN#
                NMI      conforms    conforms        2     0        255
1
                ExtINT   conforms    conforms        2     0        255
0

-------------------------------------------------------------------------------

MP Config Extended Table Entries:

--
System Address Space
 bus ID: 0 address type: memory address
 address base: 0xa0000
 address range: 0x20000
--
System Address Space
 bus ID: 0 address type: memory address
 address base: 0xd0000
 address range: 0x10000
--
System Address Space
 bus ID: 0 address type: memory address
 address base: 0xfd000000
 address range: 0x3000000
--
System Address Space
 bus ID: 0 address type: prefetch address
 address base: 0xf0000000
 address range: 0xd000000
--
System Address Space
 bus ID: 1 address type: memory address
 address base: 0xee000000
 address range: 0x2000000
--
System Address Space
 bus ID: 1 address type: prefetch address
 address base: 0x8000000
 address range: 0xe6000000
--
System Address Space
 bus ID: 0 address type: I/O address
 address base: 0x0
 address range: 0x2040
--
System Address Space
 bus ID: 1 address type: I/O address
 address base: 0x2040
 address range: 0xdfc0
--
Bus Heirarchy
 bus ID: 2 bus info: 0x01 parent bus ID: 0

-------------------------------------------------------------------------------

# SMP kernel config file options:


# Required:
options         SMP                     # Symmetric MultiProcessor Kernel
options         APIC_IO                 # Symmetric (APIC) I/O

# Optional (built-in defaults will work in most cases):
#options                NCPU=2                  # number of CPUs
#options                NBUS=3                  # number of busses
#options                NAPIC=2                 # number of IO APICs
#options                NINTR=24                # number of INTs

===============================================================================A

Best Regards,

Tom Duffey


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Thu Mar 22 12:55:31 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from meow.osd.bsdi.com (meow.osd.bsdi.com [204.216.28.88])
	by hub.freebsd.org (Postfix) with ESMTP id 6CFB237B719
	for <smp@FreeBSD.org>; Thu, 22 Mar 2001 12:55:29 -0800 (PST)
	(envelope-from jhb@FreeBSD.org)
Received: from laptop.baldwin.cx (john@jhb-laptop.osd.bsdi.com [204.216.28.241])
	by meow.osd.bsdi.com (8.11.2/8.11.2) with ESMTP id f2MKtBG84070
	for <smp@FreeBSD.org>; Thu, 22 Mar 2001 12:55:12 -0800 (PST)
	(envelope-from jhb@FreeBSD.org)
Message-ID: <XFMail.010322125500.jhb@FreeBSD.org>
X-Mailer: XFMail 1.4.0 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
Date: Thu, 22 Mar 2001 12:55:00 -0800 (PST)
From: John Baldwin <jhb@FreeBSD.org>
To: smp@FreeBSD.org
Subject: SMPng Status Report
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

Until such time as a new SMPng Project Manager is appointed or whatever, I
figured that someone needs to at least send out an occasional status report, so
here goes:

- Bosko Milekic is presently changing the msleep()/wakeup() code in the mbuf
  subsystem to make use of condition variables instead.  He is also looking
  into some optimizations in the mbuf subsystem in terms of mutex locks in
  conjuction with some potential internal changes along with Alfred Perlstein.
- I have just finished overhauling the witness code to not be mutex specific,
  but to instead use abstract lock objects.  Each lock object has a lock class
  that specifies properties of all locks of a certain type.  Individual lock
  objects also have additional properties that can override and/or add to the
  class properties.  I haven't updated the sx locks yet, but that should be
  a 15 minute job.  Once this is done sx locks can safely be used throughout
  the system.  The first ones in widespread use will replace the lockmgr locks
  currently backing the allproc and proctree locks.  I've also implemented a
  small critical_enter/exit API that will be used to replace the
  restore/save_intr() functions that came in with the original SMPng commit.
  With this, disable/enable_intr() will go back to being trivial one instruction
  functions that are i386 and ia64 specific.

The todo list still resides at its old location:
http://www.FreeBSD.org/~jasone/smp/

Some of the notable items on the todo list for those who would like to help out
but are not sure where to start include removing the syscall MP safe flag in
favor of explicit mtx_lock/unlock's of Giant for all syscall's and removing
nested includes of <sys/mutex.h> in other kernel headers.  Some of the big
projects that should be coming very soon include:

- Removing or at least ignoring all the priorities passed in to msleep/tsleep
  now that priority propagation works.
- Convert lockmgr locks over to using mutexes and sx locks.

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Thu Mar 22 17: 4: 0 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from linuxpower.p00t.net (mke-24-167-255-186.wi.rr.com [24.167.255.186])
	by hub.freebsd.org (Postfix) with ESMTP id B951837B71D
	for <freebsd-smp@freebsd.org>; Thu, 22 Mar 2001 17:03:55 -0800 (PST)
	(envelope-from tduffey@wi.rr.com)
Received: from localhost (trout@localhost)
	by linuxpower.p00t.net (8.11.3/8.11.3) with ESMTP id f2N13sK11477
	for <freebsd-smp@freebsd.org>; Thu, 22 Mar 2001 19:03:54 -0600
Date: Thu, 22 Mar 2001 19:03:54 -0600 (CST)
From: Tom Duffey <tduffey@wi.rr.com>
To: freebsd-smp@freebsd.org
Subject: More IBM Netfinity 3500 SMP problem
In-Reply-To: <Pine.LNX.4.21.0103221357420.11141-100000@linuxpower.p00t.net>
Message-ID: <Pine.LNX.4.21.0103221900050.11475-100000@linuxpower.p00t.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

The output of mptable shows that this system has three busses, but FreeBSD
defaults to 4.  So, I attempted to recompile a kernel using more specific
SMP options, namely:

options		SMP
options		APIC_IOB
options		NBUS=3

But 'config' complains:

smp:61: unknown option "NBUS".

Are the optional SMP paramaters no longer available?  Does it
matter?  Please let me know if there's anything I can do to help make
FreeBSD's SMP support work with this hardware.

Best Regards,

Tom Duffey


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Thu Mar 22 17:23:47 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from femail12.sdc1.sfba.home.com (femail12.sdc1.sfba.home.com [24.0.95.108])
	by hub.freebsd.org (Postfix) with ESMTP id 1F7AF37B71F
	for <freebsd-smp@FreeBSD.ORG>; Thu, 22 Mar 2001 17:23:42 -0800 (PST)
	(envelope-from jgowdy@home.com)
Received: from cx443070b ([24.0.36.170]) by femail12.sdc1.sfba.home.com
          (InterMail vM.4.01.03.20 201-229-121-120-20010223) with SMTP
          id <20010323012341.HEMK7377.femail12.sdc1.sfba.home.com@cx443070b>;
          Thu, 22 Mar 2001 17:23:41 -0800
Message-ID: <000a01c0b338$4bc3d680$aa240018@cx443070b>
From: "Jeremiah Gowdy" <jgowdy@home.com>
To: "Tom Duffey" <tduffey@wi.rr.com>, <freebsd-smp@FreeBSD.ORG>
References: <Pine.LNX.4.21.0103221900050.11475-100000@linuxpower.p00t.net>
Subject: Re: More IBM Netfinity 3500 SMP problem
Date: Thu, 22 Mar 2001 17:26:32 -0800
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4133.2400
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


----- Original Message -----
From: "Tom Duffey" <tduffey@wi.rr.com>
To: <freebsd-smp@FreeBSD.ORG>
Sent: Thursday, March 22, 2001 5:03 PM
Subject: More IBM Netfinity 3500 SMP problem


> The output of mptable shows that this system has three busses, but FreeBSD
> defaults to 4.  So, I attempted to recompile a kernel using more specific
> SMP options, namely:
>
> options SMP
> options APIC_IOB
> options NBUS=3
>
> But 'config' complains:
>
> smp:61: unknown option "NBUS".
>
> Are the optional SMP paramaters no longer available?  Does it
> matter?  Please let me know if there's anything I can do to help make
> FreeBSD's SMP support work with this hardware.

You shouldn't need those options.  I use the 3500 with Dual PIII 500s, and
it works fine, just options SMP


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Thu Mar 22 17:36:45 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from meow.osd.bsdi.com (meow.osd.bsdi.com [204.216.28.88])
	by hub.freebsd.org (Postfix) with ESMTP id 6438437B71E
	for <freebsd-smp@FreeBSD.org>; Thu, 22 Mar 2001 17:36:42 -0800 (PST)
	(envelope-from jhb@FreeBSD.org)
Received: from laptop.baldwin.cx (john@jhb-laptop.osd.bsdi.com [204.216.28.241])
	by meow.osd.bsdi.com (8.11.2/8.11.2) with ESMTP id f2N1aIG93081;
	Thu, 22 Mar 2001 17:36:18 -0800 (PST)
	(envelope-from jhb@FreeBSD.org)
Message-ID: <XFMail.010322173610.jhb@FreeBSD.org>
X-Mailer: XFMail 1.4.0 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <000a01c0b338$4bc3d680$aa240018@cx443070b>
Date: Thu, 22 Mar 2001 17:36:10 -0800 (PST)
From: John Baldwin <jhb@FreeBSD.org>
To: Jeremiah Gowdy <jgowdy@home.com>
Subject: Re: More IBM Netfinity 3500 SMP problem
Cc: freebsd-smp@FreeBSD.org, Tom Duffey <tduffey@wi.rr.com>
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


On 23-Mar-01 Jeremiah Gowdy wrote:
> 
> ----- Original Message -----
> From: "Tom Duffey" <tduffey@wi.rr.com>
> To: <freebsd-smp@FreeBSD.ORG>
> Sent: Thursday, March 22, 2001 5:03 PM
> Subject: More IBM Netfinity 3500 SMP problem
> 
> 
>> The output of mptable shows that this system has three busses, but FreeBSD
>> defaults to 4.  So, I attempted to recompile a kernel using more specific
>> SMP options, namely:
>>
>> options SMP
>> options APIC_IOB
>> options NBUS=3
>>
>> But 'config' complains:
>>
>> smp:61: unknown option "NBUS".
>>
>> Are the optional SMP paramaters no longer available?  Does it
>> matter?  Please let me know if there's anything I can do to help make
>> FreeBSD's SMP support work with this hardware.
> 
> You shouldn't need those options.  I use the 3500 with Dual PIII 500s, and
> it works fine, just options SMP

Err, and APIC_IO I hope.  The kernel now examines the MP table and dynamically
figures out how many busses, cpus, etc. to deal with on the fly, so it should
work fine without needing NBUS tweaked.

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Thu Mar 22 19:55:42 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from linuxpower.p00t.net (mke-24-167-255-186.wi.rr.com [24.167.255.186])
	by hub.freebsd.org (Postfix) with ESMTP id 92E5337B71A
	for <freebsd-smp@freebsd.org>; Thu, 22 Mar 2001 19:55:35 -0800 (PST)
	(envelope-from tduffey@wi.rr.com)
Received: from localhost (trout@localhost)
	by linuxpower.p00t.net (8.11.3/8.11.3) with ESMTP id f2N3tYG11743
	for <freebsd-smp@freebsd.org>; Thu, 22 Mar 2001 21:55:34 -0600
Date: Thu, 22 Mar 2001 21:55:34 -0600 (CST)
From: Tom Duffey <tduffey@wi.rr.com>
To: freebsd-smp@freebsd.org
Subject: Re: More IBM Netfinity 3500 SMP problem
In-Reply-To: <XFMail.010322173610.jhb@FreeBSD.org>
Message-ID: <Pine.LNX.4.21.0103222151240.11475-100000@linuxpower.p00t.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

Thanks for clearing up the confusion regarding optional SMP kernel
paramaters.

Unfortunately, my 3500 with dual PIII 733's still fails to boot if I
compile the kernel with "options SMP" and "options APIC_IO."  Is there
anything out of the ordinary that must be done to make SMP work with the
Netfinity 3500 M20's?

Is it possible to step through the kernel boot procedure to determine what
exactly is causing the machine to stop?  Or, is there any information I
can provide that would be useful to the SMP developers to resolve this
issue?

Thanks,

Tom Duffey

> On 23-Mar-01 Jeremiah Gowdy wrote:
> > 
> > ----- Original Message -----
> > From: "Tom Duffey" <tduffey@wi.rr.com>
> > To: <freebsd-smp@FreeBSD.ORG>
> > Sent: Thursday, March 22, 2001 5:03 PM
> > Subject: More IBM Netfinity 3500 SMP problem
> > 
> > 
> >> The output of mptable shows that this system has three busses, but FreeBSD
> >> defaults to 4.  So, I attempted to recompile a kernel using more specific
> >> SMP options, namely:
> >>
> >> options SMP
> >> options APIC_IOB
> >> options NBUS=3
> >>
> >> But 'config' complains:
> >>
> >> smp:61: unknown option "NBUS".
> >>
> >> Are the optional SMP paramaters no longer available?  Does it
> >> matter?  Please let me know if there's anything I can do to help make
> >> FreeBSD's SMP support work with this hardware.
> > 
> > You shouldn't need those options.  I use the 3500 with Dual PIII 500s, and
> > it works fine, just options SMP
> 
> Err, and APIC_IO I hope.  The kernel now examines the MP table and dynamically
> figures out how many busses, cpus, etc. to deal with on the fly, so it should
> work fine without needing NBUS tweaked.
> 
> -- 
> 
> John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
> PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
> "Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
> 


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Thu Mar 22 20:27:15 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from ambrisko.com (adsl-216-103-208-74.dsl.snfc21.pacbell.net [216.103.208.74])
	by hub.freebsd.org (Postfix) with ESMTP id 52DF437B71A
	for <freebsd-smp@FreeBSD.ORG>; Thu, 22 Mar 2001 20:27:13 -0800 (PST)
	(envelope-from ambrisko@ambrisko.com)
Received: (from ambrisko@localhost)
	by ambrisko.com (8.11.2/8.11.2) id f2N4RBu19168;
	Thu, 22 Mar 2001 20:27:11 -0800 (PST)
	(envelope-from ambrisko)
From: Doug Ambrisko <ambrisko@ambrisko.com>
Message-Id: <200103230427.f2N4RBu19168@ambrisko.com>
Subject: Re: More IBM Netfinity 3500 SMP problem
In-Reply-To: <Pine.LNX.4.21.0103222151240.11475-100000@linuxpower.p00t.net>
 "from Tom Duffey at Mar 22, 2001 09:55:34 pm"
To: Tom Duffey <tduffey@wi.rr.com>
Date: Thu, 22 Mar 2001 20:27:11 -0800 (PST)
Cc: freebsd-smp@FreeBSD.ORG
X-Mailer: ELM [version 2.4ME+ PL82 (25)]
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

Tom Duffey writes:
| Thanks for clearing up the confusion regarding optional SMP kernel
| paramaters.
| 
| Unfortunately, my 3500 with dual PIII 733's still fails to boot if I
| compile the kernel with "options SMP" and "options APIC_IO."  Is there
| anything out of the ordinary that must be done to make SMP work with the
| Netfinity 3500 M20's?

FYI the dual CPU IBM IntelliStations M Pro 6868 needed an upgraded BIOS
to work with SMP.  It was a BIOS error noted in the release notes for
the BIOS stating problems with SMP.  The IntellisStation use the 
Intel 840 chipset.  Prior 6889 used the BX chipset and we just fine.
Since IBM rebrands the same hardware under different names you could
be using the same board and then need the BIOS update.

Doug A.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Fri Mar 23  1:56:32 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from pasteur.alize-sfl.com (pasteur.alize-sfl.com [195.6.237.2])
	by hub.freebsd.org (Postfix) with ESMTP id 0FE9237B718
	for <freebsd-smp@freebsd.org>; Fri, 23 Mar 2001 01:56:25 -0800 (PST)
	(envelope-from receiver@alize-sfl.com)
Received: (from receiver@localhost)
	by pasteur.alize-sfl.com (8.9.3/8.9.3) id KAA29579
	for freebsd-smp@freebsd.org; Fri, 23 Mar 2001 10:56:24 +0100
Date: Fri, 23 Mar 2001 10:56:24 +0100
From: receiver@alize-sfl.com
To: freebsd-smp@freebsd.org
Subject: SMP / trap12 / heat problem.
Message-ID: <20010323105624.C28104@pasteur.alize-sfl.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
User-Agent: Mutt/1.2i
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

Hi all,

yesterday i bought a ASUS CUV4X-D with 2 PIII 800 & 4x256Mo SDRAM.
(i've been expecting a smp machine for 4 years ;-) ).

please note that i'm totally new to SMP world, so be indulgent please :)

i've encountered different problems:

	* first, when everything is alright (temperature disabled
	in BIOS, bi-pro kernel),
	the system can eat up to 45% (avg 15-35%) of CPU when
	building world -j 4. do you think it's normal ?
	(/usr/src and /usr/obj are on the same 40Mb/s SCSI
	 disk (seagate) on an AHA2940UW (not U2W))

	* second, when building world -j 4, i see some (not many)
	calcru errors, only with ld and as, not make or cc1. some
	= 12 exactly (the build world didn't terminate : 
	i checked LINT, 
	and sysctl'd kern.timecounter.method,
	then the console was *FLOODED* with 'microuptime went backward'
	messages, i switched to X to be a bit cooler to type things,
	xconsole freezed, then X freezed, keyboard too, then i powered
	off and went to bed. ;-(

	third, bios problem : when all hardware monitors in BIOS are on,
	in monoproc kernel, everything is fine.
	when booting my SMP kernel, the machines starts *beeping* near
	the 'waiting 5 seconds for scsi devices to settle'.
	if i disable CPU#0 temperature watch in BIOS, everything is fine.

	* independently (sorry, i don't know if this word exists), healthd:
		* does not find CPU temp / fan properties in ISA mode.
		* cannot find smb0, even if my kernel is compiled with
		support for it (it worked with the same options on my
		old ABIT P2something.
		* reports 6.86V for 5V, 14.xx for 12V, and 4.x for
	       	3.3VCORE. (the bios reports 5.02, 12.8 and 3.01.

	* when i try AUTO_EOI_1 and AUTO_EOI_2 and NTIMECOUNTER=20 in my
	SMP kernel, nmbd (at boot) kills the kernel which says :
		abort trap 12 : default page while in kernel mode,
		and so on.
	i didn't have time this morning to test which of the 3 options
	is faultive (does this exist too ?).

I suspect my power supply to be not so good. It's a 300W, but not for
SMP mother boards, it has no EAUX pin. i will try to change it at noon.
(i've got two 250W to test, only to see if healthd reports better voltage).
I will try to swap the CPU's too (CPU#0 is at ~170K (75�C) and CPU#1 is
at ~100K (49�C), acording to the bios hardware monitor. Generally, the #0
is *much* hotter than #1.
	
 I think i've said everything. All the hardware is brand new. if you want
my KERNEL file, my dmesg, i will give you all that in the afternoon (it's
currently 10:36AM in France), and my box is at home.

note that for the moment, i don't know how to debug the kernel and catch 
the page fault message and all the magic kdb things... i've got only
few knowledge of that sort of things. but i can learn (and i want to).

thanks for you comments and your help, keep the good work, i love freebsd.

PS: does anyone knows why at BIOS boot i get :

------------------------------
blahblah ACPI rev blah

CPU1: Intel blahblah
CPU2:*Intel blahblah
     ^
 this star ?
------------------------------
could this be a problem on CPU2 (#1 for us) ? or is the mb just happy to see
a second CPU ?

thanks again,

Olivier Cortes
Free Software Admin

PS2: i checked the mailing lists archives (for -questions, -hackers, and -smp),
and have read smp(4), mptables(1), sync(*), fsync(*), syncer(*). i didn't find
anything relevant for my problems. overall, i reboot ~10 times during 4 hours,
and i love vinum (fscking a 2x30 stripped IDE storage is quick :) ).

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Fri Mar 23  8:28:29 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from linuxpower.p00t.net (mke-24-167-255-186.wi.rr.com [24.167.255.186])
	by hub.freebsd.org (Postfix) with ESMTP id 037B237B71D
	for <freebsd-smp@freebsd.org>; Fri, 23 Mar 2001 08:28:26 -0800 (PST)
	(envelope-from tduffey@wi.rr.com)
Received: from localhost (trout@localhost)
	by linuxpower.p00t.net (8.11.3/8.11.3) with ESMTP id f2NGPhH00226
	for <freebsd-smp@freebsd.org>; Fri, 23 Mar 2001 10:28:24 -0600
Date: Fri, 23 Mar 2001 10:25:43 -0600 (CST)
From: Tom Duffey <tduffey@wi.rr.com>
To: freebsd-smp@freebsd.org
Subject: More detailed debugging info for Netfinity SMP lock
Message-ID: <Pine.LNX.4.21.0103231017280.187-100000@linuxpower.p00t.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

Thanks to everyone for the help so far.  This morning I upgraded the
system BIOS to the latest provided by IBM (v1.06) but still cannot boot
the system with an SMP enabled kernel.  However, I enabled the kernel
debugger and came up with the following:

stopped at	atkbd_isa_intr+0x19;	ret
stopped at	Xresume1+0x35:		cli
stopped at	Xresume1+0x36:		lock andl $-0x3,iactive
stopped at	Xresume1+0x3e:		pushl $0xc02e3140
stopped at	Xresume1+0x43:		call s_lock
stopped at	s_lock:			movel 0x4(%esp),$edx
stopped at	setlock:		movel %fs:0x94,%ecx
stopped at	setlock+0x7:		incl %ecx
stopped at	setlock+0x8:		movl $0,%eax
stopped at	setlock+0xd:		lock cmpxchgl %ecx,0(%edx)
panic: rslock: cpu: 1, addr 0xc02e3140, lock: 0x01000001
mp_lock = 01000003; cpuid = 1; lapic.id = 00000000
Debugger("panic")
Stopped at	Debugger+0x34:		movb $0,in_Debugger,597
db>

I am not a kernel hacker so unfortunately I don't know where to go
next.  Is this information useful to anyone?  I realize this might be some
obscure problem since others claim to be doing SMP on similar Netfinity
machines.  So, should I continue to post debugging information to this
list or just forget about it?  The machine runs fine without SMP so I will
probably get by for the time being but it would be nice to hear whether or
not this issue is going to be looked at by someone and possibly fixed in
the future.  Once again, thanks for your time.

Best Regards,

Tom Duffey


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Fri Mar 23 18:27: 5 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from meow.osd.bsdi.com (meow.osd.bsdi.com [204.216.28.88])
	by hub.freebsd.org (Postfix) with ESMTP id 8F5E737B71B
	for <freebsd-smp@FreeBSD.org>; Fri, 23 Mar 2001 18:26:42 -0800 (PST)
	(envelope-from jhb@FreeBSD.org)
Received: from laptop.baldwin.cx (john@jhb-laptop.osd.bsdi.com [204.216.28.241])
	by meow.osd.bsdi.com (8.11.2/8.11.2) with ESMTP id f2O2PwG30915;
	Fri, 23 Mar 2001 18:25:59 -0800 (PST)
	(envelope-from jhb@FreeBSD.org)
Message-ID: <XFMail.010323182608.jhb@FreeBSD.org>
X-Mailer: XFMail 1.4.0 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <Pine.LNX.4.21.0103231017280.187-100000@linuxpower.p00t.net>
Date: Fri, 23 Mar 2001 18:26:08 -0800 (PST)
From: John Baldwin <jhb@FreeBSD.org>
To: Tom Duffey <tduffey@wi.rr.com>
Subject: RE: More detailed debugging info for Netfinity SMP lock
Cc: freebsd-smp@FreeBSD.org
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


On 23-Mar-01 Tom Duffey wrote:
> Thanks to everyone for the help so far.  This morning I upgraded the
> system BIOS to the latest provided by IBM (v1.06) but still cannot boot
> the system with an SMP enabled kernel.  However, I enabled the kernel
> debugger and came up with the following:
> 
> stopped at    atkbd_isa_intr+0x19;    ret
> stopped at    Xresume1+0x35:          cli
> stopped at    Xresume1+0x36:          lock andl $-0x3,iactive
> stopped at    Xresume1+0x3e:          pushl $0xc02e3140
> stopped at    Xresume1+0x43:          call s_lock
> stopped at    s_lock:                 movel 0x4(%esp),$edx
> stopped at    setlock:                movel %fs:0x94,%ecx
> stopped at    setlock+0x7:            incl %ecx
> stopped at    setlock+0x8:            movl $0,%eax
> stopped at    setlock+0xd:            lock cmpxchgl %ecx,0(%edx)
> panic: rslock: cpu: 1, addr 0xc02e3140, lock: 0x01000001
> mp_lock = 01000003; cpuid = 1; lapic.id = 00000000
> Debugger("panic")
> Stopped at    Debugger+0x34:          movb $0,in_Debugger,597
> db>

Type 'trace' here to see where we are.  It is recursing on a simplelock which
is very bad.  It is probably a kernel bug but could be something else.

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sat Mar 24 10:52:20 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from linuxpower.p00t.net (mke-24-167-255-186.wi.rr.com [24.167.255.186])
	by hub.freebsd.org (Postfix) with ESMTP
	id CCC2037B71A; Sat, 24 Mar 2001 10:52:12 -0800 (PST)
	(envelope-from tduffey@wi.rr.com)
Received: from localhost (trout@localhost)
	by linuxpower.p00t.net (8.11.3/8.11.3) with ESMTP id f2OIqB601670;
	Sat, 24 Mar 2001 12:52:11 -0600
Date: Sat, 24 Mar 2001 12:52:11 -0600 (CST)
From: Tom Duffey <tduffey@wi.rr.com>
To: John Baldwin <jhb@FreeBSD.org>
Cc: freebsd-smp@FreeBSD.org
Subject: RE: More detailed debugging info for Netfinity SMP lock
In-Reply-To: <XFMail.010323182608.jhb@FreeBSD.org>
Message-ID: <Pine.LNX.4.21.0103241243490.1659-100000@linuxpower.p00t.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

Here's the trace, taken directly after the panic shown below.  Hopefully
I'm doing this right ...

db> trace
Debugger(c0x798d2) at Debugger+0x34
panic(c0258f5a,1,c02e3140,10000001,c024a26b) at panic+0xa4
bsl1(0,a0,0) at bsl1
selected_apic_ipi(1,a0,0,a,ff80dee8) at selected_apic_ipi+0x3a
stop_cpus(1,0,0,1,ff80df34) at stop_cpus+0x21
kdb_trap(a,0,ff80df3c) at kdb_trap+0xe5
trap(18,10,10,ff80a864,0) at trap+0x454
calltrap() at calltrap+0x17
--- trap 0xa, eip = 0xc0258f2d, esp = 0xff80df7c, ebp = 0xff80dfdc
setlock(3a60000,0,0,1,0) at setlock+0x11
vm_page_zero_idle(c02596c4,

Fatal trap 12: page fault while in kernel mode
mp_lock = 01000007; cpuid = 1, lapic.id = 00000000
fault virtual address	= 0xff80e000
fault code		= supervisor read, page not present
instruction pointer	= 0x8:0xc0247c30
stack pointer		= 0x10:0xff80dbc8
frame pointer		= 0x10:0xff80dbcc
code segment		= hbase 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, defs32 1, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= Idle
interrupt mask		= tty <- SMP: XXX
kernel: type 12 trap, code 0

Let me know if I've done something incorrectly or missed anything.

Thanks,

Tom Duffey

On Fri, 23 Mar 2001, John Baldwin wrote:

> 
> On 23-Mar-01 Tom Duffey wrote:
> > Thanks to everyone for the help so far.  This morning I upgraded the
> > system BIOS to the latest provided by IBM (v1.06) but still cannot boot
> > the system with an SMP enabled kernel.  However, I enabled the kernel
> > debugger and came up with the following:
> > 
> > stopped at    atkbd_isa_intr+0x19;    ret
> > stopped at    Xresume1+0x35:          cli
> > stopped at    Xresume1+0x36:          lock andl $-0x3,iactive
> > stopped at    Xresume1+0x3e:          pushl $0xc02e3140
> > stopped at    Xresume1+0x43:          call s_lock
> > stopped at    s_lock:                 movel 0x4(%esp),$edx
> > stopped at    setlock:                movel %fs:0x94,%ecx
> > stopped at    setlock+0x7:            incl %ecx
> > stopped at    setlock+0x8:            movl $0,%eax
> > stopped at    setlock+0xd:            lock cmpxchgl %ecx,0(%edx)
> > panic: rslock: cpu: 1, addr 0xc02e3140, lock: 0x01000001
> > mp_lock = 01000003; cpuid = 1; lapic.id = 00000000
> > Debugger("panic")
> > Stopped at    Debugger+0x34:          movb $0,in_Debugger,597
> > db>
> 
> Type 'trace' here to see where we are.  It is recursing on a simplelock which
> is very bad.  It is probably a kernel bug but could be something else.
> 
> -- 
> 
> John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
> PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
> "Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
> 


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sat Mar 24 16:49:37 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from matt.MUNICH.v-net.org (u57n248.syd.eastlink.ca [24.222.57.248])
	by hub.freebsd.org (Postfix) with ESMTP id BE1BB37B71A
	for <freebsd-smp@freebsd.org>; Sat, 24 Mar 2001 16:49:27 -0800 (PST)
	(envelope-from matt@researcher.com)
Received: from researcher.com (Windozzze [192.168.8.2])
	by matt.MUNICH.v-net.org (8.9.3/8.9.3) with ESMTP id UAA32813
	for <freebsd-smp@freebsd.org>; Sat, 24 Mar 2001 20:49:19 -0400 (AST)
	(envelope-from matt@researcher.com)
Message-ID: <3ABD405F.10802@researcher.com>
Date: Sat, 24 Mar 2001 20:48:31 -0400
From: Matt Rudderham <matt@researcher.com>
User-Agent: Mozilla/5.0 (Windows; U; Win98; en-US; m18) Gecko/20010131 Netscape6/6.01
X-Accept-Language: en
MIME-Version: 1.0
To: freebsd-smp@freebsd.org
Subject: SMP Server Wont Reboot Properly
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

Hi,
I'm having a problem with a new server. It is dual PIII - 800s / 256MB 
running FreeBSD 4.1.1-Release. When doing a reboot, it appears to stop 
both CPUs, syncs disks as normal, but then locks where it is just about 
to restart. Does anyone have any ideas on what could be up? Custom 
kernel built with the standard SMP options, etc... Thanks
- Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sat Mar 24 17: 4:34 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from matt.MUNICH.v-net.org (u57n248.syd.eastlink.ca [24.222.57.248])
	by hub.freebsd.org (Postfix) with ESMTP id 7D6C637B71B
	for <freebsd-smp@freebsd.org>; Sat, 24 Mar 2001 17:04:29 -0800 (PST)
	(envelope-from matt@researcher.com)
Received: from researcher.com (Windozzze [192.168.8.2])
	by matt.MUNICH.v-net.org (8.9.3/8.9.3) with ESMTP id VAA32845;
	Sat, 24 Mar 2001 21:04:24 -0400 (AST)
	(envelope-from matt@researcher.com)
Message-ID: <3ABD43E7.3050400@researcher.com>
Date: Sat, 24 Mar 2001 21:03:35 -0400
From: Matt Rudderham <matt@researcher.com>
User-Agent: Mozilla/5.0 (Windows; U; Win98; en-US; m18) Gecko/20010131 Netscape6/6.01
X-Accept-Language: en
MIME-Version: 1.0
To: Barney Wolff <barney@databus.com>, freebsd-smp@freebsd.org
Subject: Re: SMP Server Wont Reboot Properly
References: <3ABD405F.10802@researcher.com> <20010324195351.A58196@mx.databus.com>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

That was my first thinking, did a compile apm, etc out, and then 
recompiled with it in. Same result. Any other ideas?

- Matt

Barney Wolff wrote:

> Try making sure all the power management stuff is disabled.
> I don't think it plays nice with smp.
> Barney Wolff
> 
> On Sat, Mar 24, 2001 at 08:48:31PM -0400, Matt Rudderham wrote:
> 
>> Hi,
>> I'm having a problem with a new server. It is dual PIII - 800s / 256MB 
>> running FreeBSD 4.1.1-Release. When doing a reboot, it appears to stop 
>> both CPUs, syncs disks as normal, but then locks where it is just about 
>> to restart. Does anyone have any ideas on what could be up? Custom 
>> kernel built with the standard SMP options, etc... Thanks
>> - Matt
>> 
>> 
>> To Unsubscribe: send mail to majordomo@FreeBSD.org
>> with "unsubscribe freebsd-smp" in the body of the message
> 


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message