From owner-freebsd-smp  Sun Sep 17  3:20:35 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from mail.du.gtn.com (mail.du.gtn.com [194.77.9.57])
	by hub.freebsd.org (Postfix) with ESMTP
	id 450B337B422; Sun, 17 Sep 2000 03:20:27 -0700 (PDT)
Received: from mail.cicely.de (cicely.de [194.231.9.142])
	by mail.du.gtn.com (8.11.0.Beta3/8.11.0.Beta3) with ESMTP id e8HAK5Q11678
	(using TLSv1/SSLv3 with cipher EDH-RSA-DES-CBC3-SHA (168 bits) verified OK);
	Sun, 17 Sep 2000 12:20:16 +0200 (MET DST)
Received: from cicely5.cicely.de (cicely5.cicely.de [fec0::104:200:92ff:fe9b:20e7])
	by mail.cicely.de (8.11.0.Beta1/8.11.0.Beta1) with ESMTP id e8HAK8I92606;
	Sun, 17 Sep 2000 12:20:10 +0200 (CEST)
Received: (from ticso@localhost)
	by cicely5.cicely.de (8.11.0/8.9.2) id e8HAK8267920;
	Sun, 17 Sep 2000 12:20:08 +0200 (CEST)
	(envelope-from ticso)
Date: Sun, 17 Sep 2000 12:20:08 +0200
From: Bernd Walter <ticso@cicely5.cicely.de>
To: Doug Rabson <dfr@nlsystems.com>
Cc: John Baldwin <jhb@pike.osd.bsdi.com>, alpha@FreeBSD.org,
	smp@FreeBSD.org
Subject: Re: Prelimiary interrupt thread patches for alpha
Message-ID: <20000917122007.A67895@cicely5.cicely.de>
References: <20000915075812.B60348@cicely5.cicely.de> <Pine.BSF.4.21.0009161324110.86297-100000@salmon.nlsystems.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0.1i
In-Reply-To: <Pine.BSF.4.21.0009161324110.86297-100000@salmon.nlsystems.com>; from dfr@nlsystems.com on Sat, Sep 16, 2000 at 01:25:36PM +0100
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Sat, Sep 16, 2000 at 01:25:36PM +0100, Doug Rabson wrote:
> The patch should work on all except AS4100 and AS8200. I would like to get
> some testing on tsunami, apecs and lca based machines for a sanity check
> but it ought to work (crossed fingers).

-current won't work on AXPpci systems even without any patch:
Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 5.0-CURRENT #1: Sun Sep 17 12:02:20 CEST 2000
    ticso@cicely9.cicely.de:/var/d7/src-2000-09-17/src/sys/compile/CICELY10
DEC AXPpci
Alpha PC AXPpci33, 167MHz
8192 byte page size, 1 processor.
CPU: LCA Family major=4 minor=2
OSF PAL rev: 0x100090002012d
real memory  = 132145152 (129048K bytes)
avail memory = 122486784 (119616K bytes)
Preloaded elf kernel "kernel" at 0xfffffc0000650000.
lca0: <21066 Core Logic chipset>

fatal kernel trap:

    trap entry = 0x2 (memory management fault)
    a0         = 0xfffffbf1e0000018
    a1         = 0x1
    a2         = 0x0
    pc         = 0xfffffc000050e3b0
    ra         = 0xfffffc000050e2bc
    curproc    = 0xfffffc00005de3f0
        pid = 0, comm = swapper

Stopped at      badaddr_read+0xd0:      ldl     t0,0(a0) <0xfffffbf1e0000018>   <t0=0x1,a0=0xfffffbf1e0000018>
db> trace
badaddr_read() at badaddr_read+0xd0
badaddr() at badaddr+0x1c
lca_pcib_read_config() at lca_pcib_read_config+0x34c
pci_read_device() at pci_read_device+0xb0
pci_add_children() at pci_add_children+0xe4
pci_probe() at pci_probe+0x17c
device_probe_child() at device_probe_child+0x13c
device_probe_and_attach() at device_probe_and_attach+0x54
bus_generic_attach() at bus_generic_attach+0x28
device_probe_and_attach() at device_probe_and_attach+0xcc
bus_generic_attach() at bus_generic_attach+0x28
lca_attach() at lca_attach+0xa0
device_probe_and_attach() at device_probe_and_attach+0xcc
root_bus_configure() at root_bus_configure+0x38
configure() at configure+0x40
mi_startup() at mi_startup+0xf4
locorestart() at locorestart+0x6c

-- 
B.Walter              COSMO-Project         http://www.cosmo-project.de
ticso@cicely.de         Usergroup           info@cosmo-project.de


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sun Sep 17  6:30:22 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from ns.plaut.de (ns.plaut.de [194.99.75.166])
	by hub.freebsd.org (Postfix) with ESMTP
	id 8423137B422; Sun, 17 Sep 2000 06:30:15 -0700 (PDT)
Received: (from uucp@localhost)
	by ns.plaut.de (8.9.3/8.9.3) with UUCP id PAA01098;
	Sun, 17 Sep 2000 15:29:54 +0200 (CEST)
	(envelope-from root@nihil.plaut.de)
Received: from localhost (root@localhost)
	by nihil.plaut.de (8.11.0/8.8.8) with ESMTP id e8HETkm00847;
	Sun, 17 Sep 2000 16:29:46 +0200 (CEST)
	(envelope-from root@nihil.plaut.de)
Date: Sun, 17 Sep 2000 16:29:41 +0200 (CEST)
From: Michael Reifenberger <root@nihil.plaut.de>
To: Greg Lehey <grog@lemis.com>
Cc: FreeBSD-Current <current@FreeBSD.ORG>,
	FreeBSD-SMP <freebsd-smp@FreeBSD.ORG>
Subject: Re: Debugging -current SMPNG HANG on heavy disk-io
In-Reply-To: <20000917102824.C42114@wantadilla.lemis.com>
Message-ID: <Pine.BSF.4.21.0009171622240.840-100000@localhost>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

Hi,
...
> The frames above are what the system went to as the result of your
> debugger request.  I'd also be interested to see the output of the
> 'icnt' macro (if this is UP machine) or 'icnt1' (if it's SMP), and
> 'ps' (the macro I promised above).
(kgdb) icnt
1215544*        566*    0       0*      0       0       1       0
1555964*        0*      0*      0*      0       0*      22636*  11
1       0       0       0       0       0       441031
imen: 6f0b
(kgdb) ps
  pid    proc    addr    uid  pri ppid  pgrp   flag stat comm         wchan
   37 c7874a00 c9665000    0  32     6    36  004086  3  tar          piperd c9663f20
   36 c7874bc0 c960a000    0  32     6    36  004006  3  tar          FFS node c02f4220
   35 c7874d80 c9607000    0  32     6    35  004006  3  tar          inode c1d2fa00
    6 c7874f40 c9604000    0  32     1     6  004086  3  sh           wait c7874f40
    5 c7875100 c8295000    0   4     0     0  000204  3  syncer       syncer c03236e8
    4 c78752c0 c8293000    0   4     0     0  100204  3  bufdaemon    psleep c03072f0
    3 c7875480 c8291000    0   4     0     0  000204  3  vmdaemon     psleep c0317a00
    2 c7875640 c828f000    0   4     0     0  100204  3  pagedaemon   psleep c02f5938
   21 c7875800 c78d4000    0   1*    0     0  000204  2  irq8: rtc
   20 c78759c0 c78d2000    0   1*    0     0  000204  2  irq0: clk
   19 c7875b80 c78b0000    0   7*    0     0  000204  6  irq5: pcm0
   18 c7875d40 c788e000    0   7*    0     0  000204  6  irq7: ppc0
   17 c7875f00 c788c000    0   7*    0     0  000204  6  irq12: psm0
   16 c78760c0 c788a000    0   7*    0     0  000204  2  irq1: atkbd0
   15 c7876280 c7887000    0   6*    0     0  000204  6  irq6: fdc0
   14 c7876440 c7885000    0   6*    0     0  000204  6  irq15: ata1
   13 c7876600 c7883000    0   6*    0     0  000204  2  irq14: ata0
   12 c78767c0 c7881000    0   4     0     0  000204  3  random       rndslp c0322934
   11 c7876980 c787f000    0  15*    0     0  008204  6  softinterrupt
   10 c7876b40 c787d000    0   4     0     0  008204  2  idle
    1 c7876d00 c787b000    0   4     0     1  004284  3  init         wait c7876d00
    0 c0322960 c03c0000    0   4     0     0  000204  3  swapper      sched c0322960
...
> handler.  At this point, it would be very interesting to see the value
> of p->p_comm, which is the process name at the end of the ps listing.
> 
> > (kgdb) proc 35
> 
> Why are you interested in this process?
It was one of the tar's which I grabbed by hand (without your ps macro)
...

Whats next to show :-)

Bye!
----
Michael Reifenberger
^.*Plaut.*$, IT, R/3 Basis, GPS


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sun Sep 17  6:31:38 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from anchor-post-34.mail.demon.net (anchor-post-34.mail.demon.net [194.217.242.92])
	by hub.freebsd.org (Postfix) with ESMTP
	id 9869F37B423; Sun, 17 Sep 2000 06:31:33 -0700 (PDT)
Received: from nlsys.demon.co.uk ([158.152.125.33] helo=herring.nlsystems.com)
	by anchor-post-34.mail.demon.net with esmtp (Exim 2.12 #1)
	id 13aeXY-000DDd-0Y; Sun, 17 Sep 2000 14:31:31 +0100
Received: from salmon.nlsystems.com (salmon.nlsystems.com [10.0.0.3])
	by herring.nlsystems.com (8.9.3/8.8.8) with ESMTP id OAA33687;
	Sun, 17 Sep 2000 14:33:51 +0100 (BST)
	(envelope-from dfr@nlsystems.com)
Date: Sun, 17 Sep 2000 14:31:49 +0100 (BST)
From: Doug Rabson <dfr@nlsystems.com>
To: Matthew Jacob <mjacob@feral.com>
Cc: Bernd Walter <ticso@cicely5.cicely.de>,
	John Baldwin <jhb@pike.osd.bsdi.com>, alpha@FreeBSD.ORG,
	smp@FreeBSD.ORG
Subject: Re: Prelimiary interrupt thread patches for alpha
In-Reply-To: <Pine.BSF.4.21.0009161152140.91234-100000@beppo.feral.com>
Message-ID: <Pine.BSF.4.21.0009171418520.86297-100000@salmon.nlsystems.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Sat, 16 Sep 2000, Matthew Jacob wrote:

> 
> I can look at Rawhide and TurboLaser next week when my temperature comes down
> (flu). Doug- you have a rawhide. I have the turbolaseers. Why don't y'all
> check the patch in so we have something same to work with?

I'm looking at rawhide right now, so hopefully the patch should contain
support for that when its committed. We can leave turbolaser for you then.

-- 
Doug Rabson				Mail:  dfr@nlsystems.com
Nonlinear Systems Ltd.			Phone: +44 20 8348 3944


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sun Sep 17  7:18:59 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from post.mail.nl.demon.net (post-10.mail.nl.demon.net [194.159.73.20])
	by hub.freebsd.org (Postfix) with ESMTP
	id 8DE9937B422; Sun, 17 Sep 2000 07:18:55 -0700 (PDT)
Received: from [212.238.54.101] (helo=freebie.demon.nl)
	by post.mail.nl.demon.net with smtp (Exim 3.14 #2)
	id 13afHR-0005kK-00; Sun, 17 Sep 2000 14:18:53 +0000
Received: (from wkb@localhost)
	by freebie.demon.nl (8.11.0/8.11.0) id e8HEK6I49677;
	Sun, 17 Sep 2000 16:20:06 +0200 (CEST)
	(envelope-from wkb)
Date: Sun, 17 Sep 2000 16:20:06 +0200
From: Wilko Bulte <wkb@freebie.demon.nl>
To: Doug Rabson <dfr@nlsystems.com>
Cc: Matthew Jacob <mjacob@feral.com>,
	Bernd Walter <ticso@cicely5.cicely.de>,
	John Baldwin <jhb@pike.osd.bsdi.com>, alpha@freebsd.org,
	smp@freebsd.org
Subject: Re: Prelimiary interrupt thread patches for alpha
Message-ID: <20000917162006.B49643@freebie.demon.nl>
References: <Pine.BSF.4.21.0009161152140.91234-100000@beppo.feral.com> <Pine.BSF.4.21.0009171418520.86297-100000@salmon.nlsystems.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2i
In-Reply-To: <Pine.BSF.4.21.0009171418520.86297-100000@salmon.nlsystems.com>; from dfr@nlsystems.com on Sun, Sep 17, 2000 at 02:31:49PM +0100
X-OS: FreeBSD 4.1-STABLE
X-PGP: finger wilko@freebsd.org
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Sun, Sep 17, 2000 at 02:31:49PM +0100, Doug Rabson wrote:
> On Sat, 16 Sep 2000, Matthew Jacob wrote:
> 
> > 
> > I can look at Rawhide and TurboLaser next week when my temperature comes down
> > (flu). Doug- you have a rawhide. I have the turbolaseers. Why don't y'all
> > check the patch in so we have something same to work with?
> 
> I'm looking at rawhide right now, so hopefully the patch should contain
> support for that when its committed. We can leave turbolaser for you then.

Does this imply 'loader' is working again or am I reading something that is
not implied?

-- 
Wilko Bulte  	 					wilko@freebsd.org
							Arnhem, the Netherlands


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sun Sep 17  8:12: 4 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from anchor-post-34.mail.demon.net (anchor-post-34.mail.demon.net [194.217.242.92])
	by hub.freebsd.org (Postfix) with ESMTP
	id CA89337B422; Sun, 17 Sep 2000 08:11:58 -0700 (PDT)
Received: from nlsys.demon.co.uk ([158.152.125.33] helo=herring.nlsystems.com)
	by anchor-post-34.mail.demon.net with esmtp (Exim 2.12 #1)
	id 13ag6l-000K24-0Y; Sun, 17 Sep 2000 16:11:56 +0100
Received: from salmon.nlsystems.com (salmon.nlsystems.com [10.0.0.3])
	by herring.nlsystems.com (8.9.3/8.8.8) with ESMTP id QAA34226;
	Sun, 17 Sep 2000 16:13:56 +0100 (BST)
	(envelope-from dfr@nlsystems.com)
Date: Sun, 17 Sep 2000 16:11:52 +0100 (BST)
From: Doug Rabson <dfr@nlsystems.com>
To: Wilko Bulte <wkb@freebie.demon.nl>
Cc: Matthew Jacob <mjacob@feral.com>,
	Bernd Walter <ticso@cicely5.cicely.de>,
	John Baldwin <jhb@pike.osd.bsdi.com>, alpha@freebsd.org,
	smp@freebsd.org
Subject: Re: Prelimiary interrupt thread patches for alpha
In-Reply-To: <20000917162006.B49643@freebie.demon.nl>
Message-ID: <Pine.BSF.4.21.0009171611110.86297-100000@salmon.nlsystems.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Sun, 17 Sep 2000, Wilko Bulte wrote:

> On Sun, Sep 17, 2000 at 02:31:49PM +0100, Doug Rabson wrote:
> > On Sat, 16 Sep 2000, Matthew Jacob wrote:
> > 
> > > 
> > > I can look at Rawhide and TurboLaser next week when my temperature comes down
> > > (flu). Doug- you have a rawhide. I have the turbolaseers. Why don't y'all
> > > check the patch in so we have something same to work with?
> > 
> > I'm looking at rawhide right now, so hopefully the patch should contain
> > support for that when its committed. We can leave turbolaser for you then.
> 
> Does this imply 'loader' is working again or am I reading something that is
> not implied?

I'm not looking at loader at all right now. I'm netbooting all my test
machines and netboot isn't affected by the loader size problems.

-- 
Doug Rabson				Mail:  dfr@nlsystems.com
Nonlinear Systems Ltd.			Phone: +44 20 8348 3944


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sun Sep 17  8:19:19 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from ns.plaut.de (ns.plaut.de [194.99.75.166])
	by hub.freebsd.org (Postfix) with ESMTP
	id 7C20537B424; Sun, 17 Sep 2000 08:19:09 -0700 (PDT)
Received: (from uucp@localhost)
	by ns.plaut.de (8.9.3/8.9.3) with UUCP id RAA01381;
	Sun, 17 Sep 2000 17:18:46 +0200 (CEST)
	(envelope-from root@nihil.plaut.de)
Received: from localhost (root@localhost)
	by nihil.plaut.de (8.11.0/8.8.8) with ESMTP id e8HGIiF01053;
	Sun, 17 Sep 2000 18:18:44 +0200 (CEST)
	(envelope-from root@nihil.plaut.de)
Date: Sun, 17 Sep 2000 18:18:39 +0200 (CEST)
From: Michael Reifenberger <root@nihil.plaut.de>
To: Greg Lehey <grog@lemis.com>
Cc: FreeBSD-Current <current@FreeBSD.ORG>,
	FreeBSD-SMP <freebsd-smp@FreeBSD.ORG>
Subject: Re: Debugging -current SMPNG HANG on heavy disk-io
In-Reply-To: <20000917102824.C42114@wantadilla.lemis.com>
Message-ID: <Pine.BSF.4.21.0009171817301.1035-100000@localhost>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

Hi,
if the order of the ps macro is correct, here the backtraces of the procs 35,36,37:

Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD 5.0-CURRENT #0: Sat Sep 16 19:32:53 CEST 2000
    root@nihil.plaut.de:/usr/obj/usr/src/sys/nihil
Timecounter "i8254"  frequency 1193182 Hz
Timecounter "TSC"  frequency 266615847 Hz
CPU: Pentium II/Pentium II Xeon/Celeron (266.62-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x652  Stepping = 2
  Features=0x183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR>
real memory  = 268369920 (262080K bytes)
config> #flags wdc0 0xa0ffa0ff
Invalid command or syntax.  Type `?' for help.
config> #flags wdc1 0xa0ffa0ff
Invalid command or syntax.  Type `?' for help.
config> #iosiz npx0 196608
Invalid command or syntax.  Type `?' for help.
config> #irq pcic0 11
Invalid command or syntax.  Type `?' for help.
config> quit
avail memory = 257589248 (251552K bytes)
Preloaded elf kernel "kernel.ko" at 0xc03ad000.
Preloaded userconfig_script "/boot/kernel.conf" at 0xc03ad0ac.
Preloaded elf module "linux.ko" at 0xc03ad0fc.
Preloaded elf module "linprocfs.ko" at 0xc03ad19c.
Pentium Pro MTRR support enabled
VESA: v2.0, 2496k memory, flags:0x0, mode table:0xc031ee42 (1000022)
VESA: MagicGraph 256 AV 44K PRELIMINARY
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Intel 82443BX host to PCI bridge (AGP disabled)> on motherboard
pci0: <PCI bus> on pcib0
pci0: <NeoMagic MagicMedia 256AV SVGA controller> at 4.0 irq 11
isab0: <Intel 82371AB PCI to ISA bridge> at device 5.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel PIIX4 ATA33 controller> port 0xfe60-0xfe6f at device 5.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
pci0: <Intel 82371AB/EB (PIIX4) USB controller> at 5.2 irq 11
pci0: <Intel 82371AB Power management controller> at 5.3
pci0: <Toshiba Fast Infra Red controller> at 9.0 irq 11
pcic-pci0: <Toshiba ToPIC97 PCI-CardBus Bridge> at device 11.0 on pci0
pcic-pci1: <Toshiba ToPIC97 PCI-CardBus Bridge> at device 11.1 on pci0
fdc0: <NEC 765 or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: model GlidePoint, device ID 0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> on isa0
sc0: VGA <16 virtual consoles, flags=0x200>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1: configured irq 3 not in bitmap of probed irqs 0
ppc0: <Parallel port> at port 0x378-0x37f irq 7 flags 0x40 on isa0
ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/16 bytes threshold
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
pps0: <Pulse per second Timing Interface> on ppbus0
pcic0: <Intel i82365> at port 0x3e0-0x3e1 on isa0
pcic0: Polling mode
pccard0: <PC Card bus -- kludge version> on pcic0
pccard1: <PC Card bus -- kludge version> on pcic0
unknown: <PNP0303> can't assign resources
unknown: <PNP0f13> can't assign resources
unknown: <PNP0700> can't assign resources
unknown: <PNP0501> can't assign resources
unknown: <PNP0401> can't assign resources
unknown: <PNP0e03> can't assign resources
pcm0: <Yamaha OPL-SAx> at port 0x220-0x233,0x530-0x537,0x388-0x38f,0x330-0x333,0x538-0x539 irq 5 drq 1,0 on isa0
IP packet filtering initialized, divert enabled, rule-based forwarding disabled, default to deny, logging limited to 100 packets/entry by default
IPsec: Initialized Security Association Processing.
ad0: 24207MB <IBM-DARA-225000> [49184/16/63] at ata0-master using UDMA33
ad1: 6194MB <IBM-DADA-26480> [13424/15/63] at ata1-master using UDMA33
Mounting root from ufs:/dev/ad0s1a
pccard: card inserted, slot 0
panic: from debugger

syncing disks... 
done
Uptime: 3h22m40s

dumping to dev #ad/0x20001, offset 2547840
dump ata0: resetting devices .. done
255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 
---
#0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:475
475		dumppcb.pcb_cr3 = rcr3();
(kgdb) proc 37
(kgdb) bt
#0  mi_switch () at /usr/src/sys/kern/kern_synch.c:953
#1  0xc017e2f0 in msleep (ident=0xc9663f20, mtx=0x0, priority=0x110, wmesg=0xc028b101 "piperd", timo=0x0)
    at /usr/src/sys/kern/kern_synch.c:506
#2  0xc018e5bc in pipe_read (fp=0xc1258cc0, uio=0xc9666ec4, cred=0xc0ea9f80, flags=0x0, p=0xc7874a00)
    at /usr/src/sys/kern/sys_pipe.c:445
#3  0xc018d01e in dofileread (p=0xc7874a00, fp=0xc1258cc0, fd=0x0, buf=0x80ac000, nbyte=0x2800, 
    offset=0xffffffffffffffff, flags=0x0) at /usr/src/sys/sys/file.h:141
#4  0xc018cf47 in read (p=0xc7874a00, uap=0xc9666f80) at /usr/src/sys/kern/sys_generic.c:110
#5  0xc0261aec in syscall2 (frame={tf_fs = 0x2f, tf_es = 0x80b002f, tf_ds = 0xbfbf002f, tf_edi = 0xbfbffd60, 
      tf_esi = 0x1, tf_ebp = 0xbfbffc6c, tf_isp = 0xc9666fd4, tf_ebx = 0x1, tf_edx = 0x2800, tf_ecx = 0x80ae600, 
      tf_eax = 0x3, tf_trapno = 0x7, tf_err = 0x2, tf_eip = 0x8089494, tf_cs = 0x1f, tf_eflags = 0x297, 
      tf_esp = 0xbfbffc40, tf_ss = 0x2f}) at /usr/src/sys/i386/i386/trap.c:1136
#6  0xc0255b0f in Xint0x80_syscall ()
#7  0x80499e5 in ?? ()
#8  0x80481fd in ?? ()
#9  0x8050976 in ?? ()
#10 0x80505db in ?? ()
#11 0x80528f6 in ?? ()
#12 0x8048135 in ?? ()
(kgdb) proc 36
(kgdb) bt
#0  mi_switch () at /usr/src/sys/kern/kern_synch.c:953
#1  0xc017e2f0 in msleep (ident=0xc02f4220, mtx=0xc0322fc0, priority=0x2, wmesg=0xc02a2b00 "FFS node", timo=0x0)
    at /usr/src/sys/kern/kern_synch.c:506
#2  0xc01762bc in malloc (size=0x104, type=0xc02f4220, flags=0x0) at /usr/src/sys/kern/kern_malloc.c:171
#3  0xc021fecb in ffs_vget (mp=0xc1070000, ino=0x1dddba, vpp=0xc960bd40) at /usr/src/sys/ufs/ffs/ffs_vfsops.c:1045
#4  0xc0223575 in ufs_lookup (ap=0xc960bd98) at /usr/src/sys/ufs/ufs/ufs_lookup.c:551
#5  0xc0227bd1 in ufs_vnoperate (ap=0xc960bd98) at /usr/src/sys/ufs/ufs/ufs_vnops.c:2263
#6  0xc01a6df3 in vfs_cache_lookup (ap=0xc960bdf0) at vnode_if.h:77
#7  0xc0227bd1 in ufs_vnoperate (ap=0xc960bdf0) at /usr/src/sys/ufs/ufs/ufs_vnops.c:2263
#8  0xc01a9f44 in lookup (ndp=0xc960be6c) at vnode_if.h:52
#9  0xc01a9968 in namei (ndp=0xc960be6c) at /usr/src/sys/kern/vfs_lookup.c:153
#10 0xc01afaa9 in lstat (p=0xc7874bc0, uap=0xc960bf80) at /usr/src/sys/kern/vfs_syscalls.c:1787
#11 0xc0261aec in syscall2 (frame={tf_fs = 0x2f, tf_es = 0x80a002f, tf_ds = 0xbfbf002f, tf_edi = 0x80b8030, 
      tf_esi = 0x80b802c, tf_ebp = 0xbfbffa74, tf_isp = 0xc960bfd4, tf_ebx = 0x14, tf_edx = 0x80af200, 
      tf_ecx = 0x80af200, tf_eax = 0xbe, tf_trapno = 0x7, tf_err = 0x2, tf_eip = 0x805e354, tf_cs = 0x1f, 
      tf_eflags = 0x293, tf_esp = 0xbfbff9e8, tf_ss = 0x2f}) at /usr/src/sys/i386/i386/trap.c:1136
#12 0xc0255b0f in Xint0x80_syscall ()
#13 0x804b0cd in ?? ()
#14 0x804b0cd in ?? ()
#15 0x804b0cd in ?? ()
#16 0x804b0cd in ?? ()
#17 0x804a431 in ?? ()
#18 0x8052801 in ?? ()
#19 0x8048135 in ?? ()
(kgdb) proc 35
(kgdb) bt
#0  mi_switch () at /usr/src/sys/kern/kern_synch.c:953
#1  0xc017e2f0 in msleep (ident=0xc1d2fa00, mtx=0x0, priority=0x8, wmesg=0xc02a2c62 "inode", timo=0x0)
    at /usr/src/sys/kern/kern_synch.c:506
#2  0xc01750f2 in acquire (lkp=0xc1d2fa00, extflags=0x1000040, wanted=0x600) at /usr/src/sys/kern/kern_lock.c:147
#3  0xc017537c in lockmgr (lkp=0xc1d2fa00, flags=0x1010002, interlkp=0xc9b371ec, p=0xc7874d80)
    at /usr/src/sys/kern/kern_lock.c:354
#4  0xc01a8ba8 in vop_stdlock (ap=0xc9608d34) at /usr/src/sys/kern/vfs_default.c:243
#5  0xc0227bd1 in ufs_vnoperate (ap=0xc9608d34) at /usr/src/sys/ufs/ufs/ufs_vnops.c:2263
#6  0xc01b35e3 in vn_lock (vp=0xc9b37180, flags=0x10002, p=0xc7874d80) at vnode_if.h:840
#7  0xc01abc8b in vget (vp=0xc9b37180, flags=0x2, p=0xc7874d80) at /usr/src/sys/kern/vfs_subr.c:1393
#8  0xc01a6d18 in vfs_cache_lookup (ap=0xc9608df0) at /usr/src/sys/kern/vfs_cache.c:470
#9  0xc0227bd1 in ufs_vnoperate (ap=0xc9608df0) at /usr/src/sys/ufs/ufs/ufs_vnops.c:2263
#10 0xc01a9f44 in lookup (ndp=0xc9608e6c) at vnode_if.h:52
#11 0xc01a9968 in namei (ndp=0xc9608e6c) at /usr/src/sys/kern/vfs_lookup.c:153
#12 0xc01afaa9 in lstat (p=0xc7874d80, uap=0xc9608f80) at /usr/src/sys/kern/vfs_syscalls.c:1787
#13 0xc0261aec in syscall2 (frame={tf_fs = 0xc025002f, tf_es = 0x2f, tf_ds = 0x2f, tf_edi = 0x80b8030, 
      tf_esi = 0x80b802c, tf_ebp = 0xbfbffa6c, tf_isp = 0xc9608fd4, tf_ebx = 0x14, tf_edx = 0x80af200, 
      tf_ecx = 0x80af200, tf_eax = 0xbe, tf_trapno = 0x7, tf_err = 0x2, tf_eip = 0x805e354, tf_cs = 0x1f, 
      tf_eflags = 0x283, tf_esp = 0xbfbff9e0, tf_ss = 0x2f}) at /usr/src/sys/i386/i386/trap.c:1136
#14 0xc0255b0f in Xint0x80_syscall ()
#15 0x804b0cd in ?? ()
#16 0x804b0cd in ?? ()
#17 0x804b0cd in ?? ()
#18 0x804b0cd in ?? ()
#19 0x804a431 in ?? ()
#20 0x8052801 in ?? ()
#21 0x8048135 in ?? ()
(kgdb) quit

Bye!
----
Michael Reifenberger
^.*Plaut.*$, IT, R/3 Basis, GPS


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sun Sep 17  8:43: 4 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from post.mail.nl.demon.net (post-11.mail.nl.demon.net [194.159.73.21])
	by hub.freebsd.org (Postfix) with ESMTP
	id C578337B42C; Sun, 17 Sep 2000 08:42:58 -0700 (PDT)
Received: from [212.238.54.101] (helo=freebie.demon.nl)
	by post.mail.nl.demon.net with smtp (Exim 3.14 #4)
	id 13agan-000HZZ-00; Sun, 17 Sep 2000 15:42:58 +0000
Received: (from wkb@localhost)
	by freebie.demon.nl (8.11.0/8.11.0) id e8HFiB950228;
	Sun, 17 Sep 2000 17:44:11 +0200 (CEST)
	(envelope-from wkb)
Date: Sun, 17 Sep 2000 17:44:11 +0200
From: Wilko Bulte <wkb@freebie.demon.nl>
To: Doug Rabson <dfr@nlsystems.com>
Cc: Matthew Jacob <mjacob@feral.com>,
	Bernd Walter <ticso@cicely5.cicely.de>,
	John Baldwin <jhb@pike.osd.bsdi.com>, alpha@freebsd.org,
	smp@freebsd.org
Subject: Re: Prelimiary interrupt thread patches for alpha
Message-ID: <20000917174411.A50193@freebie.demon.nl>
References: <20000917162006.B49643@freebie.demon.nl> <Pine.BSF.4.21.0009171611110.86297-100000@salmon.nlsystems.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2i
In-Reply-To: <Pine.BSF.4.21.0009171611110.86297-100000@salmon.nlsystems.com>; from dfr@nlsystems.com on Sun, Sep 17, 2000 at 04:11:52PM +0100
X-OS: FreeBSD 4.1-STABLE
X-PGP: finger wilko@freebsd.org
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Sun, Sep 17, 2000 at 04:11:52PM +0100, Doug Rabson wrote:
> On Sun, 17 Sep 2000, Wilko Bulte wrote:
> 
> > On Sun, Sep 17, 2000 at 02:31:49PM +0100, Doug Rabson wrote:
> > > On Sat, 16 Sep 2000, Matthew Jacob wrote:

> > > I'm looking at rawhide right now, so hopefully the patch should contain
> > > support for that when its committed. We can leave turbolaser for you then.
> > 
> > Does this imply 'loader' is working again or am I reading something that is
> > not implied?
> 
> I'm not looking at loader at all right now. I'm netbooting all my test
> machines and netboot isn't affected by the loader size problems.

OK, clear. Thanks

-- 
Wilko Bulte  	 					wilko@freebsd.org
							Arnhem, the Netherlands


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sun Sep 17 10:39:29 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from caspian.plutotech.com (caspian.plutotech.com [206.168.67.80])
	by hub.freebsd.org (Postfix) with ESMTP
	id E76BD37B423; Sun, 17 Sep 2000 10:39:25 -0700 (PDT)
Received: (from gibbs@localhost)
	by caspian.plutotech.com (8.9.3/8.9.1) id KAA01450;
	Thu, 24 Aug 2000 10:36:33 -0600 (MDT)
	(envelope-from gibbs)
Date: Thu, 24 Aug 2000 10:36:33 -0600 (MDT)
Message-Id: <200008241636.KAA01450@caspian.plutotech.com>
From: "Justin T. Gibbs" <gibbs@FreeBSD.org>
To: Mike Smith <msmith@FreeBSD.org>
Cc: smp@FreeBSD.org
Subject: Re: 4.0-R panic on Dell PowerEdge 2450 
X-Newsgroups: pluto.freebsd.smp
In-Reply-To: <200008221950.MAA21476@mass.osd.bsdi.com>
User-Agent: tin/1.4.2-20000205 ("Possession") (UNIX) (FreeBSD/5.0-CURRENT (i386))
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

>> Can you elaborate? Is this a FreeBSD-specific problem? Does Dell know this?
> 
> It's a problem specific to interaction between this Seagate firmware and 
> the FreeBSD Adaptec driver, however it is a Seagate bug.  I believe that 
> Dell know about it; Seagate certainly do.

There is nothing to indicate that the problem is specific to either
Adaptec cards or FreeBSD.  In fact, Seagate originally performed the
fixes in response to problems under Netware.  Essentially, if you
fill up the cache on the Seagate in just the right way, it will
fall off the bus.

--
Justin


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sun Sep 17 15:10:41 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from wantadilla.lemis.com (wantadilla.lemis.com [192.109.197.80])
	by hub.freebsd.org (Postfix) with ESMTP
	id CC32F37B423; Sun, 17 Sep 2000 15:10:32 -0700 (PDT)
Received: (from grog@localhost)
	by wantadilla.lemis.com (8.11.0/8.9.3) id e8HMAMk93586;
	Mon, 18 Sep 2000 07:40:22 +0930 (CST)
	(envelope-from grog)
Date: Mon, 18 Sep 2000 07:40:22 +0930
From: Greg Lehey <grog@lemis.com>
To: Michael Reifenberger <root@nihil.plaut.de>
Cc: FreeBSD-Current <current@FreeBSD.ORG>,
	FreeBSD-SMP <freebsd-smp@FreeBSD.ORG>
Subject: Re: Debugging -current SMPNG HANG on heavy disk-io
Message-ID: <20000918074021.E67912@wantadilla.lemis.com>
References: <20000917102824.C42114@wantadilla.lemis.com> <Pine.BSF.4.21.0009171817301.1035-100000@localhost> <20000917102824.C42114@wantadilla.lemis.com> <Pine.BSF.4.21.0009171622240.840-100000@localhost>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0i
In-Reply-To: <Pine.BSF.4.21.0009171622240.840-100000@localhost>; from root@nihil.plaut.de on Sun, Sep 17, 2000 at 04:29:41PM +0200
Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia
Phone: +61-8-8388-8286
Fax: +61-8-8388-8725
Mobile: +61-418-838-708
WWW-Home-Page: http://www.lemis.com/~grog
X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF  13 24 52 F8 6D A4 95 EF
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Sunday, 17 September 2000 at 16:29:41 +0200, Michael Reifenberger wrote:
> Hi,
> ...
>> The frames above are what the system went to as the result of your
>> debugger request.  I'd also be interested to see the output of the
>> 'icnt' macro (if this is UP machine) or 'icnt1' (if it's SMP), and
>> 'ps' (the macro I promised above).
> (kgdb) icnt
> 1215544*        566*    0       0*      0       0       1       0
> 1555964*        0*      0*      0*      0       0*      22636*  11
> 1       0       0       0       0       0       441031
> imen: 6f0b
> (kgdb) ps
>   pid    proc    addr    uid  pri ppid  pgrp   flag stat comm         wchan
>    37 c7874a00 c9665000    0  32     6    36  004086  3  tar          piperd c9663f20
>    36 c7874bc0 c960a000    0  32     6    36  004006  3  tar          FFS node c02f4220
>    35 c7874d80 c9607000    0  32     6    35  004006  3  tar          inode c1d2fa00
>     6 c7874f40 c9604000    0  32     1     6  004086  3  sh           wait c7874f40
>     5 c7875100 c8295000    0   4     0     0  000204  3  syncer       syncer c03236e8
>     4 c78752c0 c8293000    0   4     0     0  100204  3  bufdaemon    psleep c03072f0
>     3 c7875480 c8291000    0   4     0     0  000204  3  vmdaemon     psleep c0317a00
>     2 c7875640 c828f000    0   4     0     0  100204  3  pagedaemon   psleep c02f5938
>    21 c7875800 c78d4000    0   1*    0     0  000204  2  irq8: rtc
>    20 c78759c0 c78d2000    0   1*    0     0  000204  2  irq0: clk
>    19 c7875b80 c78b0000    0   7*    0     0  000204  6  irq5: pcm0
>    18 c7875d40 c788e000    0   7*    0     0  000204  6  irq7: ppc0
>    17 c7875f00 c788c000    0   7*    0     0  000204  6  irq12: psm0
>    16 c78760c0 c788a000    0   7*    0     0  000204  2  irq1: atkbd0
>    15 c7876280 c7887000    0   6*    0     0  000204  6  irq6: fdc0
>    14 c7876440 c7885000    0   6*    0     0  000204  6  irq15: ata1
>    13 c7876600 c7883000    0   6*    0     0  000204  2  irq14: ata0
>    12 c78767c0 c7881000    0   4     0     0  000204  3  random       rndslp c0322934
>    11 c7876980 c787f000    0  15*    0     0  008204  6  softinterrupt
>    10 c7876b40 c787d000    0   4     0     0  008204  2  idle
>     1 c7876d00 c787b000    0   4     0     1  004284  3  init         wait c7876d00
>     0 c0322960 c03c0000    0   4     0     0  000204  3  swapper      sched c0322960
> ...
>> handler.  At this point, it would be very interesting to see the value
>> of p->p_comm, which is the process name at the end of the ps listing.
>>
>>> (kgdb) proc 35
>>
>> Why are you interested in this process?
> It was one of the tar's which I grabbed by hand (without your ps macro)
> ...
>
> Whats next to show :-)

To quote:

>> At this point, it would be very interesting to see the value of
>> p->p_comm, which is the process name at the end of the ps listing.

You could also show the content of p->p_pid.  If you don't have a p
pointer in the frame you're looking at, use ((struct
*proc)gd_curproc)->p_pid and ((struct *proc)gd_curproc)->p_comm.  We
need to know what is hanging.

I'm probably going on holiday for the rest of the week; somebody else
should pick this one up.

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sun Sep 17 15:24: 2 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from ns.plaut.de (ns.plaut.de [194.99.75.166])
	by hub.freebsd.org (Postfix) with ESMTP
	id D06E037B422; Sun, 17 Sep 2000 15:23:56 -0700 (PDT)
Received: (from uucp@localhost)
	by ns.plaut.de (8.9.3/8.9.3) with UUCP id AAA02625;
	Mon, 18 Sep 2000 00:23:36 +0200 (CEST)
	(envelope-from root@nihil.plaut.de)
Received: from localhost (root@localhost)
	by nihil.plaut.de (8.11.0/8.8.8) with ESMTP id e8HNNZr01576;
	Mon, 18 Sep 2000 01:23:35 +0200 (CEST)
	(envelope-from root@nihil.plaut.de)
Date: Mon, 18 Sep 2000 01:23:30 +0200 (CEST)
From: Michael Reifenberger <root@nihil.plaut.de>
To: Greg Lehey <grog@lemis.com>
Cc: FreeBSD-Current <current@FreeBSD.ORG>,
	FreeBSD-SMP <freebsd-smp@FreeBSD.ORG>
Subject: Re: Debugging -current SMPNG HANG on heavy disk-io
In-Reply-To: <20000918074021.E67912@wantadilla.lemis.com>
Message-ID: <Pine.BSF.4.21.0009180114302.1035-100000@localhost>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Mon, 18 Sep 2000, Greg Lehey wrote:
...
> You could also show the content of p->p_pid.  If you don't have a p
> pointer in the frame you're looking at, use ((struct
> *proc)gd_curproc)->p_pid and ((struct *proc)gd_curproc)->p_comm.  We
> need to know what is hanging.
Sorry doesn't seem to work:
(kgdb) p p->p_comm
No symbol "p" in current context.
(kgdb) p ((struct*proc)gd_curproc)->p_pid
A syntax error in expression, near `proc)gd_curproc)->p_pid'.
(kgdb) p ((struct *proc)gd_curproc)->p_comm
A syntax error in expression, near `proc)gd_curproc)->p_comm'.
(kgdb) p gd_curproc
$1 = 0xc78760c0


Bye!
----
Michael Reifenberger
^.*Plaut.*$, IT, R/3 Basis, GPS


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sun Sep 17 15:25:33 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from wantadilla.lemis.com (wantadilla.lemis.com [192.109.197.80])
	by hub.freebsd.org (Postfix) with ESMTP
	id 7A55B37B422; Sun, 17 Sep 2000 15:25:26 -0700 (PDT)
Received: (from grog@localhost)
	by wantadilla.lemis.com (8.11.0/8.9.3) id e8HMPI000981;
	Mon, 18 Sep 2000 07:55:18 +0930 (CST)
	(envelope-from grog)
Date: Mon, 18 Sep 2000 07:55:18 +0930
From: Greg Lehey <grog@lemis.com>
To: Michael Reifenberger <root@nihil.plaut.de>
Cc: FreeBSD-Current <current@FreeBSD.ORG>,
	FreeBSD-SMP <freebsd-smp@FreeBSD.ORG>
Subject: Re: Debugging -current SMPNG HANG on heavy disk-io
Message-ID: <20000918075518.H67912@wantadilla.lemis.com>
References: <20000918074021.E67912@wantadilla.lemis.com> <Pine.BSF.4.21.0009180114302.1035-100000@localhost>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0i
In-Reply-To: <Pine.BSF.4.21.0009180114302.1035-100000@localhost>; from root@nihil.plaut.de on Mon, Sep 18, 2000 at 01:23:30AM +0200
Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia
Phone: +61-8-8388-8286
Fax: +61-8-8388-8725
Mobile: +61-418-838-708
WWW-Home-Page: http://www.lemis.com/~grog
X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF  13 24 52 F8 6D A4 95 EF
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Monday, 18 September 2000 at  1:23:30 +0200, Michael Reifenberger wrote:
> On Mon, 18 Sep 2000, Greg Lehey wrote:
> ...
>> You could also show the content of p->p_pid.  If you don't have a p
>> pointer in the frame you're looking at, use ((struct
>> *proc)gd_curproc)->p_pid and ((struct *proc)gd_curproc)->p_comm.  We
>> need to know what is hanging.
> Sorry doesn't seem to work:
> (kgdb) p p->p_comm
> No symbol "p" in current context.
> (kgdb) p ((struct*proc)gd_curproc)->p_pid
> A syntax error in expression, near `proc)gd_curproc)->p_pid'.
> (kgdb) p ((struct *proc)gd_curproc)->p_comm
> A syntax error in expression, near `proc)gd_curproc)->p_comm'.
> (kgdb) p gd_curproc
> $1 = 0xc78760c0

Oops, that's what comes of typing hurriedly early in the morning.

  p ((struct proc *)gd_curproc)->p_comm
  p ((struct proc *)gd_curproc)->p_pid

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sun Sep 17 15:29:57 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from ns.plaut.de (ns.plaut.de [194.99.75.166])
	by hub.freebsd.org (Postfix) with ESMTP
	id 7D8DA37B42C; Sun, 17 Sep 2000 15:29:53 -0700 (PDT)
Received: (from uucp@localhost)
	by ns.plaut.de (8.9.3/8.9.3) with UUCP id AAA02659;
	Mon, 18 Sep 2000 00:29:34 +0200 (CEST)
	(envelope-from root@nihil.plaut.de)
Received: from localhost (root@localhost)
	by nihil.plaut.de (8.11.0/8.8.8) with ESMTP id e8HNTZO01607;
	Mon, 18 Sep 2000 01:29:35 +0200 (CEST)
	(envelope-from root@nihil.plaut.de)
Date: Mon, 18 Sep 2000 01:29:34 +0200 (CEST)
From: Michael Reifenberger <root@nihil.plaut.de>
To: Greg Lehey <grog@lemis.com>
Cc: FreeBSD-Current <current@FreeBSD.ORG>,
	FreeBSD-SMP <freebsd-smp@FreeBSD.ORG>
Subject: Re: Debugging -current SMPNG HANG on heavy disk-io
In-Reply-To: <20000918075518.H67912@wantadilla.lemis.com>
Message-ID: <Pine.BSF.4.21.0009180129040.1035-100000@localhost>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Mon, 18 Sep 2000, Greg Lehey wrote:
...
> Oops, that's what comes of typing hurriedly early in the morning.
> 
>   p ((struct proc *)gd_curproc)->p_comm
>   p ((struct proc *)gd_curproc)->p_pid
Works better:
(kgdb) p ((struct proc *)gd_curproc)->p_comm
$6 = "irq1: atkbd0\000\000\000\000"
(kgdb) p ((struct proc *)gd_curproc)->p_pid
$7 = 0x10

Bye!
----
Michael Reifenberger
^.*Plaut.*$, IT, R/3 Basis, GPS


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sun Sep 17 15:36:16 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from wantadilla.lemis.com (wantadilla.lemis.com [192.109.197.80])
	by hub.freebsd.org (Postfix) with ESMTP
	id 7C81F37B424; Sun, 17 Sep 2000 15:36:10 -0700 (PDT)
Received: (from grog@localhost)
	by wantadilla.lemis.com (8.11.0/8.9.3) id e8HMZxs06146;
	Mon, 18 Sep 2000 08:05:59 +0930 (CST)
	(envelope-from grog)
Date: Mon, 18 Sep 2000 08:05:59 +0930
From: Greg Lehey <grog@lemis.com>
To: Michael Reifenberger <root@nihil.plaut.de>
Cc: FreeBSD-Current <current@FreeBSD.ORG>,
	FreeBSD-SMP <freebsd-smp@FreeBSD.ORG>
Subject: Re: Debugging -current SMPNG HANG on heavy disk-io
Message-ID: <20000918080559.J67912@wantadilla.lemis.com>
References: <20000918075518.H67912@wantadilla.lemis.com> <Pine.BSF.4.21.0009180129040.1035-100000@localhost>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0i
In-Reply-To: <Pine.BSF.4.21.0009180129040.1035-100000@localhost>; from root@nihil.plaut.de on Mon, Sep 18, 2000 at 01:29:34AM +0200
Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia
Phone: +61-8-8388-8286
Fax: +61-8-8388-8725
Mobile: +61-418-838-708
WWW-Home-Page: http://www.lemis.com/~grog
X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF  13 24 52 F8 6D A4 95 EF
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Monday, 18 September 2000 at  1:29:34 +0200, Michael Reifenberger wrote:
> On Mon, 18 Sep 2000, Greg Lehey wrote:
> ...
>> Oops, that's what comes of typing hurriedly early in the morning.
>>
>>   p ((struct proc *)gd_curproc)->p_comm
>>   p ((struct proc *)gd_curproc)->p_pid
> Works better:
> (kgdb) p ((struct proc *)gd_curproc)->p_comm
> $6 = "irq1: atkbd0\000\000\000\000"
> (kgdb) p ((struct proc *)gd_curproc)->p_pid
> $7 = 0x10

Hmm.  I suppose that's reasonable, since you've just pressed a key.

We obviously have a problem here, but I'm not going to be able to look
at it myself until Friday or Saturday.  Anybody else want to take a
look?  There's also the possibility that a problem I had seen and not
investigated could in fact be the same problem: I got it tarring and
untarring across an NFS connection.

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sun Sep 17 18:45:42 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from pike.osd.bsdi.com (pike.osd.bsdi.com [204.216.28.222])
	by hub.freebsd.org (Postfix) with ESMTP
	id 075D137B422; Sun, 17 Sep 2000 18:45:38 -0700 (PDT)
Received: (from jhb@localhost)
	by pike.osd.bsdi.com (8.9.3/8.9.3) id SAA52872;
	Sun, 17 Sep 2000 18:44:39 -0700 (PDT)
	(envelope-from jhb)
From: John Baldwin <jhb@pike.osd.bsdi.com>
Message-Id: <200009180144.SAA52872@pike.osd.bsdi.com>
Subject: Re: Debugging -current SMPNG HANG on heavy disk-io
In-Reply-To: <20000918080559.J67912@wantadilla.lemis.com> from Greg Lehey at
 "Sep 18, 2000 08:05:59 am"
To: Greg Lehey <grog@lemis.com>
Date: Sun, 17 Sep 2000 18:44:39 -0700 (PDT)
Cc: Michael Reifenberger <root@nihil.plaut.de>,
	FreeBSD-Current <current@FreeBSD.ORG>,
	FreeBSD-SMP <freebsd-smp@FreeBSD.ORG>
X-Mailer: ELM [version 2.4ME+ PL68 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

Greg Lehey wrote:
> On Monday, 18 September 2000 at  1:29:34 +0200, Michael Reifenberger wrote:
> > On Mon, 18 Sep 2000, Greg Lehey wrote:
> > ...
> >> Oops, that's what comes of typing hurriedly early in the morning.
> >>
> >>   p ((struct proc *)gd_curproc)->p_comm
> >>   p ((struct proc *)gd_curproc)->p_pid
> > Works better:
> > (kgdb) p ((struct proc *)gd_curproc)->p_comm
> > $6 = "irq1: atkbd0\000\000\000\000"
> > (kgdb) p ((struct proc *)gd_curproc)->p_pid
> > $7 = 0x10
> 
> Hmm.  I suppose that's reasonable, since you've just pressed a key.
> 
> We obviously have a problem here, but I'm not going to be able to look
> at it myself until Friday or Saturday.  Anybody else want to take a
> look?  There's also the possibility that a problem I had seen and not
> investigated could in fact be the same problem: I got it tarring and
> untarring across an NFS connection.

Hmm, could it be lockmgr() related?

-- 

John Baldwin <jhb@bsdi.com> -- http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sun Sep 17 21: 5:42 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from server.baldwin.cx (server.geekhouse.net [64.81.6.52])
	by hub.freebsd.org (Postfix) with ESMTP
	id DC4FE37B422; Sun, 17 Sep 2000 21:05:37 -0700 (PDT)
Received: from john.baldwin.cx (root@john.baldwin.cx [192.168.1.18])
	by server.baldwin.cx (8.9.3/8.9.3) with ESMTP id VAA63765;
	Sun, 17 Sep 2000 21:07:13 -0700 (PDT)
	(envelope-from john@baldwin.cx)
Received: (from john@localhost)
	by john.baldwin.cx (8.9.3/8.9.3) id VAA03411;
	Sun, 17 Sep 2000 21:06:49 -0700 (PDT)
	(envelope-from john)
Message-Id: <200009180406.VAA03411@john.baldwin.cx>
X-Mailer: XFMail 1.4.0 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <Pine.BSF.4.21.0009171418520.86297-100000@salmon.nlsystems.com>
Date: Sun, 17 Sep 2000 21:06:49 -0700 (PDT)
From: John Baldwin <jhb@FreeBSD.ORG>
To: alpha@FreeBSD.ORG
Subject: Re: Prelimiary interrupt thread patches for alpha
Cc: smp@FreeBSD.ORG
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


On 17-Sep-00 Doug Rabson wrote:
> On Sat, 16 Sep 2000, Matthew Jacob wrote:
> 
>> 
>> I can look at Rawhide and TurboLaser next week when my temperature comes down
>> (flu). Doug- you have a rawhide. I have the turbolaseers. Why don't y'all
>> check the patch in so we have something same to work with?
> 
> I'm looking at rawhide right now, so hopefully the patch should contain
> support for that when its committed. We can leave turbolaser for you then.

I've taken some things out of the ithreads patch and updated it again.
Basically, I split the KTR changes and kern_shutdown.c changes out as
they are unrelated to ithreads.  Once this is committed I'll look at doing
softinterrupts next.  I think I can just rip out the x86 sofinterrupt
thread code that we have and make it MI with just a few tweaks.

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Mon Sep 18  1: 6:56 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from ns.plaut.de (ns.plaut.de [194.99.75.166])
	by hub.freebsd.org (Postfix) with ESMTP
	id 740E937B440; Mon, 18 Sep 2000 01:06:53 -0700 (PDT)
Received: (from uucp@localhost)
	by ns.plaut.de (8.9.3/8.9.3) with UUCP id KAA05062;
	Mon, 18 Sep 2000 10:06:01 +0200 (CEST)
	(envelope-from root@nihil.plaut.de)
Received: from localhost (root@localhost)
	by nihil.plaut.de (8.11.0/8.8.8) with ESMTP id e8I93No00378;
	Mon, 18 Sep 2000 11:03:23 +0200 (CEST)
	(envelope-from root@nihil.plaut.de)
Date: Mon, 18 Sep 2000 11:03:17 +0200 (CEST)
From: Michael Reifenberger <root@nihil.plaut.de>
To: John Baldwin <jhb@pike.osd.bsdi.com>
Cc: Greg Lehey <grog@lemis.com>,
	FreeBSD-Current <current@FreeBSD.ORG>,
	FreeBSD-SMP <freebsd-smp@FreeBSD.ORG>
Subject: Re: Debugging -current SMPNG HANG on heavy disk-io
In-Reply-To: <200009180144.SAA52872@pike.osd.bsdi.com>
Message-ID: <Pine.BSF.4.21.0009181102150.369-100000@localhost>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Sun, 17 Sep 2000, John Baldwin wrote:
...
> Hmm, could it be lockmgr() related?
How can I proof?

Bye!
----
Michael Reifenberger
^.*Plaut.*$, IT, R/3 Basis, GPS


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Tue Sep 19 11:29:38 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from midten.fast.no (midten.fast.no [213.188.8.11])
	by hub.freebsd.org (Postfix) with ESMTP
	id 0CC4837B423; Tue, 19 Sep 2000 11:29:34 -0700 (PDT)
Received: from fast.no (IDENT:tegge@midten.fast.no [213.188.8.11])
	by midten.fast.no (8.9.3/8.9.3) with ESMTP id UAA68967;
	Tue, 19 Sep 2000 20:29:19 +0200 (CEST)
Message-Id: <200009191829.UAA68967@midten.fast.no>
To: root@nihil.plaut.de
Cc: grog@lemis.com, current@FreeBSD.ORG, freebsd-smp@FreeBSD.ORG
Subject: Re: Debugging -current SMPNG HANG on heavy disk-io
From: Tor.Egge@fast.no
In-Reply-To: Your message of "Sun, 17 Sep 2000 16:29:41 +0200 (CEST)"
References: <Pine.BSF.4.21.0009171622240.840-100000@localhost>
X-Mailer: Mew version 1.70 on Emacs 19.34.1
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Date: Tue, 19 Sep 2000 20:29:18 +0200
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

> (kgdb) ps
>   pid    proc    addr    uid  pri ppid  pgrp   flag stat comm         wchan
>    37 c7874a00 c9665000    0  32     6    36  004086  3  tar          piperd c9663f20
>    36 c7874bc0 c960a000    0  32     6    36  004006  3  tar          FFS node c02f4220

This looks like you've hit the limit for the FFS node memory type.

vmstat -m will indicate if this is correct.

If you see somethinig like

  Memory statistics by type                      Type  Kern
      Type  InUse MemUse HighUse  Limit Requests Limit Limit Size(s)
[....]
   FFS node262144 65536K  65536K 65536K  2024460    0     6  256
[....]
Memory Totals:  In Use    Free    Requests
                93897K    608K     9482590

(i.e. MemUse == Limit), then you've hit the limit.  The process
allocating a FFS node normally holds a vnode lock, resulting in 
a cascade of vnode locks and a frozen system.

Increasing the kmem_map size (by setting a loader variable
(kern.vm.kmem.size) or defining VM_KMEM_SIZE and VM_KMEM_SIZE_MAX in
the kernel config file) should help.

- Tor Egge


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Tue Sep 19 12:50:54 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from ns.plaut.de (ns.plaut.de [194.99.75.166])
	by hub.freebsd.org (Postfix) with ESMTP
	id E65DD37B422; Tue, 19 Sep 2000 12:50:50 -0700 (PDT)
Received: (from uucp@localhost)
	by ns.plaut.de (8.9.3/8.9.3) with UUCP id VAA20798;
	Tue, 19 Sep 2000 21:49:54 +0200 (CEST)
	(envelope-from root@nihil.plaut.de)
Received: from localhost (root@localhost)
	by nihil.plaut.de (8.11.0/8.8.8) with ESMTP id e8JK9Ud00665;
	Tue, 19 Sep 2000 22:09:30 +0200 (CEST)
	(envelope-from root@nihil.plaut.de)
Date: Tue, 19 Sep 2000 22:09:30 +0200 (CEST)
From: Michael Reifenberger <root@nihil.plaut.de>
To: Tor.Egge@fast.no
Cc: root@nihil.plaut.de, grog@lemis.com,
	FreeBSD-Current <current@FreeBSD.ORG>,
	FreeBSD-SMP <freebsd-smp@FreeBSD.ORG>
Subject: Re: Debugging -current SMPNG HANG on heavy disk-io
In-Reply-To: <200009191829.UAA68967@midten.fast.no>
Message-ID: <Pine.BSF.4.21.0009192206550.600-100000@localhost>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Tue, 19 Sep 2000 Tor.Egge@fast.no wrote:
...
> This looks like you've hit the limit for the FFS node memory type.
> 
> vmstat -m will indicate if this is correct.
> 
...
> Increasing the kmem_map size (by setting a loader variable
> (kern.vm.kmem.size) or defining VM_KMEM_SIZE and VM_KMEM_SIZE_MAX in
> the kernel config file) should help.
I'll try. Thanks for the hint!
BTW: Is it possible to dynamically adjust the limit of the node mem?
The system shouldn't freeze anyway.

Bye!
----
Michael Reifenberger
^.*Plaut.*$, IT, R/3 Basis, GPS


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Tue Sep 19 12:51:46 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from ns.plaut.de (ns.plaut.de [194.99.75.166])
	by hub.freebsd.org (Postfix) with ESMTP
	id D53A337B424; Tue, 19 Sep 2000 12:51:39 -0700 (PDT)
Received: (from uucp@localhost)
	by ns.plaut.de (8.9.3/8.9.3) with UUCP id VAA20806;
	Tue, 19 Sep 2000 21:50:45 +0200 (CEST)
	(envelope-from root@nihil.plaut.de)
Received: from localhost (root@localhost)
	by nihil.plaut.de (8.11.0/8.8.8) with ESMTP id e8JKnm000329;
	Tue, 19 Sep 2000 22:49:48 +0200 (CEST)
	(envelope-from root@nihil.plaut.de)
Date: Tue, 19 Sep 2000 22:49:47 +0200 (CEST)
From: Michael Reifenberger <root@nihil.plaut.de>
To: Tor.Egge@fast.no
Cc: grog@lemis.com, FreeBSD-Current <current@FreeBSD.ORG>,
	FreeBSD-SMP <freebsd-smp@FreeBSD.ORG>
Subject: Re: Debugging -current SMPNG HANG on heavy disk-io
In-Reply-To: <200009191829.UAA68967@midten.fast.no>
Message-ID: <Pine.BSF.4.21.0009192238100.289-100000@localhost>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

Hi,
On Tue, 19 Sep 2000 Tor.Egge@fast.no wrote:
...
> This looks like you've hit the limit for the FFS node memory type.

BINGO!

>    FFS node262144 65536K  65536K 65536K  2024460    0     6  256

So the symptom is clear. But the cause?

With pre SMPng I had the default kmem sizes (which is 12MB I think).
Now I bumped kern.vm.kmem.size 4 times to 40960000 (which leads to 20k max-mem
for FFS node) and still can't tar /usr/ports to /dev/null!

Where comes the increased memory consumption from ?!?

BTW: What is the KMEM exactly:
Kernel real memory? 
Kernel virtual memory?

Thanks anyway for your answer and your efforts!

Bye!
----
Michael Reifenberger
^.*Plaut.*$, IT, R/3 Basis, GPS


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20  0:46:55 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from pike.osd.bsdi.com (pike.osd.bsdi.com [204.216.28.222])
	by hub.freebsd.org (Postfix) with ESMTP
	id 4B9C937B424; Wed, 20 Sep 2000 00:46:51 -0700 (PDT)
Received: from foo.osd.bsdi.com (root@foo.osd.bsdi.com [204.216.28.137])
	by pike.osd.bsdi.com (8.11.0/8.9.3) with ESMTP id e8K7kji66814;
	Wed, 20 Sep 2000 00:46:45 -0700 (PDT)
	(envelope-from jhb@foo.osd.bsdi.com)
Received: (from jhb@localhost)
	by foo.osd.bsdi.com (8.11.0/8.11.0) id e8K7iPx93296;
	Wed, 20 Sep 2000 00:44:25 -0700 (PDT)
	(envelope-from jhb)
Message-ID: <XFMail.000920004424.jhb@FreeBSD.org>
X-Mailer: XFMail 1.4.0 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <Pine.BSF.4.21.0009200514520.2289-100000@besplex.bde.org>
Date: Wed, 20 Sep 2000 00:44:24 -0700 (PDT)
Organization: BSD, Inc.
From: John Baldwin <jhb@FreeBSD.ORG>
To: Bruce Evans <bde@zeta.org.au>
Subject: Re: recent kernel, microuptime went backwards
Cc: current@FreeBSD.ORG, "Andrey A. Chernov" <ache@nagual.pp.ru>,
	smp@FreeBSD.ORG
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


On 19-Sep-00 Bruce Evans wrote:
> On Tue, 19 Sep 2000, Andrey A. Chernov wrote:
> 
>> With very latest kernel I got lots of
>> 
>> microuptime() went backwards (1.3624050 -> 1.998840)
>> 
>> messages just before
>> 
>> Mounting root from ufs:/dev/da0s1a
> 
> It really does go backwards.  This is caused by the giant lock preventing
> the clock interrupt task from running soon enough.  The giant lock can
> also prevent the clock interrupt task from running often enough even
> after booting.  E.g., "dd if=/dev/random of=/dev/null bs=large" does
> several bad things.

It's not the Giant lock that is at fault.  We give up Giant during mi_switch().
 Then scheduling problem is in the way that the top-level scheduler runs.  We
decide to schedule another process due to the timeslice ending during the clk
interrupt thread.  In the past, this was not run as a thread, so it ran, set
the AST_* constant for needing a resched and then exited.  During doreti, we
notice an AST is pending and call ast(). ast() calls userret() which notices
that a resched is needed and calls mi_switch().  In the New World Order, when
the clock interrupt occurs, we set the AST_* constant for every interrupt
before returning from sched_ithd().  This results in the actual interrupt
threads being schedule from ast().  However, when the clk ithread finishes, it
simply calls mi_switch() to enter the next process in ithd_loop().  The
need_resched() that it sets isn't handled until the next call to userret()
either via a hardware interrupt or a syscall return.  Thus, the problem isn't
due to Giant, but rather to interrupt threads.  As for the micruptime()
messages on boot, they only occur here on a UP kernel.  On an SMP kernel I
don't get them.  Also, they always occur during mi_switch() when an interrupt
thread is finishing and going back to sleep.  The first such thread to be run
to generate thet error message is the irq0: clk ithread, so the clk ithread is
running fine.

> Bruce

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.cslab.vt.edu/~jobaldwi/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20  6:59: 8 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from gidora.zeta.org.au (gidora.zeta.org.au [203.26.10.25])
	by hub.freebsd.org (Postfix) with SMTP id 9B2EB37B43E
	for <smp@FreeBSD.ORG>; Wed, 20 Sep 2000 06:59:00 -0700 (PDT)
Received: (qmail 23775 invoked from network); 20 Sep 2000 13:58:51 -0000
Received: from unknown (HELO bde.zeta.org.au) (203.2.228.102)
  by gidora.zeta.org.au with SMTP; 20 Sep 2000 13:58:51 -0000
Date: Thu, 21 Sep 2000 00:58:47 +1100 (EST)
From: Bruce Evans <bde@zeta.org.au>
X-Sender: bde@besplex.bde.org
To: John Baldwin <jhb@FreeBSD.ORG>
Cc: current@FreeBSD.ORG, "Andrey A. Chernov" <ache@nagual.pp.ru>,
	smp@FreeBSD.ORG
Subject: Re: recent kernel, microuptime went backwards
In-Reply-To: <XFMail.000920004424.jhb@FreeBSD.org>
Message-ID: <Pine.BSF.4.21.0009210008050.3475-100000@besplex.bde.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Wed, 20 Sep 2000, John Baldwin wrote:

> On 19-Sep-00 Bruce Evans wrote:
> > It really does go backwards.  This is caused by the giant lock preventing
> > the clock interrupt task from running soon enough.  The giant lock can
> > also prevent the clock interrupt task from running often enough even
> > after booting.  E.g., "dd if=/dev/random of=/dev/null bs=large" does
> > several bad things.
> 
> It's not the Giant lock that is at fault.  We give up Giant during mi_switch().
>  Then scheduling problem is in the way that the top-level scheduler runs.

Then the scheduler is more broken than I thought :-).

Initially there may be a locking problem as well as a scheduling problem.
Giving up Giant in the first mi_switch() is a bit late.  mi_switch() uses
microuptime(), and the clock task needs to be run before then to finish
initialization of the timecounter.

> We
> decide to schedule another process due to the timeslice ending during the clk
> interrupt thread.

How can this work?  The timeslice accounting stuff doesn't get updated until
the clock task runs.

> In the past, this was not run as a thread, so it ran, set
> the AST_* constant for needing a resched and then exited.  During doreti, we
> notice an AST is pending and call ast(). ast() calls userret() which notices
> that a resched is needed and calls mi_switch().  In the New World Order, when
> the clock interrupt occurs, we set the AST_* constant for every interrupt
> before returning from sched_ithd().  This results in the actual interrupt
> threads being schedule from ast().

What should happen is for ast() to normally schedule the clock interrupt
(and other interrupts) immediately (unless they are blocked).  This doesn't
seem to be working, and I can't see how it can work, since there is nothing
except the giant lock to tell us whether interrupts are blocked, and the
giant lock is held most of the time in system mode.  Previously, cpl told
us, but cpl is no longer maintained.


> However, when the clk ithread finishes, it
> simply calls mi_switch() to enter the next process in ithd_loop().  The
> need_resched() that it sets isn't handled until the next call to userret()
> either via a hardware interrupt or a syscall return.  Thus, the problem isn't
> due to Giant, but rather to interrupt threads.

I think this is a different problem.  It is similar to a problem for
scheduling netisrs from non-interrupt context.  schednetisr() sets the
AST flag and some other flags.  Nothing looks at these flags until an
interrupt occurs or the process sleeps.  Previously, this was handled in
splx():

	s = splnet();		/* s == 0 in process context. */
	queue_net_output(...);
	schednetisr(...);
	splx(s);		/* Since s == 0, the netisr gets run here. */

Will this work again as soom as splx() is replaced by mtx_exit(), etc?
We only have a few thousand spls to change :(.

> As for the micruptime()
> messages on boot, they only occur here on a UP kernel.  On an SMP kernel I
> don't get them.  Also, they always occur during mi_switch() when an interrupt
> thread is finishing and going back to sleep.  The first such thread to be run
> to generate thet error message is the irq0: clk ithread, so the clk ithread is
> running fine.

They are very timing dependent, and probably also very task-mix
dependent.  The primary cause of microuptime() going backwards is
tv_nsec overflowing if the system takes longer than 2^32 nsec (about
4 seconds) between the initialization of the timecounter and the
timecounter maintenance for the first clock interrupt.  On one of my
systems, the first thread to call mi_switch() is the generic thread
(proc0?) that executes run_interrupt_driven_hooks().  mi_switch() is
called for the first time when the ata hook goes to sleep.  Things
would be a little different for SMP.  Hopefully another cpu handles
the clock interrupt.

Bruce


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20  7:13:50 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from critter.freebsd.dk (flutter.freebsd.dk [212.242.40.147])
	by hub.freebsd.org (Postfix) with ESMTP
	id 2B8CC37B422; Wed, 20 Sep 2000 07:13:45 -0700 (PDT)
Received: from critter (localhost [127.0.0.1])
	by critter.freebsd.dk (8.11.0/8.9.3) with ESMTP id e8KD2HN96421;
	Wed, 20 Sep 2000 15:02:18 +0200 (CEST)
	(envelope-from phk@critter.freebsd.dk)
To: John Baldwin <jhb@FreeBSD.ORG>
Cc: Bruce Evans <bde@zeta.org.au>, current@FreeBSD.ORG,
	"Andrey A. Chernov" <ache@nagual.pp.ru>, smp@FreeBSD.ORG
Subject: Re: recent kernel, microuptime went backwards 
In-Reply-To: Your message of "Wed, 20 Sep 2000 00:44:24 PDT."
             <XFMail.000920004424.jhb@FreeBSD.org> 
Date: Wed, 20 Sep 2000 15:02:17 +0200
Message-ID: <96419.969454937@critter>
From: Poul-Henning Kamp <phk@critter.freebsd.dk>
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

In message <XFMail.000920004424.jhb@FreeBSD.org>, John Baldwin writes:

>As for the micruptime()
>messages on boot, they only occur here on a UP kernel.  On an SMP kernel I
>don't get them.  Also, they always occur during mi_switch() when an interrupt
>thread is finishing and going back to sleep.  The first such thread to be run
>to generate thet error message is the irq0: clk ithread, so the clk ithread is
>running fine.

The microuptime() messages occur because the timecounters expect the
i8254 clock interrupt to run "hz" times per second, and it doesn't.

In particular it doesn't during then 10-20 seconds we probe/attach
devices.

--
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD coreteam member | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20  7:54:26 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from superconductor.rush.net (superconductor.rush.net [208.9.155.8])
	by hub.freebsd.org (Postfix) with ESMTP
	id DEBCD37B423; Wed, 20 Sep 2000 07:54:22 -0700 (PDT)
Received: from localhost (trish@localhost)
	by superconductor.rush.net (8.9.3/8.9.3) with ESMTP id KAA18244;
	Wed, 20 Sep 2000 10:54:21 -0400 (EDT)
Date: Wed, 20 Sep 2000 10:54:21 -0400 (EDT)
From: Siobhan Patricia Lynch <trish@bsdunix.net>
X-Sender: trish@superconductor.rush.net
To: John Baldwin <jhb@FreeBSD.ORG>
Cc: Bruce Evans <bde@zeta.org.au>, current@FreeBSD.ORG,
	"Andrey A. Chernov" <ache@nagual.pp.ru>, smp@FreeBSD.ORG
Subject: Re: recent kernel, microuptime went backwards
In-Reply-To: <XFMail.000920004424.jhb@FreeBSD.org>
Message-ID: <Pine.BSO.4.21.0009201047110.16963-100000@superconductor.rush.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

John,
	I get these on an SMP kernel, which locks up the box, I can't even
figure out where exactly its happening. Maybe I'm just missing something
in my kernel config file? I assumed (from UPDATING) that no real change
was
needed to the SMP options?

The hardware is an Intel N440BX motherboard with two PII333 procs.

I'm not opposed to the option that this is *my* error and not SMPNG's

PRE_SMPNG is VERY stable on this box, however the constant crashes (mostly
during high I/O periods, such as make world) have made it impossible to
use the machine for any type of real development under SMPNG. (which if I
had time since the promotion, I'd be a bit crazy right now)

If theres is anything I can do to help debug this, let me know, you can
email me privately.

-Trish


__

Trish Lynch
FreeBSD - The Power to Serve 		trish@bsdunix.net
Rush Networking				trish@rush.net
VA Linux Systems			trish@valinux.com
O|S|D|N					trish@osdn.com
---

	"So if you ask me how do I fell inside I could honestly
         Tell you we've been taken on a very long ride
         And if my owners let me have some free time some day
         With all good intention I would probably run away
         Clutching the short straw"
		-Marillion, That Time of the Night


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20  8:14:59 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from superconductor.rush.net (superconductor.rush.net [208.9.155.8])
	by hub.freebsd.org (Postfix) with ESMTP
	id 6A64437B422; Wed, 20 Sep 2000 08:14:55 -0700 (PDT)
Received: from localhost (trish@localhost)
	by superconductor.rush.net (8.9.3/8.9.3) with ESMTP id LAA27302;
	Wed, 20 Sep 2000 11:14:54 -0400 (EDT)
Date: Wed, 20 Sep 2000 11:14:54 -0400 (EDT)
From: Siobhan Patricia Lynch <trish@bsdunix.net>
X-Sender: trish@superconductor.rush.net
To: John Baldwin <jhb@FreeBSD.ORG>
Cc: current@FreeBSD.ORG, smp@FreeBSD.ORG
Subject: Re: recent kernel, microuptime went backwards
In-Reply-To: <Pine.BSO.4.21.0009201047110.16963-100000@superconductor.rush.net>
Message-ID: <Pine.BSO.4.21.0009201112330.16963-100000@superconductor.rush.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

more info,
the N440BX has a symbios scsi card on board and an fxp card onboard.
I'm using relatively old scsi drives (4500 RPM seagates and HP drives)
I have softupdates on, and I'm now using COMPAT_OLDPCI for now, although I
only turned it back on after seeing that was the only real change I made
to my kernel config pre and post SMPNG.

-Trish

__

Trish Lynch
FreeBSD - The Power to Serve 		trish@bsdunix.net
Rush Networking				trish@rush.net
VA Linux Systems			trish@valinux.com
O|S|D|N					trish@osdn.com
---

	"what makes me think i could start clean slated
	 the hardest to learn was the least complicated"
		-Indigo Girls, Least Complicated


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20  9:28:46 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from earth.backplane.com (backplane-inc.SanFranciscosfd.cw.net [206.24.214.242])
	by hub.freebsd.org (Postfix) with ESMTP
	id 5B50F37B423; Wed, 20 Sep 2000 09:28:43 -0700 (PDT)
Received: (from dillon@localhost)
	by earth.backplane.com (8.11.0/8.9.3) id e8KGSha44044;
	Wed, 20 Sep 2000 09:28:43 -0700 (PDT)
	(envelope-from dillon)
Date: Wed, 20 Sep 2000 09:28:43 -0700 (PDT)
From: Matt Dillon <dillon@earth.backplane.com>
Message-Id: <200009201628.e8KGSha44044@earth.backplane.com>
To: FreeBSD-Current <current@FreeBSD.ORG>,
	FreeBSD-SMP <freebsd-smp@FreeBSD.ORG>
Subject: Congrats on the SMPng work!
References: <20000918075518.H67912@wantadilla.lemis.com> <Pine.BSF.4.21.0009180129040.1035-100000@localhost> <20000918080559.J67912@wantadilla.lemis.com>
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

    Sniff!  I feel so left out, I have so little time to play these days.

    You guys are all doing really exciting work, congratulations!

						-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20 10:11:13 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from pike.osd.bsdi.com (pike.osd.bsdi.com [204.216.28.222])
	by hub.freebsd.org (Postfix) with ESMTP
	id 9BDB537B424; Wed, 20 Sep 2000 10:11:09 -0700 (PDT)
Received: from foo.osd.bsdi.com (root@foo.osd.bsdi.com [204.216.28.137])
	by pike.osd.bsdi.com (8.11.0/8.9.3) with ESMTP id e8KHAvi79595;
	Wed, 20 Sep 2000 10:10:57 -0700 (PDT)
	(envelope-from jhb@foo.osd.bsdi.com)
Received: (from jhb@localhost)
	by foo.osd.bsdi.com (8.11.0/8.11.0) id e8KH9Na15429;
	Wed, 20 Sep 2000 10:09:23 -0700 (PDT)
	(envelope-from jhb)
Message-ID: <XFMail.000920100923.jhb@FreeBSD.org>
X-Mailer: XFMail 1.4.0 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <Pine.BSO.4.21.0009201047110.16963-100000@superconductor.rush.net>
Date: Wed, 20 Sep 2000 10:09:23 -0700 (PDT)
Organization: BSD, Inc.
From: John Baldwin <jhb@FreeBSD.ORG>
To: Siobhan Patricia Lynch <trish@bsdunix.net>
Subject: Re: recent kernel, microuptime went backwards
Cc: smp@FreeBSD.ORG, "Andrey A. Chernov" <ache@nagual.pp.ru>,
	current@FreeBSD.ORG, Bruce Evans <bde@zeta.org.au>
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


On 20-Sep-00 Siobhan Patricia Lynch wrote:
> John,
>       I get these on an SMP kernel, which locks up the box, I can't even
> figure out where exactly its happening. Maybe I'm just missing something
> in my kernel config file? I assumed (from UPDATING) that no real change
> was
> needed to the SMP options?

No, it is probably a bug in the code with this scheduling mess.  On the
quad xeon here, very recent kernels are spinning during the acd0
probe with interrupts disabled, so there may be a few places where
interrupts aren't being handled properly.

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.cslab.vt.edu/~jobaldwi/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20 13:23:41 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from ipamzlx.physik.uni-mainz.de (ipamzlx.Physik.Uni-Mainz.DE [134.93.180.54])
	by hub.freebsd.org (Postfix) with ESMTP
	id 57A5837B422; Wed, 20 Sep 2000 13:23:37 -0700 (PDT)
Received: from ipamzlx.Physik.Uni-Mainz.DE (ipamzlx.Physik.Uni-Mainz.DE [134.93.180.54])
	by ipamzlx.physik.uni-mainz.de (8.11.0/8.9.3) with ESMTP id e8KKQHo00515;
	Wed, 20 Sep 2000 22:26:17 +0200 (CEST)
	(envelope-from ohartman@ipamzlx.physik.uni-mainz.de)
Date: Wed, 20 Sep 2000 22:26:16 +0200 (CEST)
From: "O. Hartmann" <ohartman@ipamzlx.physik.uni-mainz.de>
To: freebsd-stable@freebsd.org
Cc: freebsd-smp@freebsd.org
Subject: Whats is this? FBSD 4.1 isn't stable!
Message-ID: <Pine.BSF.4.21.0009202218001.469-100000@ipamzlx.physik.uni-mainz.de>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

Today I got a fresh CVSUPdate. The server is running since 5 days non
stop without
any problem within this time. Then, when I got the last delta files
about 4 hours ago, I tried to compile a "new world" - but compilation stops
due to an error in some OPIE stuff. Well, this is well known, some stuff has
been updated and we catch new stuff in the middle of the update duty on the 
server. So, waiting ... I did a second cvsupdate and the I tried again to
make a world. And what then happened is simply "horror"! First, the compiler stops
and reported aome kind of error and SIG 11. Well, SIG 11 is a rar error message
on my machine ... but then, a few seconds later, the system reboots - without any warnings,
without core dump, without nothing! Well - does this mean stable when I simply compile
something and then the machine crashes? I think this is a kind of behaviour that must
not happen! I think not to have faulty hardware, because this server runs now for about
2 and a half year without any hardware fault!

Is anyone out here who made the same experiences? What happened?

Gruss O. Hartmann
-------------------------------------------------------------------
ohartman@ipamzlx.physik.uni-mainz.de

Klimadatenserver des IPA, Universitaet Mainz
Netzwerk- und Systembetreuung


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20 13:34: 3 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from veldy.net (w028.z064001117.msp-mn.dsl.cnc.net [64.1.117.28])
	by hub.freebsd.org (Postfix) with ESMTP
	id 52CBD37B422; Wed, 20 Sep 2000 13:33:58 -0700 (PDT)
Received: from tdtemp26 (veldy.net [64.1.117.28])
	by veldy.net (Postfix) with SMTP
	id 2A7868C37; Wed, 20 Sep 2000 15:35:17 -0500 (CDT)
Message-ID: <009201c02342$1a12fe80$ff01060a@metamoris.com>
From: "Thomas T. Veldhouse" <veldy@veldy.net>
To: "O. Hartmann" <ohartman@ipamzlx.physik.uni-mainz.de>,
	<freebsd-stable@freebsd.org>
Cc: <freebsd-smp@freebsd.org>
References: <Pine.BSF.4.21.0009202218001.469-100000@ipamzlx.physik.uni-mainz.de>
Subject: Re: Whats is this? FBSD 4.1 isn't stable!
Date: Wed, 20 Sep 2000 15:33:54 -0500
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.00.2314.1300
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

You can hardly say it is not stable.  You might not be able to build it -
but your system is plenty stable.  Let's be clear here :)

Tom Veldhouse
veldy@veldy.net

----- Original Message -----
From: O. Hartmann <ohartman@ipamzlx.physik.uni-mainz.de>
To: <freebsd-stable@freebsd.org>
Cc: <freebsd-smp@freebsd.org>
Sent: Wednesday, September 20, 2000 3:26 PM
Subject: Whats is this? FBSD 4.1 isn't stable!


> Today I got a fresh CVSUPdate. The server is running since 5 days non
> stop without
> any problem within this time. Then, when I got the last delta files
> about 4 hours ago, I tried to compile a "new world" - but compilation
stops
> due to an error in some OPIE stuff. Well, this is well known, some stuff
has
> been updated and we catch new stuff in the middle of the update duty on
the
> server. So, waiting ... I did a second cvsupdate and the I tried again to
> make a world. And what then happened is simply "horror"! First, the
compiler stops
> and reported aome kind of error and SIG 11. Well, SIG 11 is a rar error
message
> on my machine ... but then, a few seconds later, the system reboots -
without any warnings,
> without core dump, without nothing! Well - does this mean stable when I
simply compile
> something and then the machine crashes? I think this is a kind of
behaviour that must
> not happen! I think not to have faulty hardware, because this server runs
now for about
> 2 and a half year without any hardware fault!
>
> Is anyone out here who made the same experiences? What happened?
>
> Gruss O. Hartmann
> -------------------------------------------------------------------
> ohartman@ipamzlx.physik.uni-mainz.de
>
> Klimadatenserver des IPA, Universitaet Mainz
> Netzwerk- und Systembetreuung
>
>
>
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-stable" in the body of the message
>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20 13:36:33 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP
	id 0EDCE37B424; Wed, 20 Sep 2000 13:36:30 -0700 (PDT)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id e8KKZ8914257;
	Wed, 20 Sep 2000 13:35:08 -0700 (PDT)
Date: Wed, 20 Sep 2000 13:35:08 -0700
From: Alfred Perlstein <bright@wintelcom.net>
To: "O. Hartmann" <ohartman@ipamzlx.physik.uni-mainz.de>
Cc: freebsd-stable@FreeBSD.ORG, freebsd-smp@FreeBSD.ORG
Subject: Re: Whats is this? FBSD 4.1 isn't stable!
Message-ID: <20000920133507.J9141@fw.wintelcom.net>
References: <Pine.BSF.4.21.0009202218001.469-100000@ipamzlx.physik.uni-mainz.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.4i
In-Reply-To: <Pine.BSF.4.21.0009202218001.469-100000@ipamzlx.physik.uni-mainz.de>; from ohartman@ipamzlx.physik.uni-mainz.de on Wed, Sep 20, 2000 at 10:26:16PM +0200
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

* O. Hartmann <ohartman@ipamzlx.physik.uni-mainz.de> [000920 13:25] wrote:
> Today I got a fresh CVSUPdate. The server is running since 5 days non
> stop without
> any problem within this time. Then, when I got the last delta files
> about 4 hours ago, I tried to compile a "new world" - but compilation stops
> due to an error in some OPIE stuff. Well, this is well known, some stuff has
> been updated and we catch new stuff in the middle of the update duty on the 
> server. So, waiting ... I did a second cvsupdate and the I tried again to
> make a world. And what then happened is simply "horror"! First, the compiler stops
> and reported aome kind of error and SIG 11. Well, SIG 11 is a rar error message
> on my machine ... but then, a few seconds later, the system reboots - without any warnings,
> without core dump, without nothing! Well - does this mean stable when I simply compile
> something and then the machine crashes? I think this is a kind of behaviour that must
> not happen! I think not to have faulty hardware, because this server runs now for about
> 2 and a half year without any hardware fault!
> 
> Is anyone out here who made the same experiences? What happened?

First off, please wrap lines at 70 characters.

The compiler giving "sig 11" problems and rebooting is a sure
indication of a memory, connectivity (loose connectors) or a heating
issue.

Swap out the memory, check your cooling systems and make sure all
your cards and cables are seated correctly.

-Alfred


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20 13:37: 4 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP id 0270737B422
	for <freebsd-smp@FreeBSD.ORG>; Wed, 20 Sep 2000 13:37:02 -0700 (PDT)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id e8KKaDC14332;
	Wed, 20 Sep 2000 13:36:13 -0700 (PDT)
Date: Wed, 20 Sep 2000 13:36:13 -0700
From: Alfred Perlstein <bright@wintelcom.net>
To: "Thomas T. Veldhouse" <veldy@veldy.net>
Cc: "O. Hartmann" <ohartman@ipamzlx.physik.uni-mainz.de>,
	freebsd-smp@FreeBSD.ORG
Subject: Re: Whats is this? FBSD 4.1 isn't stable!
Message-ID: <20000920133613.K9141@fw.wintelcom.net>
References: <Pine.BSF.4.21.0009202218001.469-100000@ipamzlx.physik.uni-mainz.de> <009201c02342$1a12fe80$ff01060a@metamoris.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.4i
In-Reply-To: <009201c02342$1a12fe80$ff01060a@metamoris.com>; from veldy@veldy.net on Wed, Sep 20, 2000 at 03:33:54PM -0500
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

* Thomas T. Veldhouse <veldy@veldy.net> [000920 13:34] wrote:
> You can hardly say it is not stable.  You might not be able to build it -
> but your system is plenty stable.  Let's be clear here :)

No, if you're tolerant enough of eye strain and parse his mail he
complains that after starting the second make world his system 
rebooted.

-Alfred


> 
> Tom Veldhouse
> veldy@veldy.net
> 
> ----- Original Message -----
> From: O. Hartmann <ohartman@ipamzlx.physik.uni-mainz.de>
> To: <freebsd-stable@freebsd.org>
> Cc: <freebsd-smp@freebsd.org>
> Sent: Wednesday, September 20, 2000 3:26 PM
> Subject: Whats is this? FBSD 4.1 isn't stable!
> 
> 
> > Today I got a fresh CVSUPdate. The server is running since 5 days non
> > stop without
> > any problem within this time. Then, when I got the last delta files
> > about 4 hours ago, I tried to compile a "new world" - but compilation
> stops
> > due to an error in some OPIE stuff. Well, this is well known, some stuff
> has
> > been updated and we catch new stuff in the middle of the update duty on
> the
> > server. So, waiting ... I did a second cvsupdate and the I tried again to
> > make a world. And what then happened is simply "horror"! First, the
> compiler stops
> > and reported aome kind of error and SIG 11. Well, SIG 11 is a rar error
> message
> > on my machine ... but then, a few seconds later, the system reboots -
> without any warnings,
> > without core dump, without nothing! Well - does this mean stable when I
> simply compile
> > something and then the machine crashes? I think this is a kind of
> behaviour that must
> > not happen! I think not to have faulty hardware, because this server runs
> now for about
> > 2 and a half year without any hardware fault!
> >
> > Is anyone out here who made the same experiences? What happened?
> >
> > Gruss O. Hartmann
> > -------------------------------------------------------------------
> > ohartman@ipamzlx.physik.uni-mainz.de
> >
> > Klimadatenserver des IPA, Universitaet Mainz
> > Netzwerk- und Systembetreuung
> >
> >
> >
> > To Unsubscribe: send mail to majordomo@FreeBSD.org
> > with "unsubscribe freebsd-stable" in the body of the message
> >
> 
> 
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-stable" in the body of the message

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20 13:37:40 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from pike.osd.bsdi.com (pike.osd.bsdi.com [204.216.28.222])
	by hub.freebsd.org (Postfix) with ESMTP
	id 0198237B422; Wed, 20 Sep 2000 13:37:37 -0700 (PDT)
Received: from foo.osd.bsdi.com (root@foo.osd.bsdi.com [204.216.28.137])
	by pike.osd.bsdi.com (8.11.0/8.9.3) with ESMTP id e8KKbUi87352;
	Wed, 20 Sep 2000 13:37:30 -0700 (PDT)
	(envelope-from jhb@foo.osd.bsdi.com)
Received: (from jhb@localhost)
	by foo.osd.bsdi.com (8.11.0/8.11.0) id e8KKZwC00818;
	Wed, 20 Sep 2000 13:35:58 -0700 (PDT)
	(envelope-from jhb)
Message-ID: <XFMail.000920133558.jhb@FreeBSD.org>
X-Mailer: XFMail 1.4.0 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <Pine.BSF.4.21.0009202218001.469-100000@ipamzlx.physik.uni-mainz.de>
Date: Wed, 20 Sep 2000 13:35:58 -0700 (PDT)
Organization: BSD, Inc.
From: John Baldwin <jhb@FreeBSD.ORG>
To: "O. Hartmann" <ohartman@ipamzlx.physik.uni-mainz.de>
Subject: RE: Whats is this? FBSD 4.1 isn't stable!
Cc: freebsd-smp@FreeBSD.ORG, freebsd-stable@FreeBSD.ORG
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


On 20-Sep-00 O. Hartmann wrote:
> not happen! I think not to have faulty hardware, because this server runs now
> for about 2 and a half year without any hardware fault!

So?  Hardware goes bad.  I.e., it works fine for a while, and then
*wham* it breaks one day.  In fact, your hardware could have already been
somewhat marginal to begin with, and the stress of making world may have
caused something to overheat and cause the hardware to break, esp. memory
or CPU.  Bad memory and CPU's often cause Sig 11's.

> Is anyone out here who made the same experiences? What happened?

I've only seen it rarely, and in every case it was hardware.

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.cslab.vt.edu/~jobaldwi/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20 13:40:34 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from veldy.net (w028.z064001117.msp-mn.dsl.cnc.net [64.1.117.28])
	by hub.freebsd.org (Postfix) with ESMTP
	id 7BBB537B423; Wed, 20 Sep 2000 13:40:28 -0700 (PDT)
Received: from tdtemp26 (veldy.net [64.1.117.28])
	by veldy.net (Postfix) with SMTP
	id 0E7CE8C27; Wed, 20 Sep 2000 15:41:42 -0500 (CDT)
Message-ID: <00b301c02343$02ad32a0$ff01060a@metamoris.com>
From: "Thomas T. Veldhouse" <veldy@veldy.net>
To: "Alfred Perlstein" <bright@wintelcom.net>
Cc: "O. Hartmann" <ohartman@ipamzlx.physik.uni-mainz.de>,
	<freebsd-smp@FreeBSD.ORG>, <freebsd-stable@freebsd.org>
References: <Pine.BSF.4.21.0009202218001.469-100000@ipamzlx.physik.uni-mainz.de> <009201c02342$1a12fe80$ff01060a@metamoris.com> <20000920133613.K9141@fw.wintelcom.net>
Subject: Re: Whats is this? FBSD 4.1 isn't stable!
Date: Wed, 20 Sep 2000 15:40:19 -0500
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.00.2314.1300
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

You are correct, and I apologize.

Tom Veldhosue
veldy@veldy.net

----- Original Message -----
From: Alfred Perlstein <bright@wintelcom.net>
To: Thomas T. Veldhouse <veldy@veldy.net>
Cc: O. Hartmann <ohartman@ipamzlx.physik.uni-mainz.de>;
<freebsd-smp@FreeBSD.ORG>
Sent: Wednesday, September 20, 2000 3:36 PM
Subject: Re: Whats is this? FBSD 4.1 isn't stable!


> * Thomas T. Veldhouse <veldy@veldy.net> [000920 13:34] wrote:
> > You can hardly say it is not stable.  You might not be able to build
it -
> > but your system is plenty stable.  Let's be clear here :)
>
> No, if you're tolerant enough of eye strain and parse his mail he
> complains that after starting the second make world his system
> rebooted.
>
> -Alfred
>
>
> >
> > Tom Veldhouse
> > veldy@veldy.net
> >
> > ----- Original Message -----
> > From: O. Hartmann <ohartman@ipamzlx.physik.uni-mainz.de>
> > To: <freebsd-stable@freebsd.org>
> > Cc: <freebsd-smp@freebsd.org>
> > Sent: Wednesday, September 20, 2000 3:26 PM
> > Subject: Whats is this? FBSD 4.1 isn't stable!
> >
> >
> > > Today I got a fresh CVSUPdate. The server is running since 5 days non
> > > stop without
> > > any problem within this time. Then, when I got the last delta files
> > > about 4 hours ago, I tried to compile a "new world" - but compilation
> > stops
> > > due to an error in some OPIE stuff. Well, this is well known, some
stuff
> > has
> > > been updated and we catch new stuff in the middle of the update duty
on
> > the
> > > server. So, waiting ... I did a second cvsupdate and the I tried again
to
> > > make a world. And what then happened is simply "horror"! First, the
> > compiler stops
> > > and reported aome kind of error and SIG 11. Well, SIG 11 is a rar
error
> > message
> > > on my machine ... but then, a few seconds later, the system reboots -
> > without any warnings,
> > > without core dump, without nothing! Well - does this mean stable when
I
> > simply compile
> > > something and then the machine crashes? I think this is a kind of
> > behaviour that must
> > > not happen! I think not to have faulty hardware, because this server
runs
> > now for about
> > > 2 and a half year without any hardware fault!
> > >
> > > Is anyone out here who made the same experiences? What happened?
> > >
> > > Gruss O. Hartmann
> > > -------------------------------------------------------------------
> > > ohartman@ipamzlx.physik.uni-mainz.de
> > >
> > > Klimadatenserver des IPA, Universitaet Mainz
> > > Netzwerk- und Systembetreuung
> > >
> > >
> > >
> > > To Unsubscribe: send mail to majordomo@FreeBSD.org
> > > with "unsubscribe freebsd-stable" in the body of the message
> > >
> >
> >
> >
> > To Unsubscribe: send mail to majordomo@FreeBSD.org
> > with "unsubscribe freebsd-stable" in the body of the message
>
> --
> -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
> "I have the heart of a child; I keep it in a jar on my desk."
>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20 13:47:29 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from ipamzlx.physik.uni-mainz.de (ipamzlx.Physik.Uni-Mainz.DE [134.93.180.54])
	by hub.freebsd.org (Postfix) with ESMTP
	id E552F37B424; Wed, 20 Sep 2000 13:47:23 -0700 (PDT)
Received: from ipamzlx.Physik.Uni-Mainz.DE (ipamzlx.Physik.Uni-Mainz.DE [134.93.180.54])
	by ipamzlx.physik.uni-mainz.de (8.11.0/8.9.3) with ESMTP id e8KKo1o44888;
	Wed, 20 Sep 2000 22:50:02 +0200 (CEST)
	(envelope-from ohartman@ipamzlx.physik.uni-mainz.de)
Date: Wed, 20 Sep 2000 22:50:01 +0200 (CEST)
From: "O. Hartmann" <ohartman@ipamzlx.physik.uni-mainz.de>
To: Alfred Perlstein <bright@wintelcom.net>
Cc: freebsd-stable@FreeBSD.ORG, freebsd-smp@FreeBSD.ORG
Subject: Re: Whats is this? FBSD 4.1 isn't stable!
In-Reply-To: <20000920133507.J9141@fw.wintelcom.net>
Message-ID: <Pine.BSF.4.21.0009202239400.469-100000@ipamzlx.physik.uni-mainz.de>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Wed, 20 Sep 2000, Alfred Perlstein wrote:
Dear  A. Perlstein.
Well ... my sendmail is configured to warp lines at 70 characters, I do not
know why it is not working.

The problem I described is simply not that what I expect due to a hardware
malfunction. I already checked all connectors, memory - nothing! There is
nothing that seems to be different from a few days ago. Overheating is not
on the list of possibilities - due to changing weather conditions its 
really "cold" herein (about 20 degrees Celsius). But in the summertime I never
have had this error - and at this time, the server is not under heavy load.
It is strange ... really strange ...

Well, on the other hand: it is not ECC RAM, so ...


:>* O. Hartmann <ohartman@ipamzlx.physik.uni-mainz.de> [000920 13:25] wrote:
:>> Today I got a fresh CVSUPdate. The server is running since 5 days non
:>> stop without
:>> any problem within this time. Then, when I got the last delta files
:>> about 4 hours ago, I tried to compile a "new world" - but compilation stops
:>> due to an error in some OPIE stuff. Well, this is well known, some stuff has
:>> been updated and we catch new stuff in the middle of the update duty on the 
:>> server. So, waiting ... I did a second cvsupdate and the I tried again to
:>> make a world. And what then happened is simply "horror"! First, the compiler stops
:>> and reported aome kind of error and SIG 11. Well, SIG 11 is a rar error message
:>> on my machine ... but then, a few seconds later, the system reboots - without any warnings,
:>> without core dump, without nothing! Well - does this mean stable when I simply compile
:>> something and then the machine crashes? I think this is a kind of behaviour that must
:>> not happen! I think not to have faulty hardware, because this server runs now for about
:>> 2 and a half year without any hardware fault!
:>> 
:>> Is anyone out here who made the same experiences? What happened?
:>
:>First off, please wrap lines at 70 characters.
:>
:>The compiler giving "sig 11" problems and rebooting is a sure
:>indication of a memory, connectivity (loose connectors) or a heating
:>issue.
:>
:>Swap out the memory, check your cooling systems and make sure all
:>your cards and cables are seated correctly.
:>
:>-Alfred
:>

Gruss O. Hartmann
-------------------------------------------------------------------
ohartman@ipamzlx.physik.uni-mainz.de

Klimadatenserver des IPA, Universitaet Mainz
Netzwerk- und Systembetreuung


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20 13:55: 7 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP
	id 8310E37B422; Wed, 20 Sep 2000 13:55:03 -0700 (PDT)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id e8KKrkP15469;
	Wed, 20 Sep 2000 13:53:46 -0700 (PDT)
Date: Wed, 20 Sep 2000 13:53:45 -0700
From: Alfred Perlstein <bright@wintelcom.net>
To: "O. Hartmann" <ohartman@ipamzlx.physik.uni-mainz.de>
Cc: freebsd-stable@FreeBSD.ORG, freebsd-smp@FreeBSD.ORG
Subject: Re: Whats is this? FBSD 4.1 isn't stable!
Message-ID: <20000920135345.M9141@fw.wintelcom.net>
References: <20000920133507.J9141@fw.wintelcom.net> <Pine.BSF.4.21.0009202239400.469-100000@ipamzlx.physik.uni-mainz.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.4i
In-Reply-To: <Pine.BSF.4.21.0009202239400.469-100000@ipamzlx.physik.uni-mainz.de>; from ohartman@ipamzlx.physik.uni-mainz.de on Wed, Sep 20, 2000 at 10:50:01PM +0200
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

* O. Hartmann <ohartman@ipamzlx.physik.uni-mainz.de> [000920 13:47] wrote:
> On Wed, 20 Sep 2000, Alfred Perlstein wrote:
> Dear  A. Perlstein.
> Well ... my sendmail is configured to warp lines at 70 characters, I do not
> know why it is not working.
> 
> The problem I described is simply not that what I expect due to a hardware
> malfunction. I already checked all connectors, memory - nothing! There is
> nothing that seems to be different from a few days ago. Overheating is not
> on the list of possibilities - due to changing weather conditions its 
> really "cold" herein (about 20 degrees Celsius). But in the summertime I never
> have had this error - and at this time, the server is not under heavy load.
> It is strange ... really strange ...
> 
> Well, on the other hand: it is not ECC RAM, so ...

Here's what you're saying:

  I have a relatively quiencent (sp?) system, basically it's not stressed.

  After doing cvsup, makeworld and another cvsup in a pretty short
  amount of time, ie. making the CPUs spin pretty hard for a long
  time, I get a classical memory error/heating problem symptom.

Lastly where is your crashdump so someone can actually do something
about this?

http://www.freebsd.org/handbook/kerneldebug.html

thanks,
-Alfred


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20 14: 5:30 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from ipamzlx.physik.uni-mainz.de (ipamzlx.Physik.Uni-Mainz.DE [134.93.180.54])
	by hub.freebsd.org (Postfix) with ESMTP
	id DC9CF37B422; Wed, 20 Sep 2000 14:05:25 -0700 (PDT)
Received: from ipamzlx.Physik.Uni-Mainz.DE (ipamzlx.Physik.Uni-Mainz.DE [134.93.180.54])
	by ipamzlx.physik.uni-mainz.de (8.11.0/8.9.3) with ESMTP id e8KL83o76527;
	Wed, 20 Sep 2000 23:08:03 +0200 (CEST)
	(envelope-from ohartman@ipamzlx.physik.uni-mainz.de)
Date: Wed, 20 Sep 2000 23:08:03 +0200 (CEST)
From: "O. Hartmann" <ohartman@ipamzlx.physik.uni-mainz.de>
To: Alfred Perlstein <bright@wintelcom.net>
Cc: freebsd-stable@FreeBSD.ORG, freebsd-smp@FreeBSD.ORG
Subject: Re: Whats is this? FBSD 4.1 isn't stable!
In-Reply-To: <20000920135345.M9141@fw.wintelcom.net>
Message-ID: <Pine.BSF.4.21.0009202302550.469-100000@ipamzlx.physik.uni-mainz.de>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Wed, 20 Sep 2000, Alfred Perlstein wrote:
Well, sorry, but this is not that kind of doing science by laying hands onto something
and say: well - its this kind of symptome!

We develop here equipment for airborne measurement facilities for meteorological
science and the way we stress things makes me sure, over the time, that a system,
which has been stressed much more under much more bad conditions do not fail
in a phase of been not stressed that hard.

Maybe you're right and I caught some kind of "hardware failure", but surely not
that kind of failure that we expect due to "overheating" the CPUs. The machine here
in front of me is much better air conditioned thatn other systems I stressed.

Kind regards, O. Hartmann
:>* O. Hartmann <ohartman@ipamzlx.physik.uni-mainz.de> [000920 13:47] wrote:
:>> On Wed, 20 Sep 2000, Alfred Perlstein wrote:
:>> Dear  A. Perlstein.
:>> Well ... my sendmail is configured to warp lines at 70 characters, I do not
:>> know why it is not working.
:>> 
:>> The problem I described is simply not that what I expect due to a hardware
:>> malfunction. I already checked all connectors, memory - nothing! There is
:>> nothing that seems to be different from a few days ago. Overheating is not
:>> on the list of possibilities - due to changing weather conditions its 
:>> really "cold" herein (about 20 degrees Celsius). But in the summertime I never
:>> have had this error - and at this time, the server is not under heavy load.
:>> It is strange ... really strange ...
:>> 
:>> Well, on the other hand: it is not ECC RAM, so ...
:>
:>Here's what you're saying:
:>
:>  I have a relatively quiencent (sp?) system, basically it's not stressed.
:>
:>  After doing cvsup, makeworld and another cvsup in a pretty short
:>  amount of time, ie. making the CPUs spin pretty hard for a long
:>  time, I get a classical memory error/heating problem symptom.
:>
:>Lastly where is your crashdump so someone can actually do something
:>about this?
:>
:>http://www.freebsd.org/handbook/kerneldebug.html
:>
:>thanks,
:>-Alfred
:>

Gruss O. Hartmann
-------------------------------------------------------------------
ohartman@ipamzlx.physik.uni-mainz.de

Klimadatenserver des IPA, Universitaet Mainz
Netzwerk- und Systembetreuung


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20 14:20:42 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20])
	by hub.freebsd.org (Postfix) with ESMTP
	id 857AC37B423; Wed, 20 Sep 2000 14:20:38 -0700 (PDT)
Received: (from bright@localhost)
	by fw.wintelcom.net (8.10.0/8.10.0) id e8KLJpT16418;
	Wed, 20 Sep 2000 14:19:51 -0700 (PDT)
Date: Wed, 20 Sep 2000 14:19:50 -0700
From: Alfred Perlstein <bright@wintelcom.net>
To: "O. Hartmann" <ohartman@ipamzlx.physik.uni-mainz.de>
Cc: freebsd-stable@FreeBSD.ORG, freebsd-smp@FreeBSD.ORG
Subject: Re: Whats is this? FBSD 4.1 isn't stable!
Message-ID: <20000920141950.O9141@fw.wintelcom.net>
References: <20000920135345.M9141@fw.wintelcom.net> <Pine.BSF.4.21.0009202302550.469-100000@ipamzlx.physik.uni-mainz.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.4i
In-Reply-To: <Pine.BSF.4.21.0009202302550.469-100000@ipamzlx.physik.uni-mainz.de>; from ohartman@ipamzlx.physik.uni-mainz.de on Wed, Sep 20, 2000 at 11:08:03PM +0200
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

* O. Hartmann <ohartman@ipamzlx.physik.uni-mainz.de> [000920 14:05] wrote:
> On Wed, 20 Sep 2000, Alfred Perlstein wrote:
> Well, sorry, but this is not that kind of doing science by laying hands onto something
> and say: well - its this kind of symptome!
> 
> We develop here equipment for airborne measurement facilities for meteorological
> science and the way we stress things makes me sure, over the time, that a system,
> which has been stressed much more under much more bad conditions do not fail
> in a phase of been not stressed that hard.
> 
> Maybe you're right and I caught some kind of "hardware failure", but surely not
> that kind of failure that we expect due to "overheating" the CPUs. The machine here
> in front of me is much better air conditioned thatn other systems I stressed.

please read this:
  http://www.bitwizard.nl/sig11/

Try not to talk to me like I don't know what I'm talking about.

And don't forget about the email formatting. :)

thanks,
-Alfred


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20 16:46:22 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from charon.khoral.com (charon.khoral.com [209.75.155.97])
	by hub.freebsd.org (Postfix) with SMTP
	id 31C7F37B423; Wed, 20 Sep 2000 16:46:18 -0700 (PDT)
Received: from zen.alb.khoral.com by charon.khoral.com
          via smtpd (for hub.FreeBSD.org [216.136.204.18]) with SMTP; 20 Sep 2000 23:46:18 UT
Received: from benson (benson.alb.khoral.com [10.1.2.11])
	by zen.alb.khoral.com (8.9.3/8.9.3) with ESMTP id RAA01996;
	Wed, 20 Sep 2000 17:46:15 -0600 (MDT)
From: Steve Jorgensen <steve@khoral.com>
Message-Id: <200009202345.RAA29210@benson>
Subject: Re: Whats is this? FBSD 4.1 isn't stable!
To: ohartman@ipamzlx.physik.uni-mainz.de (O. Hartmann)
Date: Wed, 20 Sep 2000 17:45:42 -0600 (MDT)
Cc: bright@wintelcom.net (Alfred Perlstein),
	freebsd-stable@FreeBSD.ORG, freebsd-smp@FreeBSD.ORG
In-Reply-To: <Pine.BSF.4.21.0009202302550.469-100000@ipamzlx.physik.uni-mainz.de> from "O. Hartmann" at Sep 20, 2000 11:08:03 PM
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

O. Hartmann wrote
>> On Wed, 20 Sep 2000, Alfred Perlstein wrote:
>> Well, sorry, but this is not that kind of doing science by laying hands
>> onto something and say: well - its this kind of symptome!
>> 
>> We develop here equipment for airborne measurement facilities for
>> meteorological science and the way we stress things makes me sure, over
>> the time, that a system, which has been stressed much more under much more
>> bad conditions do not fail in a phase of been not stressed that hard.
>> 
>> Maybe you're right and I caught some kind of "hardware failure", but
>> surely not that kind of failure that we expect due to "overheating" the
>> CPUs. The machine here in front of me is much better air conditioned thatn
>> other systems I stressed.

	One of the earlier emails said you ran into a problem while doing
	a "make world".  If you were doing a make world, it was replacing
	binaries as it compiled.

	So, if it isn't a hardware problem as the others have suggested,
	my guess is you replaced enough of your binaries to really hose
	the OS.  That's why it's a really good idea to do the 'make
	buildworld' and 'make installworld' sequence separately, so you
	know if it's going to build BEFORE you start replacing your
	working operating system.

	All I can suggest is going back to your 4.1 CD and reinstalling
	that, and then trying the the cvsup, make buildworld, make
	installworld sequence again.  You could also look at which files
	were replaced in your installed system, but that would take a
	long time.. :)

	Anyway, I hope this helps.

					Steve
	

-- 
-----------------------------------------------------------
Steven Jorgensen      steve@khoral.com	 steve@spukhaus.com
------------------------------+----------------------------
Khoral Research Inc.          | PHONE: (505) 837-6500
6200 Uptown Blvd, Suite 200   | FAX:   (505) 881-3842
Albuquerque, NM 87110         | URL: http://www.khoral.com/
-----------------------------------------------------------


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Wed Sep 20 17:28:33 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from mail.rdc1.ne.home.com (ha1.rdc1.ne.home.com [24.2.4.66])
	by hub.freebsd.org (Postfix) with ESMTP
	id 801CC37B422; Wed, 20 Sep 2000 17:28:29 -0700 (PDT)
Received: from cx443070b ([24.0.36.170]) by mail.rdc1.ne.home.com
          (InterMail vM.4.01.03.00 201-229-121) with SMTP
          id <20000921002828.JXE29235.mail.rdc1.ne.home.com@cx443070b>;
          Wed, 20 Sep 2000 17:28:28 -0700
Message-ID: <000901c02363$06e9bd50$aa240018@cx443070b>
From: "Jeremiah Gowdy" <jgowdy@home.com>
To: "Steve Jorgensen" <steve@khoral.com>,
	"O. Hartmann" <ohartman@ipamzlx.physik.uni-mainz.de>
Cc: "Alfred Perlstein" <bright@wintelcom.net>,
	<freebsd-stable@FreeBSD.ORG>, <freebsd-smp@FreeBSD.ORG>
References: <200009202345.RAA29210@benson>
Subject: Re: Whats is this? FBSD 4.1 isn't stable!
Date: Wed, 20 Sep 2000 17:29:38 -0700
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4133.2400
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

> One of the earlier emails said you ran into a problem while doing
> a "make world".  If you were doing a make world, it was replacing
> binaries as it compiled.

I didn't think make world installed until after it built everything.  Just
as if you did make buildworld, make installworld.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Thu Sep 21 19:45: 9 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from ipamzlx.physik.uni-mainz.de (ipamzlx.Physik.Uni-Mainz.DE [134.93.180.54])
	by hub.freebsd.org (Postfix) with ESMTP
	id E934D37B42C; Thu, 21 Sep 2000 19:45:04 -0700 (PDT)
Received: from ipamzlx.Physik.Uni-Mainz.DE (ipamzlx.Physik.Uni-Mainz.DE [134.93.180.54])
	by ipamzlx.physik.uni-mainz.de (8.11.0/8.9.3) with ESMTP id e8M2kER01189;
	Fri, 22 Sep 2000 04:46:15 +0200 (CEST)
	(envelope-from ohartman@ipamzlx.physik.uni-mainz.de)
Date: Fri, 22 Sep 2000 04:46:14 +0200 (CEST)
From: "O. Hartmann" <ohartman@ipamzlx.physik.uni-mainz.de>
To: Mike Smith <msmith@freebsd.org>
Cc: freebsd-hardware@freebsd.org, freebsd-smp@freebsd.org
Subject: Re: TYAN Thunder 2500-80 and FreeBSD 4.1 
In-Reply-To: <200009142152.OAA01453@mass.osd.bsdi.com>
Message-ID: <Pine.BSF.4.21.0009220437450.449-100000@ipamzlx.physik.uni-mainz.de>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Thu, 14 Sep 2000, Mike Smith wrote:
This night I swapped our old SMP server, based on a GigaByte GA686DX dual 
mainboard, Intel 440BX to the new system based on the TYAN Thunder 2500-80, 
based on ServerWorks III/HE chipset. Well, it seems that FreeBSD likes much more
expensive hardware than the cheaper one :-). To say it in short words:
The system ist up without any kind of problem! The only problem occured with
my prepared SMP kernel, because it detected 2 APICs instead of one and 
4 busses instead of three (I forgot the ISA bus). This is maybe subject of 
being aware of for those who swapp, too. Except the higher amount of memory
(old server had 512 MB PC100-222 memory, new machine has 1GB PC133-333 ECC),
I use at this moment the same hardware: same harddrive array with vinum,
but now with LSI Symbios Logic 896 driver (prior system had Adaptec 2940U2W
on PCI). The NIC was an Intel Ether Express PRO 10/100+ Serveradaptor,
the new NIC is the one which is built in. Well, in summary I have the
subjective feeling that the system works slightly faster than before and
that seems not to be only the effect of more memory! 
:>> 
:>> I need a little bit help and assistance in some hardware/smp questions.
:>> At the moment, one of our main servers is based on a dual-gigabyte intel
:>> 440BX chipset based mainboard. These days we change, we want to swapp
:>> over to a new mainboard and new CPUs. The mainboard has been choosen
:>> from TYAN, TYAN Thunder 2500-80. I watched out for compatibility reasons
:>> not to obtain the Symbios Logic 53C1010 (SCSI-3/160) based board, because
:>> SymbiosLogic 53C896 seems to be fully supported.
:>
:>The 53c1010 is also very well supported.
:>
:>> It seems, that the
:>> TYAN Thunder 2500 is one of the best mainboards on market - but has anyone
:>> tested it using FBSD? 
:>
:>Yes.  We (FreeBSD Labs) evaluated a sample of this board about a month 
:>ago.  I had no problems with it at all, and qualified it for FreeBSD 4.1 
:>and above.
:>
:>-- 
:>... every activity meets with opposition, everyone who acts has his
:>rivals and unfortunately opponents also.  But not because people want
:>to be opponents, rather because the tasks and relationships force
:>people to take different points of view.  [Dr. Fritz Todt]
:>
:>
:>

Gruss O. Hartmann
-------------------------------------------------------------------
ohartman@ipamzlx.physik.uni-mainz.de

Klimadatenserver des IPA, Universitaet Mainz
Netzwerk- und Systembetreuung


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Fri Sep 22  9: 8:13 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from cmh-dial.columbus.rr.com (cmh-dial.columbus.rr.com [204.210.252.23])
	by hub.freebsd.org (Postfix) with ESMTP
	id 9F57E37B42C; Fri, 22 Sep 2000 09:08:05 -0700 (PDT)
Received: from columbus.rr.com (dhcp16466029.columbus.rr.com [24.164.66.29])
	by cmh-dial.columbus.rr.com (8.9.3/8.9.3) with ESMTP id MAA25561;
	Fri, 22 Sep 2000 12:07:14 -0400 (EDT)
Message-ID: <39CB8507.F9F786BB@columbus.rr.com>
Date: Fri, 22 Sep 2000 12:12:55 -0400
From: Bill Moran <wmoran@columbus.rr.com>
X-Mailer: Mozilla 4.72 [en] (X11; I; FreeBSD 4.1-STABLE i386)
X-Accept-Language: en
MIME-Version: 1.0
Cc: freebsd-smp@FreeBSD.ORG, freebsd-stable@FreeBSD.ORG
Subject: Re: Whats is this? FBSD 4.1 isn't stable!
References: <XFMail.000920133558.jhb@FreeBSD.org>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

John Baldwin wrote:
> On 20-Sep-00 O. Hartmann wrote:
> > not happen! I think not to have faulty hardware, because this server runs now
> > for about 2 and a half year without any hardware fault!
> 
> So?  Hardware goes bad.  I.e., it works fine for a while, and then
> *wham* it breaks one day.  In fact, your hardware could have already been
> somewhat marginal to begin with, and the stress of making world may have
> caused something to overheat and cause the hardware to break, esp. memory
> or CPU.  Bad memory and CPU's often cause Sig 11's.

To add my $.02 to this. We generally consider hardware suspect after 2
years of heavy use (servers, for example) simply because it's seen
17000 hours of use. Equipment does wear out, and depending on the
actual usage, a 2 year old server can easily be scrap (especially when
it's plugged into the same circuit as the air conditioning ... stupid
clients - I didn't say that!)

-Bill


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Fri Sep 22 20: 2:21 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from wantadilla.lemis.com (wantadilla.lemis.com [192.109.197.80])
	by hub.freebsd.org (Postfix) with ESMTP
	id 2A08D37B422; Fri, 22 Sep 2000 20:02:16 -0700 (PDT)
Received: (from grog@localhost)
	by wantadilla.lemis.com (8.11.0/8.9.3) id e8N322p58015;
	Sat, 23 Sep 2000 12:32:02 +0930 (CST)
	(envelope-from grog)
Date: Sat, 23 Sep 2000 12:32:02 +0930
From: Greg Lehey <grog@lemis.com>
To: Julian Elischer <julian@elischer.org>
Cc: current@FreeBSD.ORG, FreeBSD SMP list <FreeBSD-smp@FreeBSD.ORG>
Subject: Re: Locking doc.?
Message-ID: <20000923123202.C3999@wantadilla.lemis.com>
References: <39CC1A63.41C67EA6@elischer.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0i
In-Reply-To: <39CC1A63.41C67EA6@elischer.org>; from julian@elischer.org on Fri, Sep 22, 2000 at 07:50:11PM -0700
Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia
Phone: +61-8-8388-8286
Fax: +61-8-8388-8725
Mobile: +61-418-838-708
WWW-Home-Page: http://www.lemis.com/~grog
X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF  13 24 52 F8 6D A4 95 EF
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Friday, 22 September 2000 at 19:50:11 -0700, Julian Elischer wrote:
> Do we have a document that descibes in great detail the
> locking policy that the SMPng code should follow?
>
> I've seen several descriptionms as to how it might be done,
> but I haven't seen a "Ok we've decided that this is the strategy
> we are using"  document.

I haven't seen one either.  On the one hand it might be a little early
to come up with a (restrictive) policy document, but I do think we
should be discussing more actively how we go about subdividing the
locking.  There's been plenty of experience in the past to show that
it's madness to just go about subdividing locks.

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sat Sep 23  0:36:13 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from pike.osd.bsdi.com (pike.osd.bsdi.com [204.216.28.222])
	by hub.freebsd.org (Postfix) with ESMTP
	id A55D837B423; Sat, 23 Sep 2000 00:36:06 -0700 (PDT)
Received: from foo.osd.bsdi.com (root@foo.osd.bsdi.com [204.216.28.137])
	by pike.osd.bsdi.com (8.11.0/8.9.3) with ESMTP id e8N7Zqi82290;
	Sat, 23 Sep 2000 00:35:52 -0700 (PDT)
	(envelope-from jhb@foo.osd.bsdi.com)
Received: (from jhb@localhost)
	by foo.osd.bsdi.com (8.11.0/8.11.0) id e8N7YMi09670;
	Sat, 23 Sep 2000 00:34:22 -0700 (PDT)
	(envelope-from jhb)
Message-ID: <XFMail.000923003422.jhb@FreeBSD.org>
X-Mailer: XFMail 1.4.0 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
Date: Sat, 23 Sep 2000 00:34:22 -0700 (PDT)
Organization: BSD, Inc.
From: John Baldwin <jhb@FreeBSD.org>
To: smp@FreeBSD.org, cp@bsdi.com, alpha@FreeBSD.org
Subject: Status update
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

Ok, the alpha seems to be rather stable now without the need for obscene hacks
to the mutex code to dink with mtx_saveipl.  To summarize, here are the changes
thus far:

- software interrupts (SWI's) are now MI except for a few constants.  
  Currently we still only have 8 SWI's on the x86 due to old compatability
  nonsense.  We should be able to bump this to 32 like it is on the alpha very
  easily if it proves beneficial.  Also software interrutps are completely
  divorced from the x86 hardware interrupt code.  The softinterrupt thread is
  also now a simple kthread instead of an ithread.
- interrupt threads on the alpha for device I/O interrupts.  Note that two
  bus chipsets (dwlpx and mcpcia) still need a couple of low-level functions to
  handle enable/disable of interrupt sources.
- spl()'s are stubbed out on the alpha.  Actually, they are now stubbed out in
  MI code (kern_intr.c specifically).  As a side effect of the IPL code
  becoming mostly MI now, there is a <sys/ipl.h> that includes <machine/ipl.h>
  and should be used instead.  The individal machine/ipl.h are now quite short
  and simple.
- The interrupt state of the sched_lock is now saved in a process's PCB during
  cpu_switch().  This way, code before and after a call to either mi_switch()
  or cpu_switch() is guaranteed to be run at the same interrupt state.  Without
  this I was having problems on the alpha where the idle loop was running at
  ALPHA_PSL_IPL_SOFT (1) and as a result init's child process was never ran,
  among other things.

This last change is something I'd like some feedback on.  I've checked the
BSD/OS x86 code, and it onyl saves the recursion count of the sched_lock in the
pcb.  However, after the problems with the alpha and some discussion with Peter
Wemm on IRC, I decided that we should be doing this.  However, I'm not
completely certain, and any thoughts that anyone has would be appreciated.

There are also a few more weirdism's on the alpha.  In a few places in
sys/kern, we call spl0() instead of splx().  I've added some debugging code to
do a printf() if we aren't actually at IPL_0 (what spl0 used to do) after the
mtx_exit().  It does trigger in several cases during /etc/rc at least, but the
machine seems to be running stable regardless (I'll be running a buildworld -j
8 tonight to stress test it).  My question is: is it ok for the code to run
with some interrupts disabled or do we need to replace the calls to spl0()
with enable_intr()?
 
I'll be updating my patchset at
http://www.FreeBSD.org/~jhb/patches/alpha.ithreads.patch shortly.  If you have
time, please test this stuff out so we can get it committed.  Thanks.

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sat Sep 23 10:28: 4 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from berserker.bsdi.com (berserker.twistedbit.com [199.79.183.1])
	by hub.freebsd.org (Postfix) with ESMTP
	id 9E50B37B422; Sat, 23 Sep 2000 10:27:55 -0700 (PDT)
Received: from berserker.bsdi.com (cp@[127.0.0.1])
	by berserker.bsdi.com (8.9.3/8.9.3) with ESMTP id LAA17839;
	Sat, 23 Sep 2000 11:27:51 -0600 (MDT)
Message-Id: <200009231727.LAA17839@berserker.bsdi.com>
To: John Baldwin <jhb@freebsd.org>
Cc: smp@freebsd.org, alpha@freebsd.org
Subject: Re: Status update 
In-reply-to: Your message of "Sat, 23 Sep 2000 00:34:22 PDT."
             <XFMail.000923003422.jhb@FreeBSD.org> 
From: Chuck Paterson <cp@bsdi.com>
Date: Sat, 23 Sep 2000 11:27:51 -0600
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


John,
}
}- software interrupts (SWI's) are now MI except for a few constants.  
}  Currently we still only have 8 SWI's on the x86 due to old compatability
}  nonsense.  We should be able to bump this to 32 like it is on the alpha very
}  easily if it proves beneficial.  Also software interrutps are completely
}  divorced from the x86 hardware interrupt code.  The softinterrupt thread is
}  also now a simple kthread instead of an ithread.


	I guess I'm a little unclear about this. We certainly don't want
any hardware goop with software interrupts, at least in the case
where there is no hardware support of software interrupts,
 but it seems like we do want them to be more than just another
kthread, we want to be able to schedule them on the fly like hardware
interrupts and do light weight context switches to them.

	Am I missing something?

}- The interrupt state of the sched_lock is now saved in a process's PCB during
}  cpu_switch().  This way, code before and after a call to either mi_switch()
}  or cpu_switch() is guaranteed to be run at the same interrupt state.  Without
}  this I was having problems on the alpha where the idle loop was running at
}  ALPHA_PSL_IPL_SOFT (1) and as a result init's child process was never ran,
}  among other things.


BSD/OS on Sparc actually has interrupt levels that are associated
with some mutexs. However the code that uses these levels are always
using spin locks so there isn't an issue. Whenever cpu_switch
is called all interrupts are blocked, but the spl level, which it
think is ~= interrupt state is not raised at all. On sparc these
are use for the low level code in for devices like the com driver.

I guess the first question is why there is any kernel code calling
switch with the interrupt priority up. This sounds like it may be
a result of software interrupts being changed into kthreads?

If alpha has hardware support for software interrupts then we might
be able to treat them fully like hardware interrupts. The case in
BSD/OS where a software interrupt is scheduled and the no switch
flag is passed in will just work, the reason for the no switch
is because the thread currently holds spins locks, which in the
case of hardware supported interrupts should be blocked anyway.

Chuck


Chuck


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sat Sep 23 10:53:46 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from server.baldwin.cx (server.geekhouse.net [64.81.6.52])
	by hub.freebsd.org (Postfix) with ESMTP
	id 8D16D37B422; Sat, 23 Sep 2000 10:53:35 -0700 (PDT)
Received: from john.baldwin.cx (root@john.baldwin.cx [192.168.1.18])
	by server.baldwin.cx (8.9.3/8.9.3) with ESMTP id KAA91697;
	Sat, 23 Sep 2000 10:55:51 -0700 (PDT)
	(envelope-from john@baldwin.cx)
Received: (from john@localhost)
	by john.baldwin.cx (8.9.3/8.9.3) id KAA51845;
	Sat, 23 Sep 2000 10:55:12 -0700 (PDT)
	(envelope-from john)
Message-Id: <200009231755.KAA51845@john.baldwin.cx>
X-Mailer: XFMail 1.4.0 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <200009231727.LAA17839@berserker.bsdi.com>
Date: Sat, 23 Sep 2000 10:55:12 -0700 (PDT)
From: John Baldwin <jhb@freebsd.org>
To: Chuck Paterson <cp@bsdi.com>
Subject: Re: Status update
Cc: alpha@freebsd.org, smp@freebsd.org
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


On 23-Sep-00 Chuck Paterson wrote:
> 
> John,
> }
> }- software interrupts (SWI's) are now MI except for a few constants.  
> }  Currently we still only have 8 SWI's on the x86 due to old compatability
> }  nonsense.  We should be able to bump this to 32 like it is on the alpha very
> }  easily if it proves beneficial.  Also software interrutps are completely
> }  divorced from the x86 hardware interrupt code.  The softinterrupt thread is
> }  also now a simple kthread instead of an ithread.
> 
> 
>       I guess I'm a little unclear about this. We certainly don't want
> any hardware goop with software interrupts, at least in the case
> where there is no hardware support of software interrupts,
>  but it seems like we do want them to be more than just another
> kthread, we want to be able to schedule them on the fly like hardware
> interrupts and do light weight context switches to them.

Hmm, ok.  Currently the only thing we were using from the ithd structure
was the it_need flag (like the runstatus flag in BSD/OS' ithreads) to know
when to keep looping.  I changed it to just use spending for this.  It is still
scheduled and ran using a method similar to the hardware ithreads, it just doesn't
have a ithd struct.  However, it is quite easy to add the ithd struct back in
when we go to light-weight context switches.

> }- The interrupt state of the sched_lock is now saved in a process's PCB during
> }  cpu_switch().  This way, code before and after a call to either mi_switch()
> }  or cpu_switch() is guaranteed to be run at the same interrupt state.  Without
> }  this I was having problems on the alpha where the idle loop was running at
> }  ALPHA_PSL_IPL_SOFT (1) and as a result init's child process was never ran,
> }  among other things.
> 
> 
> BSD/OS on Sparc actually has interrupt levels that are associated
> with some mutexs. However the code that uses these levels are always
> using spin locks so there isn't an issue. Whenever cpu_switch
> is called all interrupts are blocked, but the spl level, which it
> think is ~= interrupt state is not raised at all. On sparc these
> are use for the low level code in for devices like the com driver.
> 
> I guess the first question is why there is any kernel code calling
> switch with the interrupt priority up. This sounds like it may be
> a result of software interrupts being changed into kthreads?

It just happens on the alpha, it doesn't happen on the x86.  I'm not sure
where it is happening though.  If we are sure of that guarantee (interrupts
should always be enabled before grabbing sched_lock and calling either
cpu_switch or mi_switch) then that should make my KASSERT()'s to find this
easier, and I can get rid of this code.

> If alpha has hardware support for software interrupts then we might
> be able to treat them fully like hardware interrupts. The case in
> BSD/OS where a software interrupt is scheduled and the no switch
> flag is passed in will just work, the reason for the no switch
> is because the thread currently holds spins locks, which in the
> case of hardware supported interrupts should be blocked anyway.

Currently all SWI's are now triggered by setting the appropriate bit
in spending and calling sched_softintr() to make the softinterrupt
thread runnable.  sched_softintr() is very similar in functionality to
sched_ithd in the x86 hardware interrupt code.

> Chuck

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sat Sep 23 16: 0:18 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from mail.du.gtn.com (mail.du.gtn.com [194.77.9.57])
	by hub.freebsd.org (Postfix) with ESMTP
	id C762737B424; Sat, 23 Sep 2000 16:00:04 -0700 (PDT)
Received: from mail.cicely.de (cicely.de [194.231.9.142])
	by mail.du.gtn.com (8.11.0.Beta3/8.11.0.Beta3) with ESMTP id e8NN00g06265
	(using TLSv1/SSLv3 with cipher EDH-RSA-DES-CBC3-SHA (168 bits) verified OK);
	Sun, 24 Sep 2000 01:00:02 +0200 (MET DST)
Received: from cicely8.cicely.de (cicely8.cicely.de [10.1.2.10])
	by mail.cicely.de (8.11.0.Beta1/8.11.0.Beta1) with ESMTP id e8NN0G518938;
	Sun, 24 Sep 2000 01:00:16 +0200 (CEST)
Received: (from ticso@localhost)
	by cicely8.cicely.de (8.11.0/8.9.2) id e8NN0GK68893;
	Sun, 24 Sep 2000 01:00:16 +0200 (CEST)
	(envelope-from ticso)
Date: Sun, 24 Sep 2000 01:00:15 +0200
From: Bernd Walter <ticso@cicely8.cicely.de>
To: John Baldwin <jhb@FreeBSD.ORG>
Cc: smp@FreeBSD.ORG, cp@bsdi.com, alpha@FreeBSD.ORG
Subject: Re: Status update
Message-ID: <20000924010015.A68775@cicely8.cicely.de>
References: <XFMail.000923003422.jhb@FreeBSD.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0.1i
In-Reply-To: <XFMail.000923003422.jhb@FreeBSD.org>; from jhb@FreeBSD.ORG on Sat, Sep 23, 2000 at 12:34:22AM -0700
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Sat, Sep 23, 2000 at 12:34:22AM -0700, John Baldwin wrote:
> I'll be updating my patchset at
> http://www.FreeBSD.org/~jhb/patches/alpha.ithreads.patch shortly.  If you have
> time, please test this stuff out so we can get it committed.  Thanks.

It still doesn't work for me:
Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 5.0-CURRENT #0: Sun Sep 24 00:14:11 CEST 2000
    ticso@cicely9.cicely.de:/var/d7/src-2000-09-23/src/sys/compile/CICELY9
EB164
Digital AlphaPC 164 500 MHz, 500MHz
8192 byte page size, 1 processor.
CPU: EV56 (21164A) major=7 minor=2 extensions=0x1<BWX>
OSF PAL rev: 0x1000800020117
real memory  = 265904128 (259672K bytes)
avail memory = 253419520 (247480K bytes)
Preloaded elf kernel "kernel.ko" at 0xfffffc0000628000.
../../kern/kern_fork.c:537:fork1() spl0 needs fixing
cia0: <2117x Core Logic chipset>
cia0: ALCOR/ALCOR2, pass 3
cia0: extended capabilities: 21<DWEN,BWEN>
pcib0: <2117x PCI host bus adapter> on cia0
pci0: <PCI bus> on pcib0
sym0: <895> port 0x10200-0x102ff mem 0x82030000-0x82030fff,0x82031200-0x820312ff irq 2 at device 5.0 on pci0
sym0: Symbios NVRAM, ID 7, Fast-40, LVD, parity checking
sym0: open drain IRQ line driver, using on-chip SRAM
sym0: using LOAD/STORE-based firmware.
sym0: interrupting at CIA irq 2
sym1: <810a> port 0x10100-0x101ff mem 0x82031100-0x820311ff irq 0 at device 6.0 on pci0
sym1: No NVRAM, ID 7, Fast-10, SE, parity checking
sym1: interrupting at CIA irq 0
sym2: <810a> port 0x10000-0x100ff mem 0x82031000-0x820310ff irq 1 at device 7.0 on pci0
sym2: No NVRAM, ID 7, Fast-10, SE, parity checking
sym2: interrupting at CIA irq 1
isab0: <Intel 82378IB PCI to ISA bridge> at device 8.0 on pci0
isa0: <ISA bus> on isab0
xl0: <3Com 3c905B-TX Fast Etherlink XL> port 0x10300-0x1037f mem 0x82031300-0x8203137f irq 3 at device 9.0 on pci0
xl0: interrupting at CIA irq 3
xl0: Ethernet address: 00:10:5a:30:1c:1a
miibus0: <MII bus> on xl0
xlphy0: <3Com internal media interface> on miibus0
xlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
atapci0: <CMD 646 ATA controller> port 0x10380-0x1038f irq 5 at device 11.0 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata0: interrupting at ISA irq 14
fdc0: cannot reserve I/O port range
mcclock0: <MC146818A real time clock> at port 0x70-0x71 on isa0
sio0 at port 0x3f8-0x3ff irq 4 on isa0
sio0: type 16550A, console
sio0: interrupting at ISA irq 4
sio1: reserved for low-level i/o
Timecounter "i8254"  frequency 1193182 Hz
Timecounter "alpha"  frequency 500006125 Hz
ad0: 1219MB <Conner Peripherals 1275MB - CFS1275A> [2477/16/63] at ata0-master WDMA2
Waiting 15 seconds for SCSI devices to settle
(noperiph:sym0:0:-1:-1): SCSI BUS reset delivered.

And nothing more happens...

boot -v shows this:
[...]
ata0-master: success setting WDMA2 on CMD646 chip
ad0: <Conner Peripherals 1275MB - CFS1275A/0.28> ATA-0 disk at ata0-master
ad0: 1219MB (2496816 sectors), 2477 cyls, 16 heads, 63 S/T, 512 B/S
ad0: 1 secs/int, 1 depth queue, WDMA2
ad0: piomode=4 dmamode=2 udmamode=-1 cblid=0
Creating DISK ad0
Waiting 15 seconds for SCSI devices to settle
(noperiph:sym0:0:-1:-1): SCSI BUS reset delivered.
sym0: enabling clock multiplier
sym0: Downloading SCSI SCRIPTS.
(noperiph:sym1:0:-1:-1): SCSI BUS reset delivered.
(noperiph:sym2:0:-1:-1): SCSI BUS reset delivered.
(probe15:sym1:0:0:0): INQUIRY. CDB: 12 1 80 0 ff 0 
(probe15:sym1:0:0:0): ILLEGAL REQUEST asc:24,0
(probe15:sym1:0:0:0): Invalid field in CDB sks:c0,2
(probe16:sym1:0:1:0): INQUIRY. CDB: 12 1 80 0 ff 0 
(probe16:sym1:0:1:0): ILLEGAL REQUEST asc:24,0
(probe16:sym1:0:1:0): Invalid field in CDB sks:c0,2
(probe17:sym1:0:2:0): INQUIRY. CDB: 12 1 80 0 ff 0 
(probe17:sym1:0:2:0): ILLEGAL REQUEST asc:24,0
(probe17:sym1:0:2:0): Invalid field in CDB sks:c0,2
(probe18:sym1:0:3:0): INQUIRY. CDB: 12 1 80 0 ff 0 
(probe18:sym1:0:3:0): ILLEGAL REQUEST asc:24,0
(probe18:sym1:0:3:0): Invalid field in CDB sks:c0,2
(probe19:sym1:0:4:0): INQUIRY. CDB: 12 1 80 0 ff 0 
(probe19:sym1:0:4:0): ILLEGAL REQUEST asc:24,0
(probe19:sym1:0:4:0): Invalid field in CDB sks:c0,2

The SCSI messages are normal for the disks.
Sometimes it comes to print the first SCSI "Creating Disk" lines.

-- 
B.Walter              COSMO-Project         http://www.cosmo-project.de
ticso@cicely.de         Usergroup           info@cosmo-project.de


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message


From owner-freebsd-smp  Sat Sep 23 16: 0:37 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from mail.du.gtn.com (mail.du.gtn.com [194.77.9.57])
	by hub.freebsd.org (Postfix) with ESMTP
	id 0109337B42C; Sat, 23 Sep 2000 16:00:14 -0700 (PDT)
Received: from mail.cicely.de (cicely.de [194.231.9.142])
	by mail.du.gtn.com (8.11.0.Beta3/8.11.0.Beta3) with ESMTP id e8NN00g06265
	(using TLSv1/SSLv3 with cipher EDH-RSA-DES-CBC3-SHA (168 bits) verified OK);
	Sun, 24 Sep 2000 01:00:02 +0200 (MET DST)
Received: from cicely8.cicely.de (cicely8.cicely.de [10.1.2.10])
	by mail.cicely.de (8.11.0.Beta1/8.11.0.Beta1) with ESMTP id e8NN0G518938;
	Sun, 24 Sep 2000 01:00:16 +0200 (CEST)
Received: (from ticso@localhost)
	by cicely8.cicely.de (8.11.0/8.9.2) id e8NN0GK68893;
	Sun, 24 Sep 2000 01:00:16 +0200 (CEST)
	(envelope-from ticso)
Date: Sun, 24 Sep 2000 01:00:15 +0200
From: Bernd Walter <ticso@cicely8.cicely.de>
To: John Baldwin <jhb@FreeBSD.ORG>
Cc: smp@FreeBSD.ORG, cp@bsdi.com, alpha@FreeBSD.ORG
Subject: Re: Status update
Message-ID: <20000924010015.A68775@cicely8.cicely.de>
References: <XFMail.000923003422.jhb@FreeBSD.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0.1i
In-Reply-To: <XFMail.000923003422.jhb@FreeBSD.org>; from jhb@FreeBSD.ORG on Sat, Sep 23, 2000 at 12:34:22AM -0700
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Sat, Sep 23, 2000 at 12:34:22AM -0700, John Baldwin wrote:
> I'll be updating my patchset at
> http://www.FreeBSD.org/~jhb/patches/alpha.ithreads.patch shortly.  If you have
> time, please test this stuff out so we can get it committed.  Thanks.

It still doesn't work for me:
Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 5.0-CURRENT #0: Sun Sep 24 00:14:11 CEST 2000
    ticso@cicely9.cicely.de:/var/d7/src-2000-09-23/src/sys/compile/CICELY9
EB164
Digital AlphaPC 164 500 MHz, 500MHz
8192 byte page size, 1 processor.
CPU: EV56 (21164A) major=7 minor=2 extensions=0x1<BWX>
OSF PAL rev: 0x1000800020117
real memory  = 265904128 (259672K bytes)
avail memory = 253419520 (247480K bytes)
Preloaded elf kernel "kernel.ko" at 0xfffffc0000628000.
../../kern/kern_fork.c:537:fork1() spl0 needs fixing
cia0: <2117x Core Logic chipset>
cia0: ALCOR/ALCOR2, pass 3
cia0: extended capabilities: 21<DWEN,BWEN>
pcib0: <2117x PCI host bus adapter> on cia0
pci0: <PCI bus> on pcib0
sym0: <895> port 0x10200-0x102ff mem 0x82030000-0x82030fff,0x82031200-0x820312ff irq 2 at device 5.0 on pci0
sym0: Symbios NVRAM, ID 7, Fast-40, LVD, parity checking
sym0: open drain IRQ line driver, using on-chip SRAM
sym0: using LOAD/STORE-based firmware.
sym0: interrupting at CIA irq 2
sym1: <810a> port 0x10100-0x101ff mem 0x82031100-0x820311ff irq 0 at device 6.0 on pci0
sym1: No NVRAM, ID 7, Fast-10, SE, parity checking
sym1: interrupting at CIA irq 0
sym2: <810a> port 0x10000-0x100ff mem 0x82031000-0x820310ff irq 1 at device 7.0 on pci0
sym2: No NVRAM, ID 7, Fast-10, SE, parity checking
sym2: interrupting at CIA irq 1
isab0: <Intel 82378IB PCI to ISA bridge> at device 8.0 on pci0
isa0: <ISA bus> on isab0
xl0: <3Com 3c905B-TX Fast Etherlink XL> port 0x10300-0x1037f mem 0x82031300-0x8203137f irq 3 at device 9.0 on pci0
xl0: interrupting at CIA irq 3
xl0: Ethernet address: 00:10:5a:30:1c:1a
miibus0: <MII bus> on xl0
xlphy0: <3Com internal media interface> on miibus0
xlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
atapci0: <CMD 646 ATA controller> port 0x10380-0x1038f irq 5 at device 11.0 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata0: interrupting at ISA irq 14
fdc0: cannot reserve I/O port range
mcclock0: <MC146818A real time clock> at port 0x70-0x71 on isa0
sio0 at port 0x3f8-0x3ff irq 4 on isa0
sio0: type 16550A, console
sio0: interrupting at ISA irq 4
sio1: reserved for low-level i/o
Timecounter "i8254"  frequency 1193182 Hz
Timecounter "alpha"  frequency 500006125 Hz
ad0: 1219MB <Conner Peripherals 1275MB - CFS1275A> [2477/16/63] at ata0-master WDMA2
Waiting 15 seconds for SCSI devices to settle
(noperiph:sym0:0:-1:-1): SCSI BUS reset delivered.

And nothing more happens...

boot -v shows this:
[...]
ata0-master: success setting WDMA2 on CMD646 chip
ad0: <Conner Peripherals 1275MB - CFS1275A/0.28> ATA-0 disk at ata0-master
ad0: 1219MB (2496816 sectors), 2477 cyls, 16 heads, 63 S/T, 512 B/S
ad0: 1 secs/int, 1 depth queue, WDMA2
ad0: piomode=4 dmamode=2 udmamode=-1 cblid=0
Creating DISK ad0
Waiting 15 seconds for SCSI devices to settle
(noperiph:sym0:0:-1:-1): SCSI BUS reset delivered.
sym0: enabling clock multiplier
sym0: Downloading SCSI SCRIPTS.
(noperiph:sym1:0:-1:-1): SCSI BUS reset delivered.
(noperiph:sym2:0:-1:-1): SCSI BUS reset delivered.
(probe15:sym1:0:0:0): INQUIRY. CDB: 12 1 80 0 ff 0 
(probe15:sym1:0:0:0): ILLEGAL REQUEST asc:24,0
(probe15:sym1:0:0:0): Invalid field in CDB sks:c0,2
(probe16:sym1:0:1:0): INQUIRY. CDB: 12 1 80 0 ff 0 
(probe16:sym1:0:1:0): ILLEGAL REQUEST asc:24,0
(probe16:sym1:0:1:0): Invalid field in CDB sks:c0,2
(probe17:sym1:0:2:0): INQUIRY. CDB: 12 1 80 0 ff 0 
(probe17:sym1:0:2:0): ILLEGAL REQUEST asc:24,0
(probe17:sym1:0:2:0): Invalid field in CDB sks:c0,2
(probe18:sym1:0:3:0): INQUIRY. CDB: 12 1 80 0 ff 0 
(probe18:sym1:0:3:0): ILLEGAL REQUEST asc:24,0
(probe18:sym1:0:3:0): Invalid field in CDB sks:c0,2
(probe19:sym1:0:4:0): INQUIRY. CDB: 12 1 80 0 ff 0 
(probe19:sym1:0:4:0): ILLEGAL REQUEST asc:24,0
(probe19:sym1:0:4:0): Invalid field in CDB sks:c0,2

The SCSI messages are normal for the disks.
Sometimes it comes to print the first SCSI "Creating Disk" lines.

-- 
B.Walter              COSMO-Project         http://www.cosmo-project.de
ticso@cicely.de         Usergroup           info@cosmo-project.de


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message