Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 24 Oct 1999 15:46:35 -0400 (EDT)
From:      Andrew Gallatin <gallatin@cs.duke.edu>
To:        Aernoudt Bottemanne <bottemanne@capitolonline.nl>
Cc:        "freebsd-alpha@freebsd.org" <freebsd-alpha@FreeBSD.ORG>, marcel@scc.nl
Subject:   Re: buildworld problem  + received processor correctable error message on PWS433au
Message-ID:  <14355.24089.447439.7651@grasshopper.cs.duke.edu>
In-Reply-To: <3813170B.5A22F06F@capitolonline.nl>
References:  <3813170B.5A22F06F@capitolonline.nl>

next in thread | previous in thread | raw e-mail | index | archive | help

Aernoudt Bottemanne writes:
 > Hi,
 > 
 > 
 > Now that the irq problwm is fixed, I tried to build a new world:
 > (make -j 32 buildworld > make.out 2>&1 )
 > 
 > It starts the job, but somewhere along the way it stops. The machine
 > does
 > not hang, I can login on other consoles etc, but the make process does
 > not
 > continue. During compilation I get these messages:
 > "received processor correctable errors" From the mailinglist archives I
 > found
 > that Andrew already mentioned them before, as a Hardware problem with
 > ECC memory (eg ECC memory being cnot in a well shape)

This is probably a red herring.  

 > In the make.out however there is no error message on th last line, in
 > order to
 > indicate what the problem could be.

Did the make/cc/cpp callchain die?  Is it a zombie?  If it is not
dead, what state is it in?  Are there any jobs with a WCHAN of obtrm?
(to see use ps axl or break into the debugger & do a ps if ps doesn't
work because of a kernel/userland mismatch).

Try running your buildworld with 'make buildworld' & avoid using any
-j args.  Something in the vm system is not using the atomic macros to
change object state & there is a chance that under extreme load, jobs
will hang in objtrm.

BTW, the things you've highlighted in your dmesg (the nfs stuff) is
also a red herring.

=> cia0: Pyxis, pass 1
=> cia0: extended capabilities: 1<BWEN>
=> cia0: WARNING: Pyxis pass 1 DMA bug; no bets...

Read this as "Don't use this machine as a high volume server".  ;-)

The first generation pyxis (the chipset in your machine) has several
problems.  Formost is that PCI DMA reads that cross a page boundary
don't work right. This is not a problem for the 32 bit slots because
PCI-PCI bridge breaks transfers and prevents this from occuring.  The
firmware prevents you from putting "unknown" cards in the 64-bit
slots.  You can override this by doing 'set pci_device_override
<dev_id><vendor_id>' at the srm console prompt.  Its other problems
include piss-poor DMA performance for DMA reads and a tendancy to lock 
solid when the PCI bus is pushed hard.   

Although they sound bad, none of these things should affect its use as
a personal workstation.  In fact, I wish I had one at home ;-)


=> struct nfssvc_sock bloated (> 256bytes)
=> Try reducing NFS_UIDHASHSIZ
=> struct nfsuid bloated (> 128bytes)
=> Try unionizing the nu_nickname and nu_flag fields

This is a red herring.




------------------------------------------------------------------------------
Andrew Gallatin, Sr Systems Programmer	http://www.cs.duke.edu/~gallatin
Duke University				Email: gallatin@cs.duke.edu
Department of Computer Science		Phone: (919) 660-6590




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-alpha" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?14355.24089.447439.7651>