Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 10 Apr 2003 23:09:03 +0200
From:      Wilko Bulte <wkb@freebie.xs4all.nl>
To:        Jens =?iso-8859-1?Q?R=F6der?= <j.roeder@tu-bs.de>
Cc:        freebsd-alpha@freebsd.org
Subject:   Re: alpha/50659: reboot causes SRM console to loop endless error and needs to be restetted hard
Message-ID:  <20030410210903.GA19654@freebie.xs4all.nl>
In-Reply-To: <Pine.HPX.4.33.0304092208130.3140-100000@rzsrv1.rz.tu-bs.de>
References:  <20030409180753.GB14966@freebie.xs4all.nl> <Pine.HPX.4.33.0304092208130.3140-100000@rzsrv1.rz.tu-bs.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Apr 10, 2003 at 12:33:03AM +0200, Jens Röder wrote:

> > That is a kernel panic, not a memory problem ;)
> >
> > Most Alphas, and your AS500 too, have ECC (error correction) memory. That allows
> > single bit memory errors to be corrected. The kernel will tell you if a
> > correction was applied, these are the processor correctable errors I
> > mentioned.
> 
> Hm, sounds interesting, so that does mean for me that in the case of a
> hardware memory problem I would get a kernel-message and don't need to do
> any memory checks?

Yes. If there are multibit errors ECC won't be able to correct them. I
don't know what the kernel will tell you in such a case as I never had
such a problem ;-) But is will most likely crash after reporting 
a machine check (see the the code is in /sys/alpha/alpha/interrupt.c)

> > Unaligned accesses in kernel mode are Bad(TM). Check the handbook on
> > creating more debug info on the crash please.
> 
> I am not sure if I did the right thing, so there is a core file now
> available at:
> 
> http://octopus.homeunix.net/jens@piero.ptch.nat.tu-bs.de.gz


> fatal kerneltrap:
> 
> trapentry	= 0x4	(unaligned access fault)
> cpuid		= 0
> faulting va	= 0xfffffc0031d12d0c
> opcode		= 0x2d
> register	= 0x9
> pc		= 0xfffffe0004138bc0
> ra		= 0xfffffe0004138bb4
> sp		= 0xfffffe001da7db70
> usp 		= 0x11fff628
> 
> curthread	= 0xfffffc003e2c87c0
> pid 593, comm ipfw
> Stopped at ipfw_ctl+0x1c0; or 	zero, s0,t2
> 			<zero=0x0,s0=0xfffffc0031d12d0c,t2=0x2710>

        Right.. looks like ipfw is not really up to snuff for the 64bit
Alpha ;) 

> Again, when you use "ipfw show" on 5.0 on alpha, you get messages like
> this:
> 
> ptchgate# ipfw show
> 00100         94      10410 allow ip from any to any via lo0
> 00200          0          0 deny ip from any to 127.0.0.0/8
> pid 585 (ipfw): unaligned access: va=0x1200a80b4 pc=0x120001780
> ra=0x120001764 op=ldq
> pid 585 (ipfw): unaligned access: va=0x1200a80bc pc=0x120001784
> ra=0x120001764 op=ldq
> 00300          0          0 deny ip from 127.0.0.0/8 to any
> 65000        921      89561 allow ip from any to any
> 65535          0          0 deny ip from any to any
> 
> 
> It gets more likely to crash, when my set of rules are specified and list
> the rule then.

I will give this ruleset a try on my AS500

> This does not occur on 5.0 for i386, what seems to run stable yet.

x86 is what ipfw was developed on, a 32bit CPU ;)

> 
> Bus 00 	Slot 12: Vendor: 10ec	Device: 8139 Sub_id 813910ec

          ^-- is that a network card?

> Well, I hope, there was something productive for debugging in my mail. I
> am sorry if I appear to be very unexperienced in FreeBSD, but I am just
> getting started.

Sure, this helps. I will try to see if I can make my AS500 fail in the same
way. 
-- 
|   / o / /_  _   		wilko@FreeBSD.org
|/|/ / / /(  (_)  Bulte				



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030410210903.GA19654>