Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 9 Apr 2003 10:56:35 +0200 (METDST)
From:      =?ISO-8859-1?Q?Jens_R=F6der?= <j.roeder@tu-bs.de>
To:        Wilko Bulte <wkb@freebie.xs4all.nl>
Cc:        freebsd-alpha@freebsd.org
Subject:   Re: alpha/50659: reboot causes SRM console to loop endless error and needs to be restetted hard
Message-ID:  <Pine.HPX.4.33.0304090927390.16819-100000@rzsrv1.rz.tu-bs.de>
In-Reply-To: <20030408181856.GA10163@freebie.xs4all.nl>

next in thread | previous in thread | raw e-mail | index | archive | help


Hello Wilko,

thanks a lot for the kind reply. I will go into more details below:


On Tue, 8 Apr 2003, Wilko Bulte wrote:


> >  It run perfectly under FreeBSD 4.7 but unfortunately the kernel was no=
t
> >  stable with having probably problems in memory so that I tried the 5.0=
=2E
>
> Do you mean it reports Processor Correctable memory errors? How much memo=
ry
> does it have?

The machine has about 1 GB RAM. Honestly I am not sure what "processor
correctable memory errors" are, maybe it helps to show the output. That
was from a selfcompiled kernel under 4.7 but I had the same problems when
trying a generic.



Mar 20 10:54:55 ptchgate /kernel:
Mar 20 10:54:55 ptchgate /kernel: fatal kernel trap:
Mar 20 10:54:55 ptchgate /kernel:
Mar 20 10:54:55 ptchgate /kernel: trap entry =3D 0x4 (unaligned access faul=
t)
Mar 20 10:54:55 ptchgate /kernel: a0         =3D 0xfffffca900010021
Mar 20 10:54:55 ptchgate /kernel: a1         =3D 0x2c
Mar 20 10:54:55 ptchgate /kernel: a2         =3D 0x11
Mar 20 10:54:55 ptchgate /kernel: pc         =3D 0xfffffc00004f8564
Mar 20 10:54:55 ptchgate /kernel: ra         =3D 0xfffffc00004942b4
Mar 20 10:54:55 ptchgate /kernel: curproc    =3D 0
Mar 20 10:54:55 ptchgate /kernel:
Mar 20 10:54:55 ptchgate /kernel: panic: trap
Mar 20 10:54:55 ptchgate /kernel:
Mar 20 10:54:55 ptchgate /kernel: syncing disks...
Mar 20 10:54:55 ptchgate /kernel: fatal kernel trap:
Mar 20 10:54:55 ptchgate /kernel:
Mar 20 10:54:55 ptchgate /kernel: trap entry =3D 0x2 (memory management fau=
lt)
Mar 20 10:54:55 ptchgate /kernel: a0         =3D 0x58
Mar 20 10:54:55 ptchgate /kernel: a1         =3D 0x1
Mar 20 10:54:55 ptchgate /kernel: a2         =3D 0x0
Mar 20 10:54:55 ptchgate /kernel: pc         =3D 0xfffffc00004a3d24
Mar 20 10:54:55 ptchgate /kernel: ra         =3D 0xfffffc00004aaec8
Mar 20 10:54:55 ptchgate /kernel: curproc    =3D 0
Mar 20 10:54:55 ptchgate /kernel:
Mar 20 10:54:55 ptchgate /kernel: panic: trap
Mar 20 10:54:55 ptchgate /kernel: Uptime: 9d18h16m20s
Mar 20 10:54:56 ptchgate /kernel: Automatic reboot in 15 seconds - press a
key on the console to abort
Mar 20 10:54:56 ptchgate /kernel: Rebooting...


At the moment I consider also defect memory and will check that as soon as
I have a temporarily replacement for that Institute gateway and a night
free to handle the long delay times of the routers of the university to
changes their arp tables.


> >  This signifficantly runs more stalbe as long one does not use ipfw
> >  command. "ipfw show" sometimes panics the kernel and on the reboot, as
>
> Is ipfw compiled in or kldload-ed?

This time I was using a generic kernel 5.0 with kldload-ed ipfw, but same
problem before was on a self-compiled, why I changed back to the generic.

Meanwhile I have compiled a kernel with suffiencet debug mode with the
hope to offer proper error messages.


>
> >  well as by any other reboot, the srm console gets into an desolate
> >  condition. Using "halt" works fine as expected.
>
> I am currently running a tight loop doing 'ipfw show' all the time on my
> DS10. I need to get my AS500 up to 5.0 before I can try to reproduce you
> problem there. That'll take considerable time, I have to install 4.8R
> first.

I think the "unalighed access error" when listing the firewall rules
showed only up in the 5.0 version. I will probably downgrade to 4.7 or 4.8
(what is better to use?) again and recompile with ipfw2 then, and let you
know then. Before I will try to produce proper errror messages with the
debug kernel of 5.0.

Maybe you can try out the SRM console problem without upgrading to 5.0 as
I remember I first noticed it, when I booted from floppy or CD and called
the machine to abort. I thought first of the errors reason to be my fault
because of the abortion. Again 4.7 did not have that problem.

thanks again and with best regards from Germany

JR


---------------------------------------------------------------------------=
--
Physikalische und Theoretische Chemie der TU-Braunschweig
Jens R=F6der, Hans-Sommer Str.10, 38106 Braunschweig
---------------------------------------------------------------------------=
--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.HPX.4.33.0304090927390.16819-100000>