Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 16 Mar 2014 10:35:20 +0700
From:      Olivier Nicole <olivier2553@gmail.com>
To:        Doug Hardie <bc979@lafn.org>
Cc:        "questions@freebsd.org FreeBSD" <questions@freebsd.org>
Subject:   Re: Frequent Page Faults
Message-ID:  <CA%2Bg%2BBvjRy45%2BbSKttWx=R3J-WyBBLa9i_qLXFDFAysSTcNJ8yw@mail.gmail.com>
In-Reply-To: <F0344C79-D919-43B5-88B2-DCDE590741B1@lafn.org>
References:  <F0344C79-D919-43B5-88B2-DCDE590741B1@lafn.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,

Wild guess, but if you could have someone checking the hardware for
you: dirty memory connector, dust clogged CPU fan... As you said, the
system has been running for over 4 years, this could very much be
hardware.

Olivier

On Sun, Mar 16, 2014 at 7:45 AM, Doug Hardie <bc979@lafn.org> wrote:
> I have a system running:
>
> FreeBSD zoon.lafn.org 7.2-RELEASE-p5 FreeBSD 7.2-RELEASE-p5 #3: Thu Aug 1=
9 20:09:11 PDT 2010
>
> This morning it started crashing frequently.  The system had no issues pr=
ior to today.  Sometimes it auto reboots, others it just hangs.  The consol=
e messages are always very similar.  I do have core dumps for 3 of them.  A=
t first it appeared that the problem was being caused by an attack on port =
110.  While the attack was in process, the system would stay up for only a =
few minutes.  After discovering and blocking the attack, the system remaine=
d up a couple hours but crashed while I was watching it.  I am beginning to=
 suspect a HW issue that was worsened by the load of the attack, but not ca=
used directly by the attack.
>
> I have been in the process of building a new set of disks for this system=
 using 9.2, but thats not complete yet.  In addition it will take a couple =
days to get the disks on site.  Its a remote facility.  I found several ref=
erences to issues with this problem and 7.2 that have apparently been fixed=
.  However, I have been running 7.2 since it first came out on this system =
without any similar issues.  Actually, I don't recall any issues at all wit=
h 7.2.  The source for this system no longer exists.  My development system=
 has been upgraded to 9.2.  Its a modified kernel with some of the older pr=
ocessors commented out and includes QUOTA and ALTQ.  I don't recall any oth=
er changes.
>
> I can hurry up the setup of the 9.2 system, but it would be at least unti=
l Wed before it could be installed at tried.  If that would correct the pro=
blem that would be great.  However, I have a concern that there is also a H=
W issue here and am not sure how to identify such.  My review of the dumps =
shows that acpi is always involved.  Don't know for sure what that implies =
though.  I don't believe the os has degraded and it has not been touched si=
nce 2010.  That pretty much leaves me with a HW issue.  Any ideas where the=
 problem is will be appreciated.
>
> Here are a couple of the dumps.  The first one was while the attack was i=
n progress.  The second after it was terminated.
>
> --------------------------------------------------------------
>
> zoon# kgdb /boot/kernel/kernel vmcore.0
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you =
are
> welcome to change it and/or distribute copies of it under certain conditi=
ons.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for detail=
s.
> This GDB was configured as "i386-marcel-freebsd"...
>
> Unread portion of the kernel message buffer:
> 3)
> trap number             =3D 12
> panic: page fault
> cpuid =3D 3
> Uptime: 3m48s
> Physical memory: 1993 MB
> Dumping 187 MB: 172 156 140 124
>
> Fatal trap 12: page fault while in kernel mode
> cpuid =3D 1; apic id =3D 01
> fault virtual address   =3D 0x4
> fault code              =3D supervisor write, page not present
> instruction pointer     =3D 0x20:0xc0c72c00
> stack pointer           =3D 0x28:0xe566faf4
> frame pointer           =3D 0x28:0xe566fb14
> code segment            =3D base 0x0, limit 0xfffff, type 0x1b
>                         =3D DPL 0, pres 1, def32 1, gran 1
> processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
> current process         =3D 4 (g_down)
> trap number             =3D 12
>  108 92 76 60 44 28 12
>
> Reading symbols from /boot/kernel/fdescfs.ko...Reading symbols from /boot=
/kernel/fdescfs.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/fdescfs.ko
> Reading symbols from /boot/kernel/pflog.ko...Reading symbols from /boot/k=
ernel/pflog.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/pflog.ko
> Reading symbols from /boot/kernel/pf.ko...Reading symbols from /boot/kern=
el/pf.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/pf.ko
> Reading symbols from /boot/kernel/acpi.ko...Reading symbols from /boot/ke=
rnel/acpi.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/acpi.ko
> #0  doadump () at pcpu.h:196
> 196     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) where
> #0  doadump () at pcpu.h:196
> #1  0xc07a5e27 in boot (howto=3D260) at /usr2/src/sys/kern/kern_shutdown.=
c:418
> #2  0xc07a60f9 in panic (fmt=3DVariable "fmt" is not available.
> ) at /usr2/src/sys/kern/kern_shutdown.c:574
> #3  0xc0aa792c in trap_fatal (frame=3D0xc4ff8c48, eva=3D1361334589)
>     at /usr2/src/sys/i386/i386/trap.c:939
> #4  0xc0aa7b90 in trap_pfault (frame=3D0xc4ff8c48, usermode=3D0, eva=3D13=
61334589)
>     at /usr2/src/sys/i386/i386/trap.c:852
> #5  0xc0aa8512 in trap (frame=3D0xc4ff8c48) at /usr2/src/sys/i386/i386/tr=
ap.c:530
> #6  0xc0a8d62b in calltrap () at /usr2/src/sys/i386/i386/exception.s:159
> #7  0xc0e21715 in acpi_cpu_c1 ()
>     at /usr2/src/sys/modules/acpi/acpi/../../../i386/acpica/acpi_machdep.=
c:550
> #8  0xc0e1a594 in acpi_cpu_idle ()
>     at /usr2/src/sys/modules/acpi/acpi/../../../dev/acpica/acpi_cpu.c:943
> #9  0xc0a97f78 in cpu_idle () at /usr2/src/sys/i386/i386/machdep.c:1183
> #10 0xc07c7904 in sched_idletd (dummy=3D0x0)
>     at /usr2/src/sys/kern/sched_ule.c:2681
> #11 0xc07808d9 in fork_exit (callout=3D0xc07c7640 <sched_idletd>, arg=3D0=
x0,
>     frame=3D0xc4ff8d38) at /usr2/src/sys/kern/kern_fork.c:810
> #12 0xc0a8d6a0 in fork_trampoline () at /usr2/src/sys/i386/i386/exception=
.s:264
>
>
>
>
> --------------------------------------------------------------
>
>
> zoon# kgdb /boot/kernel/kernel vmcore.2
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you =
are
> welcome to change it and/or distribute copies of it under certain conditi=
ons.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for detail=
s.
> This GDB was configured as "i386-marcel-freebsd"...
>
> Unread portion of the kernel message buffer:
>
> Fatal double fault:
> eip =3D 0xc0e21715
> esp =3D 0xc4ff8d80
> ebp =3D 0xc4ff8c88
> cpuid =3D 3; apic id =3D 03
> panic: double fault
> cpuid =3D 3
> Uptime: 2h19m49s
> Physical memory: 1993 MB
> Dumping 187 MB: 172 156 140 124 108 92 76 60 44 28 12
>
> Reading symbols from /boot/kernel/fdescfs.ko...Reading symbols from /boot=
/kernel/fdescfs.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/fdescfs.ko
> Reading symbols from /boot/kernel/pflog.ko...Reading symbols from /boot/k=
ernel/pflog.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/pflog.ko
> Reading symbols from /boot/kernel/pf.ko...Reading symbols from /boot/kern=
el/pf.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/pf.ko
> Reading symbols from /boot/kernel/acpi.ko...Reading symbols from /boot/ke=
rnel/acpi.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/acpi.ko
> #0  doadump () at pcpu.h:196
> 196     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) up
> #1  0xc07a5e27 in boot (howto=3D260) at /usr2/src/sys/kern/kern_shutdown.=
c:418
> 418     /usr2/src/sys/kern/kern_shutdown.c: No such file or directory.
>         in /usr2/src/sys/kern/kern_shutdown.c
> (kgdb) down
> #0  doadump () at pcpu.h:196
> 196     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) where
> #0  doadump () at pcpu.h:196
> #1  0xc07a5e27 in boot (howto=3D260) at /usr2/src/sys/kern/kern_shutdown.=
c:418
> #2  0xc07a60f9 in panic (fmt=3DVariable "fmt" is not available.
> ) at /usr2/src/sys/kern/kern_shutdown.c:574
> #3  0xc0aa763b in dblfault_handler () at /usr2/src/sys/i386/i386/trap.c:9=
72
> #4  0xc0e21715 in acpi_cpu_c1 ()
>     at /usr2/src/sys/modules/acpi/acpi/../../../i386/acpica/acpi_machdep.=
c:550
> #5  0xc0e1a594 in acpi_cpu_idle ()
>     at /usr2/src/sys/modules/acpi/acpi/../../../dev/acpica/acpi_cpu.c:943
> #6  0xc0a97f78 in cpu_idle () at /usr2/src/sys/i386/i386/machdep.c:1183
> #7  0xc07c7904 in sched_idletd (dummy=3D0x0)
>     at /usr2/src/sys/kern/sched_ule.c:2681
> #8  0xc07808d9 in fork_exit (callout=3D0xc07c7640 <sched_idletd>, arg=3D0=
x0,
>     frame=3D0xc4ff8d38) at /usr2/src/sys/kern/kern_fork.c:810
> #9  0xc0a8d6a0 in fork_trampoline () at /usr2/src/sys/i386/i386/exception=
.s:264
> (kgdb)
>
> _______________________________________________
> freebsd-questions@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.o=
rg"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2Bg%2BBvjRy45%2BbSKttWx=R3J-WyBBLa9i_qLXFDFAysSTcNJ8yw>