Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 May 2010 22:48:19 +0200
From:      Fabian Keil <fk@fabiankeil.de>
To:        Roman Bogorodskiy <bogorodskiy@gmail.com>
Cc:        current@freebsd.org
Subject:   Re: ffs_copyonwrite panics
Message-ID:  <20100518224819.28d9624b@r500.local>
In-Reply-To: <20100518185201.GA2745@fsol>
References:  <20100518185201.GA2745@fsol>

next in thread | previous in thread | raw e-mail | index | archive | help
--Sig_/ULCw+2R6QiNFyA4SQLvZX9X
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Roman Bogorodskiy <bogorodskiy@gmail.com> wrote:

> I've been using -CURRENT last update in February for quite a long time
> and few weeks ago decided to finally update it. The update was quite
> unfortunate as system became very unstable: it just hangs few times a
> day and panics sometimes.
>=20
> Some things can be reproduced, some cannot. Reproducible ones:
>=20
> 1. background fsck always makes system hang
> 2. system crashes on operations with nullfs mounts (disabled that for
> now)
>=20
> The most annoying one is ffs_copyonwrite panic which I cannot reproduce.
> The thing is that if I will run 'startx' on it with some X apps it will
> panic just in few minutes. When I leave the box with nearly no stress
> (just use it as internet gateway for my laptop) it behaves a little
> better but will eventually crash in few hours anyway.
>=20
> The even more annoying thing is that when I cannot save the dump,
> because when the system boots and runs 'savecore' it leads to
> fss_copyonwrite panic as well. The panic happens when about 90% complete
> (as seem via ctrl-t).
>=20
> Any ideas how to debug and get rid of this issue?
>=20
> System arch is amd64. I don't know what other details could be useful.

I'm not familiar with the background fsck issue, but if the nullfs
panic looks like this one, there's a fair chance it's already fixed:

Fatal trap 12: page fault while in kernel mode
cpuid =3D 0; apic id =3D 00
fault virtual address	=3D 0x10
fault code		=3D supervisor read data, page not present
instruction pointer	=3D 0x20:0xffffffff82412f14
stack pointer	        =3D 0x28:0xffffff803e564620
frame pointer	        =3D 0x28:0xffffff803e564770
code segment		=3D base 0x0, limit 0xfffff, type 0x1b
			=3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	=3D interrupt enabled, resume, IOPL =3D 0
current process		=3D 1825 (jail)
panic: from debugger
cpuid =3D 0
Uptime: 38s
Dumping 1992 MB (5 chunks)
  chunk 0: 1MB (155 pages) ... ok
  chunk 1: 1990MB (509345 pages) 1974 [...] 6 ... ok
  chunk 2: 2MB (273 pages) ... ok
  chunk 3: 1MB (184 pages)

#0  doadump () at pcpu.h:223
223	pcpu.h: No such file or directory.
	in pcpu.h
(kgdb) #0  doadump () at pcpu.h:223
#1  0xffffffff803c506f in boot (howto=3D260)
    at /usr/src/sys/kern/kern_shutdown.c:416
#2  0xffffffff803c546c in panic (fmt=3DVariable "fmt" is not available.
)
    at /usr/src/sys/kern/kern_shutdown.c:590
#3  0xffffffff801f6e77 in db_panic (addr=3DVariable "addr" is not available.
)
    at /usr/src/sys/ddb/db_command.c:478
#4  0xffffffff801f7281 in db_command (last_cmdp=3D0xffffffff808bfd80, cmd_t=
able=3DVariable "cmd_table" is not available.

) at /usr/src/sys/ddb/db_command.c:445
#5  0xffffffff801f74d0 in db_command_loop ()
    at /usr/src/sys/ddb/db_command.c:498
#6  0xffffffff801f9429 in db_trap (type=3DVariable "type" is not available.
) at /usr/src/sys/ddb/db_main.c:229
#7  0xffffffff803f3c25 in kdb_trap (type=3D12, code=3D0, tf=3D0xffffff803e5=
64570)
    at /usr/src/sys/kern/subr_kdb.c:535
#8  0xffffffff8062ad9d in trap_fatal (frame=3D0xffffff803e564570, eva=3DVar=
iable "eva" is not available.
)
    at /usr/src/sys/amd64/amd64/trap.c:773
#9  0xffffffff8062b0fc in trap_pfault (frame=3D0xffffff803e564570, usermode=
=3D0)
    at /usr/src/sys/amd64/amd64/trap.c:694
#10 0xffffffff8062b8ff in trap (frame=3D0xffffff803e564570)
    at /usr/src/sys/amd64/amd64/trap.c:451
#11 0xffffffff80611f33 in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:223
#12 0xffffffff82412f14 in null_bypass (ap=3D0xffffff803e564780)
    at /usr/src/sys/modules/nullfs/../../fs/nullfs/null_vnops.c:269
#13 0xffffffff80448104 in vgonel (vp=3D0xffffff0005e05780) at vnode_if.h:10=
99
#14 0xffffffff8044835e in vrecycle (vp=3D0xffffff0005e05780, td=3DVariable =
"td" is not available.
)
    at /usr/src/sys/kern/vfs_subr.c:2505
#15 0xffffffff82412e6f in null_inactive (ap=3DVariable "ap" is not availabl=
e.
)
    at /usr/src/sys/modules/nullfs/../../fs/nullfs/null_vnops.c:665
#16 0xffffffff80444ff8 in vinactive (vp=3D0xffffff0005e05780,=20
    td=3D0xffffff00054743e0) at vnode_if.h:807
#17 0xffffffff804495dd in vputx (vp=3D0xffffff0005e05780, func=3D2)
    at /usr/src/sys/kern/vfs_subr.c:2226
#18 0xffffffff8043e1ae in lookup (ndp=3D0xffffff803e564a50)
    at /usr/src/sys/kern/vfs_lookup.c:905
#19 0xffffffff8043eef7 in namei (ndp=3D0xffffff803e564a50)
    at /usr/src/sys/kern/vfs_lookup.c:269
#20 0xffffffff8044ec86 in kern_accessat (td=3D0xffffff00054743e0, fd=3D-100=
,=20
    path=3D0x800537000 <Address 0x800537000 out of bounds>, pathseg=3DVaria=
ble "pathseg" is not available.
)
    at /usr/src/sys/kern/vfs_syscalls.c:2140
#21 0xffffffff8062b21d in syscall (frame=3D0xffffff803e564c80)
    at /usr/src/sys/amd64/amd64/trap.c:946
#22 0xffffffff80612211 in Xfast_syscall ()
    at /usr/src/sys/amd64/amd64/exception.S:374
#23 0x000000080050e5ec in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)=20

I got it reproducible with:

FreeBSD 9.0-CURRENT #66 r+3fe665b: Fri May 14 17:45:10 CEST 2010
    fk@r500.local:/usr/obj/usr/src/sys/ZOEY amd64

but it had already been fixed in Subversion/CVS on Saturday so I
didn't investigate which commit caused it and which one fixed it.

My previous kernel without the issue was:
FreeBSD 9.0-CURRENT #65 r+6f48909: Sat May  8 19:28:58 CEST 2010
I'm currently using:
FreeBSD 9.0-CURRENT #69 r+3a7afc7: Sun May 16 20:04:53 CEST 2010
without any issues either. I don't use background fsck, though.

Fabian

--Sig_/ULCw+2R6QiNFyA4SQLvZX9X
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (FreeBSD)

iEYEARECAAYFAkvy/RcACgkQSMVSH78upWORUgCePNh7EXdAVeybbfwOG0IOv+pJ
7HkAnRgunNyTSh5tJS7uJb5fDBOr4R8c
=Qz9Q
-----END PGP SIGNATURE-----

--Sig_/ULCw+2R6QiNFyA4SQLvZX9X--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100518224819.28d9624b>