Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 26 Feb 2009 16:47:20 +0100
From:      Fabian Keil <freebsd-listen@fabiankeil.de>
To:        freebsd-current@freebsd.org
Subject:   Re: zfs: using, then destroying a snapshot sometimes panics zfs
Message-ID:  <20090226164720.390dbd14@fabiankeil.de>
In-Reply-To: <AE3309D0-DF67-4098-9312-8C7919A862CA@lassitu.de>
References:  <76873DDF-D21B-48AF-9AFB-5A2747BE406B@lassitu.de> <3A302EE1-F54D-4415-BC13-CA8ABBA320EC@lassitu.de> <171C5946-63D1-4AC7-89F7-A951BEF3D1C6@lassitu.de> <7EFAB629-75C5-41C1-BDAC-ADE5F69D9EF6@lassitu.de> <AE3309D0-DF67-4098-9312-8C7919A862CA@lassitu.de>

next in thread | previous in thread | raw e-mail | index | archive | help
Stefan Bethke <stb@lassitu.de> wrote:

> Am 18.02.2009 um 07:55 schrieb Stefan Bethke:
> 
> > # cd /tank/foo/.zfs
> > # ls -l
> > ls: snapshot: Bad file descriptor
> > total 0
> > # cd snapshot
> > -su: cd: snapshot: Not a directory
> 
> > Trying to umount produces a panic:
> > # zfs umount /jail/foo
> >
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 1; apic id = 01
> > fault virtual address	= 0xa8
> > fault code		= supervisor write data, page not present
> > instruction pointer	= 0x8:0xffffffff802ee565
> > stack pointer	        = 0x10:0xfffffffea29c39e0
> > frame pointer	        = 0x10:0xfffffffea29c39f0
> > code segment		= base 0x0, limit 0xfffff, type 0x1b
> > 			= DPL 0, pres 1, long 1, def32 0, gran 1
> > processor eflags	= interrupt enabled, resume, IOPL = 0
> > current process		= 51383 (zfs)
> > [thread pid 51383 tid 100298 ]
> > Stopped at      _sx_xlock+0x15: lock cmpxchgq   %rsi,0x18(%rdi)
> > db> bt
> > Tracing pid 51383 tid 100298 td 0xffffff00a598e720
> > _sx_xlock() at _sx_xlock+0x15
> > zfsctl_umount_snapshots() at zfsctl_umount_snapshots+0xa5
> > zfs_umount() at zfs_umount+0xdd
> > dounmount() at dounmount+0x2b4
> > unmount() at unmount+0x24b
> > syscall() at syscall+0x1a5
> > Xfast_syscall() at Xfast_syscall+0xab
> > --- syscall (22, FreeBSD ELF64, unmount), rip = 0x800f412fc, rsp =  
> > 0x7fffffffd1a8, rbp = 0x801202300 ---
> > db> call doadump
> 
> 
> The script I am using used to do:
> 1. create snapshot
> 2. copy data with rsync from the snapshot
> 3. destroy snapshot
> 
> Sometime after (anywhere between minutes an hours), the problem would  
> manifest itself.

Until I stopped doing it, I often got panics this way:

1) Build some port
2) Update the ports by rolling back the last snapshot
   to receive another one (zfs receive -vF)
3) Run mergemaster -a. Wait a few seconds.

Dumping core doesn't work on that system (known problem)
and as I usually did this from Xorg I couldn't get to
the debugger either.

Fabian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090226164720.390dbd14>