Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 20 Jun 2017 23:37:10 +0300
From:      Andriy Gapon <avg@FreeBSD.org>
To:        Ken Merry <ken@FreeBSD.org>
Cc:        src-committers@FreeBSD.org, svn-src-all@FreeBSD.org, svn-src-head@FreeBSD.org
Subject:   Re: svn commit: r320156 - in head: cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/cmd/ztest cddl/contrib/opensolaris/lib/libzfs/common sys/cddl/contrib/opensolaris/common/zfs sys/cddl/contri...
Message-ID:  <fc648de9-576d-b5c4-0436-e9597decadf2@FreeBSD.org>
In-Reply-To: <81F84BCA-E973-4D78-B81C-1D398ADFA47E@freebsd.org>
References:  <201706201739.v5KHdPhO051256@repo.freebsd.org> <81F84BCA-E973-4D78-B81C-1D398ADFA47E@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 20/06/2017 23:29, Ken Merry wrote:
> I don’t know for sure that this commit is the cause, but it (and r320153) are the only ZFS commits between a version of head from June 14th that boots off a ZFS mirror, and one that panics.
> 
> Here’s the stack trace:
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 22; 
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 9; apic id = 09
> fault virtual address   = 0x0
> fault code              = supervisor read data, page not present
> instruction pointer     = 0x20:0xffffffff81e47f21
> stack pointer           = 0x28:0xfffffe08b37f8810
> frame pointer           = 0x28:0xfffffe08b37f8860
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 0 (zio_free_issue_0_3)
> [ thread pid 0 tid 100478 ]
> Stopped at      0xffffffff81e47f21 = zio_vdev_io_start+0x1f1:   testb   $0x1,(%rax)
> db> bt
> Tracing pid 0 tid 100478 td 0xfffff80193156000
> zio_vdev_io_start() at 0xffffffff81e47f21 = zio_vdev_io_start+0x1f1/frame 0xfffffe08b37f8860
> zio_execute() at 0xffffffff81e4312c = zio_execute+0x36c/frame 0xfffffe08b37f88b0
> zio_nowait() at 0xffffffff81e422b8 = zio_nowait+0xb8/frame 0xfffffe08b37f88e0
> vdev_mirror_io_start() at 0xffffffff81e224fc = vdev_mirror_io_start+0x38c/frame 0xfffffe08b37f8930
> zio_vdev_io_start() at 0xffffffff81e48030 = zio_vdev_io_start+0x300/frame 0xfffffe08b37f8990
> zio_execute() at 0xffffffff81e4312c = zio_execute+0x36c/frame 0xfffffe08b37f89e0
> taskqueue_run_locked() at 0xffffffff809a9d6d = taskqueue_run_locked+0x13d/frame 0xfffffe08b37f8a40
> taskqueue_thread_loop() at 0xffffffff809aab28 = taskqueue_thread_loop+0x88/frame 0xfffffe08b37f8a70
> fork_exit() at 0xffffffff8091e3e4 = fork_exit+0x84/frame 0xfffffe08b37f8ab0
> fork_trampoline() at 0xffffffff80d930fe = fork_trampoline+0xe/frame 0xfffffe08b37f8ab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> db> 
> 
> (kgdb) list *(zio_vdev_io_start+0x1f1)
> 0xd9f21 is in zio_vdev_io_start (/usr/home/kenm/perforce4/kenm/FreeBSD-test/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:350).
> 345
> 346             /*
> 347              * Ensure that anyone expecting this zio to contain a linear ABD isn't
> 348              * going to get a nasty surprise when they try to access the data.
> 349              */
> 350             IMPLY(abd_is_linear(zio->io_abd), abd_is_linear(data));
> 351
> 352             zt->zt_orig_abd = zio->io_abd;
> 353             zt->zt_orig_size = zio->io_size;
> 354             zt->zt_bufsize = bufsize;
> 
> I’ll try rebooting and see if the problem goes away.  If not, I’ll roll back the ABD change and see if the problem goes away.

Judging from the thread that panic-ed the problem may have to do with our TRIM
support.  Unfortunately,  I didn't have a chance to test the change on a system
with working TRIM and, so, I missed it.
I will look into this further, but it's almost obvious that the problem is
caused by zio->io_abd being NULL for a zio of type ZIO_TYPE_FREE.

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?fc648de9-576d-b5c4-0436-e9597decadf2>