Date: Sun, 10 Jan 2016 18:38:10 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 206109] zpool import of corrupt pool causes system to reboot Message-ID: <bug-206109-8@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D206109 Bug ID: 206109 Summary: zpool import of corrupt pool causes system to reboot Product: Base System Version: 10.2-RELEASE Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: emilec@clarotech.co.za I recently setup a new RAIDZ2 pool with 5 x 4TB Seagate NAS drives using NAS4Free 10.2.0.2 (revision 2235). I discovered after copying data from an existing NAS to my new pool that there was some corruption detected. I attempted to run a scrub, but partway through the system crashed and went i= nto a boot loop.=20 I reloaded NAS4Free and tried to import the pool, but each time it would re= boot the system. I then tried FreeBSD-10.2-RELEASE-amd64-mini-memstick and an im= port of the pool would also cause the system to reboot. I could however mount the pool read-only and access data. >From the NAS4Free logs I was able to obtain the following when the system crashed after attempting an import: Jan 1 16:21:28 nas4free syslogd: kernel boot file is /boot/kernel/kernel Jan 1 16:21:28 nas4free kernel: Solaris: WARNING: blkptr at 0xfffffe0003a5= fa40 DVA 1 has invalid VDEV 16384 Jan 1 16:21:28 nas4free kernel: Jan 1 16:21:28 nas4free kernel: Jan 1 16:21:28 nas4free kernel: Fatal trap 12: page fault while in kernel = mode Jan 1 16:21:28 nas4free kernel: cpuid =3D 1; apic id =3D 01 Jan 1 16:21:28 nas4free kernel: fault virtual address =3D 0x50 Jan 1 16:21:28 nas4free kernel: fault code =3D supervisor read= data, page not present Jan 1 16:21:28 nas4free kernel: instruction pointer =3D 0x20:0xffffffff81e79f94 Jan 1 16:21:28 nas4free kernel: stack pointer =3D 0x28:0xfffffe0169ef5740 Jan 1 16:21:28 nas4free kernel: frame pointer =3D 0x28:0xfffffe0169ef5750 Jan 1 16:21:28 nas4free kernel: code segment =3D base 0x0, limit 0xfffff, type 0x1b Jan 1 16:21:28 nas4free kernel: =3D DPL 0, pres 1, long 1, def32 0, gran 1 Jan 1 16:21:28 nas4free kernel: processor eflags =3D interrupt enabl= ed, resume, IOPL =3D 0 Jan 1 16:21:28 nas4free kernel: current process =3D 6 (txg_thread_enter) Jan 1 16:21:28 nas4free kernel: trap number =3D 12 Jan 1 16:21:28 nas4free kernel: panic: page fault Jan 1 16:21:28 nas4free kernel: cpuid =3D 1 Jan 1 16:21:28 nas4free kernel: KDB: stack backtrace: Jan 1 16:21:28 nas4free kernel: #0 0xffffffff80a86a70 at kdb_backtrace+0x60 Jan 1 16:21:28 nas4free kernel: #1 0xffffffff80a4a1d6 at vpanic+0x126 Jan 1 16:21:28 nas4free kernel: #2 0xffffffff80a4a0a3 at panic+0x43 Jan 1 16:21:28 nas4free kernel: #3 0xffffffff80ecaedb at trap_fatal+0x36b Jan 1 16:21:28 nas4free kernel: #4 0xffffffff80ecb1dd at trap_pfault+0x2ed Jan 1 16:21:28 nas4free kernel: #5 0xffffffff80eca87a at trap+0x47a Jan 1 16:21:28 nas4free kernel: #6 0xffffffff80eb0c72 at calltrap+0x8 Jan 1 16:21:28 nas4free kernel: #7 0xffffffff81e8071f at vdev_mirror_child_select+0x6f Jan 1 16:21:28 nas4free kernel: #8 0xffffffff81e802d0 at vdev_mirror_io_start+0x270 Jan 1 16:21:28 nas4free kernel: #9 0xffffffff81e9cd86 at zio_vdev_io_start+0x1d6 Jan 1 16:21:28 nas4free kernel: #10 0xffffffff81e998b2 at zio_execute+0x162 Jan 1 16:21:28 nas4free kernel: #11 0xffffffff81e991b9 at zio_nowait+0x49 Jan 1 16:21:28 nas4free kernel: #12 0xffffffff81e1c91e at arc_read+0x8fe Jan 1 16:21:28 nas4free kernel: #13 0xffffffff81e577b2 at dsl_scan_prefetch+0xc2 Jan 1 16:21:28 nas4free kernel: #14 0xffffffff81e574a3 at dsl_scan_visitbp+0x583 Jan 1 16:21:28 nas4free kernel: #15 0xffffffff81e5722f at dsl_scan_visitbp+0x30f Jan 1 16:21:28 nas4free kernel: #16 0xffffffff81e5722f at dsl_scan_visitbp+0x30f Jan 1 16:21:28 nas4free kernel: Copyright (c) 1992-2015 The FreeBSD Projec= t. status of pool after read-only import: zpool import -F -f -o readonly=3Don -R /pool0 pool0 zpool status pool: pool0 state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: scrub in progress since Wed Dec 30 13:34:03 2015 1.06T scanned out of 8.53T at 1/s, (scan is slow, no estimated time) 0 repaired, 12.45% done config: NAME STATE READ WRITE CKSUM pool0 ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 ada0 ONLINE 0 0 0 ada1 ONLINE 0 0 0 ada2 ONLINE 0 0 0 ada3 ONLINE 0 0 0 ada4 ONLINE 0 0 0 errors: 1 data errors, use '-v' for a list I eventually discovered that the corruption was caused by faulty RAM (fails memtest). So I accept the pool is corrupt. Seeing as NAS4Free relies on FreeBSD and the behaviour is the same I thought this would be the best place to log a bug, but feel free to point me back to NAS4Free. Their forums however suggested that ZFS is enterprise and enterpr= ise would simply restore from backup. I believe it would be nice to rather catch the exception and print an error rather than reboot the system. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-206109-8>