From owner-freebsd-stable@FreeBSD.ORG Sun Feb 5 05:24:07 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0BBBC106566C for ; Sun, 5 Feb 2012 05:24:07 +0000 (UTC) (envelope-from morgan.s.reed@gmail.com) Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id DE5198FC08 for ; Sun, 5 Feb 2012 05:24:06 +0000 (UTC) Received: by pbdv10 with SMTP id v10so5132378pbd.13 for ; Sat, 04 Feb 2012 21:24:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type; bh=yZooNNViVMnsi+rqD8CfLtohi6ezE0fQFQ5q6MXiqKA=; b=tAdmuYHf7ZnDdz5bvET4uuueTARpaCuOAor2DuvkaXOF1lXuTTMZygSs6KWGIkWKP7 nSbM4E1kiIhoD905Vhxyh0bJfvIEB7KnMdhGuzZ+SCoBaLYp0Gws8QOaSD2IpwdPrdKw e+BORA2XBiFAGEYCGxT2E8TRsjZ2FCbUzc2v4= Received: by 10.68.232.103 with SMTP id tn7mr33801524pbc.74.1328417846260; Sat, 04 Feb 2012 20:57:26 -0800 (PST) MIME-Version: 1.0 Received: by 10.68.65.226 with HTTP; Sat, 4 Feb 2012 20:57:06 -0800 (PST) From: Morgan Reed Date: Sun, 5 Feb 2012 15:57:06 +1100 Message-ID: To: freebsd-stable@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Subject: ZFS panics on pool moved from OpenSolaris X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 05 Feb 2012 05:24:07 -0000 Hi all, I'm experiencing an issue in migrating my NAS from OpenSolaris over to FreeBSD, I've tried both releng_8_2 and releng_9 I have similar issues in both cases. The pool is a RAID-Z pool comprising 4 1TB drives, it was originally created on OpenSolaris (not sure what version, 2010.09 maybe, it was one of the last ones prior to the Oracle acquisition), pool was a V14 pool, initially I built a FreeBSD-8.2 system to migrate the pool to, migrated it over OK, upgraded it from V14 to V15, but later testing revealed something wasn't happy, when listing certain directories (and even doing an ls -la at the root of the pool) resulted in a kernel panic (Mostly GENERIC kernel, rebuilt with KVA_PAGES 512 but other than that stock); panic: avl_find() succeeded inside avl_add() cpuid = 0 KDB: stack backtrace: #0 0x808e0d07 at kdb_backtrace+0x47 #1 0x808b1dc7 at panic+0x117 #2 0x862e6602 at avl_add+0x52 #3 0x8635c136 at zfs_fuid_table_load+0x1f6 #4 0x8635c3ee at zfs_fuid_init+0x14e #5 0x8635c4d7 at zfs_fuid_find_by_idx+0xb7 #6 0x8635c52d at zfs_fuid_map_id+0x2d #7 0x8635d56f at zfs_groupmember+0x2f #8 0x8636df0b at zfs_zaccess_aces_check+0x1db #9 0x8636377 at zfs_zaccess+0x57 #10 0x8636d6fb at zfs_zaccess_rwx+0x3b #11 0x86385f61 at zfs_freebsd_access+0xf1 #12 0x80c02ea2 at VOP_ACCESS_APV+0x42 #13 0x809457cf at change_dir+0x5f #14 0x809467b1 at kern_chdir+0x81 #15 0x80946a22 at chdir+0x22 #16 0x808eca39 at syscallenter+0x329 #17 0x80be4e14 at syscall+0x34 Looks like something in the permissions structure was causing grief, tried running a scrub across the pool, didn't resolve the issue. After spending some time fighting with it I decided that it wasn't worth the effort, and I upgraded to FreeBSD-9.0 to see if that would assist (I normally avoid x.0 releases), once again pool imported fine, however I was still seeing similar panics, ran a scrub across the pool, still not happy, also upgraded the pool to v28 tried again, when that failed I scrubbed again but still no joy. As a matter of interest I booted an OpenIndiana live CD and tried copying the directories contents to another location, I am now able to list the directories. However there are still issues. The issue seems to have shifted slightly, stack trace from a recent panic is below (GENERIC kernel on 9.0-RELEASE); panic: avl_find() succeeded inside avl_add() cpuid = 0 KDB: stack backtrace: #0 0xc0a4b157 at kdb_backtrace+0x47 #1 0xc0a186b7 at panic+0x117 #2 0xc5a2d7b2 at avl_add+0x52 #3 0xc5ac44e6 at zfs_fuid_table_load+0x1f6 #4 0xc5ac479e at zfs_fuid_init+0x14e #5 0xc5ac4893 at zfs_fuid_find_by_idx+0xc3 #6 0xc5ac48ed at zfs_fuid_map_id+0x2d #7 0xc5ac492f at zfs_groupmember+0x2f #8 0xc5adbdcb at zfs_zaccess_aces_check+0x1db #9 0xc5adc257 at zfs_zaccess+0xb7 #10 0xc5afa7d4 at zfs_freebsd_getattr+0x1f4 #11 0xc0d69322 at VOP_GETATTR_APV+0x42 #12 0xc0ab81c9 at vn_stat+0x79 #13 0xc0aaefdd at kern_statat_vnhook+0xfd #14 0xc0aaf1cc at kern_statat+0x3c #15 0xc0aaf156 at kern_lstat+0x36 #16 0xc0aaf1ff at sys_lstat+0x2f #17 0xc0d49315 at syscall+0x355 This time it appears to be related to some extended attribute(s), I can do an ls on one of the directories in question but an ls -la causes a panic, so it would seem that it's some attribute which is only shown in the long form of the ls output that is causing the issue. I've done some digging around via the magic of google and this seems to be a fairly common issue, but I've not found a solution for it (barring copying the data off, recreating the pool and restoring the data, I'd like to avoid this if at all possible. If I could determine what the problematic attribute was and a means to strip it (be that from FreeBSD or from an OpenIndiana liveCD) I think that will get me back up and running. If anybody can provide some suggestions as to what I may be able to do to resolve this issue in situ I would be very grateful. Thanks, Morgan