From owner-freebsd-fs@FreeBSD.ORG Fri Mar 21 10:43:47 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 96825924; Fri, 21 Mar 2014 10:43:47 +0000 (UTC) Received: from mail-we0-x22c.google.com (mail-we0-x22c.google.com [IPv6:2a00:1450:400c:c03::22c]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id D2BA7881; Fri, 21 Mar 2014 10:43:46 +0000 (UTC) Received: by mail-we0-f172.google.com with SMTP id t61so1467529wes.31 for ; Fri, 21 Mar 2014 03:43:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:user-agent:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=ieiFr+1sVMKRc4JSgjatA4Q/kfjAfJwX2YFkebMKHQM=; b=m2uqmf65ZZ+6UqSPbuQU0hjDbT2WLb+B2CxpDvwKkWz4nGEsFE/g8TgsCRTFbNgV8z xBxHMvOEb5/KmHkzwd+skWFKFyVwI5Wbqr/acpuVkIiMmDVT+8nXHt7qYpXg44nzWqN6 Adj5yWAaHl06UcacAobj+rhJeoI+PqlrBV5aBg4BxZvh3O3NvKANGLONmgcDgirTI8nX 6Fhk4iTeT6+Pqk4arOLNZnnYm+3ccMbw74ratFQ80RvcUQiFCfw1ug1c+vzmzM7QjKQ/ Dlo8l1fAUDUn+9t8j6YLH8vpxnQWRb8a33T0iNpwb6cWhT/BTImRtWHSFMnLAM8nRBqZ HPyg== X-Received: by 10.194.92.228 with SMTP id cp4mr955659wjb.81.1395398625269; Fri, 21 Mar 2014 03:43:45 -0700 (PDT) Received: from mavbook.mavhome.dp.ua ([134.249.139.101]) by mx.google.com with ESMTPSA id q15sm12223741wjw.18.2014.03.21.03.43.42 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 21 Mar 2014 03:43:44 -0700 (PDT) Sender: Alexander Motin Message-ID: <532C17DD.9030704@FreeBSD.org> Date: Fri, 21 Mar 2014 12:43:41 +0200 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0 MIME-Version: 1.0 To: Andriy Gapon , Andreas Longwitz , freebsd-fs@FreeBSD.org, freebsd-geom@FreeBSD.org Subject: Re: g_mirror_access() dropping geom topology_lock [Was: Kernel crash trying to import a ZFS pool with log device] References: <532B5A0C.1010008@incore.de> <532C085D.3020201@FreeBSD.org> In-Reply-To: <532C085D.3020201@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Mar 2014 10:43:47 -0000 On 21.03.2014 11:37, Andriy Gapon wrote: > on 20/03/2014 23:13 Andreas Longwitz said the following: > [snip] >> But if I now run "zpool export" and "zpool import" the kernel crashes: >> ... >> vdev_geom_open_by_path:554[1]: Found provider by name /dev/label/C325BL31. >> vdev_geom_attach:102[1]: Attaching to label/C325BL31. >> g_access(0xffffff0096b23a00(label/C325BL31), 1, 0, 1) >> open delta:[r1w0e1] old:[r0w0e0] provider:[r0w0e0] >> 0xffffff0002c0d400(label/C325BL31) >> g_access(0xffffff0002ba7300(da2), 1, 0, 2) >> open delta:[r1w0e2] old:[r0w0e0] provider:[r0w0e0] 0xffffff0002a23800(da2) >> g_disk_access(da2, 1, 0, 2) >> vdev_geom_attach:123[1]: Created geom and consumer for label/C325BL31. >> vdev_geom_read_config:248[1]: Reading config from label/C325BL31... >> vdev_geom_open_by_path:569[1]: guid match for provider /dev/label/C325BL31. >> vdev_geom_open_by_path:554[1]: Found provider by name /dev/label/C330CJHW. >> vdev_geom_attach:102[1]: Attaching to label/C330CJHW. >> g_access(0xffffff00969a5280(label/C330CJHW), 1, 0, 1) >> open delta:[r1w0e1] old:[r0w0e0] provider:[r0w0e0] >> 0xffffff0002c02100(label/C330CJHW) >> g_access(0xffffff0002b96280(da3), 1, 0, 2) >> open delta:[r1w0e2] old:[r0w0e0] provider:[r0w0e0] 0xffffff0002a23b00(da3) >> g_disk_access(da3, 1, 0, 2) >> vdev_geom_attach:143[1]: Created consumer for label/C330CJHW. >> vdev_geom_read_config:248[1]: Reading config from label/C330CJHW... >> vdev_geom_open_by_path:569[1]: guid match for provider /dev/label/C330CJHW. >> vdev_geom_open_by_path:554[1]: Found provider by name /dev/mirror/gm0p3. >> vdev_geom_attach:102[1]: Attaching to mirror/gm0p3. >> g_access(0xffffff0096b24180(mirror/gm0p3), 1, 0, 1) >> open delta:[r1w0e1] old:[r0w0e0] provider:[r0w0e0] >> 0xffffff00969c7e00(mirror/gm0p3) >> g_part_access(mirror/gm0p3,1,0,1) >> g_access(0xffffff0096c0f800(mirror/gm0), 1, 0, 1) >> open delta:[r1w0e1] old:[r8w8e16] provider:[r8w8e16] >> 0xffffff00969c7d00(mirror/gm0) >> GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e1. > > The following part of the log is very informative. > Thank you for capturing it. > >> vdev_geom_attach:143[1]: Created consumer for mirror/gm0p3. >> vdev_geom_read_config:248[1]: Reading config from mirror/gm0p3... >> vdev_geom_open_by_path:569[1]: guid match for provider /dev/mirror/gm0p3. > > I read the above as thread A entering vdev_geom_open_by_path and trying to > "taste" /dev/mirror/gm0p3. > >> g_post_event_x(0xffffffff80b16830, 0xffffff0096b24180, 2, 0) >> vdev_geom_detach:163[1]: Closing access to mirror/gm0p3. >> g_access(0xffffff0096b24180(mirror/gm0p3), -1, 0, -1) >> open delta:[r-1w0e-1] old:[r1w0e1] provider:[r1w0e1] >> 0xffffff00969c7e00(mirror/gm0p3) > > > Simultaneously thread B is closing access /dev/mirror/gm0p3. > It is not clear from the quoted log when and how this thread B got access to the > device in the first place. > >> g_part_access(mirror/gm0p3,-1,0,-1) >> g_access(0xffffff0096c0f800(mirror/gm0), -1, 0, -1) >> open delta:[r-1w0e-1] old:[r9w8e17] provider:[r9w8e17] >> 0xffffff00969c7d00(mirror/gm0) >> GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w0e-1. >> vdev_geom_open_by_path:554[1]: Found provider by name /dev/mirror/gm0p3. >> vdev_geom_attach:102[1]: Attaching to mirror/gm0p3. >> vdev_geom_attach:128[1]: Found consumer for mirror/gm0p3. >> g_access(0xffffff0096b24180(mirror/gm0p3), 1, 0, 1) >> open delta:[r1w0e1] old:[r1w0e1] provider:[r1w0e1] >> 0xffffff00969c7e00(mirror/gm0p3) > > Thread A is accessing the device. > >> g_part_access(mirror/gm0p3,1,0,1) >> g_access(0xffffff0096c0f800(mirror/gm0), 1, 0, 1) >> Open delta:[r1w0e1] old:[r9w8e17] provider:[r9w8e17] >> 0xffffff00969c7d00(mirror/gm0) >> GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e1. >> vdev_geom_detach:167[1]: Destroyed consumer to mirror/gm0p3. >> g_detach(0xffffff0096b24180) >> g_destroy_consumer(0xffffff0096b24180) > > Thread B is destroying a special ZFS "taster" consumer. > >> vdev_geom_attach:147[1]: Used existing consumer for mirror/gm0p3. > > Thread A is trying to re-use the taster consumer, which has just been destroyed. > >> vdev_geom_read_config:248[1]: >> >> Fatal trap 12: page fault while in kernel mode > > Boom! > > I see two issues here. > First, the ZFS tasting code could be made more robust. If it never tried to > re-use the consumer and always created a new one, then most likely this crash > could be avoided. But there is no bug in the code. The code is correct and it > it uses GEOM topology lock to avoid any concurrency issues. > > But GEOM mirror code breaks a contract on which the ZFS code relies. > g_access() must be called with the topology lock hold. > I extend this requirement to a requirement that access method of any GEOM > provider must operate under the topology lock and must never drop it. > In other words, if a caller must acquire g_topology_lock before calling > g_access, then in return it must have a guarantee that the GEOM topology stays > unchanged across the call to g_access(). > g_mirror_access() breaks the above contract. > > So, the code in vdev_geom_attach() obtains g_topology_lock, then it finds an > existing valid consumer and calls g_access() on it. It reasonably expects that > the consumer remains valid, but because g_mirror_access() drops and requires the > topology lock, there is a chance that the topology can change and the consumer > may become invalid. > > I am not very familiar with gmirror code, so I am not sure how to fix the > problem from that end. I can confirm this. I know about this problem for some time already. The same issue as shown in GMIRROR is also present in GRAID. AFAIR the problem is in keeping lock order between GEOM topology lock and class' own lock. The only "excuse" is that it is not very reasonable to have ZFS on top of GMIRROR or GRAID. -- Alexander Motin