From owner-freebsd-geom@freebsd.org Mon Mar 12 13:17:52 2018 Return-Path: Delivered-To: freebsd-geom@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C4796F45305; Mon, 12 Mar 2018 13:17:52 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lf0-f49.google.com (mail-lf0-f49.google.com [209.85.215.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 512517AA0C; Mon, 12 Mar 2018 13:17:52 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lf0-f49.google.com with SMTP id q69-v6so23124566lfi.10; Mon, 12 Mar 2018 06:17:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:from:subject:message-id:date:user-agent :mime-version:content-language:content-transfer-encoding; bh=5jBSZcD7KR50Fj82AkHGF6h336mA1tGGLVXNTXAcH5c=; b=i53biMxorStuutX2Bb08VH0JVq2ugjn4MmeNQzbtMrvT86JPrr0BsIzAzNMg9p07UQ dXZzpkrN4GoZOxJuoDDDEjeMEq92+BnQ2/4SW9pNt3+pr1h86jNmP5OKU2yCsFrnAQES knA/BuzdmthjvCgY7lOq0uTrhnAMFjPjqW/xKwwRPPE8UrgUwznZ5/eKABUEI9aE/FIm lfuYcwgKZIYehYC5+cbrnRkev+ViJmZmmpfJgluMO4TWhuwgqrRoOoqbQgwGT9WhwQLi p8rDDMuFOlT1teuWv57Wv2Otkr4yl84i1mr2P3Mt766OiDD5Y4admoBL0GfbuwVsXidF C6CQ== X-Gm-Message-State: AElRT7EGPL9QsYqZLcL26Qs8GyKw4LsufhuYc0GtRS+J+RskZirMPol+ DlMv9tdhHNzdM+30it5/WOPhKPBf X-Google-Smtp-Source: AG47ELseJmK9YZf2riyV4lybw+SdZX9VsFxeudKefuuUYpATjSKV5ixwNxlIF6Hm+RnyR3+7kKU5bA== X-Received: by 10.46.14.10 with SMTP id 10mr5377960ljo.64.1520860670389; Mon, 12 Mar 2018 06:17:50 -0700 (PDT) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id v3sm1759720ljd.59.2018.03.12.06.17.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 12 Mar 2018 06:17:49 -0700 (PDT) To: freebsd-geom@FreeBSD.org, freebsd-arch@FreeBSD.ORG From: Andriy Gapon Subject: geom->access problem and workaround Message-ID: <809d9254-ee56-59d8-69a4-08838e985cea@FreeBSD.org> Date: Mon, 12 Mar 2018 15:17:48 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Mar 2018 13:17:53 -0000 According to Poul-Henning (phk@), the principal author of GEOM, a GEOM class's access method was intended to be a light-weight operation involving mostly access counts. That is, it should be (have been) close in spirit to what g_access() function does. The method is only called from g_access and it is always done under the GEOM topology lock (like with most GEOM "control plane" methods). The lock ensures that the method and the function operate on a consistent state of the topology and all geoms in it. In reality, many classes have their access method do a lot more than just checking and modifying access bits. And often, what the method does is incompatible with the topology lock. Some examples. g_mirror_access() has to drop and reacquire the topology lock to avoid a LOR (deadlock) because the method needs to use the class's internal sc_lock. zvol_geom_access() also has to drop and reacquire the topology lock when it interacts with ZFS internals involving many locks. The main issue here is that ZFS is both above the GEOM when ZFS uses GEOM for the storage access and it is "below" the GEOM when ZFS is accessed through the ZVOL provider. g_disk_access() -> daopen(). In this case the topology lock is never dropped, but the operation issues multiple SCSI commands and waits for their completion. So, if something goes wrong and takes a long time to complete then the whole topology will be frozen for all that time. [Perhaps doing the lock dance would be a better alternative] But, of course, dropping the lock does not come free. It opens races where two (at least) sets of incompatible access counts may get granted. Or a special action, that should be done only on a first access to a geom, could be executed more than once. Bringing everything to conformance with the original design would be an ideal solution, but it will take a lot of work both in the individual nonconforming classes and in at least some of their consumers. It seems to require moving all the complex operations from access methods to the GEOM "data plane". E.g, doing those things upon the first I/O operation. Or having a new special BIO_GETATTR (kind of) operation that could be executed after g_access() but before the actual I/O is allowed. I am proposing an interim solution, so really a workaround, for the problem of dropping the topology lock: https://reviews.freebsd.org/D14533 That workaround cannot guarantee, of course, the complete stability of the topology, but it prevents concurrent calls to access methods. The idea is very simple. Before calling a geom's access method the geom is marked with a special flag unless the flag is already set in which case the code waits until the flag is cleared. The flag is cleared after the call, of course. The topology lock is released while waiting for the flag. I think that having this new flag may help to get more visibility into the problem. P.S. The workaround does not help daopen() at all. -- Andriy Gapon From owner-freebsd-geom@freebsd.org Mon Mar 12 17:11:42 2018 Return-Path: Delivered-To: freebsd-geom@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 90C89F2C981 for ; Mon, 12 Mar 2018 17:11:42 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-it0-x234.google.com (mail-it0-x234.google.com [IPv6:2607:f8b0:4001:c0b::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2373486B1C for ; Mon, 12 Mar 2018 17:11:42 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-it0-x234.google.com with SMTP id w63so12116295ita.3 for ; Mon, 12 Mar 2018 10:11:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=NC9/9+VsdTpO39szZttPOl2mb16vBPjCeI1+teosK9g=; b=x9AH66wYYEGuk4uF9zD9nWrIo7Zx4HidSqcIlmDq33UFZRDnWtvr8rHBbcdKpYsiGV U3vBhNgvWHhp1otRpgenmhq2W23sz785a4cJKoY5OfEOeQO+NMbdfv3ZG04q4J8g0GSD nFufGqLXQEdnT7mifrvUkedFUxVxExkw9qlwYdkq6BBb5Gx9qbu6/N9j6B9K+/EOogIi GkJPUNrykKrKzi7hg7RhPgPN2ERm/T1tdBNerv49m2H3FPitknZQ6xngXMwFZ0YBTFXD OgRpzS1M5bOcyp07H2vdfjmnszpZI/ONHBw0RhEjaEiAc2C3nzx50mB5ihLtVM+seJpq binw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=NC9/9+VsdTpO39szZttPOl2mb16vBPjCeI1+teosK9g=; b=Hk++r2YWiW0OBX6PwIbOhO9SpwHb10Hatre/ub66qc4gP/7541heMwDv2KgB0KBT/Q 0boMahQ30HXQmNdkkPG+GAjh+gzjJDfkPrOlHXBZPoWnP+ZrulOvUid7g78IcM3FWPvm CxeggHV+Qfr2ukqycHj97KHaEkr/VAZfqOJcFeW5JhplryY5w7/a1z2+LLrGSwAIujFL kuveMOR5Q3lfHRzE2C7zwY6vHifMc2eIXvXUhfjgbBgxvelacp11nppF03cxFlcM3Koh dacmSLumOpKZ3/61z5M+NHnEcRrgSPNglQtvQtxGwGUD2jso3jTJNIxwPNaeIlYh37bn 3ZJw== X-Gm-Message-State: AElRT7EhAxnrO46ldACIu2E4PQwvBZzUFHm9DwUVGjQ6PZqrlCpALSLY MSmrd4ArzCbjLJ5f/4nA621re+1nWoW638BQJKTIkw== X-Google-Smtp-Source: AG47ELvbknfpMTWBnOUrSYKNunuJ3wNZNf2om76XMuedjgAsGyCh/ndyOBD7U0ZrnE/Y1XpfsmA6XF+G793AvmIVamo= X-Received: by 10.36.16.147 with SMTP id 141mr9583195ity.73.1520874701268; Mon, 12 Mar 2018 10:11:41 -0700 (PDT) MIME-Version: 1.0 Sender: wlosh@bsdimp.com Received: by 10.79.203.196 with HTTP; Mon, 12 Mar 2018 10:11:40 -0700 (PDT) X-Originating-IP: [2603:300b:6:5100:1052:acc7:f9de:2b6d] In-Reply-To: <809d9254-ee56-59d8-69a4-08838e985cea@FreeBSD.org> References: <809d9254-ee56-59d8-69a4-08838e985cea@FreeBSD.org> From: Warner Losh Date: Mon, 12 Mar 2018 11:11:40 -0600 X-Google-Sender-Auth: 8wO9USNQdf8tekqXZOBFR5dBO20 Message-ID: Subject: Re: geom->access problem and workaround To: Andriy Gapon Cc: freebsd-geom@freebsd.org, "freebsd-arch@freebsd.org" Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Mar 2018 17:11:42 -0000 On Mon, Mar 12, 2018 at 7:17 AM, Andriy Gapon wrote: > > According to Poul-Henning (phk@), the principal author of GEOM, a GEOM > class's > access method was intended to be a light-weight operation involving mostly > access counts. That is, it should be (have been) close in spirit to what > g_access() function does. The method is only called from g_access and it > is > always done under the GEOM topology lock (like with most GEOM "control > plane" > methods). The lock ensures that the method and the function operate on a > consistent state of the topology and all geoms in it. > > In reality, many classes have their access method do a lot more than just > checking and modifying access bits. And often, what the method does is > incompatible with the topology lock. > > Some examples. > g_mirror_access() has to drop and reacquire the topology lock to avoid a > LOR > (deadlock) because the method needs to use the class's internal sc_lock. > > zvol_geom_access() also has to drop and reacquire the topology lock when it > interacts with ZFS internals involving many locks. The main issue here is > that > ZFS is both above the GEOM when ZFS uses GEOM for the storage access and > it is > "below" the GEOM when ZFS is accessed through the ZVOL provider. > > g_disk_access() -> daopen(). In this case the topology lock is never > dropped, > but the operation issues multiple SCSI commands and waits for their > completion. > So, if something goes wrong and takes a long time to complete then the > whole > topology will be frozen for all that time. > [Perhaps doing the lock dance would be a better alternative] > > But, of course, dropping the lock does not come free. > It opens races where two (at least) sets of incompatible access counts may > get > granted. Or a special action, that should be done only on a first access > to a > geom, could be executed more than once. > > Bringing everything to conformance with the original design would be an > ideal > solution, but it will take a lot of work both in the individual > nonconforming > classes and in at least some of their consumers. It seems to require > moving all > the complex operations from access methods to the GEOM "data plane". E.g, > doing > those things upon the first I/O operation. Or having a new special > BIO_GETATTR > (kind of) operation that could be executed after g_access() but before the > actual I/O is allowed. > > I am proposing an interim solution, so really a workaround, for the > problem of > dropping the topology lock: > > https://reviews.freebsd.org/D14533 > > That workaround cannot guarantee, of course, the complete stability of the > topology, but it prevents concurrent calls to access methods. > The idea is very simple. Before calling a geom's access method the geom is > marked with a special flag unless the flag is already set in which case > the code > waits until the flag is cleared. The flag is cleared after the call, of > course. > The topology lock is released while waiting for the flag. > > I think that having this new flag may help to get more visibility into the > problem. > > P.S. > The workaround does not help daopen() at all. The storage layer generally doesn't expect higher-level locks around calls to it, and feels that it's free to sleep in the open routine for resources to become available. This is true across most 'open' routines (eg, tty will wait for the right signals, etc). In a world of removable media, I'm not sure that one can avoid this. But I'm not sure that calling open on the underlying device is at all compatible with the design goal of access being cheap. I think you can't have both: either you open the device, and cope with the fact that open may sleep, or it looks like you'll have broken code. Once we've updated the access counts, we can drop the topology lock to call open. If it succeeds, all is good. If it fails, then we have to reacquire it to "unaccess" the device after the failure... However, that doesn't help with the concurrent attempts to do first open for the device. g_disk_access will still have issues of sleeping indefinitely, which of course can lead to deadlock in complicated geometry situations (I say of course, but I'm not 100% sure). The whole reason that daopen may (but not always) sleep is that it may need to do I/O to the device to get it's media-status / size / SN if it' s a removable device... Just like with the RO flag, we'd want the open routine to fail if it can't reasonably access the device. Warner From owner-freebsd-geom@freebsd.org Mon Mar 12 18:07:29 2018 Return-Path: Delivered-To: freebsd-geom@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 61FE2F31FB7; Mon, 12 Mar 2018 18:07:29 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id E2A0A69817; Mon, 12 Mar 2018 18:07:28 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (unknown [192.168.55.3]) by phk.freebsd.dk (Postfix) with ESMTP id A915327374; Mon, 12 Mar 2018 18:07:19 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.15.2/8.15.2) with ESMTPS id w2CI73BO056621 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 12 Mar 2018 18:07:03 GMT (envelope-from phk@critter.freebsd.dk) Received: (from phk@localhost) by critter.freebsd.dk (8.15.2/8.15.2/Submit) id w2CI72Hc056620; Mon, 12 Mar 2018 18:07:02 GMT (envelope-from phk) To: Warner Losh cc: Andriy Gapon , "freebsd-arch@freebsd.org" , freebsd-geom@freebsd.org Subject: Re: geom->access problem and workaround In-reply-to: From: "Poul-Henning Kamp" References: <809d9254-ee56-59d8-69a4-08838e985cea@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <56618.1520878022.1@critter.freebsd.dk> Content-Transfer-Encoding: quoted-printable Date: Mon, 12 Mar 2018 18:07:02 +0000 Message-ID: <56619.1520878022@critter.freebsd.dk> X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Mar 2018 18:07:29 -0000 -------- In message , Warner Losh writ es: >The storage layer generally doesn't expect higher-level locks around call= s >to it, and feels that it's free to sleep in the open routine for resource= s >to become available. This is true across most 'open' routines (eg, tty wi= ll >wait for the right signals, etc). In a world of removable media, I'm not >sure that one can avoid this. The original intent was that we would. Things would probably have been clearer if I had called it g_reserve() instead of g_access(). Removable media state was supposed to be a job for the driver(s background polling), and the geom event queue was supposed to do what needed to be done as a result of the g_access() calls. The primary reason is that messing around with the geom topology is a global operation, in order to keep things simple[1], and we really don't want global locks held for any amount of time and certainly not for mechanical-movement / failure-retry kinds of time. The secondary reason was to be able to present a consistent and precise view of the system *without* opening devices, so that disk maintenance tools would not spin up all disks, rattle all drawers and bang all doors before telling you what you wanted to know. >But I'm not sure that calling open on the underlying device is at all >compatible with the design goal of access being cheap. I think you can't >have both: either you open the device, and cope with the fact that open m= ay >sleep, or it looks like you'll have broken code. Once we've updated the >access counts, we can drop the topology lock to call open. So this is where it gets slightly tricky: When you open /dev/foobar, do you open the media or only a drivemechanism that *may* hold a media ? For any normal "hard-disk", there is no difference. But for floppies, CDROMs, ZIP drives, WORM drives, Robots-with-ATA-disks and other interesting hardware, which were relevant when GEOM was designed, and to some extent still are, you only open the drive, and will have to find out next if it has a media in it or not. In particular CDROMs forced this design decision, because the ioctls to open & close the tray on CDROM drives operated on the media access device node, and too many ports knew about that. Compare that with a tape-changer, which has one device node for the robotic parts and another for (each of) the tape drive(s). If we want to have an architectural sound way to do slow operations before any "user-I/O" is initiated, the right way to do so is to define new BIO_OPEN and BIO_CLOSE operation, and insist via asserts than all BIO_{READ|WRITE|DELETE} are wrapped in these. BIO_GETATTR should probably not require a BIO_OPEN/BIO_CLOSE. Poul-Henning [1] The alternative would be to have different sub-trees, each of which can be locked individually, but that requires a LOT of housekeeping and class-complexity in order to find out what those sub-trees actually are. -- = Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe = Never attribute to malice what can adequately be explained by incompetence= . From owner-freebsd-geom@freebsd.org Mon Mar 12 22:18:17 2018 Return-Path: Delivered-To: freebsd-geom@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A3F36F4FF9C; Mon, 12 Mar 2018 22:18:17 +0000 (UTC) (envelope-from vrwmiller@gmail.com) Received: from mail-ua0-x242.google.com (mail-ua0-x242.google.com [IPv6:2607:f8b0:400c:c08::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 344EE777F9; Mon, 12 Mar 2018 22:18:17 +0000 (UTC) (envelope-from vrwmiller@gmail.com) Received: by mail-ua0-x242.google.com with SMTP id c14so7916302uak.7; Mon, 12 Mar 2018 15:18:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=Yc3dGhWuBJH+uN3w68cIQ7LnX5govoeC1Inhaa3H3IM=; b=dkdul6Vci7YodimGZEqmP4A5mUe4ZAmvNMMsqg/pikburjSm6R0FRjm9aSbNhgKAUU D3wdm+VK4xMFjtFMtt22phj6dh409zuAiKn1HWIKIraKCGWJO8Cr4hTLZqriAeCJS+Q2 XC3coqBjfJnlG2Ez7EJKdc/O3D+2CeQaZOMMRhC4VmMiuIfgHDcoy0rPdLuvkrjwxk5j SPsbWBXSU16xiAKfjjpgkLooylw1pd1JiOWNmEqjB4txmC6S3feH4BnkpGuA9xIjKcM1 OnQo6KBb6+riY+DrtGTpFFC3kW55y9qTWlfr3xUrXIH+COsltsFc+lkO0UvQFyURybkM p0dA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=Yc3dGhWuBJH+uN3w68cIQ7LnX5govoeC1Inhaa3H3IM=; b=MHcHnq88XvLZzweAAP+EJp2PkAHohhRc//z12rPL5QECe032RuEcEZK5RGABdeCZ1z pV/2oB4VlzbnSE7Vh6OXbYCh/F3mfkOKnTENRNtuRA7SRdhT9QpzG7Web0aG6Y46BRxF aMwMxPZr2e9auhgaTqH1bnOZ2jDCWVfhqAS2MLIMiEH7x3MJCl8WpZ2ICOc6SIM6xk0o B4UAf5ab3aNvutU/vXul3KmQdg95Opp5w21HuVyJEAi11MHbrDuFgZl075qCmfZZRzTr MHyYKqy440uqTfnB9cZaaBqjMHxXgDbEZyQKheCCuWBAZVmR+tiXuTZUeHdtghc/m8UM GSSg== X-Gm-Message-State: AElRT7EOFYtyJcSNT7t34L/4gr3V473FHZwr565P7h5AXGhWfwzhCQ/i 3Psx4RSXFopwC1f7iadRxCUNawyB+T6ME4TlZf1IDw== X-Google-Smtp-Source: AG47ELs2tzW3Mlq4ihe6a8vk1EHpmsJk0+htOSCVhEgNsKBZcKB70lxdQDmFOWI7FmIgd7jSi3DZWX8mx6gLCH6C1Lg= X-Received: by 10.176.75.216 with SMTP id b24mr6290580uag.137.1520893095160; Mon, 12 Mar 2018 15:18:15 -0700 (PDT) MIME-Version: 1.0 Received: by 10.176.73.195 with HTTP; Mon, 12 Mar 2018 15:18:14 -0700 (PDT) From: Rick Miller Date: Mon, 12 Mar 2018 18:18:14 -0400 Message-ID: Subject: FreeBSD 11.x fails to boot w/ SAS3 4Kn HDD and LSISAS3008 on SuperMicro X10DRH-iT To: FreeBSD Questions , freebsd-geom@freebsd.org, freebsd-stable@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Mar 2018 22:18:18 -0000 Hi all, Thanks in advance to anyone that might be able to help. I subscribed to freebsd-geom@ so that the list did not need to "reply-all". Having trouble getting FreeBSD 11-STABLE @ r329011 to boot from SAS3 4Kn HDDs via LSISAS 3008 HBA on a SuperMicro X10DRH-iT motherboard after an apparent installation. All internal media (including all other disks attached to the HBA) were removed to eliminate other storage being the reason the system won't boot. This occurs specifically in CSM mode, but the preference is to boot via UEFI mode instead. Anyway...booting the machine via the memstick image demonstrates the LSISAS 3008 controller attaching via mpr(4) (whose manpage describes the controller being supported[1]): mpr0@pci0:3:0:0: class=0x010700 card=0x080815d9 chip=0x00971000 rev=0x02 hdr=0x00 vendor = 'LSI Logic / Symbios Logic' device = 'SAS3008 PCI-Express Fusion-MPT SAS-3' class = mass storage subclass = SAS The only inserted disk attaches as da0 as illustrated by dmesg: ses0: pass0,da0: Elemne t descriptor: 'Slot00' da0 at mpr0 bus 0 scbus0 target 8 lun 0 ses0: pass0,da0: SAS Device Slot Element: 1 Phys at Slot 0 ses0: phy 0: SAS device type 1 id 0 ses0: phy 0: protocols: initiator( None ) Target( SSP ) ses0: phy 0: parent 500304801e870bff addr 5000c500a012814d da0: Fixed Direct Access SPC-4 SCSI device da0: Serial Number $serial_number da0: 1200.000MB/s transfers da0: Command queueing enabled da0: 1907729MB (488378648 4096 byte sectors) The original goal was to boot via zfs root, but when that failed, subsequent installations used the "Auto (UFS) option" to partition the disk. For example, the first installation gpart'd the disk as: # gpart show da0 => 6 488378635 da0 GPT (1.8T) 6 128 1 freebsd-boot (512K) 134 487325568 2 freebsd-ufs (1.8T) 487325702 1048576 3 freebsd-swap (4.0G) 488374278 4363 - free - (17M) The result was a reboot loop. When the system reached the point of reading the disk, it just rebooted and continued doing so. There was no loader or beastie menu. Thus, thinking that it could be the partition layout requirements of the 4Kn disks, it was gpart'd like the below[2][3]. This was done by exiting to the shell during the partition phase of bsdinstall and manually gpart'ing the disk according to the below, mounting da0p2 at /mnt and placing an fstab at /tmp/bsdinstall_etc/fstab that included mount entries for /dev/da0p2 at / and /dev/da0p3 as swap. # gpart show da0 => 6 488378635 da0 GPT (1.8T) 6 34 - free - (136K) 40 512 1 freebsd-boot (2.0M) 552 419430400 2 freebsd-ufs (1.6T) 419430952 1048576 3 freebsd-swap (4.0G) 420479528 67899113 - free - (259G) When configured as such, the system rebooted at the completion of the install and appeared to roll through the boot order, which specifies the HDD first, then CD/DVD, then network. It did attempt to boot via network, but is irrelevant here. All the hardware is alleged to be supported by FreeBSD as best I can tell and OS installation apparently works. I'm at a loss as to why the OS won't boot. Does someone have feedback or input that may expose why it doesn't boot? FWIW, a RHEL7 install was also attempted, which also does not boot. [1] https://www.freebsd.org/cgi/man.cgi?query=mpr&apropos=0&sektion=4&manpath=FreeBSD+11.1-RELEASE&arch=default&format=html [2] https://lists.freebsd.org/pipermail/freebsd-hardware/ 2013-September/007380.html [3] http://www.wonkity.com/~wblock/docs/html/disksetup.html -- Take care Rick Miller From owner-freebsd-geom@freebsd.org Tue Mar 13 06:45:19 2018 Return-Path: Delivered-To: freebsd-geom@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 16593BF4351 for ; Tue, 13 Mar 2018 06:45:19 +0000 (UTC) (envelope-from steven@multiplay.co.uk) Received: from mail-lf0-x229.google.com (mail-lf0-x229.google.com [IPv6:2a00:1450:4010:c07::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 604BC6C01A for ; Tue, 13 Mar 2018 06:45:18 +0000 (UTC) (envelope-from steven@multiplay.co.uk) Received: by mail-lf0-x229.google.com with SMTP id f75-v6so27022119lfg.6 for ; Mon, 12 Mar 2018 23:45:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=multiplay-co-uk.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=uPtwVSGDJlbqM2fO/scs6rfaD/sECYaXk7W+l8olse0=; b=1oMpzSHE3cx96+sZROASjBz8xToeqYyVCrt2hU+0M3DTtnjd5jz0EpXLQE1Kklz3AX EMEd+i8NBDFlsgNhqx1rEE7XxcKItRDbhPFvycDw2V40+GWqRAtzDHvsg5PaFrhacjcY El0uj2hquBiQr8RF9pIem/y3T5VC5+wUE7LzAwGPtaQjHzi+q3EKMOOpgNxysIGKrPVS BKLHjM9BebJhJfrP/zDIkIsp7mxriE+E14Gshnjp41HvzEfG/0CmA6mzGcRO6dEdUU+B G5pO1rdLtDcJ83stGoppO9aG0GdhU0cuSCgP7c4/v1FmkiPomALrNb7FAVVGZyG5X334 +tPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=uPtwVSGDJlbqM2fO/scs6rfaD/sECYaXk7W+l8olse0=; b=ZundymFO+KRv1Gj+qDB9okuTBr9q+bJVHWErBGnAy8WP96X5fvPzXW8034iHqiv6KR +GLzHBqoNflVw1lv5gO0FfRhAeD43xTeOc8pA/Tc4tcEVR442PzMuCkraGPdnB06A5mw fArgm+It/+eRmsEqKRRMN8e4ZqM+JrzKo6Q+40ykiKIeXBgakTHp1qd6w5vCQ1azgcFl VpC0SunZgNcpYGDeWELX18K/vw+rWCq/lvpaQHdWdrwOL9LREbgUj+VUSbbcYGgmqWbe Y8E9n+DlseNKrIYa+VtPfL+14GcLtrlh5huRlohR/mTB/tYaVIUVH/MhLJ0UUPP5xzoQ aMAg== X-Gm-Message-State: AElRT7FalhuewZYAnCTHyBWhw54wkXDww+t6P5qyVuCvJ85O3sjrwgTt MSYVbr9x/lu+EwFQ7HuFu1mv7SpVF6xScnUFQ+x/Ww== X-Google-Smtp-Source: AG47ELuul+jwt/q9cA6TJZPKBb+CurDMyiPWx+XXz4dCGIgoufNXuQ/aEXzHMdr8MgVaUHlp+fiVS+oP515MSpWALSI= X-Received: by 2002:a19:c987:: with SMTP id z129-v6mr2540279lff.74.1520923516542; Mon, 12 Mar 2018 23:45:16 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Steven Hartland Date: Tue, 13 Mar 2018 06:45:05 +0000 Message-ID: Subject: Re: FreeBSD 11.x fails to boot w/ SAS3 4Kn HDD and LSISAS3008 on SuperMicro X10DRH-iT To: Rick Miller Cc: FreeBSD Questions , freebsd-geom@freebsd.org, freebsd-stable@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Mar 2018 06:45:19 -0000 Have you tried setting dumpdev to AUTO in rc.conf to see if you can obtain a panic dump? You could also try disabling reboot on it panic using the sysctl On Mon, 12 Mar 2018 at 22:18, Rick Miller wrote: > Hi all, > > Thanks in advance to anyone that might be able to help. I subscribed to > freebsd-geom@ so that the list did not need to "reply-all". Having trouble > getting FreeBSD 11-STABLE @ r329011 to boot from SAS3 4Kn HDDs via LSISAS > 3008 HBA on a SuperMicro X10DRH-iT motherboard after an apparent > installation. All internal media (including all other disks attached to the > HBA) were removed to eliminate other storage being the reason the system > won't boot. This occurs specifically in CSM mode, but the preference is to > boot via UEFI mode instead. > > Anyway...booting the machine via the memstick image demonstrates the LSISAS > 3008 controller attaching via mpr(4) (whose manpage describes the > controller being supported[1]): > > mpr0@pci0:3:0:0: class=0x010700 card=0x080815d9 chip=0x00971000 rev=0x02 > hdr=0x00 > vendor = 'LSI Logic / Symbios Logic' > device = 'SAS3008 PCI-Express Fusion-MPT SAS-3' > class = mass storage > subclass = SAS > > The only inserted disk attaches as da0 as illustrated by dmesg: > > ses0: pass0,da0: Elemne t descriptor: 'Slot00' > da0 at mpr0 bus 0 scbus0 target 8 lun 0 > ses0: pass0,da0: SAS Device Slot Element: 1 Phys at Slot 0 > ses0: phy 0: SAS device type 1 id 0 > ses0: phy 0: protocols: initiator( None ) Target( SSP ) > ses0: phy 0: parent 500304801e870bff addr 5000c500a012814d > da0: Fixed Direct Access SPC-4 SCSI device > da0: Serial Number $serial_number > da0: 1200.000MB/s transfers > da0: Command queueing enabled > da0: 1907729MB (488378648 4096 byte sectors) > > The original goal was to boot via zfs root, but when that failed, > subsequent installations used the "Auto (UFS) option" to partition the > disk. For example, the first installation gpart'd the disk as: > > # gpart show da0 > => 6 488378635 da0 GPT (1.8T) > 6 128 1 freebsd-boot (512K) > 134 487325568 2 freebsd-ufs (1.8T) > 487325702 1048576 3 freebsd-swap (4.0G) > 488374278 4363 - free - (17M) > > The result was a reboot loop. When the system reached the point of reading > the disk, it just rebooted and continued doing so. There was no loader or > beastie menu. Thus, thinking that it could be the partition layout > requirements of the 4Kn disks, it was gpart'd like the below[2][3]. This > was done by exiting to the shell during the partition phase of bsdinstall > and manually gpart'ing the disk according to the below, mounting da0p2 at > /mnt and placing an fstab at /tmp/bsdinstall_etc/fstab that included mount > entries for /dev/da0p2 at / and /dev/da0p3 as swap. > > # gpart show da0 > => 6 488378635 da0 GPT (1.8T) > 6 34 - free - (136K) > 40 512 1 freebsd-boot (2.0M) > 552 419430400 2 freebsd-ufs (1.6T) > 419430952 1048576 3 freebsd-swap (4.0G) > 420479528 67899113 - free - (259G) > > When configured as such, the system rebooted at the completion of the > install and appeared to roll through the boot order, which specifies the > HDD first, then CD/DVD, then network. It did attempt to boot via network, > but is irrelevant here. > > All the hardware is alleged to be supported by FreeBSD as best I can tell > and OS installation apparently works. I'm at a loss as to why the OS won't > boot. Does someone have feedback or input that may expose why it doesn't > boot? > > FWIW, a RHEL7 install was also attempted, which also does not boot. > > [1] > > https://www.freebsd.org/cgi/man.cgi?query=mpr&apropos=0&sektion=4&manpath=FreeBSD+11.1-RELEASE&arch=default&format=html > [2] https://lists.freebsd.org/pipermail/freebsd-hardware/ > 2013-September/007380.html > [3] http://www.wonkity.com/~wblock/docs/html/disksetup.html > > -- > Take care > Rick Miller > _______________________________________________ > freebsd-geom@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-geom > To unsubscribe, send any mail to "freebsd-geom-unsubscribe@freebsd.org" > From owner-freebsd-geom@freebsd.org Tue Mar 13 12:25:59 2018 Return-Path: Delivered-To: freebsd-geom@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C6862F4955A; Tue, 13 Mar 2018 12:25:58 +0000 (UTC) (envelope-from vrwmiller@gmail.com) Received: from mail-ua0-x231.google.com (mail-ua0-x231.google.com [IPv6:2607:f8b0:400c:c08::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 52F2A7A499; Tue, 13 Mar 2018 12:25:58 +0000 (UTC) (envelope-from vrwmiller@gmail.com) Received: by mail-ua0-x231.google.com with SMTP id c40so9044186uae.2; Tue, 13 Mar 2018 05:25:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=SeGlG3ooT8T8szI81EK2LLLSBaO8vVI8uYji8Z3GIks=; b=K/Q9Q7PfeToYHrjjIVjbkw6jtkWoZuDcOOazBJMKWm3N76t0wJM2cl23DsOyaCRs5X 8Mw9DcPvZCKc8kXV5iX/VehHPUc65WiuX09Nmve2qTZKNDEewhSlH2QGH4C6pcIAyEDF EBWsgzH3DyweVo3zmh3Hhu7iYq7MJvPr97Q5wxPtHBFYezIP3BsApDMnhZVN4GKe3fFN HDN3DA6UMYFOsKXoIsXQYpHmXE0XdKQReiPEeXBte/ItJcJ2Fa11nCkyX94wtEN0FtsE P2DJVaif8vxKbwpAuc9MUzcFTntQmESfC0kJR33Q5sQBLm9P5c6Wtj4jqT2B90SH47qC 7rrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=SeGlG3ooT8T8szI81EK2LLLSBaO8vVI8uYji8Z3GIks=; b=YNd5CAL0GSArR+Ew8bsn2zkIbUZwaxBrLRwBWg18g2Ks8SYxf7+Gdz0QRcpBeLJPNr 1McxD8OSfvg6Xt8kqLfrG6YC5dITYuclBGKn1JUacO6ddStv6U5LbEr9QXfiQulRrAQv pgIHH0WMTR2nWJ3tQwXjjYmEbdwpS+GBsP6aZiHqqIqt4DL774vv7pQmCe78TeKkX25O YaIFIviW8JXLs5kE+SJVeKyk9Iz65+KmNBYLEUDJa2NXX6dQh589E0XsADqVmiNJOLvG UPwq659zcD/A2YIsNMdcVBAflKgMA07zZ+Y7UYLW6Fjzz1CL61OJSQ8jQMpCEYzkoAc7 ZBzQ== X-Gm-Message-State: AElRT7GM1++IJBU5mv8u1BPnkmix770/5WRopLd3vnPRV366KwZi400B lUhd+mkn6BI9+MWdphN+HFQcYTLlNTi2dhredYk= X-Google-Smtp-Source: AG47ELsDNRXeYk1DogHGU4C2re+CaWYTrAdwg2CUQLkQr02QVgGluokGahHlCZGmT2qJw1Y1o74ZG3d39s4hzRLnaE0= X-Received: by 10.159.60.89 with SMTP id w25mr273819uah.59.1520943957657; Tue, 13 Mar 2018 05:25:57 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Rick Miller Date: Tue, 13 Mar 2018 12:25:47 +0000 Message-ID: Subject: Re: FreeBSD 11.x fails to boot w/ SAS3 4Kn HDD and LSISAS3008 on SuperMicro X10DRH-iT To: FreeBSD Questions , freebsd-geom@freebsd.org, freebsd-stable@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Mar 2018 12:25:59 -0000 As it turns out, it seems EFI must be specified as the boot environment as opposed to legacy or BIOS+EFI. Setting the boot environment to EFI only resulted in successful system boot. Thanks for the replies. The answer came from Allan Jude on Twitter. On Mon, Mar 12, 2018 at 6:18 PM Rick Miller wrote: > Hi all, > > Thanks in advance to anyone that might be able to help. I subscribed to > freebsd-geom@ so that the list did not need to "reply-all". Having > trouble getting FreeBSD 11-STABLE @ r329011 to boot from SAS3 4Kn HDDs via > LSISAS 3008 HBA on a SuperMicro X10DRH-iT motherboard after an apparent > installation. All internal media (including all other disks attached to the > HBA) were removed to eliminate other storage being the reason the system > won't boot. This occurs specifically in CSM mode, but the preference is to > boot via UEFI mode instead. > > Anyway...booting the machine via the memstick image demonstrates the > LSISAS 3008 controller attaching via mpr(4) (whose manpage describes the > controller being supported[1]): > > mpr0@pci0:3:0:0: class=0x010700 card=0x080815d9 chip=0x00971000 rev=0x02 > hdr=0x00 > vendor = 'LSI Logic / Symbios Logic' > device = 'SAS3008 PCI-Express Fusion-MPT SAS-3' > class = mass storage > subclass = SAS > > The only inserted disk attaches as da0 as illustrated by dmesg: > > ses0: pass0,da0: Elemne t descriptor: 'Slot00' > da0 at mpr0 bus 0 scbus0 target 8 lun 0 > ses0: pass0,da0: SAS Device Slot Element: 1 Phys at Slot 0 > ses0: phy 0: SAS device type 1 id 0 > ses0: phy 0: protocols: initiator( None ) Target( SSP ) > ses0: phy 0: parent 500304801e870bff addr 5000c500a012814d > da0: Fixed Direct Access SPC-4 SCSI device > da0: Serial Number $serial_number > da0: 1200.000MB/s transfers > da0: Command queueing enabled > da0: 1907729MB (488378648 4096 byte sectors) > > The original goal was to boot via zfs root, but when that failed, > subsequent installations used the "Auto (UFS) option" to partition the > disk. For example, the first installation gpart'd the disk as: > > # gpart show da0 > => 6 488378635 da0 GPT (1.8T) > 6 128 1 freebsd-boot (512K) > 134 487325568 2 freebsd-ufs (1.8T) > 487325702 1048576 3 freebsd-swap (4.0G) > 488374278 4363 - free - (17M) > > The result was a reboot loop. When the system reached the point of reading > the disk, it just rebooted and continued doing so. There was no loader or > beastie menu. Thus, thinking that it could be the partition layout > requirements of the 4Kn disks, it was gpart'd like the below[2][3]. This > was done by exiting to the shell during the partition phase of bsdinstall > and manually gpart'ing the disk according to the below, mounting da0p2 at > /mnt and placing an fstab at /tmp/bsdinstall_etc/fstab that included mount > entries for /dev/da0p2 at / and /dev/da0p3 as swap. > > # gpart show da0 > => 6 488378635 da0 GPT (1.8T) > 6 34 - free - (136K) > 40 512 1 freebsd-boot (2.0M) > 552 419430400 2 freebsd-ufs (1.6T) > 419430952 1048576 3 freebsd-swap (4.0G) > 420479528 67899113 - free - (259G) > > When configured as such, the system rebooted at the completion of the > install and appeared to roll through the boot order, which specifies the > HDD first, then CD/DVD, then network. It did attempt to boot via network, > but is irrelevant here. > > All the hardware is alleged to be supported by FreeBSD as best I can tell > and OS installation apparently works. I'm at a loss as to why the OS won't > boot. Does someone have feedback or input that may expose why it doesn't > boot? > > FWIW, a RHEL7 install was also attempted, which also does not boot. > > [1] > https://www.freebsd.org/cgi/man.cgi?query=mpr&apropos=0&sektion=4&manpath=FreeBSD+11.1-RELEASE&arch=default&format=html > [2] > https://lists.freebsd.org/pipermail/freebsd-hardware/2013-September/007380.html > [3] http://www.wonkity.com/~wblock/docs/html/disksetup.html > > -- > Take care > Rick Miller > -- Take care Rick Miller From owner-freebsd-geom@freebsd.org Thu Mar 15 09:17:12 2018 Return-Path: Delivered-To: freebsd-geom@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A97E4F5BEBA for ; Thu, 15 Mar 2018 09:17:12 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 907E876123 for ; Thu, 15 Mar 2018 09:17:11 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id D547210B6E for ; Thu, 15 Mar 2018 09:17:10 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w2F9HAQx045964 for ; Thu, 15 Mar 2018 09:17:10 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from bugzilla@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w2F9HAmK045963 for freebsd-geom@FreeBSD.org; Thu, 15 Mar 2018 09:17:10 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: bugzilla set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: freebsd-geom@FreeBSD.org Subject: [Bug 225960] zfs: g_access leak when unmounting UFS on a zvol Date: Thu, 15 Mar 2018 09:17:10 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: commit-hook@freebsd.org X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-geom@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2018 09:17:12 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D225960 --- Comment #15 from commit-hook@freebsd.org --- A commit references this bug: Author: avg Date: Thu Mar 15 09:16:11 UTC 2018 New revision: 330977 URL: https://svnweb.freebsd.org/changeset/base/330977 Log: g_access: deal with races created by geoms that drop the topology lock The problem is that g_access() must be called with the GEOM topology lock held. And that gives a false impression that the lock is indeed held across the call. But this isn't always true because many classes, ZVOL being one of the many, need to drop the lock. It's either to perform an I/O on the first open or to acquire a different lock (like in g_mirror_access). That, of course, can break many assumptions. For example, g_slice_access() adds an extra exclusive count on the first open. As described above, an underlying geom may drop the topology lock and that would open a race with another thread that would also request another extra exclusive count. In general, two consumers may be granted incompatible accesses. To avoid this problem the code is changed to mark a geom with special flag before calling its access method and clear the flag afterwards. If another thread sees that flag, then it means that the topology lock has been dropped (either by the geom in question or downstream from it), so it is not safe to make another access call. So, the second thread would use g_topology_sleep() to wait until the flag is cleared and only then would it proceed with the access. Also see http://docs.freebsd.org/cgi/mid.cgi?809d9254-ee56-59d8-69a4-08838e985cea PR: 225960 Reported by: asomers Reviewed by: markj, mav MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D14533 Changes: head/sys/geom/geom.h head/sys/geom/geom_subr.c --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-geom@freebsd.org Thu Mar 15 09:28:24 2018 Return-Path: Delivered-To: freebsd-geom@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1C750F5CB0A for ; Thu, 15 Mar 2018 09:28:24 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id AD1037698D for ; Thu, 15 Mar 2018 09:28:23 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id F23DB10D00 for ; Thu, 15 Mar 2018 09:28:22 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w2F9SMia072203 for ; Thu, 15 Mar 2018 09:28:22 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from bugzilla@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w2F9SMdm072202 for freebsd-geom@FreeBSD.org; Thu, 15 Mar 2018 09:28:22 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: bugzilla set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: freebsd-geom@FreeBSD.org Subject: [Bug 225960] zfs: g_access leak when unmounting UFS on a zvol Date: Thu, 15 Mar 2018 09:28:23 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: commit-hook@freebsd.org X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-geom@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2018 09:28:24 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D225960 --- Comment #16 from commit-hook@freebsd.org --- A commit references this bug: Author: avg Date: Thu Mar 15 09:28:11 UTC 2018 New revision: 330979 URL: https://svnweb.freebsd.org/changeset/base/330979 Log: re-enable zfs_copies_006_pos test after a fix in r330977 The test was disabled in r329408. PR: 225960 Changes: head/tests/sys/cddl/zfs/tests/cli_root/zfs_copies/zfs_copies_test.sh --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-geom@freebsd.org Thu Mar 15 09:41:11 2018 Return-Path: Delivered-To: freebsd-geom@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AB99FF5DC7E for ; Thu, 15 Mar 2018 09:41:11 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4998C7765F for ; Thu, 15 Mar 2018 09:41:11 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id 7C96110F78 for ; Thu, 15 Mar 2018 09:41:10 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w2F9fAw1005365 for ; Thu, 15 Mar 2018 09:41:10 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w2F9fACZ005364 for freebsd-geom@FreeBSD.org; Thu, 15 Mar 2018 09:41:10 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: freebsd-geom@FreeBSD.org Subject: [Bug 225960] zfs: g_access leak when unmounting UFS on a zvol Date: Thu, 15 Mar 2018 09:41:10 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: avg@FreeBSD.org X-Bugzilla-Status: In Progress X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: avg@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_status assigned_to Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2018 09:41:11 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D225960 Andriy Gapon changed: What |Removed |Added ---------------------------------------------------------------------------- Status|Open |In Progress Assignee|freebsd-geom@FreeBSD.org |avg@FreeBSD.org --=20 You are receiving this mail because: You are the assignee for the bug.=