From owner-svn-src-stable-11@freebsd.org Sun Sep 30 08:41:15 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7805B10B8D69; Sun, 30 Sep 2018 08:41:15 +0000 (UTC) (envelope-from slavash@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2DC0B79931; Sun, 30 Sep 2018 08:41:15 +0000 (UTC) (envelope-from slavash@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 28BBE7F12; Sun, 30 Sep 2018 08:41:15 +0000 (UTC) (envelope-from slavash@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w8U8fFY2087830; Sun, 30 Sep 2018 08:41:15 GMT (envelope-from slavash@FreeBSD.org) Received: (from slavash@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w8U8fF9E087829; Sun, 30 Sep 2018 08:41:15 GMT (envelope-from slavash@FreeBSD.org) Message-Id: <201809300841.w8U8fF9E087829@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: slavash set sender to slavash@FreeBSD.org using -f From: Slava Shwartsman Date: Sun, 30 Sep 2018 08:41:15 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339021 - stable/11/sys/dev/pci X-SVN-Group: stable-11 X-SVN-Commit-Author: slavash X-SVN-Commit-Paths: stable/11/sys/dev/pci X-SVN-Commit-Revision: 339021 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Sep 2018 08:41:15 -0000 Author: slavash Date: Sun Sep 30 08:41:14 2018 New Revision: 339021 URL: https://svnweb.freebsd.org/changeset/base/339021 Log: MFC r338942: Add PCIV_INVALID definition From PCI Spec rev 2.2, 6.2.1. Device Identification: Vendor ID This field identifies the manufacturer of the device. Valid vendor identifiers are allocated by the PCI SIG to ensure uniqueness. 0FFFFh is an invalid value for Vendor ID. Approved by: kib (mentor) Sponsored by: Mellanox Technologies Modified: stable/11/sys/dev/pci/pcireg.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/dev/pci/pcireg.h ============================================================================== --- stable/11/sys/dev/pci/pcireg.h Sat Sep 29 21:14:54 2018 (r339020) +++ stable/11/sys/dev/pci/pcireg.h Sun Sep 30 08:41:14 2018 (r339021) @@ -120,6 +120,9 @@ #define PCIM_MFDEV 0x80 #define PCIR_BIST 0x0f +/* PCI Spec rev 2.2: 0FFFFh is an invalid value for Vendor ID. */ +#define PCIV_INVALID 0xffff + /* Capability Register Offsets */ #define PCICAP_ID 0x0 From owner-svn-src-stable-11@freebsd.org Sun Sep 30 12:25:39 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9274810BE7B7; Sun, 30 Sep 2018 12:25:39 +0000 (UTC) (envelope-from oshogbo@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 389DC80FE1; Sun, 30 Sep 2018 12:25:39 +0000 (UTC) (envelope-from oshogbo@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 24E3912456; Sun, 30 Sep 2018 12:25:39 +0000 (UTC) (envelope-from oshogbo@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w8UCPcuM022956; Sun, 30 Sep 2018 12:25:38 GMT (envelope-from oshogbo@FreeBSD.org) Received: (from oshogbo@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w8UCPcOR022953; Sun, 30 Sep 2018 12:25:38 GMT (envelope-from oshogbo@FreeBSD.org) Message-Id: <201809301225.w8UCPcOR022953@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: oshogbo set sender to oshogbo@FreeBSD.org using -f From: Mariusz Zaborski Date: Sun, 30 Sep 2018 12:25:38 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339023 - stable/11/sys/geom/eli X-SVN-Group: stable-11 X-SVN-Commit-Author: oshogbo X-SVN-Commit-Paths: stable/11/sys/geom/eli X-SVN-Commit-Revision: 339023 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Sep 2018 12:25:39 -0000 Author: oshogbo Date: Sun Sep 30 12:25:38 2018 New Revision: 339023 URL: https://svnweb.freebsd.org/changeset/base/339023 Log: MFC r336310: Let geli deal with lost devices without crashing. PR: 162036 Submitted by: Fabian Keil Obtained from: ElectroBSD Discussed with: pjd@ Modified: stable/11/sys/geom/eli/g_eli.c stable/11/sys/geom/eli/g_eli_privacy.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/geom/eli/g_eli.c ============================================================================== --- stable/11/sys/geom/eli/g_eli.c Sun Sep 30 12:16:06 2018 (r339022) +++ stable/11/sys/geom/eli/g_eli.c Sun Sep 30 12:25:38 2018 (r339023) @@ -252,7 +252,8 @@ g_eli_read_done(struct bio *bp) pbp->bio_driver2 = NULL; } g_io_deliver(pbp, pbp->bio_error); - atomic_subtract_int(&sc->sc_inflight, 1); + if (sc != NULL) + atomic_subtract_int(&sc->sc_inflight, 1); return; } mtx_lock(&sc->sc_queue_mtx); @@ -297,7 +298,8 @@ g_eli_write_done(struct bio *bp) */ sc = pbp->bio_to->geom->softc; g_io_deliver(pbp, pbp->bio_error); - atomic_subtract_int(&sc->sc_inflight, 1); + if (sc != NULL) + atomic_subtract_int(&sc->sc_inflight, 1); } /* Modified: stable/11/sys/geom/eli/g_eli_privacy.c ============================================================================== --- stable/11/sys/geom/eli/g_eli_privacy.c Sun Sep 30 12:16:06 2018 (r339022) +++ stable/11/sys/geom/eli/g_eli_privacy.c Sun Sep 30 12:25:38 2018 (r339023) @@ -87,7 +87,8 @@ g_eli_crypto_read_done(struct cryptop *crp) bp->bio_error = crp->crp_etype; } sc = bp->bio_to->geom->softc; - g_eli_key_drop(sc, crp->crp_desc->crd_key); + if (sc != NULL) + g_eli_key_drop(sc, crp->crp_desc->crd_key); /* * Do we have all sectors already? */ @@ -104,7 +105,8 @@ g_eli_crypto_read_done(struct cryptop *crp) * Read is finished, send it up. */ g_io_deliver(bp, bp->bio_error); - atomic_subtract_int(&sc->sc_inflight, 1); + if (sc != NULL) + atomic_subtract_int(&sc->sc_inflight, 1); return (0); } From owner-svn-src-stable-11@freebsd.org Sun Sep 30 23:14:08 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7A6F010AAC5A; Sun, 30 Sep 2018 23:14:08 +0000 (UTC) (envelope-from gonzo@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1D7417649A; Sun, 30 Sep 2018 23:14:08 +0000 (UTC) (envelope-from gonzo@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 12B3F18D54; Sun, 30 Sep 2018 23:14:08 +0000 (UTC) (envelope-from gonzo@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w8UNE7nG056400; Sun, 30 Sep 2018 23:14:07 GMT (envelope-from gonzo@FreeBSD.org) Received: (from gonzo@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w8UNE7MN056397; Sun, 30 Sep 2018 23:14:07 GMT (envelope-from gonzo@FreeBSD.org) Message-Id: <201809302314.w8UNE7MN056397@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: gonzo set sender to gonzo@FreeBSD.org using -f From: Oleksandr Tymoshenko Date: Sun, 30 Sep 2018 23:14:07 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339029 - stable/11/sys/dev/ichiic X-SVN-Group: stable-11 X-SVN-Commit-Author: gonzo X-SVN-Commit-Paths: stable/11/sys/dev/ichiic X-SVN-Commit-Revision: 339029 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Sep 2018 23:14:08 -0000 Author: gonzo Date: Sun Sep 30 23:14:07 2018 New Revision: 339029 URL: https://svnweb.freebsd.org/changeset/base/339029 Log: MFC r336050-r336051, r336142, r336326, r337719 r336050: ig4(4): add support for Apollo Lake I2C controllers Add PCI ids for I2C controllers on Apollo Lake platform. Also convert switch/case probe logic into a table. Reviewed by: avg Differential Revision: https://reviews.freebsd.org/D16120 r336051: ig4(4): Fix Apollo lake entries platform identifier Identify Apollo Lake controllers as IG4_APL and not as a IG4_SKYLAKE Reported by: rpokala@ r336142: ig4(4): add devmatch(8) PNP info Now that we have all devices ids in a table add MODULE_PNP_INFO macro to let devmatch autoload module r336326: Remove MODULE_PNP_INFO for ig4(4) driver ig4(4) does not support suspend/resume but present on the hardware where such functionality is critical, like laptops. Remove PNP info to avoid breaking suspend/resume on the systems where ig4(4) load is not explicitly requested by the user. PR: 229791 Reported by: Ali Abdallah r337719: [ig4] Fix initialization sequence for newer ig4 chips Newer chips may require assert/deassert after power down for proper startup. Check respective flag in DEVIDLE_CTRL and perform operation if neccesssary. PR: 221777 Submitted by: marc.priggemeyer@gmail.com Obtained from: DragonFly BSD Tested on: Thinkpad T470 Modified: stable/11/sys/dev/ichiic/ig4_iic.c stable/11/sys/dev/ichiic/ig4_pci.c stable/11/sys/dev/ichiic/ig4_reg.h stable/11/sys/dev/ichiic/ig4_var.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/dev/ichiic/ig4_iic.c ============================================================================== --- stable/11/sys/dev/ichiic/ig4_iic.c Sun Sep 30 21:54:02 2018 (r339028) +++ stable/11/sys/dev/ichiic/ig4_iic.c Sun Sep 30 23:14:07 2018 (r339029) @@ -525,6 +525,16 @@ ig4iic_attach(ig4iic_softc_t *sc) mtx_init(&sc->io_lock, "IG4 I/O lock", NULL, MTX_DEF); sx_init(&sc->call_lock, "IG4 call lock"); + v = reg_read(sc, IG4_REG_DEVIDLE_CTRL); + if (sc->version == IG4_SKYLAKE && (v & IG4_RESTORE_REQUIRED) ) { + reg_write(sc, IG4_REG_DEVIDLE_CTRL, IG4_DEVICE_IDLE | IG4_RESTORE_REQUIRED); + reg_write(sc, IG4_REG_DEVIDLE_CTRL, 0); + + reg_write(sc, IG4_REG_RESETS_SKL, IG4_RESETS_ASSERT_SKL); + reg_write(sc, IG4_REG_RESETS_SKL, IG4_RESETS_DEASSERT_SKL); + DELAY(1000); + } + if (sc->version == IG4_ATOM) v = reg_read(sc, IG4_REG_COMP_TYPE); Modified: stable/11/sys/dev/ichiic/ig4_pci.c ============================================================================== --- stable/11/sys/dev/ichiic/ig4_pci.c Sun Sep 30 21:54:02 2018 (r339028) +++ stable/11/sys/dev/ichiic/ig4_pci.c Sun Sep 30 23:14:07 2018 (r339029) @@ -80,73 +80,62 @@ static int ig4iic_pci_detach(device_t dev); #define PCI_CHIP_SKYLAKE_I2C_3 0x9d638086 #define PCI_CHIP_SKYLAKE_I2C_4 0x9d648086 #define PCI_CHIP_SKYLAKE_I2C_5 0x9d658086 +#define PCI_CHIP_APL_I2C_0 0x5aac8086 +#define PCI_CHIP_APL_I2C_1 0x5aae8086 +#define PCI_CHIP_APL_I2C_2 0x5ab08086 +#define PCI_CHIP_APL_I2C_3 0x5ab28086 +#define PCI_CHIP_APL_I2C_4 0x5ab48086 +#define PCI_CHIP_APL_I2C_5 0x5ab68086 +#define PCI_CHIP_APL_I2C_6 0x5ab88086 +#define PCI_CHIP_APL_I2C_7 0x5aba8086 +struct ig4iic_pci_device { + uint32_t devid; + const char *desc; + enum ig4_vers version; +}; + +static struct ig4iic_pci_device ig4iic_pci_devices[] = { + { PCI_CHIP_LYNXPT_LP_I2C_1, "Intel Lynx Point-LP I2C Controller-1", IG4_HASWELL}, + { PCI_CHIP_LYNXPT_LP_I2C_2, "Intel Lynx Point-LP I2C Controller-2", IG4_HASWELL}, + { PCI_CHIP_BRASWELL_I2C_1, "Intel Braswell Serial I/O I2C Port 1", IG4_ATOM}, + { PCI_CHIP_BRASWELL_I2C_2, "Intel Braswell Serial I/O I2C Port 2", IG4_ATOM}, + { PCI_CHIP_BRASWELL_I2C_3, "Intel Braswell Serial I/O I2C Port 3", IG4_ATOM}, + { PCI_CHIP_BRASWELL_I2C_5, "Intel Braswell Serial I/O I2C Port 5", IG4_ATOM}, + { PCI_CHIP_BRASWELL_I2C_6, "Intel Braswell Serial I/O I2C Port 6", IG4_ATOM}, + { PCI_CHIP_BRASWELL_I2C_7, "Intel Braswell Serial I/O I2C Port 7", IG4_ATOM}, + { PCI_CHIP_SKYLAKE_I2C_0, "Intel Sunrise Point-LP I2C Controller-0", IG4_SKYLAKE}, + { PCI_CHIP_SKYLAKE_I2C_1, "Intel Sunrise Point-LP I2C Controller-1", IG4_SKYLAKE}, + { PCI_CHIP_SKYLAKE_I2C_2, "Intel Sunrise Point-LP I2C Controller-2", IG4_SKYLAKE}, + { PCI_CHIP_SKYLAKE_I2C_3, "Intel Sunrise Point-LP I2C Controller-3", IG4_SKYLAKE}, + { PCI_CHIP_SKYLAKE_I2C_4, "Intel Sunrise Point-LP I2C Controller-4", IG4_SKYLAKE}, + { PCI_CHIP_SKYLAKE_I2C_5, "Intel Sunrise Point-LP I2C Controller-5", IG4_SKYLAKE}, + { PCI_CHIP_APL_I2C_0, "Intel Apollo Lake I2C Controller-0", IG4_APL}, + { PCI_CHIP_APL_I2C_1, "Intel Apollo Lake I2C Controller-1", IG4_APL}, + { PCI_CHIP_APL_I2C_2, "Intel Apollo Lake I2C Controller-2", IG4_APL}, + { PCI_CHIP_APL_I2C_3, "Intel Apollo Lake I2C Controller-3", IG4_APL}, + { PCI_CHIP_APL_I2C_4, "Intel Apollo Lake I2C Controller-4", IG4_APL}, + { PCI_CHIP_APL_I2C_5, "Intel Apollo Lake I2C Controller-5", IG4_APL}, + { PCI_CHIP_APL_I2C_6, "Intel Apollo Lake I2C Controller-6", IG4_APL}, + { PCI_CHIP_APL_I2C_7, "Intel Apollo Lake I2C Controller-7", IG4_APL} +}; + static int ig4iic_pci_probe(device_t dev) { ig4iic_softc_t *sc = device_get_softc(dev); + uint32_t devid; + int i; - switch(pci_get_devid(dev)) { - case PCI_CHIP_LYNXPT_LP_I2C_1: - device_set_desc(dev, "Intel Lynx Point-LP I2C Controller-1"); - sc->version = IG4_HASWELL; - break; - case PCI_CHIP_LYNXPT_LP_I2C_2: - device_set_desc(dev, "Intel Lynx Point-LP I2C Controller-2"); - sc->version = IG4_HASWELL; - break; - case PCI_CHIP_BRASWELL_I2C_1: - device_set_desc(dev, "Intel Braswell Serial I/O I2C Port 1"); - sc->version = IG4_ATOM; - break; - case PCI_CHIP_BRASWELL_I2C_2: - device_set_desc(dev, "Intel Braswell Serial I/O I2C Port 2"); - sc->version = IG4_ATOM; - break; - case PCI_CHIP_BRASWELL_I2C_3: - device_set_desc(dev, "Intel Braswell Serial I/O I2C Port 3"); - sc->version = IG4_ATOM; - break; - case PCI_CHIP_BRASWELL_I2C_5: - device_set_desc(dev, "Intel Braswell Serial I/O I2C Port 5"); - sc->version = IG4_ATOM; - break; - case PCI_CHIP_BRASWELL_I2C_6: - device_set_desc(dev, "Intel Braswell Serial I/O I2C Port 6"); - sc->version = IG4_ATOM; - break; - case PCI_CHIP_BRASWELL_I2C_7: - device_set_desc(dev, "Intel Braswell Serial I/O I2C Port 7"); - sc->version = IG4_ATOM; - break; - case PCI_CHIP_SKYLAKE_I2C_0: - device_set_desc(dev, "Intel Sunrise Point-LP I2C Controller-0"); - sc->version = IG4_SKYLAKE; - break; - case PCI_CHIP_SKYLAKE_I2C_1: - device_set_desc(dev, "Intel Sunrise Point-LP I2C Controller-1"); - sc->version = IG4_SKYLAKE; - break; - case PCI_CHIP_SKYLAKE_I2C_2: - device_set_desc(dev, "Intel Sunrise Point-LP I2C Controller-2"); - sc->version = IG4_SKYLAKE; - break; - case PCI_CHIP_SKYLAKE_I2C_3: - device_set_desc(dev, "Intel Sunrise Point-LP I2C Controller-3"); - sc->version = IG4_SKYLAKE; - break; - case PCI_CHIP_SKYLAKE_I2C_4: - device_set_desc(dev, "Intel Sunrise Point-LP I2C Controller-4"); - sc->version = IG4_SKYLAKE; - break; - case PCI_CHIP_SKYLAKE_I2C_5: - device_set_desc(dev, "Intel Sunrise Point-LP I2C Controller-5"); - sc->version = IG4_SKYLAKE; - break; - default: - return (ENXIO); + devid = pci_get_devid(dev); + for (i = 0; i < nitems(ig4iic_pci_devices); i++) { + if (ig4iic_pci_devices[i].devid == devid) { + device_set_desc(dev, ig4iic_pci_devices[i].desc); + sc->version = ig4iic_pci_devices[i].version; + return (BUS_PROBE_DEFAULT); + } } - return (BUS_PROBE_DEFAULT); + return (ENXIO); } static int @@ -239,3 +228,7 @@ DRIVER_MODULE_ORDERED(ig4iic_pci, pci, ig4iic_pci_driv MODULE_DEPEND(ig4iic_pci, pci, 1, 1, 1); MODULE_DEPEND(ig4iic_pci, iicbus, IICBUS_MINVER, IICBUS_PREFVER, IICBUS_MAXVER); MODULE_VERSION(ig4iic_pci, 1); +/* + * Loading this module breaks suspend/resume on laptops + * Do not add MODULE_PNP_INFO until it's impleneted + */ Modified: stable/11/sys/dev/ichiic/ig4_reg.h ============================================================================== --- stable/11/sys/dev/ichiic/ig4_reg.h Sun Sep 30 21:54:02 2018 (r339028) +++ stable/11/sys/dev/ichiic/ig4_reg.h Sun Sep 30 23:14:07 2018 (r339029) @@ -78,6 +78,7 @@ #define IG4_REG_CTL 0x0000 /* RW Control Register */ #define IG4_REG_TAR_ADD 0x0004 /* RW Target Address */ +#define IG4_REG_HS_MADDR 0x000C /* RW High Speed Master Mode Code Address*/ #define IG4_REG_DATA_CMD 0x0010 /* RW Data Buffer and Command */ #define IG4_REG_SS_SCL_HCNT 0x0014 /* RW Std Speed clock High Count */ #define IG4_REG_SS_SCL_LCNT 0x0018 /* RW Std Speed clock Low Count */ @@ -92,7 +93,9 @@ #define IG4_REG_CLR_RX_UNDER 0x0044 /* RO Clear RX_Under Interrupt */ #define IG4_REG_CLR_RX_OVER 0x0048 /* RO Clear RX_Over Interrupt */ #define IG4_REG_CLR_TX_OVER 0x004C /* RO Clear TX_Over Interrupt */ +#define IG4_REG_CLR_RD_REQ 0x0050 /* RO Clear RD_Req Interrupt */ #define IG4_REG_CLR_TX_ABORT 0x0054 /* RO Clear TX_Abort Interrupt */ +#define IG4_REG_CLR_RX_DONE 0x0058 /* RO Clear RX_Done Interrupt */ #define IG4_REG_CLR_ACTIVITY 0x005C /* RO Clear Activity Interrupt */ #define IG4_REG_CLR_STOP_DET 0x0060 /* RO Clear STOP Detection Int */ #define IG4_REG_CLR_START_DET 0x0064 /* RO Clear START Detection Int */ @@ -108,6 +111,7 @@ #define IG4_REG_DMA_TDLR 0x008C /* RW DMA Transmit Data Level */ #define IG4_REG_DMA_RDLR 0x0090 /* RW DMA Receive Data Level */ #define IG4_REG_SDA_SETUP 0x0094 /* RW SDA Setup */ +#define IG4_REG_ACK_GENERAL_CALL 0x0098 /* RW I2C ACK General Call */ #define IG4_REG_ENABLE_STATUS 0x009C /* RO Enable Status */ /* Available at least on Atom SoCs and Haswell mobile. */ #define IG4_REG_COMP_PARAM1 0x00F4 /* RO Component Parameter */ @@ -118,6 +122,9 @@ #define IG4_REG_RESETS_SKL 0x0204 /* RW Reset Register */ #define IG4_REG_ACTIVE_LTR_VALUE 0x0210 /* RW Active LTR Value */ #define IG4_REG_IDLE_LTR_VALUE 0x0214 /* RW Idle LTR Value */ +#define IG4_REG_TX_ACK_COUNT 0x0218 /* RO TX ACK Count */ +#define IG4_REG_RX_BYTE_COUNT 0x021C /* RO RX ACK Count */ +#define IG4_REG_DEVIDLE_CTRL 0x024C /* RW Device Control */ /* Available at least on Atom SoCs */ #define IG4_REG_CLK_PARMS 0x0800 /* RW Clock Parameters */ /* Available at least on Atom SoCs and Haswell mobile */ @@ -581,6 +588,17 @@ /* Skylake-U/Y and Kaby Lake-U/Y have the reset bits inverted */ #define IG4_RESETS_DEASSERT_SKL 0x0003 #define IG4_RESETS_ASSERT_SKL 0x0000 + +/* Newer versions of the I2C controller allow to check whether + * the above ASSERT/DEASSERT is necessary by querying the DEVIDLE_CONTROL + * register. + * + * the RESTORE_REQUIRED bit can be cleared by writing 1 + * the DEVICE_IDLE status can be set to put the controller in an idle state + * + */ +#define IG4_RESTORE_REQUIRED 0x0008 +#define IG4_DEVICE_IDLE 0x0004 /* * GENERAL - (RW) General Reigster 22.2.38 Modified: stable/11/sys/dev/ichiic/ig4_var.h ============================================================================== --- stable/11/sys/dev/ichiic/ig4_var.h Sun Sep 30 21:54:02 2018 (r339028) +++ stable/11/sys/dev/ichiic/ig4_var.h Sun Sep 30 23:14:07 2018 (r339029) @@ -47,7 +47,7 @@ #define IG4_RBUFMASK (IG4_RBUFSIZE - 1) enum ig4_op { IG4_IDLE, IG4_READ, IG4_WRITE }; -enum ig4_vers { IG4_HASWELL, IG4_ATOM, IG4_SKYLAKE }; +enum ig4_vers { IG4_HASWELL, IG4_ATOM, IG4_SKYLAKE, IG4_APL }; struct ig4iic_softc { device_t dev; From owner-svn-src-stable-11@freebsd.org Sun Sep 30 23:15:46 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E1AC310AAD72; Sun, 30 Sep 2018 23:15:45 +0000 (UTC) (envelope-from gonzo@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7AC0376671; Sun, 30 Sep 2018 23:15:45 +0000 (UTC) (envelope-from gonzo@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 6FF2218D58; Sun, 30 Sep 2018 23:15:45 +0000 (UTC) (envelope-from gonzo@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w8UNFjVu056553; Sun, 30 Sep 2018 23:15:45 GMT (envelope-from gonzo@FreeBSD.org) Received: (from gonzo@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w8UNFikV056550; Sun, 30 Sep 2018 23:15:44 GMT (envelope-from gonzo@FreeBSD.org) Message-Id: <201809302315.w8UNFikV056550@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: gonzo set sender to gonzo@FreeBSD.org using -f From: Oleksandr Tymoshenko Date: Sun, 30 Sep 2018 23:15:44 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339030 - stable/11/sys/dev/ichiic X-SVN-Group: stable-11 X-SVN-Commit-Author: gonzo X-SVN-Commit-Paths: stable/11/sys/dev/ichiic X-SVN-Commit-Revision: 339030 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Sep 2018 23:15:46 -0000 Author: gonzo Date: Sun Sep 30 23:15:44 2018 New Revision: 339030 URL: https://svnweb.freebsd.org/changeset/base/339030 Log: MFC r338111, r338215 r338111: [ig4] add ACPI Device HID for AMD platforms Added ACPI Device HID AMDI0010 for the designware I2C controllers in future AMD platforms. Also, when verifying component version check for minimal value instead of exact match. PR: 230641 Submitted by: Rajesh Reviewed by: cem, gonzo Differential Revision: https://reviews.freebsd.org/D16670 r338215: [ig4] Fix I/O timeout issue with Designware I2C controller on AMD platforms Due to hardware limitation AMD I2C controller can't trigger pending interrupt if interrupt status has been changed after clearing interrupt status bits. So, I2C will lose the interrupt and IO will be timed out. Implements a workaround to disable I2C controller interrupt and re-enable I2C interrupt before existing interrupt handler. Submitted by: rajfbsd@gmail.com Differential Revision: https://reviews.freebsd.org/D16720 Modified: stable/11/sys/dev/ichiic/ig4_acpi.c stable/11/sys/dev/ichiic/ig4_iic.c stable/11/sys/dev/ichiic/ig4_reg.h stable/11/sys/dev/ichiic/ig4_var.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/dev/ichiic/ig4_acpi.c ============================================================================== --- stable/11/sys/dev/ichiic/ig4_acpi.c Sun Sep 30 23:14:07 2018 (r339029) +++ stable/11/sys/dev/ichiic/ig4_acpi.c Sun Sep 30 23:15:44 2018 (r339030) @@ -60,6 +60,7 @@ static char *ig4iic_ids[] = { "80860F41", "808622C1", "AMDI0510", + "AMDI0010", "APMC0D0F", NULL }; @@ -67,10 +68,20 @@ static char *ig4iic_ids[] = { static int ig4iic_acpi_probe(device_t dev) { + ig4iic_softc_t *sc; + char *hid; - if (acpi_disabled("ig4iic") || - ACPI_ID_PROBE(device_get_parent(dev), dev, ig4iic_ids) == NULL) - return (ENXIO); + sc = device_get_softc(dev); + + if (acpi_disabled("ig4iic")) + return (ENXIO); + + hid = ACPI_ID_PROBE(device_get_parent(dev), dev, ig4iic_ids); + if (hid == NULL) + return (ENXIO); + + if (strcmp("AMDI0010", hid) == 0) + sc->access_intr_mask = 1; device_set_desc(dev, "Designware I2C Controller"); return (0); Modified: stable/11/sys/dev/ichiic/ig4_iic.c ============================================================================== --- stable/11/sys/dev/ichiic/ig4_iic.c Sun Sep 30 23:14:07 2018 (r339029) +++ stable/11/sys/dev/ichiic/ig4_iic.c Sun Sep 30 23:15:44 2018 (r339030) @@ -563,7 +563,7 @@ ig4iic_attach(ig4iic_softc_t *sc) if (sc->version == IG4_HASWELL || sc->version == IG4_ATOM) { v = reg_read(sc, IG4_REG_COMP_VER); - if (v != IG4_COMP_VER) { + if (v < IG4_COMP_MIN_VER) { error = ENXIO; goto done; } @@ -724,6 +724,19 @@ ig4iic_intr(void *cookie) ++sc->rnext; status = reg_read(sc, IG4_REG_I2C_STA); } + + /* + * Workaround to trigger pending interrupt if IG4_REG_INTR_STAT + * is changed after clearing it + */ + if(sc->access_intr_mask) { + status = reg_read(sc, IG4_REG_INTR_MASK); + if(status) { + reg_write(sc, IG4_REG_INTR_MASK, 0); + reg_write(sc, IG4_REG_INTR_MASK, status); + } + } + wakeup(sc); mtx_unlock(&sc->io_lock); } Modified: stable/11/sys/dev/ichiic/ig4_reg.h ============================================================================== --- stable/11/sys/dev/ichiic/ig4_reg.h Sun Sep 30 23:14:07 2018 (r339029) +++ stable/11/sys/dev/ichiic/ig4_reg.h Sun Sep 30 23:15:44 2018 (r339030) @@ -73,7 +73,6 @@ * SDA_HOLD 0x00000001 * SDA_SETUP 0x00000064 * COMP_PARAM1 0x00FFFF6E - * COMP_VER 0x3131352A */ #define IG4_REG_CTL 0x0000 /* RW Control Register */ @@ -552,11 +551,10 @@ /* * COMP_VER - (RO) Component Version Register 22.2.36 - * Default Value 0x3131352A * * Contains the chip version number. All 32 bits. */ -#define IG4_COMP_VER 0x3131352A +#define IG4_COMP_MIN_VER 0x3131352A /* * COMP_TYPE - (RO) (linux) Endian and bus width probe Modified: stable/11/sys/dev/ichiic/ig4_var.h ============================================================================== --- stable/11/sys/dev/ichiic/ig4_var.h Sun Sep 30 23:14:07 2018 (r339029) +++ stable/11/sys/dev/ichiic/ig4_var.h Sun Sep 30 23:15:44 2018 (r339030) @@ -72,6 +72,7 @@ struct ig4iic_softc { int slave_valid : 1; int read_started : 1; int write_started : 1; + int access_intr_mask : 1; /* * Locking semantics: From owner-svn-src-stable-11@freebsd.org Sun Sep 30 23:17:34 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E748610AAE35; Sun, 30 Sep 2018 23:17:33 +0000 (UTC) (envelope-from gonzo@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8BC31767D0; Sun, 30 Sep 2018 23:17:33 +0000 (UTC) (envelope-from gonzo@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 806EA18D5B; Sun, 30 Sep 2018 23:17:33 +0000 (UTC) (envelope-from gonzo@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w8UNHXag056694; Sun, 30 Sep 2018 23:17:33 GMT (envelope-from gonzo@FreeBSD.org) Received: (from gonzo@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w8UNHXLB056693; Sun, 30 Sep 2018 23:17:33 GMT (envelope-from gonzo@FreeBSD.org) Message-Id: <201809302317.w8UNHXLB056693@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: gonzo set sender to gonzo@FreeBSD.org using -f From: Oleksandr Tymoshenko Date: Sun, 30 Sep 2018 23:17:33 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339031 - stable/11/sys/dev/ichiic X-SVN-Group: stable-11 X-SVN-Commit-Author: gonzo X-SVN-Commit-Paths: stable/11/sys/dev/ichiic X-SVN-Commit-Revision: 339031 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Sep 2018 23:17:34 -0000 Author: gonzo Date: Sun Sep 30 23:17:33 2018 New Revision: 339031 URL: https://svnweb.freebsd.org/changeset/base/339031 Log: MFC r338654, r338701 r338654: [ig4] Add PCI IDs for I2C controller on Intel Kaby Lake systems PR: 221777 Approved by: re (kib) Submitted by: marc.priggemeyer@gmail.com r338701: [ig4] Fix device description for Kaby Lake systems Kaby Lake I2C controller is Intel Sunrise Point-H not Intel Sunrise Point-LP. Submitted by: Dmitry Luhtionov Approved by: re (kib) Modified: stable/11/sys/dev/ichiic/ig4_pci.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/dev/ichiic/ig4_pci.c ============================================================================== --- stable/11/sys/dev/ichiic/ig4_pci.c Sun Sep 30 23:15:44 2018 (r339030) +++ stable/11/sys/dev/ichiic/ig4_pci.c Sun Sep 30 23:17:33 2018 (r339031) @@ -80,6 +80,8 @@ static int ig4iic_pci_detach(device_t dev); #define PCI_CHIP_SKYLAKE_I2C_3 0x9d638086 #define PCI_CHIP_SKYLAKE_I2C_4 0x9d648086 #define PCI_CHIP_SKYLAKE_I2C_5 0x9d658086 +#define PCI_CHIP_KABYLAKE_I2C_0 0xa1608086 +#define PCI_CHIP_KABYLAKE_I2C_1 0xa1618086 #define PCI_CHIP_APL_I2C_0 0x5aac8086 #define PCI_CHIP_APL_I2C_1 0x5aae8086 #define PCI_CHIP_APL_I2C_2 0x5ab08086 @@ -110,6 +112,8 @@ static struct ig4iic_pci_device ig4iic_pci_devices[] = { PCI_CHIP_SKYLAKE_I2C_3, "Intel Sunrise Point-LP I2C Controller-3", IG4_SKYLAKE}, { PCI_CHIP_SKYLAKE_I2C_4, "Intel Sunrise Point-LP I2C Controller-4", IG4_SKYLAKE}, { PCI_CHIP_SKYLAKE_I2C_5, "Intel Sunrise Point-LP I2C Controller-5", IG4_SKYLAKE}, + { PCI_CHIP_KABYLAKE_I2C_0, "Intel Sunrise Point-H I2C Controller-0", IG4_SKYLAKE}, + { PCI_CHIP_KABYLAKE_I2C_1, "Intel Sunrise Point-H I2C Controller-1", IG4_SKYLAKE}, { PCI_CHIP_APL_I2C_0, "Intel Apollo Lake I2C Controller-0", IG4_APL}, { PCI_CHIP_APL_I2C_1, "Intel Apollo Lake I2C Controller-1", IG4_APL}, { PCI_CHIP_APL_I2C_2, "Intel Apollo Lake I2C Controller-2", IG4_APL}, From owner-svn-src-stable-11@freebsd.org Sun Sep 30 23:18:43 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 47D9F10AAEC6; Sun, 30 Sep 2018 23:18:43 +0000 (UTC) (envelope-from gonzo@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id F28747693D; Sun, 30 Sep 2018 23:18:42 +0000 (UTC) (envelope-from gonzo@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id ED67418D5D; Sun, 30 Sep 2018 23:18:42 +0000 (UTC) (envelope-from gonzo@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w8UNIgR7056806; Sun, 30 Sep 2018 23:18:42 GMT (envelope-from gonzo@FreeBSD.org) Received: (from gonzo@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w8UNIgrQ056805; Sun, 30 Sep 2018 23:18:42 GMT (envelope-from gonzo@FreeBSD.org) Message-Id: <201809302318.w8UNIgrQ056805@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: gonzo set sender to gonzo@FreeBSD.org using -f From: Oleksandr Tymoshenko Date: Sun, 30 Sep 2018 23:18:42 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339032 - stable/11/share/man/man4 X-SVN-Group: stable-11 X-SVN-Commit-Author: gonzo X-SVN-Commit-Paths: stable/11/share/man/man4 X-SVN-Commit-Revision: 339032 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Sep 2018 23:18:43 -0000 Author: gonzo Date: Sun Sep 30 23:18:42 2018 New Revision: 339032 URL: https://svnweb.freebsd.org/changeset/base/339032 Log: MFC r338655: [ig4] Update list of supported hardware Reflect the fact that ig4(4) is not an Intel-specific device but a driver for Synopsys DesignWare I2C controller that now ships in AMD systems too. Approved by: re (kib), rpokala Modified: stable/11/share/man/man4/ig4.4 Directory Properties: stable/11/ (props changed) Modified: stable/11/share/man/man4/ig4.4 ============================================================================== --- stable/11/share/man/man4/ig4.4 Sun Sep 30 23:17:33 2018 (r339031) +++ stable/11/share/man/man4/ig4.4 Sun Sep 30 23:18:42 2018 (r339032) @@ -24,12 +24,12 @@ .\" .\" $FreeBSD$ .\" -.Dd October 03, 2016 +.Dd September 13, 2018 .Dt IG4 4 .Os .Sh NAME .Nm ig4 -.Nd Intel(R) fourth generation mobile CPU integrated I2C driver +.Nd Synopsys DesignWare I2C Controller .Sh SYNOPSIS To compile this driver into the kernel, place the following lines into the kernel configuration file: @@ -49,9 +49,9 @@ The driver provides access to peripherals attached to an I2C controller. .Sh HARDWARE .Nm -supports the I2C controllers found in fourth generation Intel(R) Core(TM) -processors based on the mobile U-processor line for intelligent systems. -This includes the i7-4650U, i5-4300U, i3-4010U, and 2980U. +supports the I2C controllers based on Synopsys DesignWare IP that can be found +in Intel(R) Core(TM) processors starting from the fourth generation, Intel(R) +Bay Trail, Apollo Lake SoC families, and some AMD systems. .Sh SYSCTL VARIABLES These .Xr sysctl 8 From owner-svn-src-stable-11@freebsd.org Mon Oct 1 04:02:01 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 64F4F10B1A47; Mon, 1 Oct 2018 04:02:01 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 141C77FA4B; Mon, 1 Oct 2018 04:02:01 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 0BC091BB90; Mon, 1 Oct 2018 04:02:01 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w91420P5005351; Mon, 1 Oct 2018 04:02:00 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w91420Ii005350; Mon, 1 Oct 2018 04:02:00 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810010402.w91420Ii005350@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Mon, 1 Oct 2018 04:02:00 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339033 - stable/11/sys/geom/raid X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/geom/raid X-SVN-Commit-Revision: 339033 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 04:02:01 -0000 Author: mav Date: Mon Oct 1 04:02:00 2018 New Revision: 339033 URL: https://svnweb.freebsd.org/changeset/base/339033 Log: MFC r338913: Fix use-after-free in RAID0 error reporting of GEOM_RAID. Modified: stable/11/sys/geom/raid/tr_raid0.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/geom/raid/tr_raid0.c ============================================================================== --- stable/11/sys/geom/raid/tr_raid0.c Sun Sep 30 23:18:42 2018 (r339032) +++ stable/11/sys/geom/raid/tr_raid0.c Mon Oct 1 04:02:00 2018 (r339033) @@ -321,7 +321,7 @@ g_raid_tr_iodone_raid0(struct g_raid_tr_object *tr, pbp->bio_inbed++; if (pbp->bio_children == pbp->bio_inbed) { pbp->bio_completed = pbp->bio_length; - g_raid_iodone(pbp, bp->bio_error); + g_raid_iodone(pbp, pbp->bio_error); } } From owner-svn-src-stable-11@freebsd.org Mon Oct 1 04:08:50 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 817AC10B1E18; Mon, 1 Oct 2018 04:08:50 +0000 (UTC) (envelope-from sef@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3552A7FE60; Mon, 1 Oct 2018 04:08:50 +0000 (UTC) (envelope-from sef@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 2BD0D1BCB9; Mon, 1 Oct 2018 04:08:50 +0000 (UTC) (envelope-from sef@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w9148oTR005667; Mon, 1 Oct 2018 04:08:50 GMT (envelope-from sef@FreeBSD.org) Received: (from sef@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w9148lxH005654; Mon, 1 Oct 2018 04:08:47 GMT (envelope-from sef@FreeBSD.org) Message-Id: <201810010408.w9148lxH005654@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: sef set sender to sef@FreeBSD.org using -f From: Sean Eric Fagan Date: Mon, 1 Oct 2018 04:08:47 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339034 - in stable/11: cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/cmd/zpool cddl/contrib/opensolaris/cmd/ztest cddl/contrib/opensolaris/lib/libzfs/common cddl/contrib/op... X-SVN-Group: stable-11 X-SVN-Commit-Author: sef X-SVN-Commit-Paths: in stable/11: cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/cmd/zpool cddl/contrib/opensolaris/cmd/ztest cddl/contrib/opensolaris/lib/libzfs/common cddl/contrib/opensolaris/lib/libzpool/co... X-SVN-Commit-Revision: 339034 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 04:08:51 -0000 Author: sef Date: Mon Oct 1 04:08:47 2018 New Revision: 339034 URL: https://svnweb.freebsd.org/changeset/base/339034 Log: MFC r334844, r336180, r336458 r334844 This originated from ZFS On Linux, as https://github.com/zfsonlinux/zfs/commit/d4a72f23863382bdf6d0ae33196f5b5decbc48fd During scans (scrubs or resilvers), it sorts the blocks in each transaction group by block offset; the result can be a significant improvement. (On my test system just now, which I put some effort to introduce fragmentation into the pool since I set it up yesterday, a scrub went from 1h2m to 33.5m with the changes.) I've seen similar rations on production systems. r336180 Fix up some missed and mis-merges from the sequential scan code (r334844). Most of the changes involve moving some code around to reduce conflicts with future merges. One of the missing changes included a notification on scrub cancellation. r336458 Fix a couple of typos in r334844 noticed by Richard Kojedzinszky Approved by: mav Sponsored by: iXsystems, Inc Modified: stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool_main.c stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_status.c stable/11/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h stable/11/cddl/contrib/opensolaris/lib/libzpool/common/taskq.c stable/11/sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/ddt.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/range_tree.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_scan.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/range_tree.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_disk.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_missing.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_root.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/taskq.h Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Mon Oct 1 04:02:00 2018 (r339033) +++ stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Mon Oct 1 04:08:47 2018 (r339034) @@ -2281,14 +2281,14 @@ dump_dir(objset_t *os) object_count++; } - ASSERT3U(object_count, ==, usedobjs); - (void) printf("\n"); if (error != ESRCH) { (void) fprintf(stderr, "dmu_object_next() = %d\n", error); abort(); } + + ASSERT3U(object_count, ==, usedobjs); } static void @@ -2788,6 +2788,7 @@ zdb_blkptr_done(zio_t *zio) mutex_enter(&spa->spa_scrub_lock); spa->spa_scrub_inflight--; + spa->spa_load_verify_ios--; cv_broadcast(&spa->spa_scrub_io_cv); if (ioerr && !(zio->io_flags & ZIO_FLAG_SPECULATIVE)) { @@ -2859,9 +2860,10 @@ zdb_blkptr_cb(spa_t *spa, zilog_t *zilog, const blkptr flags |= ZIO_FLAG_SPECULATIVE; mutex_enter(&spa->spa_scrub_lock); - while (spa->spa_scrub_inflight > max_inflight) + while (spa->spa_load_verify_ios > max_inflight) cv_wait(&spa->spa_scrub_io_cv, &spa->spa_scrub_lock); spa->spa_scrub_inflight++; + spa->spa_load_verify_ios++; mutex_exit(&spa->spa_scrub_lock); zio_nowait(zio_read(NULL, spa, bp, abd, size, Modified: stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool_main.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool_main.c Mon Oct 1 04:02:00 2018 (r339033) +++ stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool_main.c Mon Oct 1 04:08:47 2018 (r339034) @@ -1643,7 +1643,7 @@ print_status_config(zpool_handle_t *zhp, const char *n (void) nvlist_lookup_uint64_array(nv, ZPOOL_CONFIG_SCAN_STATS, (uint64_t **)&ps, &c); - if (ps && ps->pss_state == DSS_SCANNING && + if (ps != NULL && ps->pss_state == DSS_SCANNING && vs->vs_scan_processed != 0 && children == 0) { (void) printf(gettext(" (%s)"), (ps->pss_func == POOL_SCAN_RESILVER) ? @@ -4254,11 +4254,13 @@ static void print_scan_status(pool_scan_stat_t *ps) { time_t start, end, pause; - uint64_t elapsed, mins_left, hours_left; - uint64_t pass_exam, examined, total; - uint_t rate; + uint64_t total_secs_left; + uint64_t elapsed, secs_left, mins_left, hours_left, days_left; + uint64_t pass_scanned, scanned, pass_issued, issued, total; + uint_t scan_rate, issue_rate; double fraction_done; - char processed_buf[7], examined_buf[7], total_buf[7], rate_buf[7]; + char processed_buf[7], scanned_buf[7], issued_buf[7], total_buf[7]; + char srate_buf[7], irate_buf[7]; (void) printf(gettext(" scan: ")); @@ -4272,30 +4274,37 @@ print_scan_status(pool_scan_stat_t *ps) start = ps->pss_start_time; end = ps->pss_end_time; pause = ps->pss_pass_scrub_pause; + zfs_nicenum(ps->pss_processed, processed_buf, sizeof (processed_buf)); assert(ps->pss_func == POOL_SCAN_SCRUB || ps->pss_func == POOL_SCAN_RESILVER); - /* - * Scan is finished or canceled. - */ - if (ps->pss_state == DSS_FINISHED) { - uint64_t minutes_taken = (end - start) / 60; - char *fmt = NULL; + /* Scan is finished or canceled. */ + if (ps->pss_state == DSS_FINISHED) { + total_secs_left = end - start; + days_left = total_secs_left / 60 / 60 / 24; + hours_left = (total_secs_left / 60 / 60) % 24; + mins_left = (total_secs_left / 60) % 60; + secs_left = (total_secs_left % 60); + if (ps->pss_func == POOL_SCAN_SCRUB) { - fmt = gettext("scrub repaired %s in %lluh%um with " - "%llu errors on %s"); + (void) printf(gettext("scrub repaired %s " + "in %llu days %02llu:%02llu:%02llu " + "with %llu errors on %s"), processed_buf, + (u_longlong_t)days_left, (u_longlong_t)hours_left, + (u_longlong_t)mins_left, (u_longlong_t)secs_left, + (u_longlong_t)ps->pss_errors, ctime(&end)); } else if (ps->pss_func == POOL_SCAN_RESILVER) { - fmt = gettext("resilvered %s in %lluh%um with " - "%llu errors on %s"); + (void) printf(gettext("resilvered %s " + "in %llu days %02llu:%02llu:%02llu " + "with %llu errors on %s"), processed_buf, + (u_longlong_t)days_left, (u_longlong_t)hours_left, + (u_longlong_t)mins_left, (u_longlong_t)secs_left, + (u_longlong_t)ps->pss_errors, ctime(&end)); + } - /* LINTED */ - (void) printf(fmt, processed_buf, - (u_longlong_t)(minutes_taken / 60), - (uint_t)(minutes_taken % 60), - (u_longlong_t)ps->pss_errors, - ctime((time_t *)&end)); + return; } else if (ps->pss_state == DSS_CANCELED) { if (ps->pss_func == POOL_SCAN_SCRUB) { @@ -4310,19 +4319,15 @@ print_scan_status(pool_scan_stat_t *ps) assert(ps->pss_state == DSS_SCANNING); - /* - * Scan is in progress. - */ + /* Scan is in progress. Resilvers can't be paused. */ if (ps->pss_func == POOL_SCAN_SCRUB) { if (pause == 0) { (void) printf(gettext("scrub in progress since %s"), ctime(&start)); } else { - char buf[32]; - struct tm *p = localtime(&pause); - (void) strftime(buf, sizeof (buf), "%a %b %e %T %Y", p); - (void) printf(gettext("scrub paused since %s\n"), buf); - (void) printf(gettext("\tscrub started on %s"), + (void) printf(gettext("scrub paused since %s"), + ctime(&pause)); + (void) printf(gettext("\tscrub started on %s"), ctime(&start)); } } else if (ps->pss_func == POOL_SCAN_RESILVER) { @@ -4330,49 +4335,67 @@ print_scan_status(pool_scan_stat_t *ps) ctime(&start)); } - examined = ps->pss_examined ? ps->pss_examined : 1; + scanned = ps->pss_examined; + pass_scanned = ps->pss_pass_exam; + issued = ps->pss_issued; + pass_issued = ps->pss_pass_issued; total = ps->pss_to_examine; - fraction_done = (double)examined / total; - /* elapsed time for this pass */ + /* we are only done with a block once we have issued the IO for it */ + fraction_done = (double)issued / total; + + /* elapsed time for this pass, rounding up to 1 if it's 0 */ elapsed = time(NULL) - ps->pss_pass_start; elapsed -= ps->pss_pass_scrub_spent_paused; - elapsed = elapsed ? elapsed : 1; - pass_exam = ps->pss_pass_exam ? ps->pss_pass_exam : 1; - rate = pass_exam / elapsed; - rate = rate ? rate : 1; - mins_left = ((total - examined) / rate) / 60; - hours_left = mins_left / 60; + elapsed = (elapsed != 0) ? elapsed : 1; - zfs_nicenum(examined, examined_buf, sizeof (examined_buf)); + scan_rate = pass_scanned / elapsed; + issue_rate = pass_issued / elapsed; + total_secs_left = (issue_rate != 0) ? + ((total - issued) / issue_rate) : UINT64_MAX; + + days_left = total_secs_left / 60 / 60 / 24; + hours_left = (total_secs_left / 60 / 60) % 24; + mins_left = (total_secs_left / 60) % 60; + secs_left = (total_secs_left % 60); + + /* format all of the numbers we will be reporting */ + zfs_nicenum(scanned, scanned_buf, sizeof (scanned_buf)); + zfs_nicenum(issued, issued_buf, sizeof (issued_buf)); zfs_nicenum(total, total_buf, sizeof (total_buf)); + zfs_nicenum(scan_rate, srate_buf, sizeof (srate_buf)); + zfs_nicenum(issue_rate, irate_buf, sizeof (irate_buf)); - /* - * do not print estimated time if hours_left is more than 30 days - * or we have a paused scrub - */ + /* doo not print estimated time if we have a paused scrub */ if (pause == 0) { - zfs_nicenum(rate, rate_buf, sizeof (rate_buf)); - (void) printf(gettext("\t%s scanned out of %s at %s/s"), - examined_buf, total_buf, rate_buf); - if (hours_left < (30 * 24)) { - (void) printf(gettext(", %lluh%um to go\n"), - (u_longlong_t)hours_left, (uint_t)(mins_left % 60)); - } else { - (void) printf(gettext( - ", (scan is slow, no estimated time)\n")); - } + (void) printf(gettext("\t%s scanned at %s/s, " + "%s issued at %s/s, %s total\n"), + scanned_buf, srate_buf, issued_buf, irate_buf, total_buf); } else { - (void) printf(gettext("\t%s scanned out of %s\n"), - examined_buf, total_buf); + (void) printf(gettext("\t%s scanned, %s issued, %s total\n"), + scanned_buf, issued_buf, total_buf); } if (ps->pss_func == POOL_SCAN_RESILVER) { - (void) printf(gettext(" %s resilvered, %.2f%% done\n"), + (void) printf(gettext("\t%s resilvered, %.2f%% done"), processed_buf, 100 * fraction_done); } else if (ps->pss_func == POOL_SCAN_SCRUB) { - (void) printf(gettext(" %s repaired, %.2f%% done\n"), + (void) printf(gettext("\t%s repaired, %.2f%% done"), processed_buf, 100 * fraction_done); + } + + if (pause == 0) { + if (issue_rate >= 10 * 1024 * 1024) { + (void) printf(gettext(", %llu days " + "%02llu:%02llu:%02llu to go\n"), + (u_longlong_t)days_left, (u_longlong_t)hours_left, + (u_longlong_t)mins_left, (u_longlong_t)secs_left); + } else { + (void) printf(gettext(", no estimated " + "completion time\n")); + } + } else { + (void) printf(gettext("\n")); } } Modified: stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c Mon Oct 1 04:02:00 2018 (r339033) +++ stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c Mon Oct 1 04:08:47 2018 (r339034) @@ -374,15 +374,15 @@ ztest_info_t ztest_info[] = { { ztest_fzap, 1, &zopt_sometimes }, { ztest_dmu_snapshot_create_destroy, 1, &zopt_sometimes }, { ztest_spa_create_destroy, 1, &zopt_sometimes }, - { ztest_fault_inject, 1, &zopt_sometimes }, + { ztest_fault_inject, 1, &zopt_incessant }, { ztest_ddt_repair, 1, &zopt_sometimes }, { ztest_dmu_snapshot_hold, 1, &zopt_sometimes }, { ztest_reguid, 1, &zopt_rarely }, { ztest_spa_rename, 1, &zopt_rarely }, - { ztest_scrub, 1, &zopt_rarely }, + { ztest_scrub, 1, &zopt_often }, { ztest_spa_upgrade, 1, &zopt_rarely }, { ztest_dsl_dataset_promote_busy, 1, &zopt_rarely }, - { ztest_vdev_attach_detach, 1, &zopt_sometimes }, + { ztest_vdev_attach_detach, 1, &zopt_incessant }, { ztest_vdev_LUN_growth, 1, &zopt_rarely }, { ztest_vdev_add_remove, 1, &ztest_opts.zo_vdevtime }, Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_status.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_status.c Mon Oct 1 04:02:00 2018 (r339033) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_status.c Mon Oct 1 04:08:47 2018 (r339034) @@ -219,7 +219,7 @@ check_status(nvlist_t *config, boolean_t isimport) */ (void) nvlist_lookup_uint64_array(nvroot, ZPOOL_CONFIG_SCAN_STATS, (uint64_t **)&ps, &psc); - if (ps && ps->pss_func == POOL_SCAN_RESILVER && + if (ps != NULL && ps->pss_func == POOL_SCAN_RESILVER && ps->pss_state == DSS_SCANNING) return (ZPOOL_STATUS_RESILVERING); Modified: stable/11/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h Mon Oct 1 04:02:00 2018 (r339033) +++ stable/11/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h Mon Oct 1 04:08:47 2018 (r339034) @@ -408,6 +408,7 @@ typedef struct taskq_ent { #define TQ_NOQUEUE 0x02 /* Do not enqueue if can't dispatch */ #define TQ_FRONT 0x08 /* Queue in front */ +#define TASKQID_INVALID ((taskqid_t)0) extern taskq_t *system_taskq; @@ -421,6 +422,7 @@ extern void taskq_dispatch_ent(taskq_t *, task_func_t, taskq_ent_t *); extern void taskq_destroy(taskq_t *); extern void taskq_wait(taskq_t *); +extern void taskq_wait_id(taskq_t *, taskqid_t); extern int taskq_member(taskq_t *, void *); extern void system_taskq_init(void); extern void system_taskq_fini(void); Modified: stable/11/cddl/contrib/opensolaris/lib/libzpool/common/taskq.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzpool/common/taskq.c Mon Oct 1 04:02:00 2018 (r339033) +++ stable/11/cddl/contrib/opensolaris/lib/libzpool/common/taskq.c Mon Oct 1 04:08:47 2018 (r339034) @@ -187,6 +187,12 @@ taskq_wait(taskq_t *tq) mutex_exit(&tq->tq_lock); } +void +taskq_wait_id(taskq_t *tq, taskqid_t id) +{ + taskq_wait(tq); +} + static void * taskq_thread(void *arg) { Modified: stable/11/sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c ============================================================================== --- stable/11/sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c Mon Oct 1 04:02:00 2018 (r339033) +++ stable/11/sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c Mon Oct 1 04:08:47 2018 (r339034) @@ -173,3 +173,9 @@ taskq_wait(taskq_t *tq) { taskqueue_drain_all(tq->tq_queue); } + +void +taskq_wait_id(taskq_t *tq, taskqid_t id) +{ + taskq_wait(tq); +} Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Mon Oct 1 04:02:00 2018 (r339033) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Mon Oct 1 04:08:47 2018 (r339034) @@ -339,7 +339,8 @@ int arc_no_grow_shift = 5; * minimum lifespan of a prefetch block in clock ticks * (initialized in arc_init()) */ -static int arc_min_prefetch_lifespan; +static int zfs_arc_min_prefetch_ms = 1; +static int zfs_arc_min_prescient_prefetch_ms = 6; /* * If this percent of memory is free, don't throttle. @@ -783,8 +784,9 @@ typedef struct arc_stats { kstat_named_t arcstat_meta_limit; kstat_named_t arcstat_meta_max; kstat_named_t arcstat_meta_min; - kstat_named_t arcstat_sync_wait_for_async; + kstat_named_t arcstat_async_upgrade_sync; kstat_named_t arcstat_demand_hit_predictive_prefetch; + kstat_named_t arcstat_demand_hit_prescient_prefetch; } arc_stats_t; static arc_stats_t arc_stats = { @@ -881,8 +883,9 @@ static arc_stats_t arc_stats = { { "arc_meta_limit", KSTAT_DATA_UINT64 }, { "arc_meta_max", KSTAT_DATA_UINT64 }, { "arc_meta_min", KSTAT_DATA_UINT64 }, - { "sync_wait_for_async", KSTAT_DATA_UINT64 }, + { "async_upgrade_sync", KSTAT_DATA_UINT64 }, { "demand_hit_predictive_prefetch", KSTAT_DATA_UINT64 }, + { "demand_hit_prescient_prefetch", KSTAT_DATA_UINT64 }, }; #define ARCSTAT(stat) (arc_stats.stat.value.ui64) @@ -978,22 +981,23 @@ typedef struct arc_callback arc_callback_t; struct arc_callback { void *acb_private; - arc_done_func_t *acb_done; + arc_read_done_func_t *acb_done; arc_buf_t *acb_buf; boolean_t acb_compressed; zio_t *acb_zio_dummy; + zio_t *acb_zio_head; arc_callback_t *acb_next; }; typedef struct arc_write_callback arc_write_callback_t; struct arc_write_callback { - void *awcb_private; - arc_done_func_t *awcb_ready; - arc_done_func_t *awcb_children_ready; - arc_done_func_t *awcb_physdone; - arc_done_func_t *awcb_done; - arc_buf_t *awcb_buf; + void *awcb_private; + arc_write_done_func_t *awcb_ready; + arc_write_done_func_t *awcb_children_ready; + arc_write_done_func_t *awcb_physdone; + arc_write_done_func_t *awcb_done; + arc_buf_t *awcb_buf; }; /* @@ -1233,6 +1237,8 @@ sysctl_vfs_zfs_arc_min(SYSCTL_HANDLER_ARGS) #define HDR_IO_IN_PROGRESS(hdr) ((hdr)->b_flags & ARC_FLAG_IO_IN_PROGRESS) #define HDR_IO_ERROR(hdr) ((hdr)->b_flags & ARC_FLAG_IO_ERROR) #define HDR_PREFETCH(hdr) ((hdr)->b_flags & ARC_FLAG_PREFETCH) +#define HDR_PRESCIENT_PREFETCH(hdr) \ + ((hdr)->b_flags & ARC_FLAG_PRESCIENT_PREFETCH) #define HDR_COMPRESSION_ENABLED(hdr) \ ((hdr)->b_flags & ARC_FLAG_COMPRESSED_ARC) @@ -1396,6 +1402,11 @@ SYSCTL_UQUAD(_vfs_zfs, OID_AUTO, mfu_ghost_data_esize, SYSCTL_UQUAD(_vfs_zfs, OID_AUTO, l2c_only_size, CTLFLAG_RD, &ARC_l2c_only.arcs_size.rc_count, 0, "size of mru state"); +SYSCTL_UINT(_vfs_zfs, OID_AUTO, arc_min_prefetch_ms, CTLFLAG_RW, + &zfs_arc_min_prefetch_ms, 0, "Min life of prefetch block in ms"); +SYSCTL_UINT(_vfs_zfs, OID_AUTO, arc_min_prescient_prefetch_ms, CTLFLAG_RW, + &zfs_arc_min_prescient_prefetch_ms, 0, "Min life of prescient prefetched block in ms"); + /* * L2ARC Internals */ @@ -3548,6 +3559,8 @@ arc_evict_hdr(arc_buf_hdr_t *hdr, kmutex_t *hash_lock) { arc_state_t *evicted_state, *state; int64_t bytes_evicted = 0; + int min_lifetime = HDR_PRESCIENT_PREFETCH(hdr) ? + zfs_arc_min_prescient_prefetch_ms : zfs_arc_min_prefetch_ms; ASSERT(MUTEX_HELD(hash_lock)); ASSERT(HDR_HAS_L1HDR(hdr)); @@ -3600,8 +3613,7 @@ arc_evict_hdr(arc_buf_hdr_t *hdr, kmutex_t *hash_lock) /* prefetch buffers have a minimum lifespan */ if (HDR_IO_IN_PROGRESS(hdr) || ((hdr->b_flags & (ARC_FLAG_PREFETCH | ARC_FLAG_INDIRECT)) && - ddi_get_lbolt() - hdr->b_l1hdr.b_arc_access < - arc_min_prefetch_lifespan)) { + ddi_get_lbolt() - hdr->b_l1hdr.b_arc_access < min_lifetime * hz)) { ARCSTAT_BUMP(arcstat_evict_skip); return (bytes_evicted); } @@ -4997,13 +5009,15 @@ arc_access(arc_buf_hdr_t *hdr, kmutex_t *hash_lock) * - move the buffer to the head of the list if this is * another prefetch (to make it less likely to be evicted). */ - if (HDR_PREFETCH(hdr)) { + if (HDR_PREFETCH(hdr) || HDR_PRESCIENT_PREFETCH(hdr)) { if (refcount_count(&hdr->b_l1hdr.b_refcnt) == 0) { /* link protected by hash lock */ ASSERT(multilist_link_active( &hdr->b_l1hdr.b_arc_node)); } else { - arc_hdr_clear_flags(hdr, ARC_FLAG_PREFETCH); + arc_hdr_clear_flags(hdr, + ARC_FLAG_PREFETCH | + ARC_FLAG_PRESCIENT_PREFETCH); ARCSTAT_BUMP(arcstat_mru_hits); } hdr->b_l1hdr.b_arc_access = now; @@ -5034,10 +5048,13 @@ arc_access(arc_buf_hdr_t *hdr, kmutex_t *hash_lock) * MFU state. */ - if (HDR_PREFETCH(hdr)) { + if (HDR_PREFETCH(hdr) || HDR_PRESCIENT_PREFETCH(hdr)) { new_state = arc_mru; - if (refcount_count(&hdr->b_l1hdr.b_refcnt) > 0) - arc_hdr_clear_flags(hdr, ARC_FLAG_PREFETCH); + if (refcount_count(&hdr->b_l1hdr.b_refcnt) > 0) { + arc_hdr_clear_flags(hdr, + ARC_FLAG_PREFETCH | + ARC_FLAG_PRESCIENT_PREFETCH); + } DTRACE_PROBE1(new_state__mru, arc_buf_hdr_t *, hdr); } else { new_state = arc_mfu; @@ -5058,11 +5075,7 @@ arc_access(arc_buf_hdr_t *hdr, kmutex_t *hash_lock) * If it was a prefetch, we will explicitly move it to * the head of the list now. */ - if ((HDR_PREFETCH(hdr)) != 0) { - ASSERT(refcount_is_zero(&hdr->b_l1hdr.b_refcnt)); - /* link protected by hash_lock */ - ASSERT(multilist_link_active(&hdr->b_l1hdr.b_arc_node)); - } + ARCSTAT_BUMP(arcstat_mfu_hits); hdr->b_l1hdr.b_arc_access = ddi_get_lbolt(); } else if (hdr->b_l1hdr.b_state == arc_mfu_ghost) { @@ -5073,12 +5086,11 @@ arc_access(arc_buf_hdr_t *hdr, kmutex_t *hash_lock) * MFU state. */ - if (HDR_PREFETCH(hdr)) { + if (HDR_PREFETCH(hdr) || HDR_PRESCIENT_PREFETCH(hdr)) { /* * This is a prefetch access... * move this block back to the MRU state. */ - ASSERT0(refcount_count(&hdr->b_l1hdr.b_refcnt)); new_state = arc_mru; } @@ -5145,23 +5157,28 @@ arc_buf_access(arc_buf_t *buf) demand, prefetch, !HDR_ISTYPE_METADATA(hdr), data, metadata, hits); } -/* a generic arc_done_func_t which you can use */ +/* a generic arc_read_done_func_t which you can use */ /* ARGSUSED */ void -arc_bcopy_func(zio_t *zio, arc_buf_t *buf, void *arg) +arc_bcopy_func(zio_t *zio, const zbookmark_phys_t *zb, const blkptr_t *bp, + arc_buf_t *buf, void *arg) { - if (zio == NULL || zio->io_error == 0) - bcopy(buf->b_data, arg, arc_buf_size(buf)); + if (buf == NULL) + return; + + bcopy(buf->b_data, arg, arc_buf_size(buf)); arc_buf_destroy(buf, arg); } -/* a generic arc_done_func_t */ +/* a generic arc_read_done_func_t */ +/* ARGSUSED */ void -arc_getbuf_func(zio_t *zio, arc_buf_t *buf, void *arg) +arc_getbuf_func(zio_t *zio, const zbookmark_phys_t *zb, const blkptr_t *bp, + arc_buf_t *buf, void *arg) { arc_buf_t **bufp = arg; - if (zio && zio->io_error) { - arc_buf_destroy(buf, arg); + + if (buf == NULL) { *bufp = NULL; } else { *bufp = buf; @@ -5193,7 +5210,6 @@ arc_read_done(zio_t *zio) arc_callback_t *callback_list; arc_callback_t *acb; boolean_t freeable = B_FALSE; - boolean_t no_zio_error = (zio->io_error == 0); /* * The hdr was inserted into hash-table and removed from lists @@ -5219,7 +5235,7 @@ arc_read_done(zio_t *zio) ASSERT3P(hash_lock, !=, NULL); } - if (no_zio_error) { + if (zio->io_error == 0) { /* byteswap if necessary */ if (BP_SHOULD_BYTESWAP(zio->io_bp)) { if (BP_GET_LEVEL(zio->io_bp) > 0) { @@ -5240,7 +5256,8 @@ arc_read_done(zio_t *zio) callback_list = hdr->b_l1hdr.b_acb; ASSERT3P(callback_list, !=, NULL); - if (hash_lock && no_zio_error && hdr->b_l1hdr.b_state == arc_anon) { + if (hash_lock && zio->io_error == 0 && + hdr->b_l1hdr.b_state == arc_anon) { /* * Only call arc_access on anonymous buffers. This is because * if we've issued an I/O for an evicted buffer, we've already @@ -5261,14 +5278,21 @@ arc_read_done(zio_t *zio) if (!acb->acb_done) continue; - /* This is a demand read since prefetches don't use callbacks */ callback_cnt++; + if (zio->io_error != 0) + continue; + int error = arc_buf_alloc_impl(hdr, acb->acb_private, - acb->acb_compressed, no_zio_error, &acb->acb_buf); - if (no_zio_error) { - zio->io_error = error; + acb->acb_compressed, + B_TRUE, &acb->acb_buf); + if (error != 0) { + arc_buf_destroy(acb->acb_buf, acb->acb_private); + acb->acb_buf = NULL; } + + if (zio->io_error == 0) + zio->io_error = error; } hdr->b_l1hdr.b_acb = NULL; arc_hdr_clear_flags(hdr, ARC_FLAG_IO_IN_PROGRESS); @@ -5281,7 +5305,7 @@ arc_read_done(zio_t *zio) ASSERT(refcount_is_zero(&hdr->b_l1hdr.b_refcnt) || callback_list != NULL); - if (no_zio_error) { + if (zio->io_error == 0) { arc_hdr_verify(hdr, zio->io_bp); } else { arc_hdr_set_flags(hdr, ARC_FLAG_IO_ERROR); @@ -5314,8 +5338,10 @@ arc_read_done(zio_t *zio) /* execute each callback and free its structure */ while ((acb = callback_list) != NULL) { - if (acb->acb_done) - acb->acb_done(zio, acb->acb_buf, acb->acb_private); + if (acb->acb_done) { + acb->acb_done(zio, &zio->io_bookmark, zio->io_bp, + acb->acb_buf, acb->acb_private); + } if (acb->acb_zio_dummy != NULL) { acb->acb_zio_dummy->io_error = zio->io_error; @@ -5349,7 +5375,7 @@ arc_read_done(zio_t *zio) * for readers of this block. */ int -arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_done_func_t *done, +arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_read_done_func_t *done, void *private, zio_priority_t priority, int zio_flags, arc_flags_t *arc_flags, const zbookmark_phys_t *zb) { @@ -5358,7 +5384,8 @@ arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, a zio_t *rzio; uint64_t guid = spa_load_guid(spa); boolean_t compressed_read = (zio_flags & ZIO_FLAG_RAW) != 0; - + int rc = 0; + ASSERT(!BP_IS_EMBEDDED(bp) || BPE_GET_ETYPE(bp) == BP_EMBEDDED_TYPE_DATA); @@ -5376,32 +5403,20 @@ top: *arc_flags |= ARC_FLAG_CACHED; if (HDR_IO_IN_PROGRESS(hdr)) { + zio_t *head_zio = hdr->b_l1hdr.b_acb->acb_zio_head; + ASSERT3P(head_zio, !=, NULL); if ((hdr->b_flags & ARC_FLAG_PRIO_ASYNC_READ) && priority == ZIO_PRIORITY_SYNC_READ) { /* - * This sync read must wait for an - * in-progress async read (e.g. a predictive - * prefetch). Async reads are queued - * separately at the vdev_queue layer, so - * this is a form of priority inversion. - * Ideally, we would "inherit" the demand - * i/o's priority by moving the i/o from - * the async queue to the synchronous queue, - * but there is currently no mechanism to do - * so. Track this so that we can evaluate - * the magnitude of this potential performance - * problem. - * - * Note that if the prefetch i/o is already - * active (has been issued to the device), - * the prefetch improved performance, because - * we issued it sooner than we would have - * without the prefetch. + * This is a sync read that needs to wait for + * an in-flight async read. Request that the + * zio have its priority upgraded. */ - DTRACE_PROBE1(arc__sync__wait__for__async, + zio_change_priority(head_zio, priority); + DTRACE_PROBE1(arc__async__upgrade__sync, arc_buf_hdr_t *, hdr); - ARCSTAT_BUMP(arcstat_sync_wait_for_async); + ARCSTAT_BUMP(arcstat_async_upgrade_sync); } if (hdr->b_flags & ARC_FLAG_PREDICTIVE_PREFETCH) { arc_hdr_clear_flags(hdr, @@ -5428,6 +5443,7 @@ top: spa, NULL, NULL, NULL, zio_flags); ASSERT3P(acb->acb_done, !=, NULL); + acb->acb_zio_head = head_zio; acb->acb_next = hdr->b_l1hdr.b_acb; hdr->b_l1hdr.b_acb = acb; mutex_exit(hash_lock); @@ -5455,17 +5471,32 @@ top: arc_hdr_clear_flags(hdr, ARC_FLAG_PREDICTIVE_PREFETCH); } - ASSERT(!BP_IS_EMBEDDED(bp) || !BP_IS_HOLE(bp)); + if (hdr->b_flags & ARC_FLAG_PRESCIENT_PREFETCH) { + ARCSTAT_BUMP( + arcstat_demand_hit_prescient_prefetch); + arc_hdr_clear_flags(hdr, + ARC_FLAG_PRESCIENT_PREFETCH); + } + + ASSERT(!BP_IS_EMBEDDED(bp) || !BP_IS_HOLE(bp)); /* Get a buf with the desired data in it. */ - VERIFY0(arc_buf_alloc_impl(hdr, private, - compressed_read, B_TRUE, &buf)); + rc = arc_buf_alloc_impl(hdr, private, + compressed_read, B_TRUE, &buf); + if (rc != 0) { + arc_buf_destroy(buf, private); + buf = NULL; + } + ASSERT((zio_flags & ZIO_FLAG_SPECULATIVE) || + rc == 0 || rc != ENOENT); } else if (*arc_flags & ARC_FLAG_PREFETCH && refcount_count(&hdr->b_l1hdr.b_refcnt) == 0) { arc_hdr_set_flags(hdr, ARC_FLAG_PREFETCH); } DTRACE_PROBE1(arc__hit, arc_buf_hdr_t *, hdr); arc_access(hdr, hash_lock); + if (*arc_flags & ARC_FLAG_PRESCIENT_PREFETCH) + arc_hdr_set_flags(hdr, ARC_FLAG_PRESCIENT_PREFETCH); if (*arc_flags & ARC_FLAG_L2CACHE) arc_hdr_set_flags(hdr, ARC_FLAG_L2CACHE); mutex_exit(hash_lock); @@ -5475,7 +5506,7 @@ top: data, metadata, hits); if (done) - done(NULL, buf, private); + done(NULL, zb, bp, buf, private); } else { uint64_t lsize = BP_GET_LSIZE(bp); uint64_t psize = BP_GET_PSIZE(bp); @@ -5549,6 +5580,9 @@ top: if (*arc_flags & ARC_FLAG_PREFETCH) arc_hdr_set_flags(hdr, ARC_FLAG_PREFETCH); + if (*arc_flags & ARC_FLAG_PRESCIENT_PREFETCH) + arc_hdr_set_flags(hdr, ARC_FLAG_PRESCIENT_PREFETCH); + if (*arc_flags & ARC_FLAG_L2CACHE) arc_hdr_set_flags(hdr, ARC_FLAG_L2CACHE); if (BP_GET_LEVEL(bp) > 0) @@ -5578,14 +5612,17 @@ top: vd = NULL; } - if (priority == ZIO_PRIORITY_ASYNC_READ) + /* + * We count both async reads and scrub IOs as asynchronous so + * that both can be upgraded in the event of a cache hit while + * the read IO is still in-flight. + */ + if (priority == ZIO_PRIORITY_ASYNC_READ || + priority == ZIO_PRIORITY_SCRUB) arc_hdr_set_flags(hdr, ARC_FLAG_PRIO_ASYNC_READ); else arc_hdr_clear_flags(hdr, ARC_FLAG_PRIO_ASYNC_READ); - if (hash_lock != NULL) - mutex_exit(hash_lock); - /* * At this point, we have a level 1 cache miss. Try again in * L2ARC if possible. @@ -5666,6 +5703,11 @@ top: ZIO_FLAG_CANFAIL | ZIO_FLAG_DONT_PROPAGATE | ZIO_FLAG_DONT_RETRY, B_FALSE); + acb->acb_zio_head = rzio; + + if (hash_lock != NULL) + mutex_exit(hash_lock); + DTRACE_PROBE2(l2arc__read, vdev_t *, vd, zio_t *, rzio); ARCSTAT_INCR(arcstat_l2_read_bytes, size); @@ -5680,6 +5722,8 @@ top: return (0); /* l2arc read error; goto zio_read() */ + if (hash_lock != NULL) + mutex_enter(hash_lock); } else { DTRACE_PROBE1(l2arc__miss, arc_buf_hdr_t *, hdr); @@ -5700,7 +5744,11 @@ top: rzio = zio_read(pio, spa, bp, hdr->b_l1hdr.b_pabd, size, arc_read_done, hdr, priority, zio_flags, zb); + acb->acb_zio_head = rzio; + if (hash_lock != NULL) + mutex_exit(hash_lock); + if (*arc_flags & ARC_FLAG_WAIT) return (zio_wait(rzio)); @@ -6191,9 +6239,9 @@ arc_write_done(zio_t *zio) zio_t * arc_write(zio_t *pio, spa_t *spa, uint64_t txg, blkptr_t *bp, arc_buf_t *buf, - boolean_t l2arc, const zio_prop_t *zp, arc_done_func_t *ready, - arc_done_func_t *children_ready, arc_done_func_t *physdone, - arc_done_func_t *done, void *private, zio_priority_t priority, + boolean_t l2arc, const zio_prop_t *zp, arc_write_done_func_t *ready, + arc_write_done_func_t *children_ready, arc_write_done_func_t *physdone, + arc_write_done_func_t *done, void *private, zio_priority_t priority, int zio_flags, const zbookmark_phys_t *zb) { arc_buf_hdr_t *hdr = buf->b_hdr; @@ -6620,9 +6668,6 @@ arc_init(void) mutex_init(&arc_dnlc_evicts_lock, NULL, MUTEX_DEFAULT, NULL); cv_init(&arc_dnlc_evicts_cv, NULL, CV_DEFAULT, NULL); - - /* Convert seconds to clock ticks */ - arc_min_prefetch_lifespan = 1 * hz; /* set min cache to 1/32 of all memory, or arc_abs_min, whichever is more */ arc_c_min = MAX(allmem / 32, arc_abs_min); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Mon Oct 1 04:02:00 2018 (r339033) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Mon Oct 1 04:08:47 2018 (r339034) @@ -902,7 +902,8 @@ dbuf_whichblock(dnode_t *dn, int64_t level, uint64_t o } static void -dbuf_read_done(zio_t *zio, arc_buf_t *buf, void *vdb) +dbuf_read_done(zio_t *zio, const zbookmark_phys_t *zb, const blkptr_t *bp, + arc_buf_t *buf, void *vdb) { dmu_buf_impl_t *db = vdb; @@ -916,19 +917,22 @@ dbuf_read_done(zio_t *zio, arc_buf_t *buf, void *vdb) ASSERT(db->db.db_data == NULL); if (db->db_level == 0 && db->db_freed_in_flight) { /* we were freed in flight; disregard any error */ + if (buf == NULL) { + buf = arc_alloc_buf(db->db_objset->os_spa, + db, DBUF_GET_BUFC_TYPE(db), db->db.db_size); + } arc_release(buf, db); bzero(buf->b_data, db->db.db_size); arc_buf_freeze(buf); db->db_freed_in_flight = FALSE; dbuf_set_data(db, buf); db->db_state = DB_CACHED; - } else if (zio == NULL || zio->io_error == 0) { + } else if (buf != NULL) { dbuf_set_data(db, buf); db->db_state = DB_CACHED; } else { ASSERT(db->db_blkid != DMU_BONUS_BLKID); ASSERT3P(db->db_buf, ==, NULL); - arc_buf_destroy(buf, db); db->db_state = DB_UNCACHED; } cv_broadcast(&db->db_changed); @@ -2326,7 +2330,8 @@ dbuf_issue_final_prefetch(dbuf_prefetch_arg_t *dpa, bl * prefetch if the next block down is our target. */ static void -dbuf_prefetch_indirect_done(zio_t *zio, arc_buf_t *abuf, void *private) +dbuf_prefetch_indirect_done(zio_t *zio, const zbookmark_phys_t *zb, + const blkptr_t *iobp, arc_buf_t *abuf, void *private) { dbuf_prefetch_arg_t *dpa = private; @@ -2365,13 +2370,18 @@ dbuf_prefetch_indirect_done(zio_t *zio, arc_buf_t *abu dbuf_rele(db, FTAG); } + if (abuf == NULL) { + kmem_free(dpa, sizeof(*dpa)); + return; + } + dpa->dpa_curlevel--; uint64_t nextblkid = dpa->dpa_zb.zb_blkid >> (dpa->dpa_epbs * (dpa->dpa_curlevel - dpa->dpa_zb.zb_level)); blkptr_t *bp = ((blkptr_t *)abuf->b_data) + P2PHASE(nextblkid, 1ULL << dpa->dpa_epbs); - if (BP_IS_HOLE(bp) || (zio != NULL && zio->io_error != 0)) { + if (BP_IS_HOLE(bp)) { kmem_free(dpa, sizeof (*dpa)); } else if (dpa->dpa_curlevel == dpa->dpa_zb.zb_level) { ASSERT3U(nextblkid, ==, dpa->dpa_zb.zb_blkid); @@ -3746,7 +3756,7 @@ dbuf_write(dbuf_dirty_record_t *dr, arc_buf_t *data, d * ready callback so that we can properly handle an indirect * block that only contains holes. */ - arc_done_func_t *children_ready_cb = NULL; + arc_write_done_func_t *children_ready_cb = NULL; if (db->db_level != 0) children_ready_cb = dbuf_write_children_ready; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/ddt.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/ddt.c Mon Oct 1 04:02:00 2018 (r339033) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/ddt.c Mon Oct 1 04:08:47 2018 (r339034) @@ -1112,14 +1112,26 @@ ddt_sync_table(ddt_t *ddt, dmu_tx_t *tx, uint64_t txg) void ddt_sync(spa_t *spa, uint64_t txg) { + dsl_scan_t *scn = spa->spa_dsl_pool->dp_scan; dmu_tx_t *tx; - zio_t *rio = zio_root(spa, NULL, NULL, - ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE | ZIO_FLAG_SELF_HEAL); + zio_t *rio; ASSERT(spa_syncing_txg(spa) == txg); tx = dmu_tx_create_assigned(spa->spa_dsl_pool, txg); + rio = zio_root(spa, NULL, NULL, + ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE | ZIO_FLAG_SELF_HEAL); + + /* + * This function may cause an immediate scan of ddt blocks (see + * the comment above dsl_scan_ddt() for details). We set the + * scan's root zio here so that we can wait for any scan IOs in + * addition to the regular ddt IOs. + */ + ASSERT3P(scn->scn_zio_root, ==, NULL); + scn->scn_zio_root = rio; + for (enum zio_checksum c = 0; c < ZIO_CHECKSUM_FUNCTIONS; c++) { ddt_t *ddt = spa->spa_ddt[c]; if (ddt == NULL) @@ -1129,6 +1141,7 @@ ddt_sync(spa_t *spa, uint64_t txg) } (void) zio_wait(rio); + scn->scn_zio_root = NULL; dmu_tx_commit(tx); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Mon Oct 1 04:02:00 2018 (r339033) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Mon Oct 1 04:08:47 2018 (r339034) @@ -349,6 +349,7 @@ dmu_objset_open_impl(spa_t *spa, dsl_dataset_t *ds, bl ASSERT(ds == NULL || MUTEX_HELD(&ds->ds_opening_lock)); +#if 0 /* * The $ORIGIN dataset (if it exists) doesn't have an associated * objset, so there's no reason to open it. The $ORIGIN dataset @@ -359,6 +360,7 @@ dmu_objset_open_impl(spa_t *spa, dsl_dataset_t *ds, bl ASSERT3P(ds->ds_dir, !=, spa_get_dsl(spa)->dp_origin_snap->ds_dir); } +#endif os = kmem_zalloc(sizeof (objset_t), KM_SLEEP); os->os_dsl_dataset = ds; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c Mon Oct 1 04:02:00 2018 (r339033) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c Mon Oct 1 04:08:47 2018 (r339034) @@ -499,8 +499,9 @@ traverse_prefetcher(spa_t *spa, zilog_t *zilog, const const zbookmark_phys_t *zb, const dnode_phys_t *dnp, void *arg) { prefetch_data_t *pfd = arg; - arc_flags_t aflags = ARC_FLAG_NOWAIT | ARC_FLAG_PREFETCH; - + arc_flags_t aflags = ARC_FLAG_NOWAIT | ARC_FLAG_PREFETCH | + ARC_FLAG_PRESCIENT_PREFETCH; + ASSERT(pfd->pd_bytes_fetched >= 0); if (bp == NULL) return (0); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Mon Oct 1 04:02:00 2018 (r339033) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Mon Oct 1 04:08:47 2018 (r339034) @@ -51,28 +51,136 @@ #include #include #include +#include #ifdef _KERNEL #include #endif +/* + * Grand theory statement on scan queue sorting + * + * Scanning is implemented by recursively traversing all indirection levels + * in an object and reading all blocks referenced from said objects. This + * results in us approximately traversing the object from lowest logical + * offset to the highest. For best performance, we would want the logical + * blocks to be physically contiguous. However, this is frequently not the + * case with pools given the allocation patterns of copy-on-write filesystems. + * So instead, we put the I/Os into a reordering queue and issue them in a + * way that will most benefit physical disks (LBA-order). + * + * Queue management: + * + * Ideally, we would want to scan all metadata and queue up all block I/O + * prior to starting to issue it, because that allows us to do an optimal + * sorting job. This can however consume large amounts of memory. Therefore + * we continuously monitor the size of the queues and constrain them to 5% + * (zfs_scan_mem_lim_fact) of physmem. If the queues grow larger than this + * limit, we clear out a few of the largest extents at the head of the queues + * to make room for more scanning. Hopefully, these extents will be fairly + * large and contiguous, allowing us to approach sequential I/O throughput + * even without a fully sorted tree. + * + * Metadata scanning takes place in dsl_scan_visit(), which is called from + * dsl_scan_sync() every spa_sync(). If we have either fully scanned all + * metadata on the pool, or we need to make room in memory because our + * queues are too large, dsl_scan_visit() is postponed and + * scan_io_queues_run() is called from dsl_scan_sync() instead. This implies + * that metadata scanning and queued I/O issuing are mutually exclusive. This + * allows us to provide maximum sequential I/O throughput for the majority of + * I/O's issued since sequential I/O performance is significantly negatively + * impacted if it is interleaved with random I/O. + * + * Implementation Notes + * + * One side effect of the queued scanning algorithm is that the scanning code + * needs to be notified whenever a block is freed. This is needed to allow + * the scanning code to remove these I/Os from the issuing queue. Additionally, + * we do not attempt to queue gang blocks to be issued sequentially since this + * is very hard to do and would have an extremely limitted performance benefit. + * Instead, we simply issue gang I/Os as soon as we find them using the legacy + * algorithm. *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-stable-11@freebsd.org Mon Oct 1 07:49:17 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2C09E10971AA; Mon, 1 Oct 2018 07:49:17 +0000 (UTC) (envelope-from smh@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C917D92E4C; Mon, 1 Oct 2018 07:49:16 +0000 (UTC) (envelope-from smh@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id BE3C51E059; Mon, 1 Oct 2018 07:49:16 +0000 (UTC) (envelope-from smh@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w917nGvx017923; Mon, 1 Oct 2018 07:49:16 GMT (envelope-from smh@FreeBSD.org) Received: (from smh@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w917nGIp017922; Mon, 1 Oct 2018 07:49:16 GMT (envelope-from smh@FreeBSD.org) Message-Id: <201810010749.w917nGIp017922@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: smh set sender to smh@FreeBSD.org using -f From: Steven Hartland Date: Mon, 1 Oct 2018 07:49:16 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339035 - stable/11/sys/netinet X-SVN-Group: stable-11 X-SVN-Commit-Author: smh X-SVN-Commit-Paths: stable/11/sys/netinet X-SVN-Commit-Revision: 339035 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 07:49:17 -0000 Author: smh Date: Mon Oct 1 07:49:16 2018 New Revision: 339035 URL: https://svnweb.freebsd.org/changeset/base/339035 Log: MFC r336165: Removed pointless NULL check in rip_pcblist. Sponsored by: Multiplay Modified: stable/11/sys/netinet/raw_ip.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/netinet/raw_ip.c ============================================================================== --- stable/11/sys/netinet/raw_ip.c Mon Oct 1 04:08:47 2018 (r339034) +++ stable/11/sys/netinet/raw_ip.c Mon Oct 1 07:49:16 2018 (r339035) @@ -1053,8 +1053,6 @@ rip_pcblist(SYSCTL_HANDLER_ARGS) return (error); inp_list = malloc(n * sizeof *inp_list, M_TEMP, M_WAITOK); - if (inp_list == NULL) - return (ENOMEM); INP_INFO_RLOCK(&V_ripcbinfo); for (inp = LIST_FIRST(V_ripcbinfo.ipi_listhead), i = 0; inp && i < n; From owner-svn-src-stable-11@freebsd.org Mon Oct 1 08:49:48 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4F8CE10A60EB; Mon, 1 Oct 2018 08:49:48 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 062659535C; Mon, 1 Oct 2018 08:49:48 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id F0DB91EA0A; Mon, 1 Oct 2018 08:49:47 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w918nleu049509; Mon, 1 Oct 2018 08:49:47 GMT (envelope-from ae@FreeBSD.org) Received: (from ae@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w918nlLl049508; Mon, 1 Oct 2018 08:49:47 GMT (envelope-from ae@FreeBSD.org) Message-Id: <201810010849.w918nlLl049508@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: ae set sender to ae@FreeBSD.org using -f From: "Andrey V. Elsukov" Date: Mon, 1 Oct 2018 08:49:47 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339036 - stable/11/sbin/ifconfig X-SVN-Group: stable-11 X-SVN-Commit-Author: ae X-SVN-Commit-Paths: stable/11/sbin/ifconfig X-SVN-Commit-Revision: 339036 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 08:49:48 -0000 Author: ae Date: Mon Oct 1 08:49:47 2018 New Revision: 339036 URL: https://svnweb.freebsd.org/changeset/base/339036 Log: MFC r338890: Update ifr_name before invoking IPSECSREQID ioctl, this fixes the case, when `ifconfig ipsec create reqid N` command invoked without interface unit number. The "name" global variable is updated after interface cloning in the ifclonecreate() and contains actual interface name. Reported by: lev Modified: stable/11/sbin/ifconfig/ifipsec.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sbin/ifconfig/ifipsec.c ============================================================================== --- stable/11/sbin/ifconfig/ifipsec.c Mon Oct 1 07:49:16 2018 (r339035) +++ stable/11/sbin/ifconfig/ifipsec.c Mon Oct 1 08:49:47 2018 (r339036) @@ -72,6 +72,7 @@ DECL_CMD_FUNC(setreqid, val, arg) warn("Invalid reqid value %s", val); return; } + strlcpy(ifr.ifr_name, name, sizeof(ifr.ifr_name)); ifr.ifr_data = (char *)&v; if (ioctl(s, IPSECSREQID, &ifr) == -1) { warn("ioctl(IPSECSREQID)"); From owner-svn-src-stable-11@freebsd.org Mon Oct 1 09:40:42 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6488110AAB35; Mon, 1 Oct 2018 09:40:42 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1AECD96ECA; Mon, 1 Oct 2018 09:40:42 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 15C9B1F222; Mon, 1 Oct 2018 09:40:42 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w919efnJ078366; Mon, 1 Oct 2018 09:40:41 GMT (envelope-from ae@FreeBSD.org) Received: (from ae@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w919ef7m078196; Mon, 1 Oct 2018 09:40:41 GMT (envelope-from ae@FreeBSD.org) Message-Id: <201810010940.w919ef7m078196@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: ae set sender to ae@FreeBSD.org using -f From: "Andrey V. Elsukov" Date: Mon, 1 Oct 2018 09:40:41 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339037 - stable/11/sys/netinet X-SVN-Group: stable-11 X-SVN-Commit-Author: ae X-SVN-Commit-Paths: stable/11/sys/netinet X-SVN-Commit-Revision: 339037 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 09:40:42 -0000 Author: ae Date: Mon Oct 1 09:40:41 2018 New Revision: 339037 URL: https://svnweb.freebsd.org/changeset/base/339037 Log: MFC r313168 (by pkelsey): Fix VIMAGE-related bugs in TFO. The autokey callout vnet context was not being initialized, and the per-vnet fastopen context was only being initialized for the default vnet. PR: 216613 Modified: stable/11/sys/netinet/tcp_fastopen.c stable/11/sys/netinet/tcp_subr.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/netinet/tcp_fastopen.c ============================================================================== --- stable/11/sys/netinet/tcp_fastopen.c Mon Oct 1 08:49:47 2018 (r339036) +++ stable/11/sys/netinet/tcp_fastopen.c Mon Oct 1 09:40:41 2018 (r339037) @@ -209,6 +209,7 @@ tcp_fastopen_init(void) rm_init(&V_tcp_fastopen_keylock, "tfo_keylock"); callout_init_rm(&V_tcp_fastopen_autokey_ctx.c, &V_tcp_fastopen_keylock, 0); + V_tcp_fastopen_autokey_ctx.v = curvnet; V_tcp_fastopen_keys.newest = TCP_FASTOPEN_MAX_KEYS - 1; } Modified: stable/11/sys/netinet/tcp_subr.c ============================================================================== --- stable/11/sys/netinet/tcp_subr.c Mon Oct 1 08:49:47 2018 (r339036) +++ stable/11/sys/netinet/tcp_subr.c Mon Oct 1 09:40:41 2018 (r339037) @@ -655,6 +655,10 @@ tcp_init(void) V_sack_hole_zone = uma_zcreate("sackhole", sizeof(struct sackhole), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); +#ifdef TCP_RFC7413 + tcp_fastopen_init(); +#endif + /* Skip initialization of globals for non-default instances. */ if (!IS_DEFAULT_VNET(curvnet)) return; @@ -707,10 +711,6 @@ tcp_init(void) EVENTHANDLER_PRI_ANY); #ifdef TCPPCAP tcp_pcap_init(); -#endif - -#ifdef TCP_RFC7413 - tcp_fastopen_init(); #endif } From owner-svn-src-stable-11@freebsd.org Mon Oct 1 14:40:00 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 57B6110BF318; Mon, 1 Oct 2018 14:40:00 +0000 (UTC) (envelope-from oshogbo@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0A7AF79842; Mon, 1 Oct 2018 14:40:00 +0000 (UTC) (envelope-from oshogbo@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 04E9522379; Mon, 1 Oct 2018 14:40:00 +0000 (UTC) (envelope-from oshogbo@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w91EdxcY053691; Mon, 1 Oct 2018 14:39:59 GMT (envelope-from oshogbo@FreeBSD.org) Received: (from oshogbo@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w91EdxdG053690; Mon, 1 Oct 2018 14:39:59 GMT (envelope-from oshogbo@FreeBSD.org) Message-Id: <201810011439.w91EdxdG053690@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: oshogbo set sender to oshogbo@FreeBSD.org using -f From: Mariusz Zaborski Date: Mon, 1 Oct 2018 14:39:59 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339045 - stable/11/contrib/traceroute X-SVN-Group: stable-11 X-SVN-Commit-Author: oshogbo X-SVN-Commit-Paths: stable/11/contrib/traceroute X-SVN-Commit-Revision: 339045 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 14:40:00 -0000 Author: oshogbo Date: Mon Oct 1 14:39:59 2018 New Revision: 339045 URL: https://svnweb.freebsd.org/changeset/base/339045 Log: MFC r315411 (mmel): Unbreak traceroute on system built without CAPSICUM Modified: stable/11/contrib/traceroute/traceroute.c Directory Properties: stable/11/ (props changed) Modified: stable/11/contrib/traceroute/traceroute.c ============================================================================== --- stable/11/contrib/traceroute/traceroute.c Mon Oct 1 14:27:53 2018 (r339044) +++ stable/11/contrib/traceroute/traceroute.c Mon Oct 1 14:39:59 2018 (r339045) @@ -1021,8 +1021,13 @@ main(int argc, char **argv) * We must connect(2) our socket before this point. */ if (cansandbox && cap_enter() < 0) { - Fprintf(stderr, "%s: cap_enter: %s\n", prog, strerror(errno)); - exit(1); + if (errno != ENOSYS) { + Fprintf(stderr, "%s: cap_enter: %s\n", prog, + strerror(errno)); + exit(1); + } else { + cansandbox = false; + } } cap_rights_init(&rights, CAP_SEND, CAP_SETSOCKOPT); From owner-svn-src-stable-11@freebsd.org Mon Oct 1 15:40:07 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9CF9610C1A46; Mon, 1 Oct 2018 15:40:07 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CA267C0C7; Mon, 1 Oct 2018 15:40:07 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4103A22D80; Mon, 1 Oct 2018 15:40:07 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w91Fe74K096504; Mon, 1 Oct 2018 15:40:07 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w91Fe7js096503; Mon, 1 Oct 2018 15:40:07 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810011540.w91Fe7js096503@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Mon, 1 Oct 2018 15:40:07 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339048 - stable/11/usr.sbin/makefs/tests X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: stable/11/usr.sbin/makefs/tests X-SVN-Commit-Revision: 339048 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 15:40:07 -0000 Author: asomers Date: Mon Oct 1 15:40:06 2018 New Revision: 339048 URL: https://svnweb.freebsd.org/changeset/base/339048 Log: MFC r336582: makefs(8): add test case for PR 229929 Fix two failing makefs test cases by adding "-M 1m", which was already used for every other FFS test case. Add a new test case for the underlying issue: with no -M, -m, or -s options, makefs can underestimate image size. PR: 229929 Reported by: Jenkins Modified: stable/11/usr.sbin/makefs/tests/makefs_ffs_tests.sh Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.sbin/makefs/tests/makefs_ffs_tests.sh ============================================================================== --- stable/11/usr.sbin/makefs/tests/makefs_ffs_tests.sh Mon Oct 1 14:57:33 2018 (r339047) +++ stable/11/usr.sbin/makefs/tests/makefs_ffs_tests.sh Mon Oct 1 15:40:06 2018 (r339048) @@ -53,6 +53,29 @@ check_ffs_image_contents() check_image_contents "$@" } +# With no -M, -m, or -s options, makefs should autocalculate the image size +atf_test_case autocalculate_image_size cleanup +autocalculate_image_size_body() +{ + atf_expect_fail "PR 229929 makefs(8) can underestimate image size" + create_test_inputs + + atf_check -e empty -o save:$TEST_SPEC_FILE -s exit:0 \ + mtree -c -k "$DEFAULT_MTREE_KEYWORDS" -p $TEST_INPUTS_DIR + + cd $TEST_INPUTS_DIR + atf_check -e empty -o not-empty -s exit:0 \ + $MAKEFS $TEST_IMAGE $TEST_SPEC_FILE + cd - + + mount_image + check_ffs_image_contents +} +autocalculate_image_size_cleanup() +{ + common_cleanup +} + atf_test_case D_flag cleanup D_flag_body() { @@ -109,7 +132,7 @@ from_mtree_spec_file_body() cd $TEST_INPUTS_DIR atf_check -e empty -o not-empty -s exit:0 \ - $MAKEFS $TEST_IMAGE $TEST_SPEC_FILE + $MAKEFS -M 1m $TEST_IMAGE $TEST_SPEC_FILE cd - mount_image @@ -132,7 +155,7 @@ from_multiple_dirs_body() touch $test_inputs_dir2/multiple_dirs_test_file atf_check -e empty -o not-empty -s exit:0 \ - $MAKEFS $TEST_IMAGE $TEST_INPUTS_DIR $test_inputs_dir2 + $MAKEFS -M 1m $TEST_IMAGE $TEST_INPUTS_DIR $test_inputs_dir2 mount_image check_image_contents -d $test_inputs_dir2 @@ -224,6 +247,8 @@ o_flag_version_2_cleanup() atf_init_test_cases() { + + atf_add_test_case autocalculate_image_size atf_add_test_case D_flag atf_add_test_case F_flag From owner-svn-src-stable-11@freebsd.org Mon Oct 1 15:49:44 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D2BD810C1D42; Mon, 1 Oct 2018 15:49:44 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 772387C8F6; Mon, 1 Oct 2018 15:49:44 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 6C35C22F19; Mon, 1 Oct 2018 15:49:44 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w91FniYL003177; Mon, 1 Oct 2018 15:49:44 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w91FniTD003176; Mon, 1 Oct 2018 15:49:44 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810011549.w91FniTD003176@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Mon, 1 Oct 2018 15:49:44 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339052 - stable/11/sys/compat/freebsd32 X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: stable/11/sys/compat/freebsd32 X-SVN-Commit-Revision: 339052 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 15:49:45 -0000 Author: asomers Date: Mon Oct 1 15:49:43 2018 New Revision: 339052 URL: https://svnweb.freebsd.org/changeset/base/339052 Log: MFC r336871, r336874 r336871: getrusage(2): fix return value under 32-bit emulation According to the man page, getrusage(2) should return EFAULT if the rusage argument lies outside of the process's address space. But due to an oversight in r100384, that's never been the case during 32-bit emulation. Fix it. PR: 230153 Reported by: tests(7) Reviewed by: cem Differential Revision: https://reviews.freebsd.org/D16500 r336874: freebsd32_getrusage(2): skip freebsd32_rusage_out on error PR: 230153 Reported by: kib X-MFC-With: 336871 Differential Revision: https://reviews.freebsd.org/D16500 Modified: stable/11/sys/compat/freebsd32/freebsd32_misc.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/compat/freebsd32/freebsd32_misc.c ============================================================================== --- stable/11/sys/compat/freebsd32/freebsd32_misc.c Mon Oct 1 15:47:34 2018 (r339051) +++ stable/11/sys/compat/freebsd32/freebsd32_misc.c Mon Oct 1 15:49:43 2018 (r339052) @@ -724,9 +724,7 @@ freebsd32_getrusage(struct thread *td, struct freebsd3 int error; error = kern_getrusage(td, uap->who, &s); - if (error) - return (error); - if (uap->rusage != NULL) { + if (error == 0) { freebsd32_rusage_out(&s, &s32); error = copyout(&s32, uap->rusage, sizeof(s32)); } From owner-svn-src-stable-11@freebsd.org Mon Oct 1 15:43:57 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0E8F310C1C2E; Mon, 1 Oct 2018 15:43:57 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B7A297C4EC; Mon, 1 Oct 2018 15:43:56 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id B263322F11; Mon, 1 Oct 2018 15:43:56 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w91FhuGg001999; Mon, 1 Oct 2018 15:43:56 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w91FhuEY001998; Mon, 1 Oct 2018 15:43:56 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810011543.w91FhuEY001998@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Mon, 1 Oct 2018 15:43:56 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339049 - stable/11/libexec/tftpd X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: stable/11/libexec/tftpd X-SVN-Commit-Revision: 339049 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 15:43:57 -0000 Author: asomers Date: Mon Oct 1 15:43:56 2018 New Revision: 339049 URL: https://svnweb.freebsd.org/changeset/base/339049 Log: MFC r336587: tftpd(8): when completing an WRQ, flush the file before acknowleding receipt tftpd(8) should flush a newly written file to disk before ACKing the final DATA packet. Otherwise there is a narrow race window when a subsequent read may not see the file. This is somewhat related to r330710, but the race window is much smaller. Hopefully this will fix the intermittent tests in Jenkins. Reported by: Jenkins Modified: stable/11/libexec/tftpd/tftp-transfer.c Directory Properties: stable/11/ (props changed) Modified: stable/11/libexec/tftpd/tftp-transfer.c ============================================================================== --- stable/11/libexec/tftpd/tftp-transfer.c Mon Oct 1 15:40:06 2018 (r339048) +++ stable/11/libexec/tftpd/tftp-transfer.c Mon Oct 1 15:43:56 2018 (r339049) @@ -277,6 +277,8 @@ tftp_receive(int peer, uint16_t *block, struct tftp_st send_error(peer, ENOSPACE); goto abort; } + if (n_data != segsize) + write_close(); } send_ack: @@ -301,8 +303,6 @@ send_ack: } gettimeofday(&(ts->tstop), NULL); } while (n_data == segsize); - - write_close(); /* Don't do late packet management for the client implementation */ if (acting_as_client) From owner-svn-src-stable-11@freebsd.org Mon Oct 1 15:57:50 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8EF2110C2232; Mon, 1 Oct 2018 15:57:50 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 438367D0A5; Mon, 1 Oct 2018 15:57:50 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 3E945230D2; Mon, 1 Oct 2018 15:57:50 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w91FvoWV008419; Mon, 1 Oct 2018 15:57:50 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w91FvoW1008418; Mon, 1 Oct 2018 15:57:50 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810011557.w91FvoW1008418@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Mon, 1 Oct 2018 15:57:50 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339054 - stable/11/usr.bin/tftp X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: stable/11/usr.bin/tftp X-SVN-Commit-Revision: 339054 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 15:57:50 -0000 Author: asomers Date: Mon Oct 1 15:57:49 2018 New Revision: 339054 URL: https://svnweb.freebsd.org/changeset/base/339054 Log: MFC r337779: tftp: Close a resource leak when putting files Reported by: Coverity CID: 1394842 Modified: stable/11/usr.bin/tftp/main.c Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.bin/tftp/main.c ============================================================================== --- stable/11/usr.bin/tftp/main.c Mon Oct 1 15:56:42 2018 (r339053) +++ stable/11/usr.bin/tftp/main.c Mon Oct 1 15:57:49 2018 (r339054) @@ -472,6 +472,7 @@ put(int argc, char *argv[]) printf("putting %s to %s:%s [%s]\n", cp, hostname, targ, mode); xmitfile(peer, port, fd, targ, mode); + close(fd); return; } /* this assumes the target is a directory */ From owner-svn-src-stable-11@freebsd.org Mon Oct 1 16:01:22 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 451BB10C2479; Mon, 1 Oct 2018 16:01:22 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id EF5857D4BB; Mon, 1 Oct 2018 16:01:21 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id EA2CD23116; Mon, 1 Oct 2018 16:01:21 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w91G1LEx011862; Mon, 1 Oct 2018 16:01:21 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w91G1LbA011861; Mon, 1 Oct 2018 16:01:21 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810011601.w91G1LbA011861@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Mon, 1 Oct 2018 16:01:21 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339056 - stable/11/etc X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: stable/11/etc X-SVN-Commit-Revision: 339056 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 16:01:22 -0000 Author: asomers Date: Mon Oct 1 16:01:21 2018 New Revision: 339056 URL: https://svnweb.freebsd.org/changeset/base/339056 Log: MFC r337973: Add Modbus Application Protocol to /etc/services IANA reassigned ports 502 and 802 on 2014-06-10 PR: 213276 Submitted by: Mark.Martinec@ijs.si Modified: stable/11/etc/services Directory Properties: stable/11/ (props changed) Modified: stable/11/etc/services ============================================================================== --- stable/11/etc/services Mon Oct 1 15:59:06 2018 (r339055) +++ stable/11/etc/services Mon Oct 1 16:01:21 2018 (r339056) @@ -873,8 +873,8 @@ isakmp 500/tcp isakmp 500/udp stmf 501/tcp stmf 501/udp -asa-appl-proto 502/tcp -asa-appl-proto 502/udp +mbap 502/tcp # Modbus Application Protocol +mbap 502/udp # Modbus Application Protocol intrinsa 503/tcp intrinsa 503/udp citadel 504/tcp @@ -1414,6 +1414,8 @@ mdbs_daemon 800/tcp mdbs_daemon 800/udp device 801/tcp device 801/udp +mbap-s 802/tcp # Modbus Application Protocol Secure +mbap-s 802/udp # Modbus Application Protocol Secure fcp-udp 810/tcp #FCP fcp-udp 810/udp #FCP Datagram itm-mcell-s 828/tcp From owner-svn-src-stable-11@freebsd.org Mon Oct 1 15:31:47 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CD25010C18CF; Mon, 1 Oct 2018 15:31:47 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-lf1-f52.google.com (mail-lf1-f52.google.com [209.85.167.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2FAB07BE42; Mon, 1 Oct 2018 15:31:47 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-lf1-f52.google.com with SMTP id q39-v6so8797085lfi.8; Mon, 01 Oct 2018 08:31:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=lyYrnjdVXX4OP21pavuuYoH1wJzgbTiheGHIRHe6ddA=; b=TeG/peb8CZaKZVDhQuh9ii/x/embJ2OgL5NEl85+5Pi6MJr1J6aGJDsYQLgUkPWrvQ G6kn2VHhun1i+eGDGIhMMwx6pWDd8qqFbT7OWeDB5+wXJdhICtzu2iPBcw8b6HeCqd0G 3+L7uW0885HpOQItrihlDViTbz5WGPfqY6XOCLPtYgsngT6hpugyn1+SiIjmA2tt23MM IICOz2SzsfsW3mdSTboKgJhErIZfI/WjHlLbJiE4n4muue7DGndYZYUWgGjU3lfTfNXT tfHkUQ9o0Gc13dbvQls2Q33gUK4/qMCQJqVVfaXoa0kR3KuP5hDNk94Skj/QZsvEr6YY C1yA== X-Gm-Message-State: ABuFfojNiLjcLdk8FX+2zHcwGTWtWDUT5LIHZCaHJETKcqxQHv/Fh5JA SmNhuuhA8WL5/nR0YzOHYSDEcKXFyYLWXUneHw6Odq2E X-Google-Smtp-Source: ACcGV60apfbe9ZLW98HCRqvwradpnNXiJkmyG/XtJE8vqMWeWIPuG5FkgRw2LJ3Awk7+u0TXSYqYoTvx5yACnXqYIIA= X-Received: by 2002:ac2:5082:: with SMTP id f2-v6mr5622282lfm.47.1538407899459; Mon, 01 Oct 2018 08:31:39 -0700 (PDT) MIME-Version: 1.0 References: <201809121852.w8CIqJrm046105@repo.freebsd.org> In-Reply-To: <201809121852.w8CIqJrm046105@repo.freebsd.org> From: Alan Somers Date: Mon, 1 Oct 2018 09:31:27 -0600 Message-ID: Subject: Re: svn commit: r338617 - in stable/11: lib/libc/sys sys/compat/freebsd32 sys/kern sys/netinet sys/netinet6 sys/sys tools/regression/sockets/udp_pingpong tools/regression/sockets/unix_cmsg To: Maxim Sobolev Cc: src-committers , svn-src-all , svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.27 X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 15:31:48 -0000 On Wed, Sep 12, 2018 at 12:52 PM Maxim Sobolev wrote: > Author: sobomax > Date: Wed Sep 12 18:52:18 2018 > New Revision: 338617 > URL: https://svnweb.freebsd.org/changeset/base/338617 > > Log: > MFC r312296 and r323254, which is new a socket option > SO_TS_CLOCK to pick from several different clock sources to > return timestamps when SO_TIMESTAMP is enabled and two > new nanosecond-precision timestamp types. This also fixes > recvmsg32() system call to properly down-convert layout of the > 64-bit structures to match what 32-bit app(s) expect. > > Bump __FreeBSD_version to indicate presence of a new > functionality. > > Differential Revision: https://reviews.freebsd.org/D9171 > > Added: > stable/11/tools/regression/sockets/udp_pingpong/ > - copied from r312296, head/tools/regression/sockets/udp_pingpong/ > Modified: > stable/11/lib/libc/sys/getsockopt.2 > stable/11/sys/compat/freebsd32/freebsd32.h > stable/11/sys/compat/freebsd32/freebsd32_misc.c > stable/11/sys/kern/uipc_socket.c > stable/11/sys/kern/uipc_usrreq.c > stable/11/sys/netinet/ip_input.c > stable/11/sys/netinet6/ip6_input.c > stable/11/sys/sys/param.h > stable/11/sys/sys/socket.h > stable/11/sys/sys/socketvar.h > stable/11/tools/regression/sockets/unix_cmsg/Makefile > stable/11/tools/regression/sockets/unix_cmsg/unix_cmsg.c > Directory Properties: > stable/11/ (props changed) > This change broke the build of tools/regression/sockets/unix_cmsg on stable/11. It looks like you need to MFC r309554 first. unix_cmsg.c:55:10: fatal error: 'uc_common.h' file not found #include "uc_common.h" ^~~~~~~~~~~~~ 1 error generated. *** Error code 1 Stop. -Alan From owner-svn-src-stable-11@freebsd.org Mon Oct 1 15:45:21 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6E50710C1C6B; Mon, 1 Oct 2018 15:45:21 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 232C57C605; Mon, 1 Oct 2018 15:45:21 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 1E08122F14; Mon, 1 Oct 2018 15:45:21 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w91FjLPu002356; Mon, 1 Oct 2018 15:45:21 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w91FjKw6002355; Mon, 1 Oct 2018 15:45:21 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810011545.w91FjKw6002355@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Mon, 1 Oct 2018 15:45:20 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339050 - stable/11/contrib/netbsd-tests/fs X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: stable/11/contrib/netbsd-tests/fs X-SVN-Commit-Revision: 339050 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 15:45:21 -0000 Author: asomers Date: Mon Oct 1 15:45:20 2018 New Revision: 339050 URL: https://svnweb.freebsd.org/changeset/base/339050 Log: MFC r336594: Fix tmpfs detection in the sys/fs/tmpfs tests This code was originally written for NetBSD. r306031 tried to adapt it to FreeBSD, but didn't correctly handle the case that tmpfs was available, but not already loaded. Fix the logic to load the module if necessary. The tmpfs tests shouldn't be skipped anymore. Also, fix a comment that was dislocated by r306031. Reported by: Jenkins Modified: stable/11/contrib/netbsd-tests/fs/h_funcs.subr Directory Properties: stable/11/ (props changed) Modified: stable/11/contrib/netbsd-tests/fs/h_funcs.subr ============================================================================== --- stable/11/contrib/netbsd-tests/fs/h_funcs.subr Mon Oct 1 15:43:56 2018 (r339049) +++ stable/11/contrib/netbsd-tests/fs/h_funcs.subr Mon Oct 1 15:45:20 2018 (r339050) @@ -43,17 +43,17 @@ require_fs() { atf_require_prog mount_${name} atf_require_prog umount - # if we have autoloadable modules, just assume the file system - atf_require_prog sysctl # Begin FreeBSD if true; then - if kldstat -m ${name}; then + if kldload -n ${name}; then found=yes else found=no fi else # End FreeBSD + # if we have autoloadable modules, just assume the file system + atf_require_prog sysctl autoload=$(sysctl -n kern.module.autoload) [ "${autoload}" = "1" ] && return 0 From owner-svn-src-stable-11@freebsd.org Mon Oct 1 15:56:43 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 709DB10C21B1; Mon, 1 Oct 2018 15:56:43 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 254707CF65; Mon, 1 Oct 2018 15:56:43 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 1FD93230CF; Mon, 1 Oct 2018 15:56:43 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w91FugPm008320; Mon, 1 Oct 2018 15:56:42 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w91Fug1h008319; Mon, 1 Oct 2018 15:56:42 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810011556.w91Fug1h008319@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Mon, 1 Oct 2018 15:56:42 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339053 - stable/11/share/man/man9 X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: stable/11/share/man/man9 X-SVN-Commit-Revision: 339053 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 15:56:43 -0000 Author: asomers Date: Mon Oct 1 15:56:42 2018 New Revision: 339053 URL: https://svnweb.freebsd.org/changeset/base/339053 Log: MFC r337482: Bring VOP_LOOKUP(9) up to date * Remove the cn_hash field (removed by r51906) * Add the cn_lkflags field (added by r144285) * Remove duplicate definition of cnp. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D16629 Modified: stable/11/share/man/man9/VOP_LOOKUP.9 Directory Properties: stable/11/ (props changed) Modified: stable/11/share/man/man9/VOP_LOOKUP.9 ============================================================================== --- stable/11/share/man/man9/VOP_LOOKUP.9 Mon Oct 1 15:49:43 2018 (r339052) +++ stable/11/share/man/man9/VOP_LOOKUP.9 Mon Oct 1 15:56:42 2018 (r339053) @@ -28,7 +28,7 @@ .\" .\" $FreeBSD$ .\" -.Dd November 17, 2017 +.Dd August 8, 2018 .Dt VOP_LOOKUP 9 .Os .Sh NAME @@ -51,10 +51,7 @@ The locked vnode of the directory to search. The address of a variable where the resulting locked vnode should be stored. .It Fa cnp The pathname component to be searched for. -.El -.Pp -.Fa Cnp -is a pointer to a componentname structure defined as follows: +It is a pointer to a componentname structure defined as follows: .Bd -literal struct componentname { /* @@ -64,13 +61,13 @@ struct componentname { u_long cn_flags; /* flags to namei */ struct thread *cn_thread; /* thread requesting lookup */ struct ucred *cn_cred; /* credentials */ + int cn_lkflags; /* Lock flags LK_EXCLUSIVE or LK_SHARED */ /* * Shared between lookup and commit routines. */ char *cn_pnbuf; /* pathname buffer */ char *cn_nameptr; /* pointer to looked up name */ long cn_namelen; /* length of looked up component */ - u_long cn_hash; /* hash value of looked up name */ }; .Ed .Pp From owner-svn-src-stable-11@freebsd.org Mon Oct 1 16:04:08 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4244610C25C6; Mon, 1 Oct 2018 16:04:08 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id EBF5E7D88B; Mon, 1 Oct 2018 16:04:07 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id E6D4C23276; Mon, 1 Oct 2018 16:04:07 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w91G47cD013640; Mon, 1 Oct 2018 16:04:07 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w91G47iV013639; Mon, 1 Oct 2018 16:04:07 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810011604.w91G47iV013639@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Mon, 1 Oct 2018 16:04:07 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339057 - stable/11/libexec/tftpd X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: stable/11/libexec/tftpd X-SVN-Commit-Revision: 339057 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 16:04:08 -0000 Author: asomers Date: Mon Oct 1 16:04:07 2018 New Revision: 339057 URL: https://svnweb.freebsd.org/changeset/base/339057 Log: MFC r338216: tftpd: Fix data corruption bug with netascii Transferring files in netascii format requires, among other things, translating all CR characters to a CR,NUL pair. tftpd does this correctly except when the CR occurs as the last octet of a packet. In that case, it erroneously drops the NUL which should be part of the following packet. The bug was caused by using 0 as a sentinel value in a variable that could legitimately hold 0. Fix it by switching the sentinel value to -1. PR: 178055 Reported by: Richard Reviewed by: cem Differential Revision: https://reviews.freebsd.org/D16853 Modified: stable/11/libexec/tftpd/tftp-file.c Directory Properties: stable/11/ (props changed) Modified: stable/11/libexec/tftpd/tftp-file.c ============================================================================== --- stable/11/libexec/tftpd/tftp-file.c Mon Oct 1 16:01:21 2018 (r339056) +++ stable/11/libexec/tftpd/tftp-file.c Mon Oct 1 16:04:07 2018 (r339057) @@ -108,10 +108,10 @@ convert_to_net(char *buffer, size_t count, int init) { size_t i; static size_t n = 0, in = 0; - static int newline = 0; + static int newline = -1; if (init) { - newline = 0; + newline = -1; n = 0; in = 0; return 0 ; @@ -122,9 +122,9 @@ convert_to_net(char *buffer, size_t count, int init) */ i = 0; - if (newline) { + if (newline != -1) { buffer[i++] = newline; - newline = 0; + newline = -1; } while (i < count) { @@ -159,7 +159,7 @@ convert_to_net(char *buffer, size_t count, int init) if (i > count) { /* - * Whoops... that isn't alllowed (but it will happen + * Whoops... that isn't allowed (but it will happen * when there is a CR or LF at the end of the buffer) */ newline = buffer[i-1]; From owner-svn-src-stable-11@freebsd.org Mon Oct 1 15:59:06 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CADDB10C22A4; Mon, 1 Oct 2018 15:59:06 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 716417D1AD; Mon, 1 Oct 2018 15:59:06 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 6C50D230D3; Mon, 1 Oct 2018 15:59:06 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w91Fx6ZX008526; Mon, 1 Oct 2018 15:59:06 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w91Fx6SL008525; Mon, 1 Oct 2018 15:59:06 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810011559.w91Fx6SL008525@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Mon, 1 Oct 2018 15:59:06 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339055 - stable/11/tests/sys/opencrypto X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: stable/11/tests/sys/opencrypto X-SVN-Commit-Revision: 339055 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 15:59:07 -0000 Author: asomers Date: Mon Oct 1 15:59:06 2018 New Revision: 339055 URL: https://svnweb.freebsd.org/changeset/base/339055 Log: MFC r337911: Fix the sys/opencrypto/runtests test when aesni(4) is already loaded Apparently kldstat requires the full module name, including busname Reported by: Jenkins Modified: stable/11/tests/sys/opencrypto/runtests.sh Directory Properties: stable/11/ (props changed) Modified: stable/11/tests/sys/opencrypto/runtests.sh ============================================================================== --- stable/11/tests/sys/opencrypto/runtests.sh Mon Oct 1 15:57:49 2018 (r339054) +++ stable/11/tests/sys/opencrypto/runtests.sh Mon Oct 1 15:59:06 2018 (r339055) @@ -50,9 +50,9 @@ cleanup_tests() } trap cleanup_tests EXIT INT TERM -for required_module in aesni cryptodev; do +for required_module in nexus/aesni cryptodev; do if ! kldstat -q -m $required_module; then - kldload $required_module + kldload ${required_module#nexus/} loaded_modules="$loaded_modules $required_module" fi done From owner-svn-src-stable-11@freebsd.org Mon Oct 1 15:47:36 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C5AE510C1CDA; Mon, 1 Oct 2018 15:47:35 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 72C747C6FC; Mon, 1 Oct 2018 15:47:35 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 63DA622F15; Mon, 1 Oct 2018 15:47:35 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w91FlZOG002790; Mon, 1 Oct 2018 15:47:35 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w91FlYP4002785; Mon, 1 Oct 2018 15:47:34 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810011547.w91FlYP4002785@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Mon, 1 Oct 2018 15:47:34 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339051 - stable/11/libexec/tftpd X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: stable/11/libexec/tftpd X-SVN-Commit-Revision: 339051 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 15:47:36 -0000 Author: asomers Date: Mon Oct 1 15:47:34 2018 New Revision: 339051 URL: https://svnweb.freebsd.org/changeset/base/339051 Log: MFC r336605: Fix multiple Coverity warnings in tftpd(8) * Initialize uninitialized variable (CID 1006502) * strcpy => strlcpy (CID 1006792, 1006791, 1006790) * Check function return values (CID 1009442, 1009441, 1009440) * Delete dead code in receive_packet (not reported by Coverity) * Remove redundant alarm(3) in receive_packet (not reported by Coverity) Reported by: Coverity CID: 1006502, 1006792, 1006791, 1006790, 1009442, 1009441, 1009440 Differential Revision: https://reviews.freebsd.org/D11287 Modified: stable/11/libexec/tftpd/tftp-file.c stable/11/libexec/tftpd/tftp-io.c stable/11/libexec/tftpd/tftp-utils.c stable/11/libexec/tftpd/tftpd.c Directory Properties: stable/11/ (props changed) Modified: stable/11/libexec/tftpd/tftp-file.c ============================================================================== --- stable/11/libexec/tftpd/tftp-file.c Mon Oct 1 15:45:20 2018 (r339050) +++ stable/11/libexec/tftpd/tftp-file.c Mon Oct 1 15:47:34 2018 (r339051) @@ -34,6 +34,7 @@ __FBSDID("$FreeBSD$"); #include #include +#include #include #include #include @@ -78,7 +79,8 @@ convert_from_net(char *buffer, size_t count) if (buffer[i] == '\n') { if (n == 0) { if (ftell(file) != 0) { - fseek(file, -1, SEEK_END); + int r = fseek(file, -1, SEEK_END); + assert(r == 0); convbuffer[n++] = '\n'; } else { /* This shouldn't happen */ Modified: stable/11/libexec/tftpd/tftp-io.c ============================================================================== --- stable/11/libexec/tftpd/tftp-io.c Mon Oct 1 15:45:20 2018 (r339050) +++ stable/11/libexec/tftpd/tftp-io.c Mon Oct 1 15:47:34 2018 (r339051) @@ -34,6 +34,7 @@ __FBSDID("$FreeBSD$"); #include #include +#include #include #include #include @@ -393,7 +394,7 @@ receive_packet(int peer, char *data, int size, struct struct sockaddr_storage *pfrom; socklen_t fromlen; int n; - static int waiting; + static int timed_out; if (debug&DEBUG_PACKETS) tftp_log(LOG_DEBUG, @@ -401,23 +402,16 @@ receive_packet(int peer, char *data, int size, struct pkt = (struct tftphdr *)data; - waiting = 0; signal(SIGALRM, timeout); - setjmp(timeoutbuf); + timed_out = setjmp(timeoutbuf); alarm(thistimeout); - if (waiting > 0) { - alarm(0); - return (RP_TIMEOUT); - } - - if (waiting > 0) { + if (timed_out != 0) { tftp_log(LOG_ERR, "receive_packet: timeout"); alarm(0); return (RP_TIMEOUT); } - waiting++; pfrom = (from == NULL) ? &from_local : from; fromlen = sizeof(*pfrom); n = recvfrom(peer, data, size, 0, (struct sockaddr *)pfrom, &fromlen); @@ -430,8 +424,6 @@ receive_packet(int peer, char *data, int size, struct tftp_log(LOG_ERR, "receive_packet: timeout"); return (RP_TIMEOUT); } - - alarm(0); if (n < 0) { /* No idea what could have happened if it isn't a timeout */ Modified: stable/11/libexec/tftpd/tftp-utils.c ============================================================================== --- stable/11/libexec/tftpd/tftp-utils.c Mon Oct 1 15:45:20 2018 (r339050) +++ stable/11/libexec/tftpd/tftp-utils.c Mon Oct 1 15:47:34 2018 (r339051) @@ -268,11 +268,13 @@ char * rp_strerror(int error) { static char s[100]; + size_t space = sizeof(s); int i = 0; while (rp_errors[i].desc != NULL) { if (rp_errors[i].error == error) { - strcpy(s, rp_errors[i].desc); + strlcpy(s, rp_errors[i].desc, space); + space -= strlen(rp_errors[i].desc); } i++; } Modified: stable/11/libexec/tftpd/tftpd.c ============================================================================== --- stable/11/libexec/tftpd/tftpd.c Mon Oct 1 15:45:20 2018 (r339050) +++ stable/11/libexec/tftpd/tftpd.c Mon Oct 1 15:47:34 2018 (r339051) @@ -372,7 +372,10 @@ main(int argc, char *argv[]) exit(1); } chdir("/"); - setgroups(1, &nobody->pw_gid); + if (setgroups(1, &nobody->pw_gid) != 0) { + tftp_log(LOG_ERR, "setgroups failed"); + exit(1); + } if (setuid(nobody->pw_uid) != 0) { tftp_log(LOG_ERR, "setuid failed"); exit(1); @@ -520,7 +523,7 @@ tftp_wrq(int peer, char *recvbuffer, ssize_t size) cp = parse_header(peer, recvbuffer, size, &filename, &mode); size -= (cp - recvbuffer) + 1; - strcpy(fnbuf, filename); + strlcpy(fnbuf, filename, sizeof(fnbuf)); reduce_path(fnbuf); filename = fnbuf; @@ -565,7 +568,7 @@ tftp_rrq(int peer, char *recvbuffer, ssize_t size) cp = parse_header(peer, recvbuffer, size, &filename, &mode); size -= (cp - recvbuffer) + 1; - strcpy(fnbuf, filename); + strlcpy(fnbuf, filename, sizeof(fnbuf)); reduce_path(fnbuf); filename = fnbuf; @@ -802,6 +805,7 @@ tftp_xmitfile(int peer, const char *mode) time_t now; struct tftp_stats ts; + memset(&ts, 0, sizeof(ts)); now = time(NULL); if (debug&DEBUG_SIMPLE) tftp_log(LOG_DEBUG, "Transmitting file"); From owner-svn-src-stable-11@freebsd.org Mon Oct 1 17:26:44 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9E4E010A4A5B; Mon, 1 Oct 2018 17:26:44 +0000 (UTC) (envelope-from sobomax@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5149183467; Mon, 1 Oct 2018 17:26:44 +0000 (UTC) (envelope-from sobomax@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 47A9C23F9A; Mon, 1 Oct 2018 17:26:44 +0000 (UTC) (envelope-from sobomax@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w91HQieK069177; Mon, 1 Oct 2018 17:26:44 GMT (envelope-from sobomax@FreeBSD.org) Received: (from sobomax@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w91HQfG5069159; Mon, 1 Oct 2018 17:26:41 GMT (envelope-from sobomax@FreeBSD.org) Message-Id: <201810011726.w91HQfG5069159@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: sobomax set sender to sobomax@FreeBSD.org using -f From: Maxim Sobolev Date: Mon, 1 Oct 2018 17:26:41 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339066 - stable/11/tools/regression/sockets/unix_cmsg X-SVN-Group: stable-11 X-SVN-Commit-Author: sobomax X-SVN-Commit-Paths: stable/11/tools/regression/sockets/unix_cmsg X-SVN-Commit-Revision: 339066 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 17:26:45 -0000 Author: sobomax Date: Mon Oct 1 17:26:41 2018 New Revision: 339066 URL: https://svnweb.freebsd.org/changeset/base/339066 Log: MFC r309554 and r309631 which breaks down overly long monolithic souce file and reduces duplication by auto-generating functions that only differ in the value of the SCM_XXX constant used. This also fixes unintentional breakage introduced in earlier MFC in r338617 that happens to rely on some of those changes. Reported by: asomers Pointy-hat goes to: sobomax Added: stable/11/tools/regression/sockets/unix_cmsg/t_cmsg_len.c - copied unchanged from r309554, head/tools/regression/sockets/unix_cmsg/t_cmsg_len.c stable/11/tools/regression/sockets/unix_cmsg/t_cmsg_len.h - copied unchanged from r309554, head/tools/regression/sockets/unix_cmsg/t_cmsg_len.h stable/11/tools/regression/sockets/unix_cmsg/t_cmsgcred.c - copied unchanged from r309554, head/tools/regression/sockets/unix_cmsg/t_cmsgcred.c stable/11/tools/regression/sockets/unix_cmsg/t_cmsgcred.h - copied unchanged from r309554, head/tools/regression/sockets/unix_cmsg/t_cmsgcred.h stable/11/tools/regression/sockets/unix_cmsg/t_cmsgcred_sockcred.c - copied unchanged from r309554, head/tools/regression/sockets/unix_cmsg/t_cmsgcred_sockcred.c stable/11/tools/regression/sockets/unix_cmsg/t_cmsgcred_sockcred.h - copied unchanged from r309554, head/tools/regression/sockets/unix_cmsg/t_cmsgcred_sockcred.h stable/11/tools/regression/sockets/unix_cmsg/t_generic.c - copied unchanged from r309554, head/tools/regression/sockets/unix_cmsg/t_generic.c stable/11/tools/regression/sockets/unix_cmsg/t_generic.h - copied unchanged from r309554, head/tools/regression/sockets/unix_cmsg/t_generic.h stable/11/tools/regression/sockets/unix_cmsg/t_peercred.c - copied unchanged from r309554, head/tools/regression/sockets/unix_cmsg/t_peercred.c stable/11/tools/regression/sockets/unix_cmsg/t_peercred.h - copied unchanged from r309554, head/tools/regression/sockets/unix_cmsg/t_peercred.h stable/11/tools/regression/sockets/unix_cmsg/t_sockcred.c - copied unchanged from r309554, head/tools/regression/sockets/unix_cmsg/t_sockcred.c stable/11/tools/regression/sockets/unix_cmsg/t_sockcred.h - copied unchanged from r309554, head/tools/regression/sockets/unix_cmsg/t_sockcred.h stable/11/tools/regression/sockets/unix_cmsg/t_xxxtime.c.in - copied unchanged from r309631, head/tools/regression/sockets/unix_cmsg/t_xxxtime.c.in stable/11/tools/regression/sockets/unix_cmsg/t_xxxtime.h.in - copied unchanged from r309631, head/tools/regression/sockets/unix_cmsg/t_xxxtime.h.in stable/11/tools/regression/sockets/unix_cmsg/uc_common.c - copied unchanged from r309554, head/tools/regression/sockets/unix_cmsg/uc_common.c stable/11/tools/regression/sockets/unix_cmsg/uc_common.h - copied unchanged from r309554, head/tools/regression/sockets/unix_cmsg/uc_common.h Modified: stable/11/tools/regression/sockets/unix_cmsg/Makefile stable/11/tools/regression/sockets/unix_cmsg/unix_cmsg.c Modified: stable/11/tools/regression/sockets/unix_cmsg/Makefile ============================================================================== --- stable/11/tools/regression/sockets/unix_cmsg/Makefile Mon Oct 1 16:23:00 2018 (r339065) +++ stable/11/tools/regression/sockets/unix_cmsg/Makefile Mon Oct 1 17:26:41 2018 (r339066) @@ -1,6 +1,11 @@ # $FreeBSD$ PROG= unix_cmsg +SRCS= ${AUTOSRCS} unix_cmsg.c uc_common.h uc_common.c \ + t_generic.h t_generic.c t_peercred.h t_peercred.c \ + t_cmsgcred.h t_cmsgcred.c t_sockcred.h t_sockcred.c \ + t_cmsgcred_sockcred.h t_cmsgcred_sockcred.c t_cmsg_len.h t_cmsg_len.c +CLEANFILES+= ${AUTOSRCS} MAN= WARNS?= 3 Copied: stable/11/tools/regression/sockets/unix_cmsg/t_cmsg_len.c (from r309554, head/tools/regression/sockets/unix_cmsg/t_cmsg_len.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/tools/regression/sockets/unix_cmsg/t_cmsg_len.c Mon Oct 1 17:26:41 2018 (r339066, copy of r309554, head/tools/regression/sockets/unix_cmsg/t_cmsg_len.c) @@ -0,0 +1,138 @@ +/*- + * Copyright (c) 2005 Andrey Simonenko + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include +__FBSDID("$FreeBSD$"); + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "uc_common.h" +#include "t_generic.h" +#include "t_cmsg_len.h" + +#ifndef __LP64__ +static int +t_cmsg_len_client(int fd) +{ + struct msghdr msghdr; + struct iovec iov[1]; + struct cmsghdr *cmsghdr; + void *cmsg_data; + size_t size, cmsg_size; + socklen_t socklen; + int rv; + + if (uc_sync_recv() < 0) + return (-2); + + rv = -2; + + cmsg_size = CMSG_SPACE(sizeof(struct cmsgcred)); + cmsg_data = malloc(cmsg_size); + if (cmsg_data == NULL) { + uc_logmsg("malloc"); + goto done; + } + uc_msghdr_init_client(&msghdr, iov, cmsg_data, cmsg_size, + SCM_CREDS, sizeof(struct cmsgcred)); + cmsghdr = CMSG_FIRSTHDR(&msghdr); + + if (uc_socket_connect(fd) < 0) + goto done; + + size = msghdr.msg_iov != NULL ? msghdr.msg_iov->iov_len : 0; + rv = -1; + for (socklen = 0; socklen < CMSG_LEN(0); ++socklen) { + cmsghdr->cmsg_len = socklen; + uc_dbgmsg("send: data size %zu", size); + uc_dbgmsg("send: msghdr.msg_controllen %u", + (u_int)msghdr.msg_controllen); + uc_dbgmsg("send: cmsghdr.cmsg_len %u", + (u_int)cmsghdr->cmsg_len); + if (sendmsg(fd, &msghdr, 0) < 0) { + uc_dbgmsg("sendmsg(2) failed: %s; retrying", + strerror(errno)); + continue; + } + uc_logmsgx("sent message with cmsghdr.cmsg_len %u < %u", + (u_int)cmsghdr->cmsg_len, (u_int)CMSG_LEN(0)); + break; + } + if (socklen == CMSG_LEN(0)) + rv = 0; + + if (uc_sync_send() < 0) { + rv = -2; + goto done; + } +done: + free(cmsg_data); + return (rv); +} + +static int +t_cmsg_len_server(int fd1) +{ + int fd2, rv; + + if (uc_sync_send() < 0) + return (-2); + + rv = -2; + + if (uc_cfg.sock_type == SOCK_STREAM) { + fd2 = uc_socket_accept(fd1); + if (fd2 < 0) + goto done; + } else + fd2 = fd1; + + if (uc_sync_recv() < 0) + goto done; + + rv = 0; +done: + if (uc_cfg.sock_type == SOCK_STREAM && fd2 >= 0) + if (uc_socket_close(fd2) < 0) + rv = -2; + return (rv); +} + +int +t_cmsg_len(void) +{ + return (t_generic(t_cmsg_len_client, t_cmsg_len_server)); +} +#endif Copied: stable/11/tools/regression/sockets/unix_cmsg/t_cmsg_len.h (from r309554, head/tools/regression/sockets/unix_cmsg/t_cmsg_len.h) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/tools/regression/sockets/unix_cmsg/t_cmsg_len.h Mon Oct 1 17:26:41 2018 (r339066, copy of r309554, head/tools/regression/sockets/unix_cmsg/t_cmsg_len.h) @@ -0,0 +1,32 @@ +/*- + * Copyright (c) 2005 Andrey Simonenko + * Copyright (c) 2016 Maksym Sobolyev + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef __LP64__ +int t_cmsg_len(void); +#endif Copied: stable/11/tools/regression/sockets/unix_cmsg/t_cmsgcred.c (from r309554, head/tools/regression/sockets/unix_cmsg/t_cmsgcred.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/tools/regression/sockets/unix_cmsg/t_cmsgcred.c Mon Oct 1 17:26:41 2018 (r339066, copy of r309554, head/tools/regression/sockets/unix_cmsg/t_cmsgcred.c) @@ -0,0 +1,139 @@ +/*- + * Copyright (c) 2005 Andrey Simonenko + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include +__FBSDID("$FreeBSD$"); + +#include +#include +#include +#include +#include +#include +#include + +#include "uc_common.h" +#include "t_generic.h" +#include "t_cmsgcred.h" + +int +t_cmsgcred_client(int fd) +{ + struct msghdr msghdr; + struct iovec iov[1]; + void *cmsg_data; + size_t cmsg_size; + int rv; + + if (uc_sync_recv() < 0) + return (-2); + + rv = -2; + + cmsg_size = CMSG_SPACE(sizeof(struct cmsgcred)); + cmsg_data = malloc(cmsg_size); + if (cmsg_data == NULL) { + uc_logmsg("malloc"); + goto done; + } + uc_msghdr_init_client(&msghdr, iov, cmsg_data, cmsg_size, + SCM_CREDS, sizeof(struct cmsgcred)); + + if (uc_socket_connect(fd) < 0) + goto done; + + if (uc_message_sendn(fd, &msghdr) < 0) + goto done; + + rv = 0; +done: + free(cmsg_data); + return (rv); +} + +static int +t_cmsgcred_server(int fd1) +{ + struct msghdr msghdr; + struct iovec iov[1]; + struct cmsghdr *cmsghdr; + void *cmsg_data; + size_t cmsg_size; + u_int i; + int fd2, rv; + + if (uc_sync_send() < 0) + return (-2); + + fd2 = -1; + rv = -2; + + cmsg_size = CMSG_SPACE(sizeof(struct cmsgcred)); + cmsg_data = malloc(cmsg_size); + if (cmsg_data == NULL) { + uc_logmsg("malloc"); + goto done; + } + + if (uc_cfg.sock_type == SOCK_STREAM) { + fd2 = uc_socket_accept(fd1); + if (fd2 < 0) + goto done; + } else + fd2 = fd1; + + rv = -1; + for (i = 1; i <= uc_cfg.ipc_msg.msg_num; ++i) { + uc_dbgmsg("message #%u", i); + + uc_msghdr_init_server(&msghdr, iov, cmsg_data, cmsg_size); + if (uc_message_recv(fd2, &msghdr) < 0) { + rv = -2; + break; + } + + if (uc_check_msghdr(&msghdr, sizeof(*cmsghdr)) < 0) + break; + + cmsghdr = CMSG_FIRSTHDR(&msghdr); + if (uc_check_scm_creds_cmsgcred(cmsghdr) < 0) + break; + } + if (i > uc_cfg.ipc_msg.msg_num) + rv = 0; +done: + free(cmsg_data); + if (uc_cfg.sock_type == SOCK_STREAM && fd2 >= 0) + if (uc_socket_close(fd2) < 0) + rv = -2; + return (rv); +} + +int +t_cmsgcred(void) +{ + return (t_generic(t_cmsgcred_client, t_cmsgcred_server)); +} Copied: stable/11/tools/regression/sockets/unix_cmsg/t_cmsgcred.h (from r309554, head/tools/regression/sockets/unix_cmsg/t_cmsgcred.h) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/tools/regression/sockets/unix_cmsg/t_cmsgcred.h Mon Oct 1 17:26:41 2018 (r339066, copy of r309554, head/tools/regression/sockets/unix_cmsg/t_cmsgcred.h) @@ -0,0 +1,30 @@ +/*- + * Copyright (c) 2005 Andrey Simonenko + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +int t_cmsgcred_client(int fd); +int t_cmsgcred(void); Copied: stable/11/tools/regression/sockets/unix_cmsg/t_cmsgcred_sockcred.c (from r309554, head/tools/regression/sockets/unix_cmsg/t_cmsgcred_sockcred.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/tools/regression/sockets/unix_cmsg/t_cmsgcred_sockcred.c Mon Oct 1 17:26:41 2018 (r339066, copy of r309554, head/tools/regression/sockets/unix_cmsg/t_cmsgcred_sockcred.c) @@ -0,0 +1,125 @@ +/*- + * Copyright (c) 2005 Andrey Simonenko + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include +__FBSDID("$FreeBSD$"); + +#include +#include +#include +#include +#include +#include +#include + +#include "uc_common.h" +#include "t_generic.h" +#include "t_cmsgcred.h" +#include "t_cmsgcred_sockcred.h" + +static int +t_cmsgcred_sockcred_server(int fd1) +{ + struct msghdr msghdr; + struct iovec iov[1]; + struct cmsghdr *cmsghdr; + void *cmsg_data, *cmsg1_data, *cmsg2_data; + size_t cmsg_size, cmsg1_size, cmsg2_size; + u_int i; + int fd2, rv, val; + + fd2 = -1; + rv = -2; + + cmsg1_size = CMSG_SPACE(SOCKCREDSIZE(uc_cfg.proc_cred.gid_num)); + cmsg2_size = CMSG_SPACE(sizeof(struct cmsgcred)); + cmsg1_data = malloc(cmsg1_size); + cmsg2_data = malloc(cmsg2_size); + if (cmsg1_data == NULL || cmsg2_data == NULL) { + uc_logmsg("malloc"); + goto done; + } + + uc_dbgmsg("setting LOCAL_CREDS"); + val = 1; + if (setsockopt(fd1, 0, LOCAL_CREDS, &val, sizeof(val)) < 0) { + uc_logmsg("setsockopt(LOCAL_CREDS)"); + goto done; + } + + if (uc_sync_send() < 0) + goto done; + + if (uc_cfg.sock_type == SOCK_STREAM) { + fd2 = uc_socket_accept(fd1); + if (fd2 < 0) + goto done; + } else + fd2 = fd1; + + cmsg_data = cmsg1_data; + cmsg_size = cmsg1_size; + rv = -1; + for (i = 1; i <= uc_cfg.ipc_msg.msg_num; ++i) { + uc_dbgmsg("message #%u", i); + + uc_msghdr_init_server(&msghdr, iov, cmsg_data, cmsg_size); + if (uc_message_recv(fd2, &msghdr) < 0) { + rv = -2; + break; + } + + if (uc_check_msghdr(&msghdr, sizeof(*cmsghdr)) < 0) + break; + + cmsghdr = CMSG_FIRSTHDR(&msghdr); + if (i == 1 || uc_cfg.sock_type == SOCK_DGRAM) { + if (uc_check_scm_creds_sockcred(cmsghdr) < 0) + break; + } else { + if (uc_check_scm_creds_cmsgcred(cmsghdr) < 0) + break; + } + + cmsg_data = cmsg2_data; + cmsg_size = cmsg2_size; + } + if (i > uc_cfg.ipc_msg.msg_num) + rv = 0; +done: + free(cmsg1_data); + free(cmsg2_data); + if (uc_cfg.sock_type == SOCK_STREAM && fd2 >= 0) + if (uc_socket_close(fd2) < 0) + rv = -2; + return (rv); +} + +int +t_cmsgcred_sockcred(void) +{ + return (t_generic(t_cmsgcred_client, t_cmsgcred_sockcred_server)); +} Copied: stable/11/tools/regression/sockets/unix_cmsg/t_cmsgcred_sockcred.h (from r309554, head/tools/regression/sockets/unix_cmsg/t_cmsgcred_sockcred.h) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/tools/regression/sockets/unix_cmsg/t_cmsgcred_sockcred.h Mon Oct 1 17:26:41 2018 (r339066, copy of r309554, head/tools/regression/sockets/unix_cmsg/t_cmsgcred_sockcred.h) @@ -0,0 +1,30 @@ +/*- + * Copyright (c) 2005 Andrey Simonenko + * Copyright (c) 2016 Maksym Sobolyev + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +int t_cmsgcred_sockcred(void); Copied: stable/11/tools/regression/sockets/unix_cmsg/t_generic.c (from r309554, head/tools/regression/sockets/unix_cmsg/t_generic.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/tools/regression/sockets/unix_cmsg/t_generic.c Mon Oct 1 17:26:41 2018 (r339066, copy of r309554, head/tools/regression/sockets/unix_cmsg/t_generic.c) @@ -0,0 +1,73 @@ +/*- + * Copyright (c) 2005 Andrey Simonenko + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include +__FBSDID("$FreeBSD$"); + +#include +#include +#include +#include +#include + +#include "uc_common.h" +#include "t_generic.h" + +int +t_generic(int (*client_func)(int), int (*server_func)(int)) +{ + int fd, rv, rv_client; + + switch (uc_client_fork()) { + case 0: + fd = uc_socket_create(); + if (fd < 0) + rv = -2; + else { + rv = client_func(fd); + if (uc_socket_close(fd) < 0) + rv = -2; + } + uc_client_exit(rv); + break; + case 1: + fd = uc_socket_create(); + if (fd < 0) + rv = -2; + else { + rv = server_func(fd); + rv_client = uc_client_wait(); + if (rv == 0 || (rv == -2 && rv_client != 0)) + rv = rv_client; + if (uc_socket_close(fd) < 0) + rv = -2; + } + break; + default: + rv = -2; + } + return (rv); +} Copied: stable/11/tools/regression/sockets/unix_cmsg/t_generic.h (from r309554, head/tools/regression/sockets/unix_cmsg/t_generic.h) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/tools/regression/sockets/unix_cmsg/t_generic.h Mon Oct 1 17:26:41 2018 (r339066, copy of r309554, head/tools/regression/sockets/unix_cmsg/t_generic.h) @@ -0,0 +1,30 @@ +/*- + * Copyright (c) 2005 Andrey Simonenko + * Copyright (c) 2016 Maksym Sobolyev + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +int t_generic(int (*client_func)(int), int (*server_func)(int)); Copied: stable/11/tools/regression/sockets/unix_cmsg/t_peercred.c (from r309554, head/tools/regression/sockets/unix_cmsg/t_peercred.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/tools/regression/sockets/unix_cmsg/t_peercred.c Mon Oct 1 17:26:41 2018 (r339066, copy of r309554, head/tools/regression/sockets/unix_cmsg/t_peercred.c) @@ -0,0 +1,153 @@ +/*- + * Copyright (c) 2005 Andrey Simonenko + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include +__FBSDID("$FreeBSD$"); + +#include +#include +#include +#include +#include +#include + +#include "uc_common.h" +#include "t_generic.h" +#include "t_peercred.h" + +static int +check_xucred(const struct xucred *xucred, socklen_t len) +{ + int rc; + + if (len != sizeof(*xucred)) { + uc_logmsgx("option value size %zu != %zu", + (size_t)len, sizeof(*xucred)); + return (-1); + } + + uc_dbgmsg("xucred.cr_version %u", xucred->cr_version); + uc_dbgmsg("xucred.cr_uid %lu", (u_long)xucred->cr_uid); + uc_dbgmsg("xucred.cr_ngroups %d", xucred->cr_ngroups); + + rc = 0; + + if (xucred->cr_version != XUCRED_VERSION) { + uc_logmsgx("xucred.cr_version %u != %d", + xucred->cr_version, XUCRED_VERSION); + rc = -1; + } + if (xucred->cr_uid != uc_cfg.proc_cred.euid) { + uc_logmsgx("xucred.cr_uid %lu != %lu (EUID)", + (u_long)xucred->cr_uid, (u_long)uc_cfg.proc_cred.euid); + rc = -1; + } + if (xucred->cr_ngroups == 0) { + uc_logmsgx("xucred.cr_ngroups == 0"); + rc = -1; + } + if (xucred->cr_ngroups < 0) { + uc_logmsgx("xucred.cr_ngroups < 0"); + rc = -1; + } + if (xucred->cr_ngroups > XU_NGROUPS) { + uc_logmsgx("xucred.cr_ngroups %hu > %u (max)", + xucred->cr_ngroups, XU_NGROUPS); + rc = -1; + } + if (xucred->cr_groups[0] != uc_cfg.proc_cred.egid) { + uc_logmsgx("xucred.cr_groups[0] %lu != %lu (EGID)", + (u_long)xucred->cr_groups[0], (u_long)uc_cfg.proc_cred.egid); + rc = -1; + } + if (uc_check_groups("xucred.cr_groups", xucred->cr_groups, + "xucred.cr_ngroups", xucred->cr_ngroups, false) < 0) + rc = -1; + return (rc); +} + +static int +t_peercred_client(int fd) +{ + struct xucred xucred; + socklen_t len; + + if (uc_sync_recv() < 0) + return (-1); + + if (uc_socket_connect(fd) < 0) + return (-1); + + len = sizeof(xucred); + if (getsockopt(fd, 0, LOCAL_PEERCRED, &xucred, &len) < 0) { + uc_logmsg("getsockopt(LOCAL_PEERCRED)"); + return (-1); + } + + if (check_xucred(&xucred, len) < 0) + return (-1); + + return (0); +} + +static int +t_peercred_server(int fd1) +{ + struct xucred xucred; + socklen_t len; + int fd2, rv; + + if (uc_sync_send() < 0) + return (-2); + + fd2 = uc_socket_accept(fd1); + if (fd2 < 0) + return (-2); + + len = sizeof(xucred); + if (getsockopt(fd2, 0, LOCAL_PEERCRED, &xucred, &len) < 0) { + uc_logmsg("getsockopt(LOCAL_PEERCRED)"); + rv = -2; + goto done; + } + + if (check_xucred(&xucred, len) < 0) { + rv = -1; + goto done; + } + + rv = 0; +done: + if (uc_socket_close(fd2) < 0) + rv = -2; + return (rv); +} + +int +t_peercred(void) +{ + return (t_generic(t_peercred_client, t_peercred_server)); +} Copied: stable/11/tools/regression/sockets/unix_cmsg/t_peercred.h (from r309554, head/tools/regression/sockets/unix_cmsg/t_peercred.h) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/tools/regression/sockets/unix_cmsg/t_peercred.h Mon Oct 1 17:26:41 2018 (r339066, copy of r309554, head/tools/regression/sockets/unix_cmsg/t_peercred.h) @@ -0,0 +1,29 @@ +/*- + * Copyright (c) 2005 Andrey Simonenko + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +int t_peercred(void); Copied: stable/11/tools/regression/sockets/unix_cmsg/t_sockcred.c (from r309554, head/tools/regression/sockets/unix_cmsg/t_sockcred.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/tools/regression/sockets/unix_cmsg/t_sockcred.c Mon Oct 1 17:26:41 2018 (r339066, copy of r309554, head/tools/regression/sockets/unix_cmsg/t_sockcred.c) @@ -0,0 +1,215 @@ +/*- + * Copyright (c) 2005 Andrey Simonenko + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include +__FBSDID("$FreeBSD$"); + +#include +#include +#include +#include +#include +#include +#include + +#include "uc_common.h" +#include "t_generic.h" +#include "t_sockcred.h" + +static int +t_sockcred_client(int type, int fd) +{ + struct msghdr msghdr; + struct iovec iov[1]; + int rv; + + if (uc_sync_recv() < 0) + return (-2); + + rv = -2; + + uc_msghdr_init_client(&msghdr, iov, NULL, 0, 0, 0); + + if (uc_socket_connect(fd) < 0) + goto done; + + if (type == 2) + if (uc_sync_recv() < 0) + goto done; + + if (uc_message_sendn(fd, &msghdr) < 0) + goto done; + + rv = 0; +done: + return (rv); +} + +static int +t_sockcred_server(int type, int fd1) +{ + struct msghdr msghdr; + struct iovec iov[1]; + struct cmsghdr *cmsghdr; + void *cmsg_data; + size_t cmsg_size; + u_int i; + int fd2, rv, val; + + fd2 = -1; + rv = -2; + + cmsg_size = CMSG_SPACE(SOCKCREDSIZE(uc_cfg.proc_cred.gid_num)); + cmsg_data = malloc(cmsg_size); + if (cmsg_data == NULL) { + uc_logmsg("malloc"); + goto done; + } + + if (type == 1) { + uc_dbgmsg("setting LOCAL_CREDS"); + val = 1; + if (setsockopt(fd1, 0, LOCAL_CREDS, &val, sizeof(val)) < 0) { + uc_logmsg("setsockopt(LOCAL_CREDS)"); + goto done; + } + } + + if (uc_sync_send() < 0) + goto done; + + if (uc_cfg.sock_type == SOCK_STREAM) { + fd2 = uc_socket_accept(fd1); + if (fd2 < 0) + goto done; + } else + fd2 = fd1; + + if (type == 2) { + uc_dbgmsg("setting LOCAL_CREDS"); + val = 1; + if (setsockopt(fd2, 0, LOCAL_CREDS, &val, sizeof(val)) < 0) { + uc_logmsg("setsockopt(LOCAL_CREDS)"); + goto done; + } + if (uc_sync_send() < 0) + goto done; + } + + rv = -1; + for (i = 1; i <= uc_cfg.ipc_msg.msg_num; ++i) { + uc_dbgmsg("message #%u", i); + + uc_msghdr_init_server(&msghdr, iov, cmsg_data, cmsg_size); + if (uc_message_recv(fd2, &msghdr) < 0) { + rv = -2; + break; + } + + if (i > 1 && uc_cfg.sock_type == SOCK_STREAM) { + if (uc_check_msghdr(&msghdr, 0) < 0) + break; + } else { + if (uc_check_msghdr(&msghdr, sizeof(*cmsghdr)) < 0) + break; + + cmsghdr = CMSG_FIRSTHDR(&msghdr); + if (uc_check_scm_creds_sockcred(cmsghdr) < 0) + break; + } + } + if (i > uc_cfg.ipc_msg.msg_num) + rv = 0; +done: + free(cmsg_data); + if (uc_cfg.sock_type == SOCK_STREAM && fd2 >= 0) + if (uc_socket_close(fd2) < 0) + rv = -2; + return (rv); +} + +int +t_sockcred_1(void) +{ + u_int i; + int fd, rv, rv_client; + + switch (uc_client_fork()) { + case 0: + for (i = 1; i <= 2; ++i) { + uc_dbgmsg("client #%u", i); + fd = uc_socket_create(); + if (fd < 0) + rv = -2; + else { + rv = t_sockcred_client(1, fd); + if (uc_socket_close(fd) < 0) + rv = -2; + } + if (rv != 0) + break; + } *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-stable-11@freebsd.org Mon Oct 1 17:37:00 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 67F0D10A4F09; Mon, 1 Oct 2018 17:37:00 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1824283A69; Mon, 1 Oct 2018 17:37:00 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 118BD24140; Mon, 1 Oct 2018 17:37:00 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w91HaxRq075309; Mon, 1 Oct 2018 17:36:59 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w91HaxCh075303; Mon, 1 Oct 2018 17:36:59 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810011736.w91HaxCh075303@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Mon, 1 Oct 2018 17:36:59 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339067 - in stable/11: sys/kern sys/sys tests/sys/kern X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: in stable/11: sys/kern sys/sys tests/sys/kern X-SVN-Commit-Revision: 339067 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 17:37:00 -0000 Author: asomers Date: Mon Oct 1 17:36:58 2018 New Revision: 339067 URL: https://svnweb.freebsd.org/changeset/base/339067 Log: MFC r337222: Fix LOCAL_PEERCRED with socketpair(2) Enable the LOCAL_PEERCRED socket option for unix domain stream sockets created with socketpair(2). Previously, it only worked with unix domain stream sockets created with socket(2)/listen(2)/connect(2)/accept(2). PR: 176419 Reported by: Nicholas Wilson Differential Revision: https://reviews.freebsd.org/D16350 Added: stable/11/tests/sys/kern/unix_socketpair_test.c - copied unchanged from r337222, head/tests/sys/kern/unix_socketpair_test.c Modified: stable/11/sys/kern/uipc_syscalls.c stable/11/sys/kern/uipc_usrreq.c stable/11/sys/sys/unpcb.h stable/11/tests/sys/kern/Makefile Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/kern/uipc_syscalls.c ============================================================================== --- stable/11/sys/kern/uipc_syscalls.c Mon Oct 1 17:26:41 2018 (r339066) +++ stable/11/sys/kern/uipc_syscalls.c Mon Oct 1 17:36:58 2018 (r339067) @@ -56,6 +56,8 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include +#include #ifdef KTRACE #include #endif @@ -691,6 +693,15 @@ kern_socketpair(struct thread *td, int domain, int typ error = soconnect2(so2, so1); if (error != 0) goto free4; + } else if (so1->so_proto->pr_flags & PR_CONNREQUIRED) { + struct unpcb *unp, *unp2; + unp = sotounpcb(so1); + unp2 = sotounpcb(so2); + /* + * No need to lock the unps, because the sockets are brand-new. + * No other threads can be using them yet + */ + unp_copy_peercred(td, unp, unp2, unp); } finit(fp1, FREAD | FWRITE | fflag, DTYPE_SOCKET, fp1->f_data, &socketops); Modified: stable/11/sys/kern/uipc_usrreq.c ============================================================================== --- stable/11/sys/kern/uipc_usrreq.c Mon Oct 1 17:26:41 2018 (r339066) +++ stable/11/sys/kern/uipc_usrreq.c Mon Oct 1 17:36:58 2018 (r339067) @@ -1424,26 +1424,10 @@ unp_connectat(int fd, struct socket *so, struct sockad sa = NULL; } - /* - * The connector's (client's) credentials are copied from its - * process structure at the time of connect() (which is now). - */ - cru2x(td->td_ucred, &unp3->unp_peercred); - unp3->unp_flags |= UNP_HAVEPC; - - /* - * The receiver's (server's) credentials are copied from the - * unp_peercred member of socket on which the former called - * listen(); uipc_listen() cached that process's credentials - * at that time so we can use them now. - */ KASSERT(unp2->unp_flags & UNP_HAVEPCCACHED, ("unp_connect: listener without cached peercred")); - memcpy(&unp->unp_peercred, &unp2->unp_peercred, - sizeof(unp->unp_peercred)); - unp->unp_flags |= UNP_HAVEPC; - if (unp2->unp_flags & UNP_WANTCRED) - unp3->unp_flags |= UNP_WANTCRED; + unp_copy_peercred(td, unp3, unp, unp2); + UNP_PCB_UNLOCK(unp3); UNP_PCB_UNLOCK(unp2); UNP_PCB_UNLOCK(unp); @@ -1474,6 +1458,27 @@ bad: unp->unp_flags &= ~UNP_CONNECTING; UNP_PCB_UNLOCK(unp); return (error); +} + +/* + * Set socket peer credentials at connection time. + * + * The client's PCB credentials are copied from its process structure. The + * server's PCB credentials are copied from the socket on which it called + * listen(2). uipc_listen cached that process's credentials at the time. + */ +void +unp_copy_peercred(struct thread *td, struct unpcb *client_unp, + struct unpcb *server_unp, struct unpcb *listen_unp) +{ + cru2x(td->td_ucred, &client_unp->unp_peercred); + client_unp->unp_flags |= UNP_HAVEPC; + + memcpy(&server_unp->unp_peercred, &listen_unp->unp_peercred, + sizeof(server_unp->unp_peercred)); + server_unp->unp_flags |= UNP_HAVEPC; + if (listen_unp->unp_flags & UNP_WANTCRED) + client_unp->unp_flags |= UNP_WANTCRED; } static int Modified: stable/11/sys/sys/unpcb.h ============================================================================== --- stable/11/sys/sys/unpcb.h Mon Oct 1 17:26:41 2018 (r339066) +++ stable/11/sys/sys/unpcb.h Mon Oct 1 17:36:58 2018 (r339067) @@ -150,4 +150,13 @@ struct xunpgen { }; #endif /* _SYS_SOCKETVAR_H_ */ +#if defined(_KERNEL) +struct thread; + +/* In uipc_userreq.c */ +void +unp_copy_peercred(struct thread *td, struct unpcb *client_unp, + struct unpcb *server_unp, struct unpcb *listen_unp); +#endif + #endif /* _SYS_UNPCB_H_ */ Modified: stable/11/tests/sys/kern/Makefile ============================================================================== --- stable/11/tests/sys/kern/Makefile Mon Oct 1 17:26:41 2018 (r339066) +++ stable/11/tests/sys/kern/Makefile Mon Oct 1 17:36:58 2018 (r339067) @@ -12,9 +12,10 @@ ATF_TESTS_C+= ptrace_test TEST_METADATA.ptrace_test+= timeout="15" ATF_TESTS_C+= reaper PLAIN_TESTS_C+= subr_unit_test -ATF_TESTS_C+= unix_seqpacket_test ATF_TESTS_C+= unix_passfd_test +ATF_TESTS_C+= unix_seqpacket_test TEST_METADATA.unix_seqpacket_test+= timeout="15" +ATF_TESTS_C+= unix_socketpair_test ATF_TESTS_C+= waitpid_nohang ATF_TESTS_C+= pdeathsig Copied: stable/11/tests/sys/kern/unix_socketpair_test.c (from r337222, head/tests/sys/kern/unix_socketpair_test.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/tests/sys/kern/unix_socketpair_test.c Mon Oct 1 17:36:58 2018 (r339067, copy of r337222, head/tests/sys/kern/unix_socketpair_test.c) @@ -0,0 +1,76 @@ +/*- + * Copyright (c) 2018 Alan Somers + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include +__FBSDID("$FreeBSD$"); + +#include +#include +#include +#include +#include +#include + +#include + +#include + +/* getpeereid(3) should work with stream sockets created via socketpair(2) */ +ATF_TC_WITHOUT_HEAD(getpeereid); +ATF_TC_BODY(getpeereid, tc) +{ + int sv[2]; + int s; + uid_t real_euid, euid; + gid_t real_egid, egid; + + real_euid = geteuid(); + real_egid = getegid(); + + s = socketpair(PF_LOCAL, SOCK_STREAM, 0, sv); + ATF_CHECK_EQ(0, s); + ATF_CHECK(sv[0] >= 0); + ATF_CHECK(sv[1] >= 0); + ATF_CHECK(sv[0] != sv[1]); + + ATF_REQUIRE_EQ(0, getpeereid(sv[0], &euid, &egid)); + ATF_CHECK_EQ(real_euid, euid); + ATF_CHECK_EQ(real_egid, egid); + + ATF_REQUIRE_EQ(0, getpeereid(sv[1], &euid, &egid)); + ATF_CHECK_EQ(real_euid, euid); + ATF_CHECK_EQ(real_egid, egid); + + close(sv[0]); + close(sv[1]); +} + + +ATF_TP_ADD_TCS(tp) +{ + ATF_TP_ADD_TC(tp, getpeereid); + + return atf_no_error(); +} From owner-svn-src-stable-11@freebsd.org Mon Oct 1 17:54:28 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B767810A55A3 for ; Mon, 1 Oct 2018 17:54:28 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: from mail-ot1-f42.google.com (mail-ot1-f42.google.com [209.85.210.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5A7E48454F for ; Mon, 1 Oct 2018 17:54:28 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: by mail-ot1-f42.google.com with SMTP id w67so4693853ota.7 for ; Mon, 01 Oct 2018 10:54:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=R8W1dXFQBQcx1TqkE/3fOaHUben7jqonsEiD+3QIyc4=; b=skqDj8bnTaMfu/KmlJ0MAUjUIkAJMG4GC6o5tqEModqfn95dWp/3WXe2NiZDDxXknb TCCdg3Gio261zCBb7bqQ6BNEh1mkIdHd6+RchQNI1lkHrIUmocs6WHmuwr6GDn/pUkGF TSHC5bsZ4KRysZy0E4UFIiFAo8lRL95SjhNK/G9waJu8+swqxWeJItN2GZNbcQmGPPkv aNmYj6i1zUc8GPdqb9TOvIE555EQQnZsH7cVmixm8UBovmTX9mYVspIQDClLmwKvWH41 GnC5wgQKXDw8KSzvWt5pmUgQPd1wYVAdVUJli0B7uTIGDzQqEQaTfD/g7ZL/FyTWqmYF h79Q== X-Gm-Message-State: ABuFfogRSr8L4QzUSv75IoLK+PWRZgBttTgCY/bX3cThAtVB19hx3ogc wbx+8V06tsdleP9rHze2y1CARPUlBPylSlBeGzw3xg== X-Google-Smtp-Source: ACcGV61VEKog5hUluQBxsqFwLCITNe5LDBXwBwQ0shlA5UU17P+zuw70P85lDnQHYki19MYuKAO9B3uU9CrExV/A54M= X-Received: by 2002:a9d:26c3:: with SMTP id i3-v6mr7450532otd.350.1538414867757; Mon, 01 Oct 2018 10:27:47 -0700 (PDT) MIME-Version: 1.0 References: <201809121852.w8CIqJrm046105@repo.freebsd.org> In-Reply-To: From: Maxim Sobolev Date: Mon, 1 Oct 2018 10:27:36 -0700 Message-ID: Subject: Re: svn commit: r338617 - in stable/11: lib/libc/sys sys/compat/freebsd32 sys/kern sys/netinet sys/netinet6 sys/sys tools/regression/sockets/udp_pingpong tools/regression/sockets/unix_cmsg To: Alan Somers Cc: src-committers , svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.27 X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 17:54:28 -0000 Thanks Alan, should be fixed now in r339066. Please let me know if you still have any issues. -Max On Mon, Oct 1, 2018 at 8:31 AM Alan Somers wrote: > On Wed, Sep 12, 2018 at 12:52 PM Maxim Sobolev > wrote: > >> Author: sobomax >> Date: Wed Sep 12 18:52:18 2018 >> New Revision: 338617 >> URL: https://svnweb.freebsd.org/changeset/base/338617 >> >> Log: >> MFC r312296 and r323254, which is new a socket option >> SO_TS_CLOCK to pick from several different clock sources to >> return timestamps when SO_TIMESTAMP is enabled and two >> new nanosecond-precision timestamp types. This also fixes >> recvmsg32() system call to properly down-convert layout of the >> 64-bit structures to match what 32-bit app(s) expect. >> >> Bump __FreeBSD_version to indicate presence of a new >> functionality. >> >> Differential Revision: https://reviews.freebsd.org/D9171 >> >> Added: >> stable/11/tools/regression/sockets/udp_pingpong/ >> - copied from r312296, head/tools/regression/sockets/udp_pingpong/ >> Modified: >> stable/11/lib/libc/sys/getsockopt.2 >> stable/11/sys/compat/freebsd32/freebsd32.h >> stable/11/sys/compat/freebsd32/freebsd32_misc.c >> stable/11/sys/kern/uipc_socket.c >> stable/11/sys/kern/uipc_usrreq.c >> stable/11/sys/netinet/ip_input.c >> stable/11/sys/netinet6/ip6_input.c >> stable/11/sys/sys/param.h >> stable/11/sys/sys/socket.h >> stable/11/sys/sys/socketvar.h >> stable/11/tools/regression/sockets/unix_cmsg/Makefile >> stable/11/tools/regression/sockets/unix_cmsg/unix_cmsg.c >> Directory Properties: >> stable/11/ (props changed) >> > > This change broke the build of tools/regression/sockets/unix_cmsg on > stable/11. It looks like you need to MFC r309554 first. > > unix_cmsg.c:55:10: fatal error: 'uc_common.h' file not found > #include "uc_common.h" > ^~~~~~~~~~~~~ > 1 error > generated. > > *** Error code 1 > > Stop. > > -Alan > From owner-svn-src-stable-11@freebsd.org Mon Oct 1 17:59:24 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 545B610A56B0; Mon, 1 Oct 2018 17:59:24 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-lf1-f48.google.com (mail-lf1-f48.google.com [209.85.167.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C6C4B8465E; Mon, 1 Oct 2018 17:59:23 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-lf1-f48.google.com with SMTP id s10-v6so3095338lfc.9; Mon, 01 Oct 2018 10:59:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=kKBxg5cPAxa7+9gdmBeHHRVY0Dl8XdIevOznFJlbH0w=; b=IF0nHWK+FFYkDKbnlVcGoFYj5+9+hi2rBEwTcTSLluZuud5FHIsEmbR2r3XTTwzaRI wtnXXAVL3XVgK8TIrjsSZYMb+6+OKtIrIDgEhTFaHhsTjNAC6clbejkCLnI5qqNflZd1 xRu70/vuOGy78m6fOkXJuo7UHTLtZ386JgfEe/c6mrQi3UVQFxvqWjtJvRClBDWWF/h5 rPSbOt38pugAxkq6RdCVINo5+q4nbT32TNBzb+O5AvcAsrECYBQgAMBnhtS0dUQTUAMk GI7lbhatE8mMmI6DW7zP/kMKwKsB214lxHvLyxnU4bpk7AEk4xQEt8tsjM4tQh4WRSnO Wi2w== X-Gm-Message-State: ABuFfoigM/FZw4as378fiJI7aiW7AkMrlL6IUFNmUuR9eNL5+DDn5Ipx 091GOcKPSaNnbaktzEWpWiOGlU2ugbQ6G3mYDwC1l6OG X-Google-Smtp-Source: ACcGV62hgRd4KVfLJYfFZOceza10lGCUCS5ky8xUM02KyeQutBkxFpEXkb/PG43LRu+eXD612h0Zb2cN47Yn/jc7E+g= X-Received: by 2002:a19:6f0a:: with SMTP id k10-v6mr6495426lfc.8.1538416761921; Mon, 01 Oct 2018 10:59:21 -0700 (PDT) MIME-Version: 1.0 References: <201809121852.w8CIqJrm046105@repo.freebsd.org> In-Reply-To: From: Alan Somers Date: Mon, 1 Oct 2018 11:59:10 -0600 Message-ID: Subject: Re: svn commit: r338617 - in stable/11: lib/libc/sys sys/compat/freebsd32 sys/kern sys/netinet sys/netinet6 sys/sys tools/regression/sockets/udp_pingpong tools/regression/sockets/unix_cmsg To: Maxim Sobolev Cc: src-committers , svn-src-all , svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.27 X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 17:59:24 -0000 Everything's working fine now. Thanks for the quick turnaround. On Mon, Oct 1, 2018 at 11:51 AM Maxim Sobolev wrote: > Thanks Alan, should be fixed now in r339066. Please let me know if you > still have any issues. > > -Max > > On Mon, Oct 1, 2018 at 8:31 AM Alan Somers wrote: > >> On Wed, Sep 12, 2018 at 12:52 PM Maxim Sobolev >> wrote: >> >>> Author: sobomax >>> Date: Wed Sep 12 18:52:18 2018 >>> New Revision: 338617 >>> URL: https://svnweb.freebsd.org/changeset/base/338617 >>> >>> Log: >>> MFC r312296 and r323254, which is new a socket option >>> SO_TS_CLOCK to pick from several different clock sources to >>> return timestamps when SO_TIMESTAMP is enabled and two >>> new nanosecond-precision timestamp types. This also fixes >>> recvmsg32() system call to properly down-convert layout of the >>> 64-bit structures to match what 32-bit app(s) expect. >>> >>> Bump __FreeBSD_version to indicate presence of a new >>> functionality. >>> >>> Differential Revision: https://reviews.freebsd.org/D9171 >>> >>> Added: >>> stable/11/tools/regression/sockets/udp_pingpong/ >>> - copied from r312296, head/tools/regression/sockets/udp_pingpong/ >>> Modified: >>> stable/11/lib/libc/sys/getsockopt.2 >>> stable/11/sys/compat/freebsd32/freebsd32.h >>> stable/11/sys/compat/freebsd32/freebsd32_misc.c >>> stable/11/sys/kern/uipc_socket.c >>> stable/11/sys/kern/uipc_usrreq.c >>> stable/11/sys/netinet/ip_input.c >>> stable/11/sys/netinet6/ip6_input.c >>> stable/11/sys/sys/param.h >>> stable/11/sys/sys/socket.h >>> stable/11/sys/sys/socketvar.h >>> stable/11/tools/regression/sockets/unix_cmsg/Makefile >>> stable/11/tools/regression/sockets/unix_cmsg/unix_cmsg.c >>> Directory Properties: >>> stable/11/ (props changed) >>> >> >> This change broke the build of tools/regression/sockets/unix_cmsg on >> stable/11. It looks like you need to MFC r309554 first. >> >> unix_cmsg.c:55:10: fatal error: 'uc_common.h' file not found >> #include "uc_common.h" >> ^~~~~~~~~~~~~ >> 1 error >> generated. >> >> *** Error code 1 >> >> Stop. >> >> -Alan >> > From owner-svn-src-stable-11@freebsd.org Mon Oct 1 18:15:26 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8874C10A630C; Mon, 1 Oct 2018 18:15:26 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3D34785818; Mon, 1 Oct 2018 18:15:26 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 38002247AC; Mon, 1 Oct 2018 18:15:26 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w91IFQCu000220; Mon, 1 Oct 2018 18:15:26 GMT (envelope-from gjb@FreeBSD.org) Received: (from gjb@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w91IFQRX000219; Mon, 1 Oct 2018 18:15:26 GMT (envelope-from gjb@FreeBSD.org) Message-Id: <201810011815.w91IFQRX000219@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: gjb set sender to gjb@FreeBSD.org using -f From: Glen Barber Date: Mon, 1 Oct 2018 18:15:25 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339069 - in stable: 10/release/doc/share/xml 11/release/doc/share/xml X-SVN-Group: stable-11 X-SVN-Commit-Author: gjb X-SVN-Commit-Paths: in stable: 10/release/doc/share/xml 11/release/doc/share/xml X-SVN-Commit-Revision: 339069 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 18:15:26 -0000 Author: gjb Date: Mon Oct 1 18:15:25 2018 New Revision: 339069 URL: https://svnweb.freebsd.org/changeset/base/339069 Log: Document EN-18:09 through EN-18:12. Sponsored by: The FreeBSD Foundation Modified: stable/11/release/doc/share/xml/errata.xml Changes in other areas also in this revision: Modified: stable/10/release/doc/share/xml/errata.xml Modified: stable/11/release/doc/share/xml/errata.xml ============================================================================== --- stable/11/release/doc/share/xml/errata.xml Mon Oct 1 18:00:52 2018 (r339068) +++ stable/11/release/doc/share/xml/errata.xml Mon Oct 1 18:15:25 2018 (r339069) @@ -24,6 +24,39 @@ 12 September 2018 Regression in Lazy FPU remediation + + + FreeBSD-EN-18:09.ip + 27 September 2018 + IP fragment remediation causes + IPv6 reassembly failure + + + + FreeBSD-EN-18:10.syscall + 27 September 2018 + Null pointer dereference in + freebsd4_getfsstat system + call + + + + FreeBSD-EN-18:11.listen + 27 September 2018 + Denial of service in listen + system call + + + + FreeBSD-EN-18:12.mem + 27 September 2018 + Small kernel memory disclosures in two system + calls + From owner-svn-src-stable-11@freebsd.org Tue Oct 2 09:51:26 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5A22910BDEB2; Tue, 2 Oct 2018 09:51:26 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0D0D1848B2; Tue, 2 Oct 2018 09:51:26 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 07DEC2E1FA; Tue, 2 Oct 2018 09:51:26 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w929pPFl041414; Tue, 2 Oct 2018 09:51:25 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w929pPc2041413; Tue, 2 Oct 2018 09:51:25 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201810020951.w929pPc2041413@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Tue, 2 Oct 2018 09:51:25 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339081 - stable/11/sys/amd64/amd64 X-SVN-Group: stable-11 X-SVN-Commit-Author: kib X-SVN-Commit-Paths: stable/11/sys/amd64/amd64 X-SVN-Commit-Revision: 339081 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Oct 2018 09:51:26 -0000 Author: kib Date: Tue Oct 2 09:51:25 2018 New Revision: 339081 URL: https://svnweb.freebsd.org/changeset/base/339081 Log: MFC r338932: Fix some uses of dmaplimit. Modified: stable/11/sys/amd64/amd64/pmap.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/amd64/amd64/pmap.c ============================================================================== --- stable/11/sys/amd64/amd64/pmap.c Tue Oct 2 08:13:54 2018 (r339080) +++ stable/11/sys/amd64/amd64/pmap.c Tue Oct 2 09:51:25 2018 (r339081) @@ -1337,7 +1337,7 @@ pmap_init(void) if (ppim->va == 0) continue; /* Make the direct map consistent */ - if (ppim->pa < dmaplimit && ppim->pa + ppim->sz < dmaplimit) { + if (ppim->pa < dmaplimit && ppim->pa + ppim->sz <= dmaplimit) { (void)pmap_change_attr(PHYS_TO_DMAP(ppim->pa), ppim->sz, ppim->mode); } @@ -6876,7 +6876,7 @@ pmap_mapdev_attr(vm_paddr_t pa, vm_size_t size, int mo * If the specified range of physical addresses fits within * the direct map window, use the direct map. */ - if (pa < dmaplimit && pa + size < dmaplimit) { + if (pa < dmaplimit && pa + size <= dmaplimit) { va = PHYS_TO_DMAP(pa); if (!pmap_change_attr(va, size, mode)) return ((void *)(va + offset)); From owner-svn-src-stable-11@freebsd.org Tue Oct 2 15:18:50 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F356F109E107; Tue, 2 Oct 2018 15:18:49 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A58C88FD15; Tue, 2 Oct 2018 15:18:49 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 9C0BE1821; Tue, 2 Oct 2018 15:18:49 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w92FInWH005136; Tue, 2 Oct 2018 15:18:49 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w92FInXB005133; Tue, 2 Oct 2018 15:18:49 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810021518.w92FInXB005133@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Tue, 2 Oct 2018 15:18:49 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339084 - in stable/11: contrib/openbsm/bsm etc/mtree tests/sys tests/sys/audit X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: in stable/11: contrib/openbsm/bsm etc/mtree tests/sys tests/sys/audit X-SVN-Commit-Revision: 339084 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Oct 2018 15:18:50 -0000 Author: asomers Date: Tue Oct 2 15:18:48 2018 New Revision: 339084 URL: https://svnweb.freebsd.org/changeset/base/339084 Log: MFC r334360, r334362, r334388, r334395 r334360: Add initial set of tests for audit(4) This change includes the framework for testing the auditability of various syscalls, and includes changes for the first 12. The tests will start auditd(8) if needed, though they'll be much faster if it's already running. The syscalls tested in this commit include mkdir(2), mkdirat(2), mknod(2), mknodat(2), mkfifo(2), mkfifoat(2), link(2), linkat(2), symlink(2), symlinkat(2), rename(2), and renameat(2). Submitted by: aniketp Sponsored by: Google, Inc (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15286 r334362 by emaste: Temporarily disconnect audit tests Audit tests added in r334360 broke the build on a number of archs. Remove the subdir from the top level tests/sys/Makefile until they're fixed. r334388: Fix OpenBSM with GCC with -Wredundant-decls Upstream change ed47534 consciously added some redundant functional declarations, and I'm not sure why. AFAICT they were never required. On FreeBSD, they break the build with GCC (but not Clang) for any program including libbsm.h with WARNS=6. Fix by cherry-picking upstream change https://github.com/openbsm/openbsm/commit/0553c27 Reported by: emaste Reviewed by: cem Obtained from: OpenBSM Pull Request: https://github.com/openbsm/openbsm/pull/31 r334395: Revert r334362 Reconnect tests/sys/audit now that the GCC issue is fixed by 334388 X-MFC-With: 334362, 334360, 334388 Added: stable/11/tests/sys/audit/ - copied from r334360, head/tests/sys/audit/ Modified: stable/11/contrib/openbsm/bsm/libbsm.h stable/11/etc/mtree/BSD.tests.dist stable/11/tests/sys/Makefile Directory Properties: stable/11/ (props changed) Modified: stable/11/contrib/openbsm/bsm/libbsm.h ============================================================================== --- stable/11/contrib/openbsm/bsm/libbsm.h Tue Oct 2 15:08:41 2018 (r339083) +++ stable/11/contrib/openbsm/bsm/libbsm.h Tue Oct 2 15:18:48 2018 (r339084) @@ -868,21 +868,6 @@ void au_print_tok_xml(FILE *outfp, tokenstr_t *tok, void au_print_xml_header(FILE *outfp); void au_print_xml_footer(FILE *outfp); -/* - * BSM library routines for converting between local and BSM constant spaces. - * (Note: some of these are replicated in audit_record.h for the benefit of - * the FreeBSD and Mac OS X kernels) - */ -int au_bsm_to_domain(u_short bsm_domain, int *local_domainp); -int au_bsm_to_errno(u_char bsm_error, int *errorp); -int au_bsm_to_fcntl_cmd(u_short bsm_fcntl_cmd, int *local_fcntl_cmdp); -int au_bsm_to_socket_type(u_short bsm_socket_type, - int *local_socket_typep); -u_short au_domain_to_bsm(int local_domain); -u_char au_errno_to_bsm(int local_errno); -u_short au_fcntl_cmd_to_bsm(int local_fcntl_command); -u_short au_socket_type_to_bsm(int local_socket_type); - const char *au_strerror(u_char bsm_error); __END_DECLS Modified: stable/11/etc/mtree/BSD.tests.dist ============================================================================== --- stable/11/etc/mtree/BSD.tests.dist Tue Oct 2 15:08:41 2018 (r339083) +++ stable/11/etc/mtree/BSD.tests.dist Tue Oct 2 15:18:48 2018 (r339084) @@ -420,6 +420,8 @@ .. aio .. + audit + .. capsicum .. fifo Modified: stable/11/tests/sys/Makefile ============================================================================== --- stable/11/tests/sys/Makefile Tue Oct 2 15:08:41 2018 (r339083) +++ stable/11/tests/sys/Makefile Tue Oct 2 15:18:48 2018 (r339084) @@ -4,6 +4,7 @@ TESTSDIR= ${TESTSBASE}/sys TESTS_SUBDIRS+= acl TESTS_SUBDIRS+= aio +TESTS_SUBDIRS+= audit TESTS_SUBDIRS+= capsicum TESTS_SUBDIRS+= fifo TESTS_SUBDIRS+= file From owner-svn-src-stable-11@freebsd.org Tue Oct 2 16:23:35 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0A40310A3F8C; Tue, 2 Oct 2018 16:23:35 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A49BB71EF4; Tue, 2 Oct 2018 16:23:34 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 9DCF2241B; Tue, 2 Oct 2018 16:23:34 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w92GNYof041383; Tue, 2 Oct 2018 16:23:34 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w92GNXUW041378; Tue, 2 Oct 2018 16:23:33 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810021623.w92GNXUW041378@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Tue, 2 Oct 2018 16:23:33 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339087 - stable/11/tests/sys/audit X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: stable/11/tests/sys/audit X-SVN-Commit-Revision: 339087 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Oct 2018 16:23:35 -0000 Author: asomers Date: Tue Oct 2 16:23:33 2018 New Revision: 339087 URL: https://svnweb.freebsd.org/changeset/base/339087 Log: MFC many audit(4) tests. MFC r334471, r334487, r334496, r334592, r334668, r334933, r335067, r335105, r335136, r335140, r335145, r335207-r335208, r335215, and r335255-r335256. r334471: audit(4): Add tests for the fr class of syscalls readlink and readlinkat are the only syscalls in this class. open and openat are as well, but they'll be handled in a different file. Also, tidy up the copyright headers of recently added files in this area. Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15636 r334487: audit(4): Add tests for the fw class of syscalls. truncate and ftruncate are the only syscalls in this class, apart from certain variations of open and openat, which will be handled in a different file. Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15640 r334496: audit(4): add tests for the fd audit class The only syscalls in this class are rmdir, unlink, unlinkat, rename, and renameat. Also, set is_exclusive for all audit(4) tests, because they can start and stop auditd. Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15647 r334592: audit(4): add tests for the cl audit class The only syscalls in this class are close, closefrom, munmap, and revoke. Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15650 r334668: audit(4): add tests for open(2) and openat(2) These syscalls are atypical, because each one corresponds to several different audit events, and they each pass several different audit class filters. Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15657 r334933: audit(4): add tests for stat(2) and friends This revision adds auditability tests for stat, lstat, fstat, and fstatat, all from the fa audit class. More tests from that audit class will follow. Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15709 r335067: audit(4): Fix file descriptor leaks in ATF tests Submitted by: aniketp Reported by: Coverity CID: 1393343 1393346 1392695 1392781 1391709 1392078 1392413 CID: 1392014 1392521 1393344 1393345 1393347 1393348 1393349 CID: 1393354 1393355 1393356 1393357 1393358 1393360 1393362 CID: 1393368 1393369 1393370 1393371 1393372 1393373 1393376 CID: 1393380 1393384 1393387 1393388 1393389 Sponsored by: Google, Inc (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15782 r335105: audit(4): add tests for statfs(2), fstatfs(2), and getfsstat(2) Submitted by: aniketp Sponsored by: Google, Inc (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15750 r335136: audit(4): add tests for flock, fcntl, and fsync Submitted by: aniketp Sponsored by: Google, Inc (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15795 r335140: audit(4): fix typo from r335136 Typo in Makefile accidentally disabled some older tests X-MFC-With: 335136 r335145: audit(4): add tests for fhopen, fhstat, and fhstatfs Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15798 r335207: audit(4): add tests for access(2), chmod(2), and friends access(2), eaccess(2), faccessat(2), chmod(2), fchmod(2), lchmod(2), and fchmodat(2). Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15805 Differential Revision: https://reviews.freebsd.org/D15808 r335208: audit(4): improve formatting in tests/sys/audit/open.c [skip ci] Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15797 r335215: audit(4): Add a few tests for network-related syscalls Add tests for socket(2), socketpair(2), and setsockopt(2) Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15803 r335255: audit(4): add tests for bind(2), bindat(2), and listen(2) Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15843 r335256: audit(4): add tests for chown(2) and friends Includes chown, fchown, lchown, and fchownat Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15825 Added: stable/11/tests/sys/audit/file-attribute-access.c - copied, changed from r334933, head/tests/sys/audit/file-attribute-access.c stable/11/tests/sys/audit/file-attribute-modify.c - copied, changed from r335136, head/tests/sys/audit/file-attribute-modify.c stable/11/tests/sys/audit/file-close.c - copied, changed from r334592, head/tests/sys/audit/file-close.c stable/11/tests/sys/audit/file-delete.c - copied, changed from r334496, head/tests/sys/audit/file-delete.c stable/11/tests/sys/audit/file-read.c - copied unchanged from r334471, head/tests/sys/audit/file-read.c stable/11/tests/sys/audit/file-write.c - copied, changed from r334487, head/tests/sys/audit/file-write.c stable/11/tests/sys/audit/network.c - copied, changed from r335215, head/tests/sys/audit/network.c stable/11/tests/sys/audit/open.c - copied, changed from r334668, head/tests/sys/audit/open.c Modified: stable/11/tests/sys/audit/Makefile stable/11/tests/sys/audit/file-create.c stable/11/tests/sys/audit/utils.c stable/11/tests/sys/audit/utils.h Directory Properties: stable/11/ (props changed) Modified: stable/11/tests/sys/audit/Makefile ============================================================================== --- stable/11/tests/sys/audit/Makefile Tue Oct 2 16:01:33 2018 (r339086) +++ stable/11/tests/sys/audit/Makefile Tue Oct 2 16:23:33 2018 (r339087) @@ -2,16 +2,41 @@ TESTSDIR= ${TESTSBASE}/sys/audit -ATF_TESTS_C= file-create +ATF_TESTS_C= file-attribute-access +ATF_TESTS_C+= file-attribute-modify +ATF_TESTS_C+= file-create +ATF_TESTS_C+= file-delete +ATF_TESTS_C+= file-close +ATF_TESTS_C+= file-write +ATF_TESTS_C+= file-read +ATF_TESTS_C+= open +ATF_TESTS_C+= network +SRCS.file-attribute-access+= file-attribute-access.c +SRCS.file-attribute-access+= utils.c +SRCS.file-attribute-modify+= file-attribute-modify.c +SRCS.file-attribute-modify+= utils.c SRCS.file-create+= file-create.c SRCS.file-create+= utils.c +SRCS.file-delete+= file-delete.c +SRCS.file-delete+= utils.c +SRCS.file-close+= file-close.c +SRCS.file-close+= utils.c +SRCS.file-write+= file-write.c +SRCS.file-write+= utils.c +SRCS.file-read+= file-read.c +SRCS.file-read+= utils.c +SRCS.open+= open.c +SRCS.open+= utils.c +SRCS.network+= network.c +SRCS.network+= utils.c TEST_METADATA+= timeout="30" TEST_METADATA+= required_user="root" +TEST_METADATA+= is_exclusive="true" WARNS?= 6 -LDFLAGS+= -lbsm +LDFLAGS+= -lbsm -lutil .include Copied and modified: stable/11/tests/sys/audit/file-attribute-access.c (from r334933, head/tests/sys/audit/file-attribute-access.c) ============================================================================== --- head/tests/sys/audit/file-attribute-access.c Sun Jun 10 21:36:29 2018 (r334933, copy source) +++ stable/11/tests/sys/audit/file-attribute-access.c Tue Oct 2 16:23:33 2018 (r339087) @@ -25,6 +25,9 @@ * $FreeBSD$ */ +#include +#include +#include #include #include @@ -36,8 +39,12 @@ static struct pollfd fds[1]; static mode_t mode = 0777; +static pid_t pid; +static fhandle_t fht; +static int filedesc, fhdesc; static char extregex[80]; static struct stat statbuff; +static struct statfs statfsbuff; static const char *auclass = "fa"; static const char *path = "fileforaudit"; static const char *errpath = "dirdoesnotexist/fileforaudit"; @@ -55,10 +62,11 @@ ATF_TC_HEAD(stat_success, tc) ATF_TC_BODY(stat_success, tc) { /* File needs to exist to call stat(2) */ - ATF_REQUIRE(open(path, O_CREAT, mode) != -1); + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); FILE *pipefd = setup(fds, auclass); ATF_REQUIRE_EQ(0, stat(path, &statbuff)); check_audit(fds, successreg, pipefd); + close(filedesc); } ATF_TC_CLEANUP(stat_success, tc) @@ -140,7 +148,6 @@ ATF_TC_HEAD(fstat_success, tc) ATF_TC_BODY(fstat_success, tc) { - int filedesc; /* File needs to exist to call fstat(2) */ ATF_REQUIRE((filedesc = open(path, O_CREAT | O_RDWR, mode)) != -1); FILE *pipefd = setup(fds, auclass); @@ -149,6 +156,7 @@ ATF_TC_BODY(fstat_success, tc) snprintf(extregex, sizeof(extregex), "fstat.*%jd.*return,success", (intmax_t)statbuff.st_ino); check_audit(fds, extregex, pipefd); + close(filedesc); } ATF_TC_CLEANUP(fstat_success, tc) @@ -224,6 +232,435 @@ ATF_TC_CLEANUP(fstatat_failure, tc) } +ATF_TC_WITH_CLEANUP(statfs_success); +ATF_TC_HEAD(statfs_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "statfs(2) call"); +} + +ATF_TC_BODY(statfs_success, tc) +{ + /* File needs to exist to call statfs(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, statfs(path, &statfsbuff)); + check_audit(fds, successreg, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(statfs_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(statfs_failure); +ATF_TC_HEAD(statfs_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "statfs(2) call"); +} + +ATF_TC_BODY(statfs_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + /* Failure reason: file does not exist */ + ATF_REQUIRE_EQ(-1, statfs(errpath, &statfsbuff)); + check_audit(fds, failurereg, pipefd); +} + +ATF_TC_CLEANUP(statfs_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fstatfs_success); +ATF_TC_HEAD(fstatfs_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "fstatfs(2) call"); +} + +ATF_TC_BODY(fstatfs_success, tc) +{ + /* File needs to exist to call fstat(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT | O_RDWR, mode)) != -1); + /* Call stat(2) to store the Inode number of 'path' */ + ATF_REQUIRE_EQ(0, stat(path, &statbuff)); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, fstatfs(filedesc, &statfsbuff)); + + snprintf(extregex, sizeof(extregex), "fstatfs.*%jd.*return,success", + (intmax_t)statbuff.st_ino); + check_audit(fds, extregex, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(fstatfs_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fstatfs_failure); +ATF_TC_HEAD(fstatfs_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "fstatfs(2) call"); +} + +ATF_TC_BODY(fstatfs_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + const char *regex = "fstatfs.*return,failure : Bad file descriptor"; + /* Failure reason: bad file descriptor */ + ATF_REQUIRE_EQ(-1, fstatfs(-1, &statfsbuff)); + check_audit(fds, regex, pipefd); +} + +ATF_TC_CLEANUP(fstatfs_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(getfsstat_success); +ATF_TC_HEAD(getfsstat_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "getfsstat(2) call"); +} + +ATF_TC_BODY(getfsstat_success, tc) +{ + pid = getpid(); + snprintf(extregex, sizeof(extregex), "getfsstat.*%d.*success", pid); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE(getfsstat(NULL, 0, MNT_NOWAIT) != -1); + check_audit(fds, extregex, pipefd); +} + +ATF_TC_CLEANUP(getfsstat_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(getfsstat_failure); +ATF_TC_HEAD(getfsstat_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "getfsstat(2) call"); +} + +ATF_TC_BODY(getfsstat_failure, tc) +{ + const char *regex = "getfsstat.*return,failure : Invalid argument"; + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid value for mode */ + ATF_REQUIRE_EQ(-1, getfsstat(NULL, 0, -1)); + check_audit(fds, regex, pipefd); +} + +ATF_TC_CLEANUP(getfsstat_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fhopen_success); +ATF_TC_HEAD(fhopen_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "fhopen(2) call"); +} + +ATF_TC_BODY(fhopen_success, tc) +{ + pid = getpid(); + snprintf(extregex, sizeof(extregex), "fhopen.*%d.*return,success", pid); + + /* File needs to exist to get a file-handle */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + /* Get the file handle to be passed to fhopen(2) */ + ATF_REQUIRE_EQ(0, getfh(path, &fht)); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE((fhdesc = fhopen(&fht, O_RDWR)) != -1); + check_audit(fds, extregex, pipefd); + + close(fhdesc); + close(filedesc); +} + +ATF_TC_CLEANUP(fhopen_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fhopen_failure); +ATF_TC_HEAD(fhopen_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "fhopen(2) call"); +} + +ATF_TC_BODY(fhopen_failure, tc) +{ + const char *regex = "fhopen.*return,failure : Invalid argument"; + FILE *pipefd = setup(fds, auclass); + /* + * Failure reason: NULL does not represent any file handle + * and O_CREAT is not allowed as the flag for fhopen(2) + */ + ATF_REQUIRE_EQ(-1, fhopen(NULL, O_CREAT)); + check_audit(fds, regex, pipefd); +} + +ATF_TC_CLEANUP(fhopen_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fhstat_success); +ATF_TC_HEAD(fhstat_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "fstat(2) call"); +} + +ATF_TC_BODY(fhstat_success, tc) +{ + pid = getpid(); + snprintf(extregex, sizeof(extregex), "fhstat.*%d.*return,success", pid); + + /* File needs to exist to get a file-handle */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + /* Get the file handle to be passed to fhstat(2) */ + ATF_REQUIRE_EQ(0, getfh(path, &fht)); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, fhstat(&fht, &statbuff)); + check_audit(fds, extregex, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(fhstat_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fhstat_failure); +ATF_TC_HEAD(fhstat_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "fhstat(2) call"); +} + +ATF_TC_BODY(fhstat_failure, tc) +{ + const char *regex = "fhstat.*return,failure : Bad address"; + FILE *pipefd = setup(fds, auclass); + /* Failure reason: NULL does not represent any file handle */ + ATF_REQUIRE_EQ(-1, fhstat(NULL, NULL)); + check_audit(fds, regex, pipefd); +} + +ATF_TC_CLEANUP(fhstat_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fhstatfs_success); +ATF_TC_HEAD(fhstatfs_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "fstatfs(2) call"); +} + +ATF_TC_BODY(fhstatfs_success, tc) +{ + pid = getpid(); + snprintf(extregex, sizeof(extregex), "fhstatfs.*%d.*success", pid); + + /* File needs to exist to get a file-handle */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + /* Get the file handle to be passed to fhstatfs(2) */ + ATF_REQUIRE_EQ(0, getfh(path, &fht)); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, fhstatfs(&fht, &statfsbuff)); + check_audit(fds, extregex, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(fhstatfs_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fhstatfs_failure); +ATF_TC_HEAD(fhstatfs_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "fhstatfs(2) call"); +} + +ATF_TC_BODY(fhstatfs_failure, tc) +{ + const char *regex = "fhstatfs.*return,failure : Bad address"; + FILE *pipefd = setup(fds, auclass); + /* Failure reason: NULL does not represent any file handle */ + ATF_REQUIRE_EQ(-1, fhstatfs(NULL, NULL)); + check_audit(fds, regex, pipefd); +} + +ATF_TC_CLEANUP(fhstatfs_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(access_success); +ATF_TC_HEAD(access_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "access(2) call"); +} + +ATF_TC_BODY(access_success, tc) +{ + /* File needs to exist to call access(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, access(path, F_OK)); + check_audit(fds, successreg, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(access_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(access_failure); +ATF_TC_HEAD(access_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "access(2) call"); +} + +ATF_TC_BODY(access_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + /* Failure reason: file does not exist */ + ATF_REQUIRE_EQ(-1, access(errpath, F_OK)); + check_audit(fds, failurereg, pipefd); +} + +ATF_TC_CLEANUP(access_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(eaccess_success); +ATF_TC_HEAD(eaccess_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "eaccess(2) call"); +} + +ATF_TC_BODY(eaccess_success, tc) +{ + /* File needs to exist to call eaccess(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, eaccess(path, F_OK)); + check_audit(fds, successreg, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(eaccess_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(eaccess_failure); +ATF_TC_HEAD(eaccess_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "eaccess(2) call"); +} + +ATF_TC_BODY(eaccess_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + /* Failure reason: file does not exist */ + ATF_REQUIRE_EQ(-1, eaccess(errpath, F_OK)); + check_audit(fds, failurereg, pipefd); +} + +ATF_TC_CLEANUP(eaccess_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(faccessat_success); +ATF_TC_HEAD(faccessat_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "faccessat(2) call"); +} + +ATF_TC_BODY(faccessat_success, tc) +{ + /* File needs to exist to call faccessat(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, faccessat(AT_FDCWD, path, F_OK, AT_EACCESS)); + check_audit(fds, successreg, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(faccessat_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(faccessat_failure); +ATF_TC_HEAD(faccessat_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "faccessat(2) call"); +} + +ATF_TC_BODY(faccessat_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + /* Failure reason: file does not exist */ + ATF_REQUIRE_EQ(-1, faccessat(AT_FDCWD, errpath, F_OK, AT_EACCESS)); + check_audit(fds, failurereg, pipefd); +} + +ATF_TC_CLEANUP(faccessat_failure, tc) +{ + cleanup(); +} + + ATF_TP_ADD_TCS(tp) { ATF_TP_ADD_TC(tp, stat_success); @@ -234,6 +671,28 @@ ATF_TP_ADD_TCS(tp) ATF_TP_ADD_TC(tp, fstat_failure); ATF_TP_ADD_TC(tp, fstatat_success); ATF_TP_ADD_TC(tp, fstatat_failure); + + ATF_TP_ADD_TC(tp, statfs_success); + ATF_TP_ADD_TC(tp, statfs_failure); + ATF_TP_ADD_TC(tp, fstatfs_success); + ATF_TP_ADD_TC(tp, fstatfs_failure); + + ATF_TP_ADD_TC(tp, getfsstat_success); + ATF_TP_ADD_TC(tp, getfsstat_failure); + + ATF_TP_ADD_TC(tp, fhopen_success); + ATF_TP_ADD_TC(tp, fhopen_failure); + ATF_TP_ADD_TC(tp, fhstat_success); + ATF_TP_ADD_TC(tp, fhstat_failure); + ATF_TP_ADD_TC(tp, fhstatfs_success); + ATF_TP_ADD_TC(tp, fhstatfs_failure); + + ATF_TP_ADD_TC(tp, access_success); + ATF_TP_ADD_TC(tp, access_failure); + ATF_TP_ADD_TC(tp, eaccess_success); + ATF_TP_ADD_TC(tp, eaccess_failure); + ATF_TP_ADD_TC(tp, faccessat_success); + ATF_TP_ADD_TC(tp, faccessat_failure); return (atf_no_error()); } Copied and modified: stable/11/tests/sys/audit/file-attribute-modify.c (from r335136, head/tests/sys/audit/file-attribute-modify.c) ============================================================================== --- head/tests/sys/audit/file-attribute-modify.c Thu Jun 14 13:42:58 2018 (r335136, copy source) +++ stable/11/tests/sys/audit/file-attribute-modify.c Tue Oct 2 16:23:33 2018 (r339087) @@ -26,6 +26,7 @@ */ #include +#include #include #include @@ -34,12 +35,17 @@ #include "utils.h" static pid_t pid; +static uid_t uid = -1; +static gid_t gid = -1; static int filedesc; static struct pollfd fds[1]; static mode_t mode = 0777; static char extregex[80]; static const char *auclass = "fm"; static const char *path = "fileforaudit"; +static const char *errpath = "adirhasnoname/fileforaudit"; +static const char *successreg = "fileforaudit.*return,success"; +static const char *failurereg = "fileforaudit.*return,failure"; ATF_TC_WITH_CLEANUP(flock_success); @@ -186,6 +192,364 @@ ATF_TC_CLEANUP(fsync_failure, tc) } +ATF_TC_WITH_CLEANUP(chmod_success); +ATF_TC_HEAD(chmod_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "chmod(2) call"); +} + +ATF_TC_BODY(chmod_success, tc) +{ + /* File needs to exist to call chmod(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, chmod(path, mode)); + check_audit(fds, successreg, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(chmod_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(chmod_failure); +ATF_TC_HEAD(chmod_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "chmod(2) call"); +} + +ATF_TC_BODY(chmod_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + /* Failure reason: file does not exist */ + ATF_REQUIRE_EQ(-1, chmod(errpath, mode)); + check_audit(fds, failurereg, pipefd); +} + +ATF_TC_CLEANUP(chmod_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fchmod_success); +ATF_TC_HEAD(fchmod_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "fchmod(2) call"); +} + +ATF_TC_BODY(fchmod_success, tc) +{ + pid = getpid(); + snprintf(extregex, sizeof(extregex), "fchmod.*%d.*return,success", pid); + + /* File needs to exist to call fchmod(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, fchmod(filedesc, mode)); + check_audit(fds, extregex, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(fchmod_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fchmod_failure); +ATF_TC_HEAD(fchmod_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "fchmod(2) call"); +} + +ATF_TC_BODY(fchmod_failure, tc) +{ + const char *regex = "fchmod.*return,failure : Bad file descriptor"; + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid file descriptor */ + ATF_REQUIRE_EQ(-1, fchmod(-1, mode)); + check_audit(fds, regex, pipefd); +} + +ATF_TC_CLEANUP(fchmod_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(lchmod_success); +ATF_TC_HEAD(lchmod_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "lchmod(2) call"); +} + +ATF_TC_BODY(lchmod_success, tc) +{ + /* Symbolic link needs to exist to call lchmod(2) */ + ATF_REQUIRE_EQ(0, symlink("symlink", path)); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, lchmod(path, mode)); + check_audit(fds, successreg, pipefd); +} + +ATF_TC_CLEANUP(lchmod_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(lchmod_failure); +ATF_TC_HEAD(lchmod_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "lchmod(2) call"); +} + +ATF_TC_BODY(lchmod_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + /* Failure reason: file does not exist */ + ATF_REQUIRE_EQ(-1, lchmod(errpath, mode)); + check_audit(fds, failurereg, pipefd); +} + +ATF_TC_CLEANUP(lchmod_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fchmodat_success); +ATF_TC_HEAD(fchmodat_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "fchmodat(2) call"); +} + +ATF_TC_BODY(fchmodat_success, tc) +{ + /* File needs to exist to call fchmodat(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, fchmodat(AT_FDCWD, path, mode, 0)); + check_audit(fds, successreg, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(fchmodat_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fchmodat_failure); +ATF_TC_HEAD(fchmodat_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "fchmodat(2) call"); +} + +ATF_TC_BODY(fchmodat_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + /* Failure reason: file does not exist */ + ATF_REQUIRE_EQ(-1, fchmodat(AT_FDCWD, errpath, mode, 0)); + check_audit(fds, failurereg, pipefd); +} + +ATF_TC_CLEANUP(fchmodat_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(chown_success); +ATF_TC_HEAD(chown_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "chown(2) call"); +} + +ATF_TC_BODY(chown_success, tc) +{ + /* File needs to exist to call chown(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, chown(path, uid, gid)); + check_audit(fds, successreg, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(chown_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(chown_failure); +ATF_TC_HEAD(chown_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "chown(2) call"); +} + +ATF_TC_BODY(chown_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + /* Failure reason: file does not exist */ + ATF_REQUIRE_EQ(-1, chown(errpath, uid, gid)); + check_audit(fds, failurereg, pipefd); +} + +ATF_TC_CLEANUP(chown_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fchown_success); +ATF_TC_HEAD(fchown_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "fchown(2) call"); +} + +ATF_TC_BODY(fchown_success, tc) +{ + pid = getpid(); + snprintf(extregex, sizeof(extregex), "fchown.*%d.*return,success", pid); + + /* File needs to exist to call fchown(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, fchown(filedesc, uid, gid)); + check_audit(fds, extregex, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(fchown_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fchown_failure); +ATF_TC_HEAD(fchown_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "fchown(2) call"); +} + +ATF_TC_BODY(fchown_failure, tc) +{ + const char *regex = "fchown.*return,failure : Bad file descriptor"; + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid file descriptor */ + ATF_REQUIRE_EQ(-1, fchown(-1, uid, gid)); + check_audit(fds, regex, pipefd); +} + +ATF_TC_CLEANUP(fchown_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(lchown_success); +ATF_TC_HEAD(lchown_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "lchown(2) call"); +} + +ATF_TC_BODY(lchown_success, tc) +{ + /* Symbolic link needs to exist to call lchown(2) */ + ATF_REQUIRE_EQ(0, symlink("symlink", path)); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, lchown(path, uid, gid)); + check_audit(fds, successreg, pipefd); +} + +ATF_TC_CLEANUP(lchown_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(lchown_failure); +ATF_TC_HEAD(lchown_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "lchown(2) call"); +} + +ATF_TC_BODY(lchown_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Symbolic link does not exist */ + ATF_REQUIRE_EQ(-1, lchown(errpath, uid, gid)); + check_audit(fds, failurereg, pipefd); +} + +ATF_TC_CLEANUP(lchown_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fchownat_success); +ATF_TC_HEAD(fchownat_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "fchownat(2) call"); +} + +ATF_TC_BODY(fchownat_success, tc) +{ + /* File needs to exist to call fchownat(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, fchownat(AT_FDCWD, path, uid, gid, 0)); + check_audit(fds, successreg, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(fchownat_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fchownat_failure); +ATF_TC_HEAD(fchownat_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "fchownat(2) call"); +} + +ATF_TC_BODY(fchownat_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + /* Failure reason: file does not exist */ + ATF_REQUIRE_EQ(-1, fchownat(AT_FDCWD, errpath, uid, gid, 0)); + check_audit(fds, failurereg, pipefd); +} + +ATF_TC_CLEANUP(fchownat_failure, tc) +{ + cleanup(); +} + + ATF_TP_ADD_TCS(tp) { ATF_TP_ADD_TC(tp, flock_success); @@ -194,6 +558,24 @@ ATF_TP_ADD_TCS(tp) ATF_TP_ADD_TC(tp, fcntl_failure); ATF_TP_ADD_TC(tp, fsync_success); ATF_TP_ADD_TC(tp, fsync_failure); + + ATF_TP_ADD_TC(tp, chmod_success); + ATF_TP_ADD_TC(tp, chmod_failure); + ATF_TP_ADD_TC(tp, fchmod_success); + ATF_TP_ADD_TC(tp, fchmod_failure); + ATF_TP_ADD_TC(tp, lchmod_success); + ATF_TP_ADD_TC(tp, lchmod_failure); + ATF_TP_ADD_TC(tp, fchmodat_success); + ATF_TP_ADD_TC(tp, fchmodat_failure); + + ATF_TP_ADD_TC(tp, chown_success); + ATF_TP_ADD_TC(tp, chown_failure); + ATF_TP_ADD_TC(tp, fchown_success); + ATF_TP_ADD_TC(tp, fchown_failure); + ATF_TP_ADD_TC(tp, lchown_success); + ATF_TP_ADD_TC(tp, lchown_failure); + ATF_TP_ADD_TC(tp, fchownat_success); + ATF_TP_ADD_TC(tp, fchownat_failure); return (atf_no_error()); } Copied and modified: stable/11/tests/sys/audit/file-close.c (from r334592, head/tests/sys/audit/file-close.c) ============================================================================== --- head/tests/sys/audit/file-close.c Sun Jun 3 23:36:29 2018 (r334592, copy source) +++ stable/11/tests/sys/audit/file-close.c Tue Oct 2 16:23:33 2018 (r339087) @@ -40,6 +40,7 @@ static pid_t pid; static struct pollfd fds[1]; static mode_t mode = 0777; +static int filedesc; static char extregex[80]; static struct stat statbuff; static const char *auclass = "cl"; @@ -103,7 +104,6 @@ ATF_TC_HEAD(close_success, tc) ATF_TC_BODY(close_success, tc) { - int filedesc; /* File needs to exist to call close(2) */ ATF_REQUIRE((filedesc = open(path, O_CREAT | O_RDWR, mode)) != -1); *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-stable-11@freebsd.org Tue Oct 2 17:07:12 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C5FFD10A55D6; Tue, 2 Oct 2018 17:07:11 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 76B0873974; Tue, 2 Oct 2018 17:07:11 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 7152E2AD4; Tue, 2 Oct 2018 17:07:11 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w92H7BpI061951; Tue, 2 Oct 2018 17:07:11 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w92H7Avj061947; Tue, 2 Oct 2018 17:07:10 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810021707.w92H7Avj061947@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Tue, 2 Oct 2018 17:07:10 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339089 - stable/11/tests/sys/audit X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: stable/11/tests/sys/audit X-SVN-Commit-Revision: 339089 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Oct 2018 17:07:12 -0000 Author: asomers Date: Tue Oct 2 17:07:10 2018 New Revision: 339089 URL: https://svnweb.freebsd.org/changeset/base/339089 Log: MFC r335261, r335275, r335284-r335285, r335294, r335318, r335320, r335703 r335261: audit(4): add tests for pathconf(2) and friends pathconf, lpathconf, and fpathconf are included Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15842 r335275: audit(4): add tests for chflags and friends chflags, fchflags, and lchflags (but not chflagsat) are included. Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15854 r335284: audit(4): add tests for extattr_get_file(2) and friends This commit includes extattr_{get_file, get_fd, get_link, list_file, list_fd, list_link}. It does not include any syscalls that modify, set, or delete extended attributes, as those are in a different audit class. Submitted by: aniketpt Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15859 r335285: audit(4): Add tests for a few syscalls in the ad class The ad audit class is for administrative commands. This commit adds test for settimeofday, adjtime, and getfh. Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15861 r335294: audit(4): add tests for connect, connectat, and accept Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15853 r335318: audit(4): add tests for extattr_set_file and friends Includes extattr_{set_file, _set_fd, _set_link, _delete_file, _delete_fd, _delete_link} Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15867 r335320: audit(4): Add tests for {get/set}auid, {get/set}audit, {get/set}audit_addr Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15871 r335703: audit(4): fix Coverity issues Fix several incorrect buffer size arguments and a file descriptor leak. Submitted by: aniketp Reported by: Coverity CID: 1393489 1393501 1393509 1393510 1393514 1393515 1393516 CID: 1393517 1393518 1393519 X-MFC-With: 335284 X-MFC-With: 335318 X-MFC-With: 335320 Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D16000 Added: stable/11/tests/sys/audit/administrative.c - copied, changed from r335285, head/tests/sys/audit/administrative.c Modified: stable/11/tests/sys/audit/Makefile stable/11/tests/sys/audit/file-attribute-access.c stable/11/tests/sys/audit/file-attribute-modify.c stable/11/tests/sys/audit/network.c Directory Properties: stable/11/ (props changed) Modified: stable/11/tests/sys/audit/Makefile ============================================================================== --- stable/11/tests/sys/audit/Makefile Tue Oct 2 17:01:42 2018 (r339088) +++ stable/11/tests/sys/audit/Makefile Tue Oct 2 17:07:10 2018 (r339089) @@ -11,6 +11,7 @@ ATF_TESTS_C+= file-write ATF_TESTS_C+= file-read ATF_TESTS_C+= open ATF_TESTS_C+= network +ATF_TESTS_C+= administrative SRCS.file-attribute-access+= file-attribute-access.c SRCS.file-attribute-access+= utils.c @@ -30,6 +31,8 @@ SRCS.open+= open.c SRCS.open+= utils.c SRCS.network+= network.c SRCS.network+= utils.c +SRCS.administrative+= administrative.c +SRCS.administrative+= utils.c TEST_METADATA+= timeout="30" TEST_METADATA+= required_user="root" Copied and modified: stable/11/tests/sys/audit/administrative.c (from r335285, head/tests/sys/audit/administrative.c) ============================================================================== --- head/tests/sys/audit/administrative.c Sun Jun 17 16:24:46 2018 (r335285, copy source) +++ stable/11/tests/sys/audit/administrative.c Tue Oct 2 17:07:10 2018 (r339089) @@ -39,7 +39,7 @@ static pid_t pid; static int filedesc; static mode_t mode = 0777; static struct pollfd fds[1]; -static char adregex[60]; +static char adregex[80]; static const char *auclass = "ad"; static const char *path = "fileforaudit"; @@ -163,7 +163,7 @@ ATF_TC_BODY(nfs_getfh_success, tc) snprintf(adregex, sizeof(adregex), "nfs_getfh.*%d.*ret.*success", pid); /* File needs to exist to call getfh(2) */ - ATF_REQUIRE(filedesc = open(path, O_CREAT, mode) != -1); + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); FILE *pipefd = setup(fds, auclass); ATF_REQUIRE_EQ(0, getfh(path, &fhp)); check_audit(fds, adregex, pipefd); @@ -200,6 +200,301 @@ ATF_TC_CLEANUP(nfs_getfh_failure, tc) } +ATF_TC_WITH_CLEANUP(getauid_success); +ATF_TC_HEAD(getauid_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "getauid(2) call"); +} + +ATF_TC_BODY(getauid_success, tc) +{ + au_id_t auid; + pid = getpid(); + snprintf(adregex, sizeof(adregex), "getauid.*%d.*return,success", pid); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, getauid(&auid)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(getauid_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(getauid_failure); +ATF_TC_HEAD(getauid_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "getauid(2) call"); +} + +ATF_TC_BODY(getauid_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "getauid.*%d.*return,failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Bad address */ + ATF_REQUIRE_EQ(-1, getauid(NULL)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(getauid_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(setauid_success); +ATF_TC_HEAD(setauid_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "setauid(2) call"); +} + +ATF_TC_BODY(setauid_success, tc) +{ + au_id_t auid; + pid = getpid(); + snprintf(adregex, sizeof(adregex), "setauid.*%d.*return,success", pid); + ATF_REQUIRE_EQ(0, getauid(&auid)); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, setauid(&auid)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(setauid_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(setauid_failure); +ATF_TC_HEAD(setauid_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "setauid(2) call"); +} + +ATF_TC_BODY(setauid_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "setauid.*%d.*return,failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Bad address */ + ATF_REQUIRE_EQ(-1, setauid(NULL)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(setauid_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(getaudit_success); +ATF_TC_HEAD(getaudit_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "getaudit(2) call"); +} + +ATF_TC_BODY(getaudit_success, tc) +{ + pid = getpid(); + auditinfo_t auditinfo; + snprintf(adregex, sizeof(adregex), "getaudit.*%d.*return,success", pid); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, getaudit(&auditinfo)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(getaudit_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(getaudit_failure); +ATF_TC_HEAD(getaudit_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "getaudit(2) call"); +} + +ATF_TC_BODY(getaudit_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "getaudit.*%d.*return,failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Bad address */ + ATF_REQUIRE_EQ(-1, getaudit(NULL)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(getaudit_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(setaudit_success); +ATF_TC_HEAD(setaudit_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "setaudit(2) call"); +} + +ATF_TC_BODY(setaudit_success, tc) +{ + pid = getpid(); + auditinfo_t auditinfo; + snprintf(adregex, sizeof(adregex), "setaudit.*%d.*return,success", pid); + ATF_REQUIRE_EQ(0, getaudit(&auditinfo)); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, setaudit(&auditinfo)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(setaudit_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(setaudit_failure); +ATF_TC_HEAD(setaudit_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "setaudit(2) call"); +} + +ATF_TC_BODY(setaudit_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "setaudit.*%d.*return,failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Bad address */ + ATF_REQUIRE_EQ(-1, setaudit(NULL)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(setaudit_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(getaudit_addr_success); +ATF_TC_HEAD(getaudit_addr_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "getaudit_addr(2) call"); +} + +ATF_TC_BODY(getaudit_addr_success, tc) +{ + pid = getpid(); + auditinfo_addr_t auditinfo; + snprintf(adregex, sizeof(adregex), + "getaudit_addr.*%d.*return,success", pid); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, getaudit_addr(&auditinfo, sizeof(auditinfo))); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(getaudit_addr_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(getaudit_addr_failure); +ATF_TC_HEAD(getaudit_addr_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "getaudit_addr(2) call"); +} + +ATF_TC_BODY(getaudit_addr_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), + "getaudit_addr.*%d.*return,failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Bad address */ + ATF_REQUIRE_EQ(-1, getaudit_addr(NULL, 0)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(getaudit_addr_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(setaudit_addr_success); +ATF_TC_HEAD(setaudit_addr_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "setaudit_addr(2) call"); +} + +ATF_TC_BODY(setaudit_addr_success, tc) +{ + pid = getpid(); + auditinfo_addr_t auditinfo; + snprintf(adregex, sizeof(adregex), + "setaudit_addr.*%d.*return,success", pid); + + ATF_REQUIRE_EQ(0, getaudit_addr(&auditinfo, sizeof(auditinfo))); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, setaudit_addr(&auditinfo, sizeof(auditinfo))); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(setaudit_addr_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(setaudit_addr_failure); +ATF_TC_HEAD(setaudit_addr_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "setaudit_addr(2) call"); +} + +ATF_TC_BODY(setaudit_addr_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), + "setaudit_addr.*%d.*return,failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Bad address */ + ATF_REQUIRE_EQ(-1, setaudit_addr(NULL, 0)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(setaudit_addr_failure, tc) +{ + cleanup(); +} + + ATF_TP_ADD_TCS(tp) { ATF_TP_ADD_TC(tp, settimeofday_success); @@ -209,6 +504,21 @@ ATF_TP_ADD_TCS(tp) ATF_TP_ADD_TC(tp, nfs_getfh_success); ATF_TP_ADD_TC(tp, nfs_getfh_failure); + + ATF_TP_ADD_TC(tp, getauid_success); + ATF_TP_ADD_TC(tp, getauid_failure); + ATF_TP_ADD_TC(tp, setauid_success); + ATF_TP_ADD_TC(tp, setauid_failure); + + ATF_TP_ADD_TC(tp, getaudit_success); + ATF_TP_ADD_TC(tp, getaudit_failure); + ATF_TP_ADD_TC(tp, setaudit_success); + ATF_TP_ADD_TC(tp, setaudit_failure); + + ATF_TP_ADD_TC(tp, getaudit_addr_success); + ATF_TP_ADD_TC(tp, getaudit_addr_failure); + ATF_TP_ADD_TC(tp, setaudit_addr_success); + ATF_TP_ADD_TC(tp, setaudit_addr_failure); return (atf_no_error()); } Modified: stable/11/tests/sys/audit/file-attribute-access.c ============================================================================== --- stable/11/tests/sys/audit/file-attribute-access.c Tue Oct 2 17:01:42 2018 (r339088) +++ stable/11/tests/sys/audit/file-attribute-access.c Tue Oct 2 17:07:10 2018 (r339089) @@ -26,6 +26,7 @@ */ #include +#include #include #include #include @@ -43,9 +44,11 @@ static pid_t pid; static fhandle_t fht; static int filedesc, fhdesc; static char extregex[80]; +static char buff[] = "ezio"; static struct stat statbuff; static struct statfs statfsbuff; static const char *auclass = "fa"; +static const char *name = "authorname"; static const char *path = "fileforaudit"; static const char *errpath = "dirdoesnotexist/fileforaudit"; static const char *successreg = "fileforaudit.*return,success"; @@ -661,6 +664,478 @@ ATF_TC_CLEANUP(faccessat_failure, tc) } +ATF_TC_WITH_CLEANUP(pathconf_success); +ATF_TC_HEAD(pathconf_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "pathconf(2) call"); +} + +ATF_TC_BODY(pathconf_success, tc) +{ + /* File needs to exist to call pathconf(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + FILE *pipefd = setup(fds, auclass); + /* Get the maximum number of bytes of filename */ + ATF_REQUIRE(pathconf(path, _PC_NAME_MAX) != -1); + check_audit(fds, successreg, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(pathconf_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(pathconf_failure); +ATF_TC_HEAD(pathconf_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "pathconf(2) call"); +} + +ATF_TC_BODY(pathconf_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + /* Failure reason: file does not exist */ + ATF_REQUIRE_EQ(-1, pathconf(errpath, _PC_NAME_MAX)); + check_audit(fds, failurereg, pipefd); +} + +ATF_TC_CLEANUP(pathconf_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(lpathconf_success); +ATF_TC_HEAD(lpathconf_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "lpathconf(2) call"); +} + +ATF_TC_BODY(lpathconf_success, tc) +{ + /* Symbolic link needs to exist to call lpathconf(2) */ + ATF_REQUIRE_EQ(0, symlink("symlink", path)); + FILE *pipefd = setup(fds, auclass); + /* Get the maximum number of bytes of symlink's name */ + ATF_REQUIRE(lpathconf(path, _PC_SYMLINK_MAX) != -1); + check_audit(fds, successreg, pipefd); +} + +ATF_TC_CLEANUP(lpathconf_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(lpathconf_failure); +ATF_TC_HEAD(lpathconf_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "lpathconf(2) call"); +} + +ATF_TC_BODY(lpathconf_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + /* Failure reason: symbolic link does not exist */ + ATF_REQUIRE_EQ(-1, lpathconf(errpath, _PC_SYMLINK_MAX)); + check_audit(fds, failurereg, pipefd); +} + +ATF_TC_CLEANUP(lpathconf_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fpathconf_success); +ATF_TC_HEAD(fpathconf_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "fpathconf(2) call"); +} + +ATF_TC_BODY(fpathconf_success, tc) +{ + pid = getpid(); + snprintf(extregex, sizeof(extregex), "fpathconf.*%d.*success", pid); + + /* File needs to exist to call fpathconf(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + FILE *pipefd = setup(fds, auclass); + /* Get the maximum number of bytes of filename */ + ATF_REQUIRE(fpathconf(filedesc, _PC_NAME_MAX) != -1); + check_audit(fds, extregex, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(fpathconf_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fpathconf_failure); +ATF_TC_HEAD(fpathconf_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "fpathconf(2) call"); +} + +ATF_TC_BODY(fpathconf_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + const char *regex = "fpathconf.*return,failure : Bad file descriptor"; + /* Failure reason: Bad file descriptor */ + ATF_REQUIRE_EQ(-1, fpathconf(-1, _PC_NAME_MAX)); + check_audit(fds, regex, pipefd); +} + +ATF_TC_CLEANUP(fpathconf_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(extattr_get_file_success); +ATF_TC_HEAD(extattr_get_file_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "extattr_get_file(2) call"); +} + +ATF_TC_BODY(extattr_get_file_success, tc) +{ + /* File needs to exist to call extattr_get_file(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + /* Set an extended attribute to be retrieved later on */ + ATF_REQUIRE_EQ(sizeof(buff), extattr_set_file(path, + EXTATTR_NAMESPACE_USER, name, buff, sizeof(buff))); + + /* Prepare the regex to be checked in the audit record */ + snprintf(extregex, sizeof(extregex), + "extattr_get_file.*%s.*%s.*return,success", path, name); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(sizeof(buff), extattr_get_file(path, + EXTATTR_NAMESPACE_USER, name, NULL, 0)); + check_audit(fds, extregex, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(extattr_get_file_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(extattr_get_file_failure); +ATF_TC_HEAD(extattr_get_file_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "extattr_get_file(2) call"); +} + +ATF_TC_BODY(extattr_get_file_failure, tc) +{ + /* Prepare the regex to be checked in the audit record */ + snprintf(extregex, sizeof(extregex), + "extattr_get_file.*%s.*%s.*failure", path, name); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: file does not exist */ + ATF_REQUIRE_EQ(-1, extattr_get_file(path, + EXTATTR_NAMESPACE_USER, name, NULL, 0)); + check_audit(fds, extregex, pipefd); +} + +ATF_TC_CLEANUP(extattr_get_file_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(extattr_get_fd_success); +ATF_TC_HEAD(extattr_get_fd_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "extattr_get_fd(2) call"); +} + +ATF_TC_BODY(extattr_get_fd_success, tc) +{ + /* File needs to exist to call extattr_get_fd(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + /* Set an extended attribute to be retrieved later on */ + ATF_REQUIRE_EQ(sizeof(buff), extattr_set_file(path, + EXTATTR_NAMESPACE_USER, name, buff, sizeof(buff))); + + /* Prepare the regex to be checked in the audit record */ + snprintf(extregex, sizeof(extregex), + "extattr_get_fd.*%s.*return,success", name); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(sizeof(buff), extattr_get_fd(filedesc, + EXTATTR_NAMESPACE_USER, name, NULL, 0)); + check_audit(fds, extregex, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(extattr_get_fd_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(extattr_get_fd_failure); +ATF_TC_HEAD(extattr_get_fd_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "extattr_get_fd(2) call"); +} + +ATF_TC_BODY(extattr_get_fd_failure, tc) +{ + /* Prepare the regex to be checked in the audit record */ + snprintf(extregex, sizeof(extregex), + "extattr_get_fd.*%s.*return,failure : Bad file descriptor", name); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid file descriptor */ + ATF_REQUIRE_EQ(-1, extattr_get_fd(-1, + EXTATTR_NAMESPACE_USER, name, NULL, 0)); + check_audit(fds, extregex, pipefd); +} + +ATF_TC_CLEANUP(extattr_get_fd_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(extattr_get_link_success); +ATF_TC_HEAD(extattr_get_link_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "extattr_get_link(2) call"); +} + +ATF_TC_BODY(extattr_get_link_success, tc) +{ + /* Symbolic link needs to exist to call extattr_get_link(2) */ + ATF_REQUIRE_EQ(0, symlink("symlink", path)); + /* Set an extended attribute to be retrieved later on */ + ATF_REQUIRE_EQ(sizeof(buff), extattr_set_link(path, + EXTATTR_NAMESPACE_USER, name, buff, sizeof(buff))); + + /* Prepare the regex to be checked in the audit record */ + snprintf(extregex, sizeof(extregex), + "extattr_get_link.*%s.*%s.*return,success", path, name); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(sizeof(buff), extattr_get_link(path, + EXTATTR_NAMESPACE_USER, name, NULL, 0)); + check_audit(fds, extregex, pipefd); +} + +ATF_TC_CLEANUP(extattr_get_link_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(extattr_get_link_failure); +ATF_TC_HEAD(extattr_get_link_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "extattr_get_link(2) call"); +} + +ATF_TC_BODY(extattr_get_link_failure, tc) +{ + /* Prepare the regex to be checked in the audit record */ + snprintf(extregex, sizeof(extregex), + "extattr_get_link.*%s.*%s.*failure", path, name); + FILE *pipefd = setup(fds, auclass); + /* Failure reason: symbolic link does not exist */ + ATF_REQUIRE_EQ(-1, extattr_get_link(path, + EXTATTR_NAMESPACE_USER, name, NULL, 0)); + check_audit(fds, extregex, pipefd); +} + +ATF_TC_CLEANUP(extattr_get_link_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(extattr_list_file_success); +ATF_TC_HEAD(extattr_list_file_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "extattr_list_file(2) call"); +} + +ATF_TC_BODY(extattr_list_file_success, tc) +{ + int readbuff; + /* File needs to exist to call extattr_list_file(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE((readbuff = extattr_list_file(path, + EXTATTR_NAMESPACE_USER, NULL, 0)) != -1); + /* Prepare the regex to be checked in the audit record */ + snprintf(extregex, sizeof(extregex), + "extattr_list_file.*%s.*return,success,%d", path, readbuff); + check_audit(fds, extregex, pipefd); +} + +ATF_TC_CLEANUP(extattr_list_file_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(extattr_list_file_failure); +ATF_TC_HEAD(extattr_list_file_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "extattr_list_file(2) call"); +} + +ATF_TC_BODY(extattr_list_file_failure, tc) +{ + /* Prepare the regex to be checked in the audit record */ + snprintf(extregex, sizeof(extregex), + "extattr_list_file.*%s.*return,failure", path); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: file does not exist */ + ATF_REQUIRE_EQ(-1, extattr_list_file(path, + EXTATTR_NAMESPACE_USER, NULL, 0)); + check_audit(fds, extregex, pipefd); +} + +ATF_TC_CLEANUP(extattr_list_file_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(extattr_list_fd_success); +ATF_TC_HEAD(extattr_list_fd_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "extattr_list_fd(2) call"); +} + +ATF_TC_BODY(extattr_list_fd_success, tc) +{ + int readbuff; + /* File needs to exist to call extattr_list_fd(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE((readbuff = extattr_list_fd(filedesc, + EXTATTR_NAMESPACE_USER, NULL, 0)) != -1); + /* Prepare the regex to be checked in the audit record */ + snprintf(extregex, sizeof(extregex), + "extattr_list_fd.*return,success,%d", readbuff); + check_audit(fds, extregex, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(extattr_list_fd_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(extattr_list_fd_failure); +ATF_TC_HEAD(extattr_list_fd_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "extattr_list_fd(2) call"); +} + +ATF_TC_BODY(extattr_list_fd_failure, tc) +{ + /* Prepare the regex to be checked in the audit record */ + snprintf(extregex, sizeof(extregex), + "extattr_list_fd.*return,failure : Bad file descriptor"); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid file descriptor */ + ATF_REQUIRE_EQ(-1, + extattr_list_fd(-1, EXTATTR_NAMESPACE_USER, NULL, 0)); + check_audit(fds, extregex, pipefd); +} + +ATF_TC_CLEANUP(extattr_list_fd_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(extattr_list_link_success); +ATF_TC_HEAD(extattr_list_link_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "extattr_list_link(2) call"); +} + +ATF_TC_BODY(extattr_list_link_success, tc) +{ + int readbuff; + /* Symbolic link needs to exist to call extattr_list_link(2) */ + ATF_REQUIRE_EQ(0, symlink("symlink", path)); + FILE *pipefd = setup(fds, auclass); + + ATF_REQUIRE((readbuff = extattr_list_link(path, + EXTATTR_NAMESPACE_USER, NULL, 0)) != -1); + /* Prepare the regex to be checked in the audit record */ + snprintf(extregex, sizeof(extregex), + "extattr_list_link.*%s.*return,success,%d", path, readbuff); + check_audit(fds, extregex, pipefd); +} + +ATF_TC_CLEANUP(extattr_list_link_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(extattr_list_link_failure); +ATF_TC_HEAD(extattr_list_link_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "extattr_list_link(2) call"); +} + +ATF_TC_BODY(extattr_list_link_failure, tc) +{ + /* Prepare the regex to be checked in the audit record */ + snprintf(extregex, sizeof(extregex), + "extattr_list_link.*%s.*failure", path); + FILE *pipefd = setup(fds, auclass); + /* Failure reason: symbolic link does not exist */ + ATF_REQUIRE_EQ(-1, extattr_list_link(path, + EXTATTR_NAMESPACE_USER, NULL, 0)); + check_audit(fds, extregex, pipefd); +} + +ATF_TC_CLEANUP(extattr_list_link_failure, tc) +{ + cleanup(); +} + + ATF_TP_ADD_TCS(tp) { ATF_TP_ADD_TC(tp, stat_success); @@ -693,6 +1168,27 @@ ATF_TP_ADD_TCS(tp) ATF_TP_ADD_TC(tp, eaccess_failure); ATF_TP_ADD_TC(tp, faccessat_success); ATF_TP_ADD_TC(tp, faccessat_failure); + + ATF_TP_ADD_TC(tp, pathconf_success); + ATF_TP_ADD_TC(tp, pathconf_failure); + ATF_TP_ADD_TC(tp, lpathconf_success); + ATF_TP_ADD_TC(tp, lpathconf_failure); + ATF_TP_ADD_TC(tp, fpathconf_success); + ATF_TP_ADD_TC(tp, fpathconf_failure); + + ATF_TP_ADD_TC(tp, extattr_get_file_success); + ATF_TP_ADD_TC(tp, extattr_get_file_failure); + ATF_TP_ADD_TC(tp, extattr_get_fd_success); + ATF_TP_ADD_TC(tp, extattr_get_fd_failure); + ATF_TP_ADD_TC(tp, extattr_get_link_success); + ATF_TP_ADD_TC(tp, extattr_get_link_failure); + + ATF_TP_ADD_TC(tp, extattr_list_file_success); + ATF_TP_ADD_TC(tp, extattr_list_file_failure); + ATF_TP_ADD_TC(tp, extattr_list_fd_success); + ATF_TP_ADD_TC(tp, extattr_list_fd_failure); + ATF_TP_ADD_TC(tp, extattr_list_link_success); + ATF_TP_ADD_TC(tp, extattr_list_link_failure); return (atf_no_error()); } Modified: stable/11/tests/sys/audit/file-attribute-modify.c ============================================================================== --- stable/11/tests/sys/audit/file-attribute-modify.c Tue Oct 2 17:01:42 2018 (r339088) +++ stable/11/tests/sys/audit/file-attribute-modify.c Tue Oct 2 17:07:10 2018 (r339089) @@ -25,6 +25,8 @@ * $FreeBSD$ */ +#include +#include #include #include @@ -37,11 +39,13 @@ static pid_t pid; static uid_t uid = -1; static gid_t gid = -1; -static int filedesc; +static int filedesc, retval; static struct pollfd fds[1]; static mode_t mode = 0777; static char extregex[80]; +static char buff[] = "ezio"; static const char *auclass = "fm"; +static const char *name = "authorname"; static const char *path = "fileforaudit"; static const char *errpath = "adirhasnoname/fileforaudit"; static const char *successreg = "fileforaudit.*return,success"; @@ -550,6 +554,468 @@ ATF_TC_CLEANUP(fchownat_failure, tc) } +ATF_TC_WITH_CLEANUP(chflags_success); +ATF_TC_HEAD(chflags_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "chflags(2) call"); +} + +ATF_TC_BODY(chflags_success, tc) +{ + /* File needs to exist to call chflags(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, chflags(path, UF_OFFLINE)); + check_audit(fds, successreg, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(chflags_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(chflags_failure); +ATF_TC_HEAD(chflags_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "chflags(2) call"); +} + +ATF_TC_BODY(chflags_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + /* Failure reason: file does not exist */ + ATF_REQUIRE_EQ(-1, chflags(errpath, UF_OFFLINE)); + check_audit(fds, failurereg, pipefd); +} + +ATF_TC_CLEANUP(chflags_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fchflags_success); +ATF_TC_HEAD(fchflags_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "fchflags(2) call"); +} + +ATF_TC_BODY(fchflags_success, tc) +{ + pid = getpid(); + snprintf(extregex, sizeof(extregex), "fchflags.*%d.*ret.*success", pid); + /* File needs to exist to call fchflags(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, fchflags(filedesc, UF_OFFLINE)); + check_audit(fds, extregex, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(fchflags_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(fchflags_failure); +ATF_TC_HEAD(fchflags_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "fchflags(2) call"); +} + +ATF_TC_BODY(fchflags_failure, tc) *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-stable-11@freebsd.org Tue Oct 2 17:27:12 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0346A10A5DF1; Tue, 2 Oct 2018 17:27:12 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9B49D744C4; Tue, 2 Oct 2018 17:27:11 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 8FBDE2E34; Tue, 2 Oct 2018 17:27:11 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w92HRBif072117; Tue, 2 Oct 2018 17:27:11 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w92HRAMf072113; Tue, 2 Oct 2018 17:27:10 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810021727.w92HRAMf072113@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Tue, 2 Oct 2018 17:27:10 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339090 - stable/11/tests/sys/audit X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: stable/11/tests/sys/audit X-SVN-Commit-Revision: 339090 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Oct 2018 17:27:12 -0000 Author: asomers Date: Tue Oct 2 17:27:10 2018 New Revision: 339090 URL: https://svnweb.freebsd.org/changeset/base/339090 Log: MFC r335319, r335354, r335374 r335319: audit(4): add tests for send, recv, sendto, and recvfrom Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15869 r335354: audit(4): add tests for ioctl(2) Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15872 r335374: audit(4): add tests for utimes(2) and friends, mprotect, and undelete Includes utimes(2), futimes(2), lutimes(2), futimesat(2), mprotect(2), and undelete(2). undelete, for now, is tested only in failure mode. Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15893 Added: stable/11/tests/sys/audit/ioctl.c - copied unchanged from r335354, head/tests/sys/audit/ioctl.c Modified: stable/11/tests/sys/audit/Makefile stable/11/tests/sys/audit/file-attribute-modify.c stable/11/tests/sys/audit/network.c Directory Properties: stable/11/ (props changed) Modified: stable/11/tests/sys/audit/Makefile ============================================================================== --- stable/11/tests/sys/audit/Makefile Tue Oct 2 17:07:10 2018 (r339089) +++ stable/11/tests/sys/audit/Makefile Tue Oct 2 17:27:10 2018 (r339090) @@ -10,6 +10,7 @@ ATF_TESTS_C+= file-close ATF_TESTS_C+= file-write ATF_TESTS_C+= file-read ATF_TESTS_C+= open +ATF_TESTS_C+= ioctl ATF_TESTS_C+= network ATF_TESTS_C+= administrative @@ -29,6 +30,8 @@ SRCS.file-read+= file-read.c SRCS.file-read+= utils.c SRCS.open+= open.c SRCS.open+= utils.c +SRCS.ioctl+= ioctl.c +SRCS.ioctl+= utils.c SRCS.network+= network.c SRCS.network+= utils.c SRCS.administrative+= administrative.c Modified: stable/11/tests/sys/audit/file-attribute-modify.c ============================================================================== --- stable/11/tests/sys/audit/file-attribute-modify.c Tue Oct 2 17:07:10 2018 (r339089) +++ stable/11/tests/sys/audit/file-attribute-modify.c Tue Oct 2 17:27:10 2018 (r339090) @@ -28,10 +28,13 @@ #include #include #include +#include #include +#include #include #include +#include #include #include "utils.h" @@ -689,6 +692,257 @@ ATF_TC_CLEANUP(lchflags_failure, tc) } +ATF_TC_WITH_CLEANUP(utimes_success); +ATF_TC_HEAD(utimes_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "utimes(2) call"); +} + +ATF_TC_BODY(utimes_success, tc) +{ + /* File needs to exist to call utimes(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, utimes(path, NULL)); + check_audit(fds, successreg, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(utimes_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(utimes_failure); +ATF_TC_HEAD(utimes_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "utimes(2) call"); +} + +ATF_TC_BODY(utimes_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + /* Failure reason: file does not exist */ + ATF_REQUIRE_EQ(-1, utimes(errpath, NULL)); + check_audit(fds, failurereg, pipefd); +} + +ATF_TC_CLEANUP(utimes_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(futimes_success); +ATF_TC_HEAD(futimes_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "futimes(2) call"); +} + +ATF_TC_BODY(futimes_success, tc) +{ + pid = getpid(); + snprintf(extregex, sizeof(extregex), "futimes.*%d.*ret.*success", pid); + + /* File needs to exist to call futimes(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, futimes(filedesc, NULL)); + check_audit(fds, extregex, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(futimes_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(futimes_failure); +ATF_TC_HEAD(futimes_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "futimes(2) call"); +} + +ATF_TC_BODY(futimes_failure, tc) +{ + const char *regex = "futimes.*return,failure : Bad file descriptor"; + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid file descriptor */ + ATF_REQUIRE_EQ(-1, futimes(-1, NULL)); + check_audit(fds, regex, pipefd); +} + +ATF_TC_CLEANUP(futimes_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(lutimes_success); +ATF_TC_HEAD(lutimes_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "lutimes(2) call"); +} + +ATF_TC_BODY(lutimes_success, tc) +{ + /* Symbolic link needs to exist to call lutimes(2) */ + ATF_REQUIRE_EQ(0, symlink("symlink", path)); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, lutimes(path, NULL)); + check_audit(fds, successreg, pipefd); +} + +ATF_TC_CLEANUP(lutimes_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(lutimes_failure); +ATF_TC_HEAD(lutimes_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "lutimes(2) call"); +} + +ATF_TC_BODY(lutimes_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + /* Failure reason: symbolic link does not exist */ + ATF_REQUIRE_EQ(-1, lutimes(errpath, NULL)); + check_audit(fds, failurereg, pipefd); +} + +ATF_TC_CLEANUP(lutimes_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(futimesat_success); +ATF_TC_HEAD(futimesat_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "futimesat(2) call"); +} + +ATF_TC_BODY(futimesat_success, tc) +{ + /* File needs to exist to call futimesat(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT, mode)) != -1); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, futimesat(AT_FDCWD, path, NULL)); + check_audit(fds, successreg, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(futimesat_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(futimesat_failure); +ATF_TC_HEAD(futimesat_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "futimesat(2) call"); +} + +ATF_TC_BODY(futimesat_failure, tc) +{ + FILE *pipefd = setup(fds, auclass); + /* Failure reason: file does not exist */ + ATF_REQUIRE_EQ(-1, futimesat(AT_FDCWD, errpath, NULL)); + check_audit(fds, failurereg, pipefd); +} + +ATF_TC_CLEANUP(futimesat_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(mprotect_success); +ATF_TC_HEAD(mprotect_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "mprotect(2) call"); +} + +ATF_TC_BODY(mprotect_success, tc) +{ + pid = getpid(); + snprintf(extregex, sizeof(extregex), "mprotect.*%d.*ret.*success", pid); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, mprotect(NULL, 0, PROT_NONE)); + check_audit(fds, extregex, pipefd); +} + +ATF_TC_CLEANUP(mprotect_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(mprotect_failure); +ATF_TC_HEAD(mprotect_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "mprotect(2) call"); +} + +ATF_TC_BODY(mprotect_failure, tc) +{ + const char *regex = "mprotect.*return,failure : Invalid argument"; + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(-1, mprotect((void *)SIZE_MAX, -1, PROT_NONE)); + check_audit(fds, regex, pipefd); +} + +ATF_TC_CLEANUP(mprotect_failure, tc) +{ + cleanup(); +} + +/* + * undelete(2) only works on whiteout files in union file system. Hence, no + * test case for successful invocation. + */ + +ATF_TC_WITH_CLEANUP(undelete_failure); +ATF_TC_HEAD(undelete_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "undelete(2) call"); +} + +ATF_TC_BODY(undelete_failure, tc) +{ + pid = getpid(); + snprintf(extregex, sizeof(extregex), "undelete.*%d.*ret.*failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: File does not exist */ + ATF_REQUIRE_EQ(-1, undelete(errpath)); + check_audit(fds, extregex, pipefd); +} + +ATF_TC_CLEANUP(undelete_failure, tc) +{ + cleanup(); +} + + ATF_TC_WITH_CLEANUP(extattr_set_file_success); ATF_TC_HEAD(extattr_set_file_success, tc) { @@ -1049,6 +1303,19 @@ ATF_TP_ADD_TCS(tp) ATF_TP_ADD_TC(tp, fchflags_failure); ATF_TP_ADD_TC(tp, lchflags_success); ATF_TP_ADD_TC(tp, lchflags_failure); + + ATF_TP_ADD_TC(tp, utimes_success); + ATF_TP_ADD_TC(tp, utimes_failure); + ATF_TP_ADD_TC(tp, futimes_success); + ATF_TP_ADD_TC(tp, futimes_failure); + ATF_TP_ADD_TC(tp, lutimes_success); + ATF_TP_ADD_TC(tp, lutimes_failure); + ATF_TP_ADD_TC(tp, futimesat_success); + ATF_TP_ADD_TC(tp, futimesat_failure); + + ATF_TP_ADD_TC(tp, mprotect_success); + ATF_TP_ADD_TC(tp, mprotect_failure); + ATF_TP_ADD_TC(tp, undelete_failure); ATF_TP_ADD_TC(tp, extattr_set_file_success); ATF_TP_ADD_TC(tp, extattr_set_file_failure); Copied: stable/11/tests/sys/audit/ioctl.c (from r335354, head/tests/sys/audit/ioctl.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/tests/sys/audit/ioctl.c Tue Oct 2 17:27:10 2018 (r339090, copy of r335354, head/tests/sys/audit/ioctl.c) @@ -0,0 +1,103 @@ +/*- + * Copyright (c) 2018 Aniket Pandey + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#include + +#include +#include + +#include +#include +#include + +#include "utils.h" + +static int filedesc; +static char ioregex[80]; +static const char *auclass = "io"; +static struct pollfd fds[1]; +static unsigned long request = AUDITPIPE_FLUSH; + + +ATF_TC_WITH_CLEANUP(ioctl_success); +ATF_TC_HEAD(ioctl_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "ioctl(2) call"); +} + +ATF_TC_BODY(ioctl_success, tc) +{ + /* auditpipe(4) supports quite a few ioctls */ + ATF_REQUIRE((filedesc = open("/dev/auditpipe", O_RDONLY)) != -1); + /* Prepare the regex to be checked in the audit record */ + snprintf(ioregex, sizeof(ioregex), + "ioctl.*%#lx.*%#x.*return,success", request, filedesc); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE(ioctl(filedesc, request) != -1); + check_audit(fds, ioregex, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(ioctl_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(ioctl_failure); +ATF_TC_HEAD(ioctl_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "ioctl(2) call"); +} + +ATF_TC_BODY(ioctl_failure, tc) +{ + snprintf(ioregex, sizeof(ioregex), + "ioctl.*%#lx.*return,failure : Bad file descriptor", request); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid file descriptor */ + ATF_REQUIRE_EQ(-1, ioctl(-1, request)); + check_audit(fds, ioregex, pipefd); +} + +ATF_TC_CLEANUP(ioctl_failure, tc) +{ + cleanup(); +} + + +ATF_TP_ADD_TCS(tp) +{ + ATF_TP_ADD_TC(tp, ioctl_success); + ATF_TP_ADD_TC(tp, ioctl_failure); + + return (atf_no_error()); +} Modified: stable/11/tests/sys/audit/network.c ============================================================================== --- stable/11/tests/sys/audit/network.c Tue Oct 2 17:07:10 2018 (r339089) +++ stable/11/tests/sys/audit/network.c Tue Oct 2 17:27:10 2018 (r339090) @@ -32,16 +32,20 @@ #include #include #include -#include #include "utils.h" +#define MAX_DATA 1024 #define SERVER_PATH "server" -static int sockfd, sockfd2; -static socklen_t len; +static int sockfd, sockfd2, connectfd; +static ssize_t data_bytes; static struct pollfd fds[1]; +static struct sockaddr_un server; static char extregex[80]; +static char data[MAX_DATA]; +static socklen_t len = sizeof(struct sockaddr_un); +static char msgbuff[MAX_DATA] = "This message does not exist"; static const char *auclass = "nt"; static const char *nosupregex = "return,failure : Address family " "not supported by protocol family"; @@ -66,11 +70,11 @@ close_sockets(int count, ...) * Assign local filesystem address to a Unix domain socket */ static void -assign_address(struct sockaddr_un *server) +assign_address(struct sockaddr_un *serveraddr) { - memset(server, 0, sizeof(*server)); - server->sun_family = AF_UNIX; - strcpy(server->sun_path, SERVER_PATH); + memset(serveraddr, 0, sizeof(*serveraddr)); + serveraddr->sun_family = AF_UNIX; + strcpy(serveraddr->sun_path, SERVER_PATH); } @@ -203,12 +207,10 @@ ATF_TC_HEAD(setsockopt_failure, tc) ATF_TC_BODY(setsockopt_failure, tc) { - int tr = 1; snprintf(extregex, sizeof(extregex), "setsockopt.*%s", invalregex); FILE *pipefd = setup(fds, auclass); /* Failure reason: Invalid socket descriptor */ - ATF_REQUIRE_EQ(-1, setsockopt(-1, SOL_SOCKET, - SO_REUSEADDR, &tr, sizeof(int))); + ATF_REQUIRE_EQ(-1, setsockopt(-1, SOL_SOCKET, 0, NULL, 0)); check_audit(fds, extregex, pipefd); } @@ -227,10 +229,7 @@ ATF_TC_HEAD(bind_success, tc) ATF_TC_BODY(bind_success, tc) { - struct sockaddr_un server; assign_address(&server); - len = sizeof(struct sockaddr_un); - /* Preliminary socket setup */ ATF_REQUIRE((sockfd = socket(PF_UNIX, SOCK_STREAM, 0)) != -1); /* Check the presence of AF_UNIX address path in audit record */ @@ -258,10 +257,7 @@ ATF_TC_HEAD(bind_failure, tc) ATF_TC_BODY(bind_failure, tc) { - /* Preliminary socket setup */ - struct sockaddr_un server; assign_address(&server); - len = sizeof(struct sockaddr_un); /* Check the presence of AF_UNIX path in audit record */ snprintf(extregex, sizeof(extregex), "bind.*%s.*return,failure", SERVER_PATH); @@ -287,10 +283,7 @@ ATF_TC_HEAD(bindat_success, tc) ATF_TC_BODY(bindat_success, tc) { - struct sockaddr_un server; assign_address(&server); - len = sizeof(struct sockaddr_un); - /* Preliminary socket setup */ ATF_REQUIRE((sockfd = socket(PF_UNIX, SOCK_STREAM, 0)) != -1); /* Check the presence of socket descriptor in audit record */ @@ -319,10 +312,7 @@ ATF_TC_HEAD(bindat_failure, tc) ATF_TC_BODY(bindat_failure, tc) { - /* Preliminary socket setup */ - struct sockaddr_un server; assign_address(&server); - len = sizeof(struct sockaddr_un); snprintf(extregex, sizeof(extregex), "bindat.*%s", invalregex); FILE *pipefd = setup(fds, auclass); @@ -347,10 +337,7 @@ ATF_TC_HEAD(listen_success, tc) ATF_TC_BODY(listen_success, tc) { - struct sockaddr_un server; assign_address(&server); - len = sizeof(struct sockaddr_un); - /* Preliminary socket setup */ ATF_REQUIRE((sockfd = socket(PF_UNIX, SOCK_STREAM, 0)) != -1); ATF_REQUIRE_EQ(0, bind(sockfd, (struct sockaddr *)&server, len)); @@ -401,14 +388,9 @@ ATF_TC_HEAD(connect_success, tc) ATF_TC_BODY(connect_success, tc) { - struct sockaddr_un server; assign_address(&server); - len = sizeof(struct sockaddr_un); - - /* Setup a non-blocking server socket */ - ATF_REQUIRE((sockfd = socket(PF_UNIX, - SOCK_STREAM | SOCK_NONBLOCK, 0)) != -1); - /* Bind to the specified address and wait for connection */ + /* Setup a server socket and bind to the specified address */ + ATF_REQUIRE((sockfd = socket(PF_UNIX, SOCK_STREAM, 0)) != -1); ATF_REQUIRE_EQ(0, bind(sockfd, (struct sockaddr *)&server, len)); ATF_REQUIRE_EQ(0, listen(sockfd, 1)); @@ -442,11 +424,7 @@ ATF_TC_HEAD(connect_failure, tc) ATF_TC_BODY(connect_failure, tc) { - /* Preliminary socket setup */ - struct sockaddr_un server; assign_address(&server); - len = sizeof(struct sockaddr_un); - /* Audit record must contain AF_UNIX address path */ snprintf(extregex, sizeof(extregex), "connect.*%s.*return,failure", SERVER_PATH); @@ -472,14 +450,9 @@ ATF_TC_HEAD(connectat_success, tc) ATF_TC_BODY(connectat_success, tc) { - struct sockaddr_un server; assign_address(&server); - len = sizeof(struct sockaddr_un); - - /* Setup a non-blocking server socket */ - ATF_REQUIRE((sockfd = socket(PF_UNIX, - SOCK_STREAM | SOCK_NONBLOCK, 0)) != -1); - /* Bind to the specified address and wait for connection */ + /* Setup a server socket and bind to the specified address */ + ATF_REQUIRE((sockfd = socket(PF_UNIX, SOCK_STREAM, 0)) != -1); ATF_REQUIRE_EQ(0, bind(sockfd, (struct sockaddr *)&server, len)); ATF_REQUIRE_EQ(0, listen(sockfd, 1)); @@ -514,10 +487,7 @@ ATF_TC_HEAD(connectat_failure, tc) ATF_TC_BODY(connectat_failure, tc) { - /* Preliminary socket setup */ - struct sockaddr_un server; assign_address(&server); - len = sizeof(struct sockaddr_un); snprintf(extregex, sizeof(extregex), "connectat.*%s", invalregex); FILE *pipefd = setup(fds, auclass); @@ -542,15 +512,9 @@ ATF_TC_HEAD(accept_success, tc) ATF_TC_BODY(accept_success, tc) { - int clientfd; - struct sockaddr_un server; assign_address(&server); - len = sizeof(struct sockaddr_un); - - /* Setup a non-blocking server socket */ - ATF_REQUIRE((sockfd = socket(PF_UNIX, - SOCK_STREAM | SOCK_NONBLOCK, 0)) != -1); - /* Bind to the specified address and wait for connection */ + /* Setup a server socket and bind to the specified address */ + ATF_REQUIRE((sockfd = socket(PF_UNIX, SOCK_STREAM, 0)) != -1); ATF_REQUIRE_EQ(0, bind(sockfd, (struct sockaddr *)&server, len)); ATF_REQUIRE_EQ(0, listen(sockfd, 1)); @@ -559,15 +523,15 @@ ATF_TC_BODY(accept_success, tc) ATF_REQUIRE_EQ(0, connect(sockfd2, (struct sockaddr *)&server, len)); FILE *pipefd = setup(fds, auclass); - ATF_REQUIRE((clientfd = accept(sockfd, NULL, &len)) != -1); + ATF_REQUIRE((connectfd = accept(sockfd, NULL, &len)) != -1); - /* Audit record must contain clientfd & sockfd */ + /* Audit record must contain connectfd & sockfd */ snprintf(extregex, sizeof(extregex), - "accept.*0x%x.*return,success,%d", sockfd, clientfd); + "accept.*0x%x.*return,success,%d", sockfd, connectfd); check_audit(fds, extregex, pipefd); /* Close all socket descriptors */ - close_sockets(3, sockfd, sockfd2, clientfd); + close_sockets(3, sockfd, sockfd2, connectfd); } ATF_TC_CLEANUP(accept_success, tc) @@ -585,14 +549,10 @@ ATF_TC_HEAD(accept_failure, tc) ATF_TC_BODY(accept_failure, tc) { - /* Preliminary socket setup */ - struct sockaddr_un client; - len = sizeof(struct sockaddr_un); snprintf(extregex, sizeof(extregex), "accept.*%s", invalregex); - FILE *pipefd = setup(fds, auclass); /* Failure reason: Invalid socket descriptor */ - ATF_REQUIRE_EQ(-1, accept(-1, (struct sockaddr *)&client, &len)); + ATF_REQUIRE_EQ(-1, accept(-1, NULL, NULL)); check_audit(fds, extregex, pipefd); } @@ -602,6 +562,252 @@ ATF_TC_CLEANUP(accept_failure, tc) } +ATF_TC_WITH_CLEANUP(send_success); +ATF_TC_HEAD(send_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "send(2) call"); +} + +ATF_TC_BODY(send_success, tc) +{ + assign_address(&server); + /* Setup a server socket and bind to the specified address */ + ATF_REQUIRE((sockfd = socket(PF_UNIX, SOCK_STREAM, 0)) != -1); + ATF_REQUIRE_EQ(0, bind(sockfd, (struct sockaddr *)&server, len)); + ATF_REQUIRE_EQ(0, listen(sockfd, 1)); + + /* Set up "blocking" client and connect with non-blocking server */ + ATF_REQUIRE((sockfd2 = socket(PF_UNIX, SOCK_STREAM, 0)) != -1); + ATF_REQUIRE_EQ(0, connect(sockfd2, (struct sockaddr *)&server, len)); + ATF_REQUIRE((connectfd = accept(sockfd, NULL, &len)) != -1); + + /* Send a sample message to the connected socket */ + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE((data_bytes = + send(sockfd2, msgbuff, strlen(msgbuff), 0)) != -1); + + /* Audit record must contain sockfd2 and data_bytes */ + snprintf(extregex, sizeof(extregex), + "send.*0x%x.*return,success,%zd", sockfd2, data_bytes); + check_audit(fds, extregex, pipefd); + + /* Close all socket descriptors */ + close_sockets(3, sockfd, sockfd2, connectfd); +} + +ATF_TC_CLEANUP(send_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(send_failure); +ATF_TC_HEAD(send_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "send(2) call"); +} + +ATF_TC_BODY(send_failure, tc) +{ + snprintf(extregex, sizeof(extregex), "send.*%s", invalregex); + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid socket descriptor */ + ATF_REQUIRE_EQ(-1, send(-1, NULL, 0, 0)); + check_audit(fds, extregex, pipefd); +} + +ATF_TC_CLEANUP(send_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(recv_success); +ATF_TC_HEAD(recv_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "recv(2) call"); +} + +ATF_TC_BODY(recv_success, tc) +{ + assign_address(&server); + /* Setup a server socket and bind to the specified address */ + ATF_REQUIRE((sockfd = socket(PF_UNIX, SOCK_STREAM, 0)) != -1); + ATF_REQUIRE_EQ(0, bind(sockfd, (struct sockaddr *)&server, len)); + ATF_REQUIRE_EQ(0, listen(sockfd, 1)); + + /* Set up "blocking" client and connect with non-blocking server */ + ATF_REQUIRE((sockfd2 = socket(PF_UNIX, SOCK_STREAM, 0)) != -1); + ATF_REQUIRE_EQ(0, connect(sockfd2, (struct sockaddr *)&server, len)); + ATF_REQUIRE((connectfd = accept(sockfd, NULL, &len)) != -1); + /* Send a sample message to the connected socket */ + ATF_REQUIRE(send(sockfd2, msgbuff, strlen(msgbuff), 0) != -1); + + /* Receive data once connectfd is ready for reading */ + FILE *pipefd = setup(fds, auclass); + //ATF_REQUIRE(check_readfs(connectfd) != 0); + ATF_REQUIRE((data_bytes = recv(connectfd, data, MAX_DATA, 0)) != 0); + + /* Audit record must contain connectfd and data_bytes */ + snprintf(extregex, sizeof(extregex), + "recv.*0x%x.*return,success,%zd", connectfd, data_bytes); + check_audit(fds, extregex, pipefd); + + /* Close all socket descriptors */ + close_sockets(3, sockfd, sockfd2, connectfd); +} + +ATF_TC_CLEANUP(recv_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(recv_failure); +ATF_TC_HEAD(recv_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "recv(2) call"); +} + +ATF_TC_BODY(recv_failure, tc) +{ + snprintf(extregex, sizeof(extregex), "recv.*%s", invalregex); + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid socket descriptor */ + ATF_REQUIRE_EQ(-1, recv(-1, NULL, 0, 0)); + check_audit(fds, extregex, pipefd); +} + +ATF_TC_CLEANUP(recv_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(sendto_success); +ATF_TC_HEAD(sendto_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "sendto(2) call"); +} + +ATF_TC_BODY(sendto_success, tc) +{ + assign_address(&server); + /* Setup a server socket and bind to the specified address */ + ATF_REQUIRE((sockfd = socket(PF_UNIX, SOCK_DGRAM, 0)) != -1); + ATF_REQUIRE_EQ(0, bind(sockfd, (struct sockaddr *)&server, len)); + + /* Set up client socket to be used for sending the data */ + ATF_REQUIRE((sockfd2 = socket(PF_UNIX, SOCK_DGRAM, 0)) != -1); + + /* Send a sample message to server's address */ + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE((data_bytes = sendto(sockfd2, msgbuff, + strlen(msgbuff), 0, (struct sockaddr *)&server, len)) != -1); + + /* Audit record must contain sockfd2 and data_bytes */ + snprintf(extregex, sizeof(extregex), + "sendto.*0x%x.*return,success,%zd", sockfd2, data_bytes); + check_audit(fds, extregex, pipefd); + + /* Close all socket descriptors */ + close_sockets(2, sockfd, sockfd2); +} + +ATF_TC_CLEANUP(sendto_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(sendto_failure); +ATF_TC_HEAD(sendto_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "sendto(2) call"); +} + +ATF_TC_BODY(sendto_failure, tc) +{ + snprintf(extregex, sizeof(extregex), "sendto.*%s", invalregex); + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid socket descriptor */ + ATF_REQUIRE_EQ(-1, sendto(-1, NULL, 0, 0, NULL, 0)); + check_audit(fds, extregex, pipefd); +} + +ATF_TC_CLEANUP(sendto_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(recvfrom_success); +ATF_TC_HEAD(recvfrom_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "recvfrom(2) call"); +} + +ATF_TC_BODY(recvfrom_success, tc) +{ + assign_address(&server); + /* Setup a server socket and bind to the specified address */ + ATF_REQUIRE((sockfd = socket(PF_UNIX, SOCK_DGRAM, 0)) != -1); + ATF_REQUIRE_EQ(0, bind(sockfd, (struct sockaddr *)&server, len)); + + /* Set up client socket to be used for sending the data */ + ATF_REQUIRE((sockfd2 = socket(PF_UNIX, SOCK_DGRAM, 0)) != -1); + ATF_REQUIRE(sendto(sockfd2, msgbuff, strlen(msgbuff), 0, + (struct sockaddr *)&server, len) != -1); + + /* Receive data once sockfd is ready for reading */ + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE((data_bytes = recvfrom(sockfd, data, + MAX_DATA, 0, NULL, &len)) != 0); + + /* Audit record must contain sockfd and data_bytes */ + snprintf(extregex, sizeof(extregex), + "recvfrom.*0x%x.*return,success,%zd", sockfd, data_bytes); + check_audit(fds, extregex, pipefd); + + /* Close all socket descriptors */ + close_sockets(2, sockfd, sockfd2); +} + +ATF_TC_CLEANUP(recvfrom_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(recvfrom_failure); +ATF_TC_HEAD(recvfrom_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "recvfrom(2) call"); +} + +ATF_TC_BODY(recvfrom_failure, tc) +{ + snprintf(extregex, sizeof(extregex), "recvfrom.*%s", invalregex); + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid socket descriptor */ + ATF_REQUIRE_EQ(-1, recvfrom(-1, NULL, 0, 0, NULL, NULL)); + check_audit(fds, extregex, pipefd); +} + +ATF_TC_CLEANUP(recvfrom_failure, tc) +{ + cleanup(); +} + + ATF_TP_ADD_TCS(tp) { ATF_TP_ADD_TC(tp, socket_success); @@ -624,6 +830,16 @@ ATF_TP_ADD_TCS(tp) ATF_TP_ADD_TC(tp, connectat_failure); ATF_TP_ADD_TC(tp, accept_success); ATF_TP_ADD_TC(tp, accept_failure); + + ATF_TP_ADD_TC(tp, send_success); + ATF_TP_ADD_TC(tp, send_failure); + ATF_TP_ADD_TC(tp, recv_success); + ATF_TP_ADD_TC(tp, recv_failure); + + ATF_TP_ADD_TC(tp, sendto_success); + ATF_TP_ADD_TC(tp, sendto_failure); + ATF_TP_ADD_TC(tp, recvfrom_success); + ATF_TP_ADD_TC(tp, recvfrom_failure); return (atf_no_error()); } From owner-svn-src-stable-11@freebsd.org Tue Oct 2 17:38:59 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5F7EA10A630C; Tue, 2 Oct 2018 17:38:59 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0FCC574B3D; Tue, 2 Oct 2018 17:38:59 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 08F432FD2; Tue, 2 Oct 2018 17:38:59 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w92Hcxiw077468; Tue, 2 Oct 2018 17:38:59 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w92HcxLO077467; Tue, 2 Oct 2018 17:38:59 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810021738.w92HcxLO077467@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Tue, 2 Oct 2018 17:38:58 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339092 - stable/11/tests/sys/audit X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: stable/11/tests/sys/audit X-SVN-Commit-Revision: 339092 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Oct 2018 17:38:59 -0000 Author: asomers Date: Tue Oct 2 17:38:58 2018 New Revision: 339092 URL: https://svnweb.freebsd.org/changeset/base/339092 Log: MFC r335792, r336564, r336579 r335792: audit(4): add tests for several more administrative syscalls Includes ntp_adjtime, auditctl, acct, auditon, and clock_settime. Includes quotactl, mount, nmount, swapon, and swapoff in failure mode only. Success tests for those syscalls will follow. Also includes reboot(2) in failure mode only. That one can't be tested in success mode. Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D15898 r336564: Separate the audit(4) tests for auditon(2)'s individual commands auditon(2) is an ioctl-like syscall with several different variants, each of which has a distinct audit event. Write separate audit(4) tests for each variant. Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D16255 r336579: audit(4): add more test cases for auditon(2) auditon(2) is an ioctl-like syscall with several different variants, each of which has a distinct audit event. This commit tests the remaining variants that weren't tested in r336564. Submitted by: aniketp X-MFC-With: 336564 Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D16381 Modified: stable/11/tests/sys/audit/administrative.c Directory Properties: stable/11/ (props changed) Modified: stable/11/tests/sys/audit/administrative.c ============================================================================== --- stable/11/tests/sys/audit/administrative.c Tue Oct 2 17:29:56 2018 (r339091) +++ stable/11/tests/sys/audit/administrative.c Tue Oct 2 17:38:58 2018 (r339092) @@ -27,21 +27,36 @@ #include #include +#include +#include +#include #include +#include +#include +#include +#include +#include + #include +#include #include +#include +#include #include #include "utils.h" static pid_t pid; static int filedesc; +/* Default argument for handling ENOSYS in auditon(2) functions */ +static int auditon_def = 0; static mode_t mode = 0777; static struct pollfd fds[1]; static char adregex[80]; static const char *auclass = "ad"; static const char *path = "fileforaudit"; +static const char *successreg = "fileforaudit.*return,success"; ATF_TC_WITH_CLEANUP(settimeofday_success); @@ -101,6 +116,60 @@ ATF_TC_CLEANUP(settimeofday_failure, tc) } +ATF_TC_WITH_CLEANUP(clock_settime_success); +ATF_TC_HEAD(clock_settime_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "clock_settime(2) call"); +} + +ATF_TC_BODY(clock_settime_success, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "clock_settime.*%d.*success", pid); + + struct timespec tp; + ATF_REQUIRE_EQ(0, clock_gettime(CLOCK_REALTIME, &tp)); + + FILE *pipefd = setup(fds, auclass); + /* Setting the same time as obtained by clock_gettime(2) */ + ATF_REQUIRE_EQ(0, clock_settime(CLOCK_REALTIME, &tp)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(clock_settime_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(clock_settime_failure); +ATF_TC_HEAD(clock_settime_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "clock_settime(2) call"); +} + +ATF_TC_BODY(clock_settime_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "clock_settime.*%d.*failure", pid); + + struct timespec tp; + ATF_REQUIRE_EQ(0, clock_gettime(CLOCK_MONOTONIC, &tp)); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: cannot use CLOCK_MONOTONIC to set the system time */ + ATF_REQUIRE_EQ(-1, clock_settime(CLOCK_MONOTONIC, &tp)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(clock_settime_failure, tc) +{ + cleanup(); +} + + ATF_TC_WITH_CLEANUP(adjtime_success); ATF_TC_HEAD(adjtime_success, tc) { @@ -115,7 +184,7 @@ ATF_TC_BODY(adjtime_success, tc) FILE *pipefd = setup(fds, auclass); /* We don't want to change the system time, hence NULL */ - ATF_REQUIRE_EQ(0, adjtime(NULL,NULL)); + ATF_REQUIRE_EQ(0, adjtime(NULL, NULL)); check_audit(fds, adregex, pipefd); } @@ -148,7 +217,55 @@ ATF_TC_CLEANUP(adjtime_failure, tc) } +ATF_TC_WITH_CLEANUP(ntp_adjtime_success); +ATF_TC_HEAD(ntp_adjtime_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "ntp_adjtime(2) call"); +} +ATF_TC_BODY(ntp_adjtime_success, tc) +{ + struct timex timebuff; + bzero(&timebuff, sizeof(timebuff)); + + pid = getpid(); + snprintf(adregex, sizeof(adregex), "ntp_adjtime.*%d.*success", pid); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE(ntp_adjtime(&timebuff) != -1); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(ntp_adjtime_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(ntp_adjtime_failure); +ATF_TC_HEAD(ntp_adjtime_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "ntp_adjtime(2) call"); +} + +ATF_TC_BODY(ntp_adjtime_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "ntp_adjtime.*%d.*failure", pid); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(-1, ntp_adjtime(NULL)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(ntp_adjtime_failure, tc) +{ + cleanup(); +} + + ATF_TC_WITH_CLEANUP(nfs_getfh_success); ATF_TC_HEAD(nfs_getfh_success, tc) { @@ -200,6 +317,137 @@ ATF_TC_CLEANUP(nfs_getfh_failure, tc) } +ATF_TC_WITH_CLEANUP(auditctl_success); +ATF_TC_HEAD(auditctl_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "auditctl(2) call"); +} + +ATF_TC_BODY(auditctl_success, tc) +{ + /* File needs to exist in order to call auditctl(2) */ + ATF_REQUIRE((filedesc = open(path, O_CREAT | O_WRONLY, mode)) != -1); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, auditctl(path)); + check_audit(fds, successreg, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(auditctl_success, tc) +{ + /* + * auditctl(2) disables audit log at /var/audit and initiates auditing + * at the configured path. To reset this, we need to stop and start the + * auditd(8) again. Here, we check if auditd(8) was running already + * before the test started. If so, we stop and start it again. + */ + system("service auditd onestop > /dev/null 2>&1"); + if (!atf_utils_file_exists("started_auditd")) + system("service auditd onestart > /dev/null 2>&1"); +} + + +ATF_TC_WITH_CLEANUP(auditctl_failure); +ATF_TC_HEAD(auditctl_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditctl(2) call"); +} + +ATF_TC_BODY(auditctl_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "auditctl.*%d.*return,failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: file does not exist */ + ATF_REQUIRE_EQ(-1, auditctl(NULL)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditctl_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(acct_success); +ATF_TC_HEAD(acct_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "acct(2) call"); +} + +ATF_TC_BODY(acct_success, tc) +{ + int acctinfo, filedesc2; + size_t len = sizeof(acctinfo); + const char *acctname = "kern.acct_configured"; + ATF_REQUIRE_EQ(0, sysctlbyname(acctname, &acctinfo, &len, NULL, 0)); + + /* File needs to exist to start system accounting */ + ATF_REQUIRE((filedesc = open(path, O_CREAT | O_RDWR, mode)) != -1); + + /* + * acctinfo = 0: System accounting was disabled + * acctinfo = 1: System accounting was enabled + */ + if (acctinfo) { + ATF_REQUIRE((filedesc2 = open("acct_ok", O_CREAT, mode)) != -1); + close(filedesc2); + } + + pid = getpid(); + snprintf(adregex, sizeof(adregex), + "acct.*%s.*%d.*return,success", path, pid); + + /* + * We temporarily switch the accounting record to a file at + * our own configured path in order to confirm acct(2)'s successful + * auditing. Then we set everything back to its original state. + */ + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, acct(path)); + check_audit(fds, adregex, pipefd); + close(filedesc); +} + +ATF_TC_CLEANUP(acct_success, tc) +{ + /* Reset accounting configured path */ + ATF_REQUIRE_EQ(0, system("service accounting onestop")); + if (atf_utils_file_exists("acct_ok")) { + ATF_REQUIRE_EQ(0, system("service accounting onestart")); + } + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(acct_failure); +ATF_TC_HEAD(acct_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "acct(2) call"); +} + +ATF_TC_BODY(acct_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "acct.*%d.*return,failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: File does not exist */ + ATF_REQUIRE_EQ(-1, acct(path)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(acct_failure, tc) +{ + cleanup(); +} + + ATF_TC_WITH_CLEANUP(getauid_success); ATF_TC_HEAD(getauid_success, tc) { @@ -494,16 +742,888 @@ ATF_TC_CLEANUP(setaudit_addr_failure, tc) cleanup(); } +/* + * Note: The test-case uses A_GETFSIZE as the command argument but since it is + * not an independent audit event, it will be used to check the default mode + * auditing of auditon(2) system call. + * + * Please See: sys/security/audit/audit_bsm_klib.c + * function(): au_event_t auditon_command_event() :: case A_GETFSIZE: + */ +ATF_TC_WITH_CLEANUP(auditon_default_success); +ATF_TC_HEAD(auditon_default_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "auditon(2) call"); +} +ATF_TC_BODY(auditon_default_success, tc) +{ + au_fstat_t fsize_arg; + bzero(&fsize_arg, sizeof(au_fstat_t)); + + pid = getpid(); + snprintf(adregex, sizeof(adregex), "auditon.*%d.*return,success", pid); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, auditon(A_GETFSIZE, &fsize_arg, sizeof(fsize_arg))); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_default_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_default_failure); +ATF_TC_HEAD(auditon_default_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditon(2) call"); +} + +ATF_TC_BODY(auditon_default_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "auditon.*%d.*return,failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid argument */ + ATF_REQUIRE_EQ(-1, auditon(A_GETFSIZE, NULL, 0)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_default_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_getpolicy_success); +ATF_TC_HEAD(auditon_getpolicy_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "auditon(2) call for cmd: A_GETPOLICY"); +} + +ATF_TC_BODY(auditon_getpolicy_success, tc) +{ + int aupolicy; + pid = getpid(); + snprintf(adregex, sizeof(adregex), "GPOLICY command.*%d.*success", pid); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, auditon(A_GETPOLICY, &aupolicy, sizeof(aupolicy))); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_getpolicy_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_getpolicy_failure); +ATF_TC_HEAD(auditon_getpolicy_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditon(2) call for cmd: A_GETPOLICY"); +} + +ATF_TC_BODY(auditon_getpolicy_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "GPOLICY command.*%d.*failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid argument */ + ATF_REQUIRE_EQ(-1, auditon(A_GETPOLICY, NULL, 0)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_getpolicy_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_setpolicy_success); +ATF_TC_HEAD(auditon_setpolicy_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "auditon(2) call for cmd: A_SETPOLICY"); +} + +ATF_TC_BODY(auditon_setpolicy_success, tc) +{ + int aupolicy; + pid = getpid(); + snprintf(adregex, sizeof(adregex), "SPOLICY command.*%d.*success", pid); + + /* Retrieve the current auditing policy, to be used with A_SETPOLICY */ + ATF_REQUIRE_EQ(0, auditon(A_GETPOLICY, &aupolicy, sizeof(aupolicy))); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, auditon(A_SETPOLICY, &aupolicy, sizeof(aupolicy))); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_setpolicy_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_setpolicy_failure); +ATF_TC_HEAD(auditon_setpolicy_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditon(2) call for cmd: A_SETPOLICY"); +} + +ATF_TC_BODY(auditon_setpolicy_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "SPOLICY command.*%d.*failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid argument */ + ATF_REQUIRE_EQ(-1, auditon(A_SETPOLICY, NULL, 0)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_setpolicy_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_getkmask_success); +ATF_TC_HEAD(auditon_getkmask_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "auditon(2) call for cmd: A_GETKMASK"); +} + +ATF_TC_BODY(auditon_getkmask_success, tc) +{ + pid = getpid(); + au_mask_t evmask; + snprintf(adregex, sizeof(adregex), "get kernel mask.*%d.*success", pid); + + bzero(&evmask, sizeof(evmask)); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, auditon(A_GETKMASK, &evmask, sizeof(evmask))); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_getkmask_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_getkmask_failure); +ATF_TC_HEAD(auditon_getkmask_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditon(2) call for cmd: A_GETKMASK"); +} + +ATF_TC_BODY(auditon_getkmask_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "get kernel mask.*%d.*failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid au_mask_t structure */ + ATF_REQUIRE_EQ(-1, auditon(A_GETKMASK, NULL, 0)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_getkmask_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_setkmask_success); +ATF_TC_HEAD(auditon_setkmask_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "auditon(2) call for cmd: A_SETKMASK"); +} + +ATF_TC_BODY(auditon_setkmask_success, tc) +{ + pid = getpid(); + au_mask_t evmask; + snprintf(adregex, sizeof(adregex), "set kernel mask.*%d.*success", pid); + + /* Retrieve the current audit mask to be used with A_SETKMASK */ + bzero(&evmask, sizeof(evmask)); + ATF_REQUIRE_EQ(0, auditon(A_GETKMASK, &evmask, sizeof(evmask))); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, auditon(A_SETKMASK, &evmask, sizeof(evmask))); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_setkmask_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_setkmask_failure); +ATF_TC_HEAD(auditon_setkmask_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditon(2) call for cmd: A_SETKMASK"); +} + +ATF_TC_BODY(auditon_setkmask_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "set kernel mask.*%d.*failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid au_mask_t structure */ + ATF_REQUIRE_EQ(-1, auditon(A_SETKMASK, NULL, 0)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_setkmask_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_getqctrl_success); +ATF_TC_HEAD(auditon_getqctrl_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "auditon(2) call for cmd: A_GETQCTRL"); +} + +ATF_TC_BODY(auditon_getqctrl_success, tc) +{ + pid = getpid(); + au_qctrl_t evqctrl; + snprintf(adregex, sizeof(adregex), "GQCTRL command.*%d.*success", pid); + + bzero(&evqctrl, sizeof(evqctrl)); + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, auditon(A_GETQCTRL, &evqctrl, sizeof(evqctrl))); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_getqctrl_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_getqctrl_failure); +ATF_TC_HEAD(auditon_getqctrl_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditon(2) call for cmd: A_GETQCTRL"); +} + +ATF_TC_BODY(auditon_getqctrl_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "GQCTRL command.*%d.*failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid au_qctrl_t structure */ + ATF_REQUIRE_EQ(-1, auditon(A_GETQCTRL, NULL, 0)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_getqctrl_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_setqctrl_success); +ATF_TC_HEAD(auditon_setqctrl_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "auditon(2) call for cmd: A_SETKMASK"); +} + +ATF_TC_BODY(auditon_setqctrl_success, tc) +{ + pid = getpid(); + au_qctrl_t evqctrl; + snprintf(adregex, sizeof(adregex), "SQCTRL command.*%d.*success", pid); + + /* Retrieve the current audit mask to be used with A_SETQCTRL */ + bzero(&evqctrl, sizeof(evqctrl)); + ATF_REQUIRE_EQ(0, auditon(A_GETQCTRL, &evqctrl, sizeof(evqctrl))); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, auditon(A_SETQCTRL, &evqctrl, sizeof(evqctrl))); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_setqctrl_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_setqctrl_failure); +ATF_TC_HEAD(auditon_setqctrl_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditon(2) call for cmd: A_SETKMASK"); +} + +ATF_TC_BODY(auditon_setqctrl_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "SQCTRL command.*%d.*failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid au_qctrl_t structure */ + ATF_REQUIRE_EQ(-1, auditon(A_SETQCTRL, NULL, 0)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_setqctrl_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_getclass_success); +ATF_TC_HEAD(auditon_getclass_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "auditon(2) call for cmd: A_GETCLASS"); +} + +ATF_TC_BODY(auditon_getclass_success, tc) +{ + pid = getpid(); + au_evclass_map_t evclass; + snprintf(adregex, sizeof(adregex), "get event class.*%d.*success", pid); + + /* Initialize evclass to get the event-class mapping for auditon(2) */ + evclass.ec_number = AUE_AUDITON; + evclass.ec_class = 0; + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, auditon(A_GETCLASS, &evclass, sizeof(evclass))); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_getclass_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_getclass_failure); +ATF_TC_HEAD(auditon_getclass_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditon(2) call for cmd: A_GETCLASS"); +} + +ATF_TC_BODY(auditon_getclass_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "get event class.*%d.*failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid au_evclass_map_t structure */ + ATF_REQUIRE_EQ(-1, auditon(A_GETCLASS, NULL, 0)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_getclass_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_setclass_success); +ATF_TC_HEAD(auditon_setclass_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "auditon(2) call for cmd: A_SETCLASS"); +} + +ATF_TC_BODY(auditon_setclass_success, tc) +{ + pid = getpid(); + au_evclass_map_t evclass; + snprintf(adregex, sizeof(adregex), "set event class.*%d.*success", pid); + + /* Initialize evclass and get the event-class mapping for auditon(2) */ + evclass.ec_number = AUE_AUDITON; + evclass.ec_class = 0; + ATF_REQUIRE_EQ(0, auditon(A_GETCLASS, &evclass, sizeof(evclass))); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, auditon(A_SETCLASS, &evclass, sizeof(evclass))); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_setclass_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_setclass_failure); +ATF_TC_HEAD(auditon_setclass_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditon(2) call for cmd: A_SETCLASS"); +} + +ATF_TC_BODY(auditon_setclass_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "set event class.*%d.*failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid au_evclass_map_t structure */ + ATF_REQUIRE_EQ(-1, auditon(A_SETCLASS, NULL, 0)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_setclass_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_getcond_success); +ATF_TC_HEAD(auditon_getcond_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "auditon(2) call for cmd: A_GETCOND"); +} + +ATF_TC_BODY(auditon_getcond_success, tc) +{ + int auditcond; + pid = getpid(); + snprintf(adregex, sizeof(adregex), "get audit state.*%d.*success", pid); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, auditon(A_GETCOND, &auditcond, sizeof(auditcond))); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_getcond_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_getcond_failure); +ATF_TC_HEAD(auditon_getcond_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditon(2) call for cmd: A_GETCOND"); +} + +ATF_TC_BODY(auditon_getcond_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "get audit state.*%d.*failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid argument */ + ATF_REQUIRE_EQ(-1, auditon(A_GETCOND, NULL, 0)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_getcond_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_setcond_success); +ATF_TC_HEAD(auditon_setcond_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "auditon(2) call for cmd: A_SETCOND"); +} + +ATF_TC_BODY(auditon_setcond_success, tc) +{ + int auditcond = AUC_AUDITING; + pid = getpid(); + snprintf(adregex, sizeof(adregex), "set audit state.*%d.*success", pid); + + FILE *pipefd = setup(fds, auclass); + /* At this point auditd is running, so the audit state is AUC_AUDITING */ + ATF_REQUIRE_EQ(0, auditon(A_SETCOND, &auditcond, sizeof(auditcond))); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_setcond_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_setcond_failure); +ATF_TC_HEAD(auditon_setcond_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditon(2) call for cmd: A_SETCOND"); +} + +ATF_TC_BODY(auditon_setcond_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "set audit state.*%d.*failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid argument */ + ATF_REQUIRE_EQ(-1, auditon(A_SETCOND, NULL, 0)); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_setcond_failure, tc) +{ + cleanup(); +} + +/* + * Following test-cases for auditon(2) are all in failure mode only as although + * auditable, they have not been implemented and return ENOSYS whenever called. + * + * Commands: A_GETCWD A_GETCAR A_GETSTAT A_SETSTAT A_SETUMASK A_SETSMASK + */ + +ATF_TC_WITH_CLEANUP(auditon_getcwd_failure); +ATF_TC_HEAD(auditon_getcwd_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditon(2) call for cmd: A_GETCWD"); +} + +ATF_TC_BODY(auditon_getcwd_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "get cwd.*%d.*failure", pid); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_ERRNO(ENOSYS, auditon(A_GETCWD, &auditon_def, + sizeof(auditon_def)) == -1); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_getcwd_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_getcar_failure); +ATF_TC_HEAD(auditon_getcar_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditon(2) call for cmd: A_GETCAR"); +} + +ATF_TC_BODY(auditon_getcar_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), "get car.*%d.*failure", pid); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_ERRNO(ENOSYS, auditon(A_GETCAR, &auditon_def, + sizeof(auditon_def)) == -1); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_getcar_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_getstat_failure); +ATF_TC_HEAD(auditon_getstat_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditon(2) call for cmd: A_GETSTAT"); +} + +ATF_TC_BODY(auditon_getstat_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), + "get audit statistics.*%d.*return,failure", pid); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_ERRNO(ENOSYS, auditon(A_GETSTAT, &auditon_def, + sizeof(auditon_def)) == -1); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_getstat_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_setstat_failure); +ATF_TC_HEAD(auditon_setstat_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditon(2) call for cmd: A_SETSTAT"); +} + +ATF_TC_BODY(auditon_setstat_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), + "set audit statistics.*%d.*return,failure", pid); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_ERRNO(ENOSYS, auditon(A_SETSTAT, &auditon_def, + sizeof(auditon_def)) == -1); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_setstat_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_setumask_failure); +ATF_TC_HEAD(auditon_setumask_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditon(2) call for cmd: A_SETUMASK"); +} + +ATF_TC_BODY(auditon_setumask_failure, tc) +{ + pid = getpid(); + snprintf(adregex, sizeof(adregex), + "set mask per uid.*%d.*return,failure", pid); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_ERRNO(ENOSYS, auditon(A_SETUMASK, &auditon_def, + sizeof(auditon_def)) == -1); + check_audit(fds, adregex, pipefd); +} + +ATF_TC_CLEANUP(auditon_setumask_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(auditon_setsmask_failure); +ATF_TC_HEAD(auditon_setsmask_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "auditon(2) call for cmd: A_SETSMASK"); *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-stable-11@freebsd.org Tue Oct 2 17:42:50 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CB5B510A66B0; Tue, 2 Oct 2018 17:42:50 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7723F7506E; Tue, 2 Oct 2018 17:42:50 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 6E7F2317F; Tue, 2 Oct 2018 17:42:50 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w92HgoaE082484; Tue, 2 Oct 2018 17:42:50 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w92HgoO7082483; Tue, 2 Oct 2018 17:42:50 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810021742.w92HgoO7082483@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Tue, 2 Oct 2018 17:42:50 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339093 - stable/11/contrib/openbsm/libauditd X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: stable/11/contrib/openbsm/libauditd X-SVN-Commit-Revision: 339093 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Oct 2018 17:42:51 -0000 Author: asomers Date: Tue Oct 2 17:42:50 2018 New Revision: 339093 URL: https://svnweb.freebsd.org/changeset/base/339093 Log: MFC r336613: auditd(8): Log a better error when no hostname is set in audit_control Cherry-pick from https://github.com/openbsm/openbsm/commit/01ba03b Reviewed by: cem Obtained from: OpenBSM Pull Request: https://github.com/openbsm/openbsm/pull/38 Modified: stable/11/contrib/openbsm/libauditd/auditd_lib.c Directory Properties: stable/11/ (props changed) Modified: stable/11/contrib/openbsm/libauditd/auditd_lib.c ============================================================================== --- stable/11/contrib/openbsm/libauditd/auditd_lib.c Tue Oct 2 17:38:58 2018 (r339092) +++ stable/11/contrib/openbsm/libauditd/auditd_lib.c Tue Oct 2 17:42:50 2018 (r339093) @@ -255,7 +255,8 @@ auditd_set_host(void) struct auditinfo_addr aia; int error, ret = ADE_NOERR; - if (getachost(auditd_host, sizeof(auditd_host)) != 0) { + if ((getachost(auditd_host, sizeof(auditd_host)) != 0) || + ((auditd_hostlen = strlen(auditd_host)) == 0)) { ret = ADE_PARSE; /* @@ -272,7 +273,6 @@ auditd_set_host(void) ret = ADE_AUDITON; return (ret); } - auditd_hostlen = strlen(auditd_host); error = getaddrinfo(auditd_host, NULL, NULL, &res); if (error) return (ADE_GETADDR); From owner-svn-src-stable-11@freebsd.org Tue Oct 2 17:57:17 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BE12210A6C0E; Tue, 2 Oct 2018 17:57:17 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 709907567B; Tue, 2 Oct 2018 17:57:17 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 6B7BC3321; Tue, 2 Oct 2018 17:57:17 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w92HvHNn087792; Tue, 2 Oct 2018 17:57:17 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w92HvHBJ087790; Tue, 2 Oct 2018 17:57:17 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810021757.w92HvHBJ087790@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Tue, 2 Oct 2018 17:57:17 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339094 - in stable/11: etc/mtree tests/sys tests/sys/auditpipe X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: in stable/11: etc/mtree tests/sys tests/sys/auditpipe X-SVN-Commit-Revision: 339094 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Oct 2018 17:57:17 -0000 Author: asomers Date: Tue Oct 2 17:57:16 2018 New Revision: 339094 URL: https://svnweb.freebsd.org/changeset/base/339094 Log: MFC r336728: Introduce test program for auditpipe(4) Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D16395 Added: stable/11/tests/sys/auditpipe/ - copied from r336728, head/tests/sys/auditpipe/ Modified: stable/11/etc/mtree/BSD.tests.dist stable/11/tests/sys/Makefile Directory Properties: stable/11/ (props changed) Modified: stable/11/etc/mtree/BSD.tests.dist ============================================================================== --- stable/11/etc/mtree/BSD.tests.dist Tue Oct 2 17:42:50 2018 (r339093) +++ stable/11/etc/mtree/BSD.tests.dist Tue Oct 2 17:57:16 2018 (r339094) @@ -422,6 +422,8 @@ .. audit .. + auditpipe + .. capsicum .. fifo Modified: stable/11/tests/sys/Makefile ============================================================================== --- stable/11/tests/sys/Makefile Tue Oct 2 17:42:50 2018 (r339093) +++ stable/11/tests/sys/Makefile Tue Oct 2 17:57:16 2018 (r339094) @@ -5,6 +5,7 @@ TESTSDIR= ${TESTSBASE}/sys TESTS_SUBDIRS+= acl TESTS_SUBDIRS+= aio TESTS_SUBDIRS+= audit +TESTS_SUBDIRS+= auditpipe TESTS_SUBDIRS+= capsicum TESTS_SUBDIRS+= fifo TESTS_SUBDIRS+= file From owner-svn-src-stable-11@freebsd.org Tue Oct 2 18:12:33 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C4E5710A730D; Tue, 2 Oct 2018 18:12:33 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 78A5D75FAB; Tue, 2 Oct 2018 18:12:33 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 7365439EF; Tue, 2 Oct 2018 18:12:33 +0000 (UTC) (envelope-from asomers@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w92ICXQe097904; Tue, 2 Oct 2018 18:12:33 GMT (envelope-from asomers@FreeBSD.org) Received: (from asomers@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w92ICX5u097903; Tue, 2 Oct 2018 18:12:33 GMT (envelope-from asomers@FreeBSD.org) Message-Id: <201810021812.w92ICX5u097903@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: asomers set sender to asomers@FreeBSD.org using -f From: Alan Somers Date: Tue, 2 Oct 2018 18:12:33 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339095 - stable/11/tests/sys/audit X-SVN-Group: stable-11 X-SVN-Commit-Author: asomers X-SVN-Commit-Paths: stable/11/tests/sys/audit X-SVN-Commit-Revision: 339095 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Oct 2018 18:12:34 -0000 Author: asomers Date: Tue Oct 2 18:12:32 2018 New Revision: 339095 URL: https://svnweb.freebsd.org/changeset/base/339095 Log: MFC r336875: audit(4): add tests for sysctl(3) and sysarch(2) Submitted by: aniketp Sponsored by: Google, Inc. (GSoC 2018) Differential Revision: https://reviews.freebsd.org/D16116 Added: stable/11/tests/sys/audit/miscellaneous.c - copied unchanged from r336875, head/tests/sys/audit/miscellaneous.c Modified: stable/11/tests/sys/audit/Makefile Directory Properties: stable/11/ (props changed) Modified: stable/11/tests/sys/audit/Makefile ============================================================================== --- stable/11/tests/sys/audit/Makefile Tue Oct 2 17:57:16 2018 (r339094) +++ stable/11/tests/sys/audit/Makefile Tue Oct 2 18:12:32 2018 (r339095) @@ -13,6 +13,7 @@ ATF_TESTS_C+= open ATF_TESTS_C+= ioctl ATF_TESTS_C+= network ATF_TESTS_C+= administrative +ATF_TESTS_C+= miscellaneous SRCS.file-attribute-access+= file-attribute-access.c SRCS.file-attribute-access+= utils.c @@ -36,6 +37,8 @@ SRCS.network+= network.c SRCS.network+= utils.c SRCS.administrative+= administrative.c SRCS.administrative+= utils.c +SRCS.miscellaneous+= miscellaneous.c +SRCS.miscellaneous+= utils.c TEST_METADATA+= timeout="30" TEST_METADATA+= required_user="root" Copied: stable/11/tests/sys/audit/miscellaneous.c (from r336875, head/tests/sys/audit/miscellaneous.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/tests/sys/audit/miscellaneous.c Tue Oct 2 18:12:32 2018 (r339095, copy of r336875, head/tests/sys/audit/miscellaneous.c) @@ -0,0 +1,232 @@ +/*- + * Copyright (c) 2018 Aniket Pandey + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#include +#include + +#include +#include + +#include +#include + +#include "utils.h" + +static pid_t pid; +static char miscreg[80]; +static struct pollfd fds[1]; +static const char *auclass = "ot"; + + +/* + * Success case of audit(2) is skipped for now as the behaviour is quite + * undeterministic. It will be added when the intermittency is resolved. + */ + + +ATF_TC_WITH_CLEANUP(audit_failure); +ATF_TC_HEAD(audit_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "audit(2) call"); +} + +ATF_TC_BODY(audit_failure, tc) +{ + pid = getpid(); + snprintf(miscreg, sizeof(miscreg), "audit.*%d.*return,failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid argument */ + ATF_REQUIRE_EQ(-1, audit(NULL, -1)); + check_audit(fds, miscreg, pipefd); +} + +ATF_TC_CLEANUP(audit_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(sysarch_success); +ATF_TC_HEAD(sysarch_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "sysarch(2) call"); +} + +ATF_TC_BODY(sysarch_success, tc) +{ + pid = getpid(); + snprintf(miscreg, sizeof(miscreg), "sysarch.*%d.*return,success", pid); + + /* Set sysnum to the syscall corresponding to the system architecture */ +#if defined(I386_GET_IOPERM) /* i386 */ + struct i386_ioperm_args i3sysarg; + bzero(&i3sysarg, sizeof(i3sysarg)); + +#elif defined(AMD64_GET_FSBASE) /* amd64 */ + register_t amd64arg; + +#elif defined(MIPS_GET_TLS) /* MIPS */ + char *mipsarg; + +#elif defined(ARM_SYNC_ICACHE) /* ARM */ + struct arm_sync_icache_args armsysarg; + bzero(&armsysarg, sizeof(armsysarg)); + +#elif defined(SPARC_UTRAP_INSTALL) /* Sparc64 */ + struct sparc_utrap_args handler = { + .type = UT_DIVISION_BY_ZERO, + /* We don't want to change the previous handlers */ + .new_precise = (void *)UTH_NOCHANGE, + .new_deferred = (void *)UTH_NOCHANGE, + .old_precise = NULL, + .old_deferred = NULL + }; + + struct sparc_utrap_install_args sparc64arg = { + .num = ST_DIVISION_BY_ZERO, + .handlers = &handler + }; +#else + /* For PowerPC, ARM64, RISCV archs, sysarch(2) is not supported */ + atf_tc_skip("sysarch(2) is not supported for the system architecture"); +#endif + + FILE *pipefd = setup(fds, auclass); +#if defined(I386_GET_IOPERM) + ATF_REQUIRE_EQ(0, sysarch(I386_GET_IOPERM, &i3sysarg)); +#elif defined(AMD64_GET_FSBASE) + ATF_REQUIRE_EQ(0, sysarch(AMD64_GET_FSBASE, &amd64arg)); +#elif defined(MIPS_GET_TLS) + ATF_REQUIRE_EQ(0, sysarch(MIPS_GET_TLS, &mipsarg)); +#elif defined(ARM_SYNC_ICACHE) + ATF_REQUIRE_EQ(0, sysarch(ARM_SYNC_ICACHE, &armsysarg)); +#elif defined(SPARC_UTRAP_INSTALL) + ATF_REQUIRE_EQ(0, sysarch(SPARC_UTRAP_INSTALL, &sparc64arg)); +#endif + check_audit(fds, miscreg, pipefd); +} + +ATF_TC_CLEANUP(sysarch_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(sysarch_failure); +ATF_TC_HEAD(sysarch_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "sysarch(2) call for any architecture"); +} + +ATF_TC_BODY(sysarch_failure, tc) +{ + pid = getpid(); + snprintf(miscreg, sizeof(miscreg), "sysarch.*%d.*return,failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid argument and Bad address */ + ATF_REQUIRE_EQ(-1, sysarch(-1, NULL)); + check_audit(fds, miscreg, pipefd); +} + +ATF_TC_CLEANUP(sysarch_failure, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(sysctl_success); +ATF_TC_HEAD(sysctl_success, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of a successful " + "sysctl(3) call"); +} + +ATF_TC_BODY(sysctl_success, tc) +{ + int mib[2], maxproc; + size_t proclen; + + /* Set mib to retrieve the maximum number of allowed processes */ + mib[0] = CTL_KERN; + mib[1] = KERN_MAXPROC; + proclen = sizeof(maxproc); + + pid = getpid(); + snprintf(miscreg, sizeof(miscreg), "sysctl.*%d.*return,success", pid); + + FILE *pipefd = setup(fds, auclass); + ATF_REQUIRE_EQ(0, sysctl(mib, 2, &maxproc, &proclen, NULL, 0)); + check_audit(fds, miscreg, pipefd); +} + +ATF_TC_CLEANUP(sysctl_success, tc) +{ + cleanup(); +} + + +ATF_TC_WITH_CLEANUP(sysctl_failure); +ATF_TC_HEAD(sysctl_failure, tc) +{ + atf_tc_set_md_var(tc, "descr", "Tests the audit of an unsuccessful " + "sysctl(3) call"); +} + +ATF_TC_BODY(sysctl_failure, tc) +{ + pid = getpid(); + snprintf(miscreg, sizeof(miscreg), "sysctl.*%d.*return,failure", pid); + + FILE *pipefd = setup(fds, auclass); + /* Failure reason: Invalid arguments */ + ATF_REQUIRE_EQ(-1, sysctl(NULL, 0, NULL, NULL, NULL, 0)); + check_audit(fds, miscreg, pipefd); +} + +ATF_TC_CLEANUP(sysctl_failure, tc) +{ + cleanup(); +} + + +ATF_TP_ADD_TCS(tp) +{ + ATF_TP_ADD_TC(tp, audit_failure); + + ATF_TP_ADD_TC(tp, sysarch_success); + ATF_TP_ADD_TC(tp, sysarch_failure); + + ATF_TP_ADD_TC(tp, sysctl_success); + ATF_TP_ADD_TC(tp, sysctl_failure); + + return (atf_no_error()); +} From owner-svn-src-stable-11@freebsd.org Tue Oct 2 21:19:44 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E59D010AD556; Tue, 2 Oct 2018 21:19:43 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8FE287CE7B; Tue, 2 Oct 2018 21:19:43 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 7C3F9585C; Tue, 2 Oct 2018 21:19:43 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w92LJhsw091843; Tue, 2 Oct 2018 21:19:43 GMT (envelope-from markj@FreeBSD.org) Received: (from markj@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w92LJhKG091842; Tue, 2 Oct 2018 21:19:43 GMT (envelope-from markj@FreeBSD.org) Message-Id: <201810022119.w92LJhKG091842@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: markj set sender to markj@FreeBSD.org using -f From: Mark Johnston Date: Tue, 2 Oct 2018 21:19:43 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339096 - stable/11/usr.bin/clang/lld X-SVN-Group: stable-11 X-SVN-Commit-Author: markj X-SVN-Commit-Paths: stable/11/usr.bin/clang/lld X-SVN-Commit-Revision: 339096 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Oct 2018 21:19:44 -0000 Author: markj Date: Tue Oct 2 21:19:42 2018 New Revision: 339096 URL: https://svnweb.freebsd.org/changeset/base/339096 Log: MFC r328810 (by emaste): ld.lld.1: miscellaneous style improvements MFC r329002 (by emaste): Update ld.lld.1 based on the version committed upstream MFC r329003 (by emaste): ld.lld.1: explain long options may use one or two dashes Modified: stable/11/usr.bin/clang/lld/ld.lld.1 Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.bin/clang/lld/ld.lld.1 ============================================================================== --- stable/11/usr.bin/clang/lld/ld.lld.1 Tue Oct 2 18:12:32 2018 (r339095) +++ stable/11/usr.bin/clang/lld/ld.lld.1 Tue Oct 2 21:19:42 2018 (r339096) @@ -25,7 +25,7 @@ .\" .\" $FreeBSD$ .\" -.Dd January 10, 2018 +.Dd February 7, 2018 .Dt LD.LLD 1 .Os .Sh NAME @@ -36,7 +36,7 @@ .Op Ar options .Ar objfile ... .Sh DESCRIPTION -A linker takes one or more object, archive and library files, and combines +A linker takes one or more object, archive, and library files, and combines them into an output file (an executable, a shared library, or another object file). It relocates code and data from the input files and resolves symbol @@ -47,10 +47,22 @@ is a drop-in replacement for the GNU BFD and gold link It accepts most of the same command line arguments and linker scripts as GNU linkers. .Pp -The following options are available: +Many options have both a single-letter and long form. +When using the long form options other than those beginning with the +letter +.Cm o +may be specified using either one or two dashes preceeding the option name. +Long options beginning with +.Cm o +require two dashes to avoid confusion with the +.Fl o Ar path +option. +.Pp +These options are available: .Bl -tag -width indent .It Fl -allow-multiple-definition -Allow multiple definitions. +Do not error if a symbol is defined multiple times. +The first definition will be used. .It Fl -as-needed Only set .Dv DT_NEEDED @@ -69,18 +81,56 @@ Bind defined function symbols locally. Bind defined symbols locally. .It Fl -build-id= Ns Ar value Generate a build ID note. +.Ar value +may be one of +.Cm md5 , +.Cm sha1 , +.Cm tree , +.Cm uuid , +.Cm 0x Ns Ar hex-string , +and +.Cm none . +.Cm tree +is an alias for +.Cm sha1 . +Build-IDs of type +.Cm md5 , +.Cm sha1 , +and +.Cm tree +are calculated from the object contents. .It Fl -build-id Generate a build ID note. .It Fl -color-diagnostics= Ns Ar value Use colors in diagnostics. +.Ar value +may be one of +.Cm always , +.Cm auto , +and +.Cm never . +.Cm auto +enables color if and only if output is to a terminal. .It Fl -color-diagnostics -Use colors in diagnostics. +Alias for +.Fl -color-diagnostics= Ns Cm auto . .It Fl -compress-debug-sections= Ns Ar value Compress DWARF debug sections. +.Ar value +may be +.Cm none +or +.Cm zlib . .It Fl -define-common Assign space to common symbols. -.It Fl -defsym= Ns Ar value +.It Fl -defsym= Ns Ar symbol= Ns Ar expression Define a symbol alias. +.Ar expression +may be another symbol or a linker script expression. +For example, +.Fl -defsym= Ns Cm foo= Ns Cm bar +or +.Fl -defsym= Ns Cm foo= Ns Cm bar+0x100 . .It Fl -demangle Demangle symbol names. .It Fl -disable-new-dtags @@ -95,8 +145,9 @@ Keep all symbols in the symbol table. Specify the dynamic linker to be used for a dynamically linked executable. This is recorded in an ELF segment of type .Dv PT_INTERP . -.It Fl -dynamic-list Ar value -Read a list of dynamic symbols. +.It Fl -dynamic-list Ar file +Read a list of dynamic symbols from +.Ar file . .It Fl -eh-frame-hdr Request creation of .Li .eh_frame_hdr @@ -119,8 +170,10 @@ A value of zero indicates that there is no limit. Report unresolved symbols as errors. .It Fl -exclude-libs Ar value Exclude static libraries from automatic export. -.It Fl -export-dynamic-symbol Ar value -Put a symbol in the dynamic symbol table. +.It Fl -export-dynamic-symbol Ar symbol +Include +.Ar symbol +in the dynamic symbol table. .It Fl -export-dynamic Put symbols in the dynamic symbol table. .It Fl -fatal-warnings @@ -132,10 +185,16 @@ field to the specified value. .It Fl -fini Ar symbol Specify a finalizer function. .It Fl -format= Ns Ar input-format -Change the input format of the inputs following this option. -.It Fl -full-shutdown -Perform a full shutdown instead of calling -.Fn _exit . +Specify the format of the inputs following this option. +.Ar input-format +may be one of +.Cm binary , +.Cm elf , +and +.Cm default . +.Cm default +is a synonym for +.Cm elf . .It Fl -gc-sections Enable garbage collection of unused sections. .It Fl -gdb-index @@ -143,7 +202,15 @@ Generate .Li .gdb_index section. .It Fl -hash-style Ar value -Specify hash style (sysv, gnu or both). +Specify hash style. +.Ar value +may be +.Cm sysv , +.Cm gnu , +or +.Cm both . +.Cm both +is the default. .It Fl -help Print a usage message. .It Fl -icf=all @@ -169,8 +236,9 @@ Number of LTO codegen partitions. Add a directory to the library search path. .It Fl l Ar libName Root name of library to use. -.It Fl -Map Ar value -Print a link map to the specified file. +.It Fl -Map Ar file +Print a link map to +.Ar file . .It Fl m Ar value Set target emulation. .It Fl -no-as-needed @@ -203,22 +271,40 @@ Report unresolved symbols even if the linker is creati Restores the default behavior of loading archive members. .It Fl -noinhibit-exec Retain the executable output file whenever it is still usable. -.It Fl -nopie +.It Fl -no-pie Do not create a position independent executable. .It Fl -nostdlib Only search directories specified on the command line. .It Fl -oformat Ar format -Specify the binary format for the output object file. +Specify the format for the output object file. +The only supported +.Ar format +is +.Cm binary , +which produces output with no ELF header. .It Fl -omagic Set the text and data sections to be readable and writable. -.It Fl -opt-remarks-filename Ar value -YAML output file for optimization remarks. +.It Fl -opt-remarks-filename Ar file +Write optimization remarks in YAML format to +.Ar file . .It Fl -opt-remarks-with-hotness Include hotness information in the optimization remarks file. .It Fl O Ar value Optimize output file size. +.Ar value +may be: +.Bl -tag -width indent +.It Cm O0 +Disable string merging. +.It Cm O1 +Enable string merging. +.It Cm O2 +Enable string tail merging. +.El +.Cm O1 +is the default. .It Fl o Ar path -Write the output executable, library or object to +Write the output executable, library, or object to .Ar path . If not specified, .Dv a.out @@ -245,9 +331,9 @@ The supported values are .Ar windows and .Ar posix . -.It Fl -script Ar value -Read linker script from the path -.Ar value . +.It Fl -script Ar file +Read linker script from +.Ar file . .It Fl -section-start Ar address Set address of section. .It Fl -shared @@ -266,8 +352,9 @@ in an archive. Strip all symbols. .It Fl -strip-debug Strip debugging information. -.It Fl -symbol-ordering-file Ar value -Lay out sections in the order specified by the symbol file. +.It Fl -symbol-ordering-file Ar file +Lay out sections in the order specified by +.Ar file . .It Fl -sysroot= Ns Ar value Set the system root. .It Fl -target1-abs @@ -312,8 +399,10 @@ Pruning policy for the ThinLTO cache. Number of ThinLTO jobs. .It Fl -threads Run the linker multi-threaded. -.It Fl -trace-symbol Ar value -Trace references to symbols. +This option is enabled by default. +.It Fl -trace-symbol Ar symbol +Trace references to +.Ar symbol . .It Fl -trace Print the names of the input files. .It Fl -Ttext Ar value @@ -322,18 +411,21 @@ Same as with .Li .text as the sectionname. -.It Fl -undefined Ar value -Force undefined symbol during linking. +.It Fl -undefined Ar symbol +Force +.Ar symbol +to be an undefined symbol during linking. .It Fl -unresolved-symbols= Ns Ar value Determine how to handle unresolved symbols. .It Fl -verbose Verbose mode. -.It Fl -version-script Ar value -Read a version script. +.It Fl -version-script Ar file +Read version script from +.Ar file . .It Fl V , Fl -version Display the version number and exit. .It Fl v -Display the version number, and proceed with linking if object files are +Display the version number and proceed with linking if object files are specified. .It Fl -warn-common Warn about duplicate common symbols. @@ -346,19 +438,75 @@ Use wrapper functions for symbol. .It Fl z Ar option Linker option extensions. .Bl -tag -width indent +.It Cm execstack +Make the main stack executable. +Stack permissions are recorded in the +.Dv PT_GNU_STACK +segment. +.It Cm muldefs +Do not error if a symbol is defined multiple times. +The first definition will be used. +This is a synonym for +.Fl -allow-multiple-definition. +.It Cm nocombreloc +Disable combining and sorting multiple relocation sections. +.It Cm nocopyreloc +Disable the creation of copy relocations. +.It Cm nodelete +Set the +.Dv DF_1_NODELETE +flag to indicate that the object cannot be unloaded from a process. +.It Cm nodlopen +Set the +.Dv DF_1_NOOPEN +flag to indcate that the object may not be opened by +.Xr dlopen 3 . +.It Cm norelro +Do not indicate that portions of the object shold be mapped read-only +after initial relocation processing. +The object will omit the +.Dv PT_GNU_RELRO +segment. .It Cm notext Allow relocations against read-only segments. Sets the .Dv DT_TEXTREL flag in the .Dv DYNAMIC section. +.It Cm now +Set the +.Dv DF_BIND_NOW +flag to indicate that the run-time loader should perform all relocation +processing as part of object initialization. +By default relocations may be performed on demand. +.It Cm origin +Set the +.Dv DF_ORIGIN +flag to indicate that the object requires +$ORIGIN +processing. +.It Cm retpolineplt +Emit retpoline format PLT entries as a mitigation for CVE-2017-5715. +.It Cm rodynamic +Make the +.Li .dynamic +section read-only. +The +.Dv DT_DEBUG +tag will not be emitted. +.It Cm stack-size= Ns Ar size +Set the main thread's stack size to +.Ar size . +The stack size is recorded as the size of the +.Ar size . +.Dv PT_GNU_STACK +program segment. .It Cm text Do not allow relocations against read-only segments. This is the default. +.It Cm wxneeded +Create a +.Dv PT_OPENBSD_WXNEEDED +segment. .El .El -.Sh IMPLEMENTATION NOTES -The targets supported by -.Nm -are: -elf32-i386 elf32-iamcu elf32-littlearm elf32-ntradbigmips elf32-ntradlittlemips elf32-powerpc elf32-tradbigmips elf32-tradlittlemips elf32-x86-64 elf64-amdgpu elf64-littleaarch64 elf64-powerpc elf64-tradbigmips elf64-tradlittlemips elf64-x86-64 From owner-svn-src-stable-11@freebsd.org Tue Oct 2 21:20:47 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 69C5910AD60A; Tue, 2 Oct 2018 21:20:47 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 19C927D01C; Tue, 2 Oct 2018 21:20:47 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 14B38596D; Tue, 2 Oct 2018 21:20:47 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w92LKk3R091972; Tue, 2 Oct 2018 21:20:46 GMT (envelope-from markj@FreeBSD.org) Received: (from markj@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w92LKjtA091967; Tue, 2 Oct 2018 21:20:45 GMT (envelope-from markj@FreeBSD.org) Message-Id: <201810022120.w92LKjtA091967@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: markj set sender to markj@FreeBSD.org using -f From: Mark Johnston Date: Tue, 2 Oct 2018 21:20:45 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339097 - in stable/11: contrib/llvm/tools/lld/ELF usr.bin/clang/lld X-SVN-Group: stable-11 X-SVN-Commit-Author: markj X-SVN-Commit-Paths: in stable/11: contrib/llvm/tools/lld/ELF usr.bin/clang/lld X-SVN-Commit-Revision: 339097 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Oct 2018 21:20:47 -0000 Author: markj Date: Tue Oct 2 21:20:45 2018 New Revision: 339097 URL: https://svnweb.freebsd.org/changeset/base/339097 Log: MFC r338251: Add an lld option to emit PC-relative relocations for ifunc calls. Modified: stable/11/contrib/llvm/tools/lld/ELF/Config.h stable/11/contrib/llvm/tools/lld/ELF/Driver.cpp stable/11/contrib/llvm/tools/lld/ELF/Relocations.cpp stable/11/contrib/llvm/tools/lld/ELF/Writer.cpp stable/11/usr.bin/clang/lld/ld.lld.1 Directory Properties: stable/11/ (props changed) Modified: stable/11/contrib/llvm/tools/lld/ELF/Config.h ============================================================================== --- stable/11/contrib/llvm/tools/lld/ELF/Config.h Tue Oct 2 21:19:42 2018 (r339096) +++ stable/11/contrib/llvm/tools/lld/ELF/Config.h Tue Oct 2 21:20:45 2018 (r339097) @@ -152,6 +152,7 @@ struct Configuration { bool ZCombreloc; bool ZExecstack; bool ZHazardplt; + bool ZIfuncnoplt; bool ZNocopyreloc; bool ZNodelete; bool ZNodlopen; Modified: stable/11/contrib/llvm/tools/lld/ELF/Driver.cpp ============================================================================== --- stable/11/contrib/llvm/tools/lld/ELF/Driver.cpp Tue Oct 2 21:19:42 2018 (r339096) +++ stable/11/contrib/llvm/tools/lld/ELF/Driver.cpp Tue Oct 2 21:20:45 2018 (r339097) @@ -669,6 +669,7 @@ void LinkerDriver::readConfigs(opt::InputArgList &Args Config->ZCombreloc = !hasZOption(Args, "nocombreloc"); Config->ZExecstack = hasZOption(Args, "execstack"); Config->ZHazardplt = hasZOption(Args, "hazardplt"); + Config->ZIfuncnoplt = hasZOption(Args, "ifunc-noplt"); Config->ZNocopyreloc = hasZOption(Args, "nocopyreloc"); Config->ZNodelete = hasZOption(Args, "nodelete"); Config->ZNodlopen = hasZOption(Args, "nodlopen"); Modified: stable/11/contrib/llvm/tools/lld/ELF/Relocations.cpp ============================================================================== --- stable/11/contrib/llvm/tools/lld/ELF/Relocations.cpp Tue Oct 2 21:19:42 2018 (r339096) +++ stable/11/contrib/llvm/tools/lld/ELF/Relocations.cpp Tue Oct 2 21:20:45 2018 (r339097) @@ -374,6 +374,9 @@ static bool isStaticLinkTimeConstant(RelExpr E, RelTyp R_PPC_PLT_OPD, R_TLSDESC_CALL, R_TLSDESC_PAGE, R_HINT>(E)) return true; + if (Sym.isGnuIFunc() && Config->ZIfuncnoplt) + return false; + // These never do, except if the entire file is position dependent or if // only the low bits are used. if (E == R_GOT || E == R_PLT || E == R_TLSDESC) @@ -921,7 +924,9 @@ static void scanRelocs(InputSectionBase &Sec, ArrayRef // Strenghten or relax a PLT access. // // GNU ifunc symbols must be accessed via PLT because their addresses - // are determined by runtime. + // are determined by runtime. If the -z ifunc-noplt option is specified, + // we permit the optimization of ifunc calls by omitting the PLT entry + // and preserving relocations at ifunc call sites. // // On the other hand, if we know that a PLT entry will be resolved within // the same ELF module, we can skip PLT access and directly jump to the @@ -929,7 +934,7 @@ static void scanRelocs(InputSectionBase &Sec, ArrayRef // all dynamic symbols that can be resolved within the executable will // actually be resolved that way at runtime, because the main exectuable // is always at the beginning of a search list. We can leverage that fact. - if (Sym.isGnuIFunc()) + if (Sym.isGnuIFunc() && !Config->ZIfuncnoplt) Expr = toPlt(Expr); else if (!Preemptible && Expr == R_GOT_PC && !isAbsoluteValue(Sym)) Expr = @@ -1031,6 +1036,16 @@ static void scanRelocs(InputSectionBase &Sec, ArrayRef // uses Elf_Rel, since in that case the written value is the addend. if (IsConstant) { Sec.Relocations.push_back({Expr, Type, Offset, Addend, &Sym}); + continue; + } + + // Preserve relocations against ifuncs if we were asked to do so. + if (Sym.isGnuIFunc() && Config->ZIfuncnoplt) { + if (Config->IsRela) + InX::RelaDyn->addReloc({Type, &Sec, Offset, false, &Sym, Addend}); + else + // Preserve the existing addend. + InX::RelaDyn->addReloc({Type, &Sec, Offset, false, &Sym, 0}); continue; } Modified: stable/11/contrib/llvm/tools/lld/ELF/Writer.cpp ============================================================================== --- stable/11/contrib/llvm/tools/lld/ELF/Writer.cpp Tue Oct 2 21:19:42 2018 (r339096) +++ stable/11/contrib/llvm/tools/lld/ELF/Writer.cpp Tue Oct 2 21:20:45 2018 (r339097) @@ -1400,8 +1400,11 @@ template void Writer::finalizeSecti applySynthetic({InX::EhFrame}, [](SyntheticSection *SS) { SS->finalizeContents(); }); - for (Symbol *S : Symtab->getSymbols()) + for (Symbol *S : Symtab->getSymbols()) { S->IsPreemptible |= computeIsPreemptible(*S); + if (S->isGnuIFunc() && Config->ZIfuncnoplt) + S->ExportDynamic = true; + } // Scan relocations. This must be done after every symbol is declared so that // we can correctly decide if a dynamic relocation is needed. Modified: stable/11/usr.bin/clang/lld/ld.lld.1 ============================================================================== --- stable/11/usr.bin/clang/lld/ld.lld.1 Tue Oct 2 21:19:42 2018 (r339096) +++ stable/11/usr.bin/clang/lld/ld.lld.1 Tue Oct 2 21:20:45 2018 (r339097) @@ -25,7 +25,7 @@ .\" .\" $FreeBSD$ .\" -.Dd February 7, 2018 +.Dd August 22, 2018 .Dt LD.LLD 1 .Os .Sh NAME @@ -443,6 +443,13 @@ Make the main stack executable. Stack permissions are recorded in the .Dv PT_GNU_STACK segment. +.It Cm ifunc-noplt +Do not emit PLT entries for GNU ifuncs. +Instead, preserve relocations for ifunc call sites so that they may +be applied by a run-time loader. +Note that this feature requires special loader support and will +generally result in application crashes when used outside of freestanding +environments. .It Cm muldefs Do not error if a symbol is defined multiple times. The first definition will be used. From owner-svn-src-stable-11@freebsd.org Tue Oct 2 22:51:26 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4EDDF10AFFF1; Tue, 2 Oct 2018 22:51:26 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id EAC958077B; Tue, 2 Oct 2018 22:51:25 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id E5CD76763; Tue, 2 Oct 2018 22:51:25 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w92MpPfY039765; Tue, 2 Oct 2018 22:51:25 GMT (envelope-from emaste@FreeBSD.org) Received: (from emaste@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w92MpP2H039762; Tue, 2 Oct 2018 22:51:25 GMT (envelope-from emaste@FreeBSD.org) Message-Id: <201810022251.w92MpP2H039762@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: emaste set sender to emaste@FreeBSD.org using -f From: Ed Maste Date: Tue, 2 Oct 2018 22:51:25 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339100 - in stable/11: contrib/llvm/tools/lld/ELF usr.bin/clang/lld X-SVN-Group: stable-11 X-SVN-Commit-Author: emaste X-SVN-Commit-Paths: in stable/11: contrib/llvm/tools/lld/ELF usr.bin/clang/lld X-SVN-Commit-Revision: 339100 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Oct 2018 22:51:26 -0000 Author: emaste Date: Tue Oct 2 22:51:24 2018 New Revision: 339100 URL: https://svnweb.freebsd.org/changeset/base/339100 Log: MFC r338682: lld: add -z interpose support -z interpose sets the DF_1_INTERPOSE flag, marking the object as an interposer. PR: 230604 Relnotes: Yes Sponsored by: The FreeBSD Foundation Modified: stable/11/contrib/llvm/tools/lld/ELF/Config.h stable/11/contrib/llvm/tools/lld/ELF/Driver.cpp stable/11/contrib/llvm/tools/lld/ELF/SyntheticSections.cpp stable/11/usr.bin/clang/lld/ld.lld.1 Directory Properties: stable/11/ (props changed) Modified: stable/11/contrib/llvm/tools/lld/ELF/Config.h ============================================================================== --- stable/11/contrib/llvm/tools/lld/ELF/Config.h Tue Oct 2 21:40:57 2018 (r339099) +++ stable/11/contrib/llvm/tools/lld/ELF/Config.h Tue Oct 2 22:51:24 2018 (r339100) @@ -153,6 +153,7 @@ struct Configuration { bool ZExecstack; bool ZHazardplt; bool ZIfuncnoplt; + bool ZInterpose; bool ZNocopyreloc; bool ZNodelete; bool ZNodlopen; Modified: stable/11/contrib/llvm/tools/lld/ELF/Driver.cpp ============================================================================== --- stable/11/contrib/llvm/tools/lld/ELF/Driver.cpp Tue Oct 2 21:40:57 2018 (r339099) +++ stable/11/contrib/llvm/tools/lld/ELF/Driver.cpp Tue Oct 2 22:51:24 2018 (r339100) @@ -670,6 +670,7 @@ void LinkerDriver::readConfigs(opt::InputArgList &Args Config->ZExecstack = hasZOption(Args, "execstack"); Config->ZHazardplt = hasZOption(Args, "hazardplt"); Config->ZIfuncnoplt = hasZOption(Args, "ifunc-noplt"); + Config->ZInterpose = hasZOption(Args, "interpose"); Config->ZNocopyreloc = hasZOption(Args, "nocopyreloc"); Config->ZNodelete = hasZOption(Args, "nodelete"); Config->ZNodlopen = hasZOption(Args, "nodlopen"); Modified: stable/11/contrib/llvm/tools/lld/ELF/SyntheticSections.cpp ============================================================================== --- stable/11/contrib/llvm/tools/lld/ELF/SyntheticSections.cpp Tue Oct 2 21:40:57 2018 (r339099) +++ stable/11/contrib/llvm/tools/lld/ELF/SyntheticSections.cpp Tue Oct 2 22:51:24 2018 (r339100) @@ -1034,6 +1034,8 @@ template void DynamicSection::final uint32_t DtFlags1 = 0; if (Config->Bsymbolic) DtFlags |= DF_SYMBOLIC; + if (Config->ZInterpose) + DtFlags1 |= DF_1_INTERPOSE; if (Config->ZNodelete) DtFlags1 |= DF_1_NODELETE; if (Config->ZNodlopen) Modified: stable/11/usr.bin/clang/lld/ld.lld.1 ============================================================================== --- stable/11/usr.bin/clang/lld/ld.lld.1 Tue Oct 2 21:40:57 2018 (r339099) +++ stable/11/usr.bin/clang/lld/ld.lld.1 Tue Oct 2 22:51:24 2018 (r339100) @@ -25,7 +25,7 @@ .\" .\" $FreeBSD$ .\" -.Dd August 22, 2018 +.Dd September 14, 2018 .Dt LD.LLD 1 .Os .Sh NAME @@ -450,6 +450,12 @@ be applied by a run-time loader. Note that this feature requires special loader support and will generally result in application crashes when used outside of freestanding environments. +.It Cm interpose +Set the +.Dv DF_1_INTERPOSE +flag to indicate that the object is an interposer. +Runtime linkers perform symbol resolution by first searching the application, +followed by interposers, and then any other dependencies. .It Cm muldefs Do not error if a symbol is defined multiple times. The first definition will be used. From owner-svn-src-stable-11@freebsd.org Wed Oct 3 02:07:25 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C8C7610B45E6; Wed, 3 Oct 2018 02:07:25 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7B98385DA3; Wed, 3 Oct 2018 02:07:25 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 768A6107EB; Wed, 3 Oct 2018 02:07:25 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w9327Pjr040423; Wed, 3 Oct 2018 02:07:25 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w9327P7V040422; Wed, 3 Oct 2018 02:07:25 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030207.w9327P7V040422@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:07:25 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339102 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339102 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:07:26 -0000 Author: mav Date: Wed Oct 3 02:07:24 2018 New Revision: 339102 URL: https://svnweb.freebsd.org/changeset/base/339102 Log: MFC r336943: MFV r336942: 9189 Add debug to vdev_label_read_config when txg check fails illumos/illumos-gate@b6bf6e1540f30bd97b8d6e2c21d95e17841e0f23 Reviewed by: George Wilson Reviewed by: Prashanth Sreenivasa Approved by: Matt Ahrens Author: Pavel Zakharov Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Tue Oct 2 23:23:56 2018 (r339101) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Oct 3 02:07:24 2018 (r339102) @@ -1725,7 +1725,8 @@ vdev_validate(vdev_t *vd) if ((label = vdev_label_read_config(vd, txg)) == NULL) { vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN, VDEV_AUX_BAD_LABEL); - vdev_dbgmsg(vd, "vdev_validate: failed reading config"); + vdev_dbgmsg(vd, "vdev_validate: failed reading config for " + "txg %llu", (u_longlong_t)txg); return (0); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c Tue Oct 2 23:23:56 2018 (r339101) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c Wed Oct 3 02:07:24 2018 (r339102) @@ -546,6 +546,7 @@ vdev_label_read_config(vdev_t *vd, uint64_t txg) abd_t *vp_abd; zio_t *zio; uint64_t best_txg = 0; + uint64_t label_txg = 0; int error = 0; int flags = ZIO_FLAG_CONFIG_WRITER | ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE; @@ -571,8 +572,6 @@ retry: if (zio_wait(zio) == 0 && nvlist_unpack(vp->vp_nvlist, sizeof (vp->vp_nvlist), &label, 0) == 0) { - uint64_t label_txg = 0; - /* * Auxiliary vdevs won't have txg values in their * labels and newly added vdevs may not have been @@ -601,6 +600,15 @@ retry: if (config == NULL && !(flags & ZIO_FLAG_TRYHARD)) { flags |= ZIO_FLAG_TRYHARD; goto retry; + } + + /* + * We found a valid label but it didn't pass txg restrictions. + */ + if (config == NULL && label_txg != 0) { + vdev_dbgmsg(vd, "label discarded as txg is too large " + "(%llu > %llu)", (u_longlong_t)label_txg, + (u_longlong_t)txg); } abd_free(vp_abd); From owner-svn-src-stable-11@freebsd.org Wed Oct 3 02:08:33 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 659B810B4659; Wed, 3 Oct 2018 02:08:33 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1AA4E85EE7; Wed, 3 Oct 2018 02:08:33 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 15878107EC; Wed, 3 Oct 2018 02:08:33 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w9328WvA040537; Wed, 3 Oct 2018 02:08:32 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w9328Wc4040534; Wed, 3 Oct 2018 02:08:32 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030208.w9328Wc4040534@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:08:32 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339103 - in stable/11/cddl/contrib/opensolaris: cmd/zfs lib/libzfs/common X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/cddl/contrib/opensolaris: cmd/zfs lib/libzfs/common X-SVN-Commit-Revision: 339103 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:08:33 -0000 Author: mav Date: Wed Oct 3 02:08:32 2018 New Revision: 339103 URL: https://svnweb.freebsd.org/changeset/base/339103 Log: MFC r336945: MFV r336944: 9286 want refreservation=auto When a ZFS volume is created with zfs create -V (but without -s), the refreservation property is set to a value that is volsize plus the maximum size of metadata. If refreservation is ever set to another value, it is impossible to set it back to the automatically determined value. There are other cases where refreservation may be wrong. These include receiving a volume that was sent without properties and zfs clone. We need: zfs set refreservation=auto zfs clone -o refreservation=auto Each one would use the same function used by zfs create -V to determine the proper value for refreservation. illumos/illumos-gate@1c10ae76c0cb31326c320e7cef1d3f24a1f47125 Reviewed by: Allan Jude Reviewed by: Matthew Ahrens Reviewed by: John Kennedy Reviewed by: Andy Stormont Approved by: Richard Lowe Author: Mike Gerdts Modified: stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_util.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 Wed Oct 3 02:07:24 2018 (r339102) +++ stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 Wed Oct 3 02:08:32 2018 (r339103) @@ -28,6 +28,7 @@ .\" Copyright (c) 2016 Nexenta Systems, Inc. All Rights Reserved. .\" Copyright (c) 2014, Xin LI .\" Copyright (c) 2014-2015, The FreeBSD Foundation, All Rights Reserved. +.\" Copyright 2018 Joyent, Inc. .\" .\" $FreeBSD$ .\" @@ -1311,7 +1312,7 @@ The default value is Limits the amount of space a dataset can consume. This property enforces a hard limit on the amount of space used. This hard limit does not include space used by descendents, including file systems and snapshots. -.It Sy refreservation Ns = Ns Ar size | Cm none +.It Sy refreservation Ns = Ns Ar size | Cm none | Cm auto The minimum amount of space guaranteed to a dataset, not including its descendents. When the amount of space used is below this value, the dataset is treated as if it were taking up the amount of space specified by @@ -1327,6 +1328,18 @@ is set, a snapshot is only allowed if there is enough of this reservation to accommodate the current number of "referenced" bytes in the dataset. .Pp +If +.Sy refreservation +is set to +.Sy auto , +a volume is thick provisioned or not sparse. +.Sy refreservation Ns = Cm auto +is only supported on volumes. +See +.Sy volsize +in the Native Properties +section for more information about sparse volumes. +.Pp This property can also be referred to by its shortened column name, .Sy refreserv . .It Sy reservation Ns = Ns Ar size | Cm none @@ -1459,18 +1472,33 @@ on how the volume is used. These effects can also occu changed while it is in use (particularly when shrinking the size). Extreme care should be used when adjusting the volume size. .Pp -Though not recommended, a "sparse volume" (also known as "thin provisioning") +Though not recommended, a "sparse volume" (also known as "thin provisioned") can be created by specifying the .Fl s option to the .Qq Nm Cm create Fl V -command, or by changing the reservation after the volume has been created. A -"sparse volume" is a volume where the reservation is less then the volume size. +command, or by changing the value of the +.Sy refreservation +property, or +.Sy reservation +property on pool version 8 or earlier +.Pc +after the volume has been created. +A "sparse volume" is a volume where the value of +.Sy refreservation +is less then the size of the volume plus the space required to store its +metadata. Consequently, writes to a sparse volume can fail with .Sy ENOSPC when the pool is low on space. For a sparse volume, changes to .Sy volsize -are not reflected in the reservation. +are not reflected in the +.Sy refreservation . +A volume that is not sparse is said to be "thick provisioned". +A sparse volume can become thick provisioned by setting +.Sy refreservation +to +.Sy auto . .It Sy volmode Ns = Ns Cm default | geom | dev | none This property specifies how volumes should be exposed to the OS. Setting it to Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Oct 3 02:07:24 2018 (r339102) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Oct 3 02:08:32 2018 (r339103) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2013, Joyent, Inc. All rights reserved. + * Copyright (c) 2018, Joyent, Inc. All rights reserved. * Copyright (c) 2011, 2016 by Delphix. All rights reserved. * Copyright (c) 2012 DEY Storage Systems, Inc. All rights reserved. * Copyright (c) 2011-2012 Pawel Jakub Dawidek. All rights reserved. @@ -1407,7 +1407,6 @@ badlabel: switch (prop) { case ZFS_PROP_RESERVATION: - case ZFS_PROP_REFRESERVATION: if (intval > volsize) { zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, "'%s' is greater than current " @@ -1418,6 +1417,17 @@ badlabel: } break; + case ZFS_PROP_REFRESERVATION: + if (intval > volsize && intval != UINT64_MAX) { + zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, + "'%s' is greater than current " + "volume size"), propname); + (void) zfs_error(hdl, EZFS_BADPROP, + errbuf); + goto error; + } + break; + case ZFS_PROP_VOLSIZE: if (intval % blocksize != 0) { zfs_nicenum(blocksize, buf, @@ -1519,6 +1529,61 @@ zfs_add_synthetic_resv(zfs_handle_t *zhp, nvlist_t *nv return (1); } +/* + * Helper for 'zfs {set|clone} refreservation=auto'. Must be called after + * zfs_valid_proplist(), as it is what sets the UINT64_MAX sentinal value. + * Return codes must match zfs_add_synthetic_resv(). + */ +static int +zfs_fix_auto_resv(zfs_handle_t *zhp, nvlist_t *nvl) +{ + uint64_t volsize; + uint64_t resvsize; + zfs_prop_t prop; + nvlist_t *props; + + if (!ZFS_IS_VOLUME(zhp)) { + return (0); + } + + if (zfs_which_resv_prop(zhp, &prop) != 0) { + return (-1); + } + + if (prop != ZFS_PROP_REFRESERVATION) { + return (0); + } + + if (nvlist_lookup_uint64(nvl, zfs_prop_to_name(prop), &resvsize) != 0) { + /* No value being set, so it can't be "auto" */ + return (0); + } + if (resvsize != UINT64_MAX) { + /* Being set to a value other than "auto" */ + return (0); + } + + props = fnvlist_alloc(); + + fnvlist_add_uint64(props, zfs_prop_to_name(ZFS_PROP_VOLBLOCKSIZE), + zfs_prop_get_int(zhp, ZFS_PROP_VOLBLOCKSIZE)); + + if (nvlist_lookup_uint64(nvl, zfs_prop_to_name(ZFS_PROP_VOLSIZE), + &volsize) != 0) { + volsize = zfs_prop_get_int(zhp, ZFS_PROP_VOLSIZE); + } + + resvsize = zvol_volsize_to_reservation(volsize, props); + fnvlist_free(props); + + (void) nvlist_remove_all(nvl, zfs_prop_to_name(prop)); + if (nvlist_add_uint64(nvl, zfs_prop_to_name(prop), resvsize) != 0) { + (void) no_memory(zhp->zfs_hdl); + return (-1); + } + return (1); +} + void zfs_setprop_error(libzfs_handle_t *hdl, zfs_prop_t prop, int err, char *errbuf) @@ -1685,6 +1750,12 @@ zfs_prop_set_list(zfs_handle_t *zhp, nvlist_t *props) goto error; } } + + if (added_resv != 1 && + (added_resv = zfs_fix_auto_resv(zhp, nvl)) == -1) { + goto error; + } + /* * Check how many properties we're setting and allocate an array to * store changelist pointers for postfix(). @@ -3715,6 +3786,7 @@ zfs_clone(zfs_handle_t *zhp, const char *target, nvlis if (props) { zfs_type_t type; + if (ZFS_IS_VOLUME(zhp)) { type = ZFS_TYPE_VOLUME; } else { @@ -3723,6 +3795,10 @@ zfs_clone(zfs_handle_t *zhp, const char *target, nvlis if ((props = zfs_valid_proplist(hdl, type, props, zoned, zhp, zhp->zpool_hdl, errbuf)) == NULL) return (-1); + if (zfs_fix_auto_resv(zhp, props) == -1) { + nvlist_free(props); + return (-1); + } } ret = lzc_clone(target, zhp->zfs_name, props); Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_util.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_util.c Wed Oct 3 02:07:24 2018 (r339102) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_util.c Wed Oct 3 02:08:32 2018 (r339103) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2013, Joyent, Inc. All rights reserved. + * Copyright (c) 2018 Joyent, Inc. * Copyright (c) 2011, 2017 by Delphix. All rights reserved. * Copyright 2016 Igor Kozhukhov * Copyright (c) 2017 Datto Inc. @@ -1227,6 +1227,7 @@ zprop_parse_value(libzfs_handle_t *hdl, nvpair_t *elem const char *propname; char *value; boolean_t isnone = B_FALSE; + boolean_t isauto = B_FALSE; if (type == ZFS_TYPE_POOL) { proptype = zpool_prop_get_type(prop); @@ -1262,8 +1263,9 @@ zprop_parse_value(libzfs_handle_t *hdl, nvpair_t *elem (void) nvpair_value_string(elem, &value); if (strcmp(value, "none") == 0) { isnone = B_TRUE; - } else if (zfs_nicestrtonum(hdl, value, ivalp) - != 0) { + } else if (strcmp(value, "auto") == 0) { + isauto = B_TRUE; + } else if (zfs_nicestrtonum(hdl, value, ivalp) != 0) { goto error; } } else if (datatype == DATA_TYPE_UINT64) { @@ -1293,6 +1295,31 @@ zprop_parse_value(libzfs_handle_t *hdl, nvpair_t *elem prop == ZFS_PROP_SNAPSHOT_LIMIT)) { *ivalp = UINT64_MAX; } + + /* + * Special handling for setting 'refreservation' to 'auto'. Use + * UINT64_MAX to tell the caller to use zfs_fix_auto_resv(). + * 'auto' is only allowed on volumes. + */ + if (isauto) { + switch (prop) { + case ZFS_PROP_REFRESERVATION: + if ((type & ZFS_TYPE_VOLUME) == 0) { + zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, + "'%s=auto' only allowed on " + "volumes"), nvpair_name(elem)); + goto error; + } + *ivalp = UINT64_MAX; + break; + default: + zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, + "'auto' is invalid value for '%s'"), + nvpair_name(elem)); + goto error; + } + } + break; case PROP_TYPE_INDEX: From owner-svn-src-stable-11@freebsd.org Wed Oct 3 02:09:39 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 16AF810B4741; Wed, 3 Oct 2018 02:09:39 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id BE21986048; Wed, 3 Oct 2018 02:09:38 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id B8EC7107EF; Wed, 3 Oct 2018 02:09:38 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w9329cu6040643; Wed, 3 Oct 2018 02:09:38 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w9329bEr040634; Wed, 3 Oct 2018 02:09:37 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030209.w9329bEr040634@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:09:37 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339104 - in stable/11: cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/cmd/zpool cddl/contrib/opensolaris/cmd/ztest sys/cddl/contrib/opensolaris/common/zfs sys/cddl/contrib/o... X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11: cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/cmd/zpool cddl/contrib/opensolaris/cmd/ztest sys/cddl/contrib/opensolaris/common/zfs sys/cddl/contrib/opensolaris/uts/common/fs/... X-SVN-Commit-Revision: 339104 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:09:39 -0000 Author: mav Date: Wed Oct 3 02:09:36 2018 New Revision: 339104 URL: https://svnweb.freebsd.org/changeset/base/339104 Log: MFC r336947: MFV r336946: 9238 ZFS Spacemap Encoding V2 The current space map encoding has the following disadvantages: [1] Assuming 512 sector size each entry can represent at most 16MB for a segment. This makes the encoding very inefficient for large regions of space. [2] As vdev-wide space maps have started to be used by new features (i.e. device removal, zpool checkpoint) we've started imposing limits in the vdevs that can be used with them based on the maximum addressable offset (currently 64PB for a top-level vdev). The new remains backwards compatible with the old one. The introduced two-word entry format, besides extending the limits imposed by the single-entry layout, also includes a vdev field and some extra padding after its prefix. The extra padding after the prefix should is reserved for future usage (e.g. new prefixes for future encodings or new fields for flags). The new vdev field not only makes the space maps more self-descriptive, but also opens the doors for pool-wide space maps. One final important note is that the number of bits used for vdevs is reduced to 24 bits for blkptrs. That was decided as we don't know of any setups that use more than 16M vdevs for the time being and we wanted to fit the vdev field in the space map. In addition that gives us some extra bits in dva_t. illumos/illumos-gate@17f11284b49b98353b5119463254074fd9bc0a28 Reviewed by: Matt Ahrens Reviewed by: George Wilson Approved by: Gordon Ross Author: Serapheim Dimitropoulos Modified: stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool-features.7 stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfeature_common.c stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfeature_common.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_checkpoint.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/space_map.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect_mapping.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Oct 3 02:08:32 2018 (r339103) +++ stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Oct 3 02:09:36 2018 (r339104) @@ -774,7 +774,6 @@ verify_spacemap_refcounts(spa_t *spa) static void dump_spacemap(objset_t *os, space_map_t *sm) { - uint64_t alloc, offset, entry; char *ddata[] = { "ALLOC", "FREE", "CONDENSE", "INVALID", "INVALID", "INVALID", "INVALID", "INVALID" }; @@ -791,41 +790,73 @@ dump_spacemap(objset_t *os, space_map_t *sm) /* * Print out the freelist entries in both encoded and decoded form. */ - alloc = 0; - for (offset = 0; offset < space_map_length(sm); - offset += sizeof (entry)) { - uint8_t mapshift = sm->sm_shift; + uint8_t mapshift = sm->sm_shift; + int64_t alloc = 0; + uint64_t word; + for (uint64_t offset = 0; offset < space_map_length(sm); + offset += sizeof (word)) { VERIFY0(dmu_read(os, space_map_object(sm), offset, - sizeof (entry), &entry, DMU_READ_PREFETCH)); - if (SM_DEBUG_DECODE(entry)) { + sizeof (word), &word, DMU_READ_PREFETCH)); + if (sm_entry_is_debug(word)) { (void) printf("\t [%6llu] %s: txg %llu, pass %llu\n", - (u_longlong_t)(offset / sizeof (entry)), - ddata[SM_DEBUG_ACTION_DECODE(entry)], - (u_longlong_t)SM_DEBUG_TXG_DECODE(entry), - (u_longlong_t)SM_DEBUG_SYNCPASS_DECODE(entry)); + (u_longlong_t)(offset / sizeof (word)), + ddata[SM_DEBUG_ACTION_DECODE(word)], + (u_longlong_t)SM_DEBUG_TXG_DECODE(word), + (u_longlong_t)SM_DEBUG_SYNCPASS_DECODE(word)); + continue; + } + + uint8_t words; + char entry_type; + uint64_t entry_off, entry_run, entry_vdev = SM_NO_VDEVID; + + if (sm_entry_is_single_word(word)) { + entry_type = (SM_TYPE_DECODE(word) == SM_ALLOC) ? + 'A' : 'F'; + entry_off = (SM_OFFSET_DECODE(word) << mapshift) + + sm->sm_start; + entry_run = SM_RUN_DECODE(word) << mapshift; + words = 1; } else { - (void) printf("\t [%6llu] %c range:" - " %010llx-%010llx size: %06llx\n", - (u_longlong_t)(offset / sizeof (entry)), - SM_TYPE_DECODE(entry) == SM_ALLOC ? 'A' : 'F', - (u_longlong_t)((SM_OFFSET_DECODE(entry) << - mapshift) + sm->sm_start), - (u_longlong_t)((SM_OFFSET_DECODE(entry) << - mapshift) + sm->sm_start + - (SM_RUN_DECODE(entry) << mapshift)), - (u_longlong_t)(SM_RUN_DECODE(entry) << mapshift)); - if (SM_TYPE_DECODE(entry) == SM_ALLOC) - alloc += SM_RUN_DECODE(entry) << mapshift; - else - alloc -= SM_RUN_DECODE(entry) << mapshift; + /* it is a two-word entry so we read another word */ + ASSERT(sm_entry_is_double_word(word)); + + uint64_t extra_word; + offset += sizeof (extra_word); + VERIFY0(dmu_read(os, space_map_object(sm), offset, + sizeof (extra_word), &extra_word, + DMU_READ_PREFETCH)); + + ASSERT3U(offset, <=, space_map_length(sm)); + + entry_run = SM2_RUN_DECODE(word) << mapshift; + entry_vdev = SM2_VDEV_DECODE(word); + entry_type = (SM2_TYPE_DECODE(extra_word) == SM_ALLOC) ? + 'A' : 'F'; + entry_off = (SM2_OFFSET_DECODE(extra_word) << + mapshift) + sm->sm_start; + words = 2; } + + (void) printf("\t [%6llu] %c range:" + " %010llx-%010llx size: %06llx vdev: %06llu words: %u\n", + (u_longlong_t)(offset / sizeof (word)), + entry_type, (u_longlong_t)entry_off, + (u_longlong_t)(entry_off + entry_run), + (u_longlong_t)entry_run, + (u_longlong_t)entry_vdev, words); + + if (entry_type == 'A') + alloc += entry_run; + else + alloc -= entry_run; } - if (alloc != space_map_allocated(sm)) { - (void) printf("space_map_object alloc (%llu) INCONSISTENT " - "with space map summary (%llu)\n", - (u_longlong_t)space_map_allocated(sm), (u_longlong_t)alloc); + if ((uint64_t)alloc != space_map_allocated(sm)) { + (void) printf("space_map_object alloc (%lld) INCONSISTENT " + "with space map summary (%lld)\n", + (longlong_t)space_map_allocated(sm), (longlong_t)alloc); } } @@ -1155,7 +1186,7 @@ dump_ddt(ddt_t *ddt, enum ddt_type type, enum ddt_clas while ((error = ddt_object_walk(ddt, type, class, &walk, &dde)) == 0) dump_dde(ddt, &dde, walk); - ASSERT(error == ENOENT); + ASSERT3U(error, ==, ENOENT); (void) printf("\n"); } @@ -3097,15 +3128,14 @@ typedef struct checkpoint_sm_exclude_entry_arg { } checkpoint_sm_exclude_entry_arg_t; static int -checkpoint_sm_exclude_entry_cb(maptype_t type, uint64_t offset, uint64_t size, - void *arg) +checkpoint_sm_exclude_entry_cb(space_map_entry_t *sme, void *arg) { checkpoint_sm_exclude_entry_arg_t *cseea = arg; vdev_t *vd = cseea->cseea_vd; - metaslab_t *ms = vd->vdev_ms[offset >> vd->vdev_ms_shift]; - uint64_t end = offset + size; + metaslab_t *ms = vd->vdev_ms[sme->sme_offset >> vd->vdev_ms_shift]; + uint64_t end = sme->sme_offset + sme->sme_run; - ASSERT(type == SM_FREE); + ASSERT(sme->sme_type == SM_FREE); /* * Since the vdev_checkpoint_sm exists in the vdev level @@ -3123,7 +3153,7 @@ checkpoint_sm_exclude_entry_cb(maptype_t type, uint64_ * metaslab boundaries. So if needed we could add code * that handles metaslab-crossing segments in the future. */ - VERIFY3U(offset, >=, ms->ms_start); + VERIFY3U(sme->sme_offset, >=, ms->ms_start); VERIFY3U(end, <=, ms->ms_start + ms->ms_size); /* @@ -3131,10 +3161,10 @@ checkpoint_sm_exclude_entry_cb(maptype_t type, uint64_ * also verify that the entry is there to begin with. */ mutex_enter(&ms->ms_lock); - range_tree_remove(ms->ms_allocatable, offset, size); + range_tree_remove(ms->ms_allocatable, sme->sme_offset, sme->sme_run); mutex_exit(&ms->ms_lock); - cseea->cseea_checkpoint_size += size; + cseea->cseea_checkpoint_size += sme->sme_run; return (0); } @@ -4109,15 +4139,14 @@ typedef struct verify_checkpoint_sm_entry_cb_arg { #define ENTRIES_PER_PROGRESS_UPDATE 10000 static int -verify_checkpoint_sm_entry_cb(maptype_t type, uint64_t offset, uint64_t size, - void *arg) +verify_checkpoint_sm_entry_cb(space_map_entry_t *sme, void *arg) { verify_checkpoint_sm_entry_cb_arg_t *vcsec = arg; vdev_t *vd = vcsec->vcsec_vd; - metaslab_t *ms = vd->vdev_ms[offset >> vd->vdev_ms_shift]; - uint64_t end = offset + size; + metaslab_t *ms = vd->vdev_ms[sme->sme_offset >> vd->vdev_ms_shift]; + uint64_t end = sme->sme_offset + sme->sme_run; - ASSERT(type == SM_FREE); + ASSERT(sme->sme_type == SM_FREE); if ((vcsec->vcsec_entryid % ENTRIES_PER_PROGRESS_UPDATE) == 0) { (void) fprintf(stderr, @@ -4131,7 +4160,7 @@ verify_checkpoint_sm_entry_cb(maptype_t type, uint64_t /* * See comment in checkpoint_sm_exclude_entry_cb() */ - VERIFY3U(offset, >=, ms->ms_start); + VERIFY3U(sme->sme_offset, >=, ms->ms_start); VERIFY3U(end, <=, ms->ms_start + ms->ms_size); /* @@ -4140,7 +4169,7 @@ verify_checkpoint_sm_entry_cb(maptype_t type, uint64_t * their respective ms_allocateable trees should not contain them. */ mutex_enter(&ms->ms_lock); - range_tree_verify(ms->ms_allocatable, offset, size); + range_tree_verify(ms->ms_allocatable, sme->sme_offset, sme->sme_run); mutex_exit(&ms->ms_lock); return (0); @@ -4386,7 +4415,7 @@ verify_checkpoint(spa_t *spa) DMU_POOL_ZPOOL_CHECKPOINT, sizeof (uint64_t), sizeof (uberblock_t) / sizeof (uint64_t), &checkpoint); - if (error == ENOENT) { + if (error == ENOENT && !dump_opt['L']) { /* * If the feature is active but the uberblock is missing * then we must be in the middle of discarding the @@ -4409,7 +4438,7 @@ verify_checkpoint(spa_t *spa) error = 3; } - if (error == 0) + if (error == 0 && !dump_opt['L']) verify_checkpoint_blocks(spa); return (error); @@ -4514,7 +4543,7 @@ dump_zpool(spa_t *spa) if (dump_opt['h']) dump_history(spa); - if (rc == 0 && !dump_opt['L']) + if (rc == 0) rc = verify_checkpoint(spa); if (rc != 0) { Modified: stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool-features.7 ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool-features.7 Wed Oct 3 02:08:32 2018 (r339103) +++ stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool-features.7 Wed Oct 3 02:09:36 2018 (r339104) @@ -482,6 +482,24 @@ This feature becomes when the "zpool remove" command is used on a top-level vdev, and will never return to being .Sy enabled . +.It Sy spacemap_v2 +.Bl -column "READ\-ONLY COMPATIBLE" "com.delphix:spacemap_v2" +.It GUID Ta com.delphix:spacemap_v2 +.It READ\-ONLY COMPATIBLE Ta yes +.It DEPENDENCIES Ta none +.El +.Pp +This feature enables the use of the new space map encoding which +consists of two words (instead of one) whenever it is advantageous. +The new encoding allows space maps to represent large regions of +space more efficiently on-disk while also increasing their maximum +addressable offset. +.Pp +This feature becomes +.Sy active +as soon as it is enabled and will +never return to being +.Sy enabled . .It Sy large_blocks .Bl -column "READ\-ONLY COMPATIBLE" "org.open-zfs:large_block" .It GUID Ta org.open-zfs:large_block Modified: stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c Wed Oct 3 02:08:32 2018 (r339103) +++ stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c Wed Oct 3 02:09:36 2018 (r339104) @@ -195,6 +195,7 @@ extern uint64_t zfs_deadman_synctime_ms; extern int metaslab_preload_limit; extern boolean_t zfs_compressed_arc_enabled; extern boolean_t zfs_abd_scatter_enabled; +extern boolean_t zfs_force_some_double_word_sm_entries; static ztest_shared_opts_t *ztest_shared_opts; static ztest_shared_opts_t ztest_opts; @@ -6397,6 +6398,12 @@ main(int argc, char **argv) dprintf_setup(&argc, argv); zfs_deadman_synctime_ms = 300000; + /* + * As two-word space map entries may not come up often (especially + * if pool and vdev sizes are small) we want to force at least some + * of them so the feature get tested. + */ + zfs_force_some_double_word_sm_entries = B_TRUE; ztest_fd_rand = open("/dev/urandom", O_RDONLY); ASSERT3S(ztest_fd_rand, >=, 0); Modified: stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfeature_common.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfeature_common.c Wed Oct 3 02:08:32 2018 (r339103) +++ stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfeature_common.c Wed Oct 3 02:09:36 2018 (r339104) @@ -229,6 +229,12 @@ zpool_feature_init(void) "Pool state can be checkpointed, allowing rewind later.", ZFEATURE_FLAG_READONLY_COMPAT, NULL); + zfeature_register(SPA_FEATURE_SPACEMAP_V2, + "com.delphix:spacemap_v2", "spacemap_v2", + "Space maps representing large segments are more efficient.", + ZFEATURE_FLAG_READONLY_COMPAT | ZFEATURE_FLAG_ACTIVATE_ON_ENABLE, + NULL); + static const spa_feature_t large_blocks_deps[] = { SPA_FEATURE_EXTENSIBLE_DATASET, SPA_FEATURE_NONE Modified: stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfeature_common.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfeature_common.h Wed Oct 3 02:08:32 2018 (r339103) +++ stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfeature_common.h Wed Oct 3 02:09:36 2018 (r339104) @@ -60,6 +60,7 @@ typedef enum spa_feature { SPA_FEATURE_DEVICE_REMOVAL, SPA_FEATURE_OBSOLETE_COUNTS, SPA_FEATURE_POOL_CHECKPOINT, + SPA_FEATURE_SPACEMAP_V2, SPA_FEATURES } spa_feature_t; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Oct 3 02:08:32 2018 (r339103) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Oct 3 02:09:36 2018 (r339104) @@ -2105,17 +2105,6 @@ metaslab_group_preload(metaslab_group_t *mg) * * 3. The on-disk size of the space map should actually decrease. * - * Checking the first condition is tricky since we don't want to walk - * the entire AVL tree calculating the estimated on-disk size. Instead we - * use the size-ordered range tree in the metaslab and calculate the - * size required to write out the largest segment in our free tree. If the - * size required to represent that segment on disk is larger than the space - * map object then we avoid condensing this map. - * - * To determine the second criterion we use a best-case estimate and assume - * each segment can be represented on-disk as a single 64-bit entry. We refer - * to this best-case estimate as the space map's minimal form. - * * Unfortunately, we cannot compute the on-disk size of the space map in this * context because we cannot accurately compute the effects of compression, etc. * Instead, we apply the heuristic described in the block comment for @@ -2126,9 +2115,6 @@ static boolean_t metaslab_should_condense(metaslab_t *msp) { space_map_t *sm = msp->ms_sm; - range_seg_t *rs; - uint64_t size, entries, segsz, object_size, optimal_size, record_size; - dmu_object_info_t doi; vdev_t *vd = msp->ms_group->mg_vd; uint64_t vdev_blocksize = 1 << vd->vdev_ashift; uint64_t current_txg = spa_syncing_txg(vd->vdev_spa); @@ -2154,34 +2140,22 @@ metaslab_should_condense(metaslab_t *msp) msp->ms_condense_checked_txg = current_txg; /* - * Use the ms_allocatable_by_size range tree, which is ordered by - * size, to obtain the largest segment in the free tree. We always - * condense metaslabs that are empty and metaslabs for which a - * condense request has been made. + * We always condense metaslabs that are empty and metaslabs for + * which a condense request has been made. */ - rs = avl_last(&msp->ms_allocatable_by_size); - if (rs == NULL || msp->ms_condense_wanted) + if (avl_is_empty(&msp->ms_allocatable_by_size) || + msp->ms_condense_wanted) return (B_TRUE); - /* - * Calculate the number of 64-bit entries this segment would - * require when written to disk. If this single segment would be - * larger on-disk than the entire current on-disk structure, then - * clearly condensing will increase the on-disk structure size. - */ - size = (rs->rs_end - rs->rs_start) >> sm->sm_shift; - entries = size / (MIN(size, SM_RUN_MAX)); - segsz = entries * sizeof (uint64_t); + uint64_t object_size = space_map_length(msp->ms_sm); + uint64_t optimal_size = space_map_estimate_optimal_size(sm, + msp->ms_allocatable, SM_NO_VDEVID); - optimal_size = - sizeof (uint64_t) * avl_numnodes(&msp->ms_allocatable->rt_root); - object_size = space_map_length(msp->ms_sm); - + dmu_object_info_t doi; dmu_object_info_from_db(sm->sm_dbuf, &doi); - record_size = MAX(doi.doi_data_block_size, vdev_blocksize); + uint64_t record_size = MAX(doi.doi_data_block_size, vdev_blocksize); - return (segsz <= object_size && - object_size >= (optimal_size * zfs_condense_pct / 100) && + return (object_size >= (optimal_size * zfs_condense_pct / 100) && object_size > zfs_metaslab_condense_block_threshold * record_size); } @@ -2256,11 +2230,11 @@ metaslab_condense(metaslab_t *msp, uint64_t txg, dmu_t * optimal, this is typically close to optimal, and much cheaper to * compute. */ - space_map_write(sm, condense_tree, SM_ALLOC, tx); + space_map_write(sm, condense_tree, SM_ALLOC, SM_NO_VDEVID, tx); range_tree_vacate(condense_tree, NULL, NULL); range_tree_destroy(condense_tree); - space_map_write(sm, msp->ms_allocatable, SM_FREE, tx); + space_map_write(sm, msp->ms_allocatable, SM_FREE, SM_NO_VDEVID, tx); mutex_enter(&msp->ms_lock); msp->ms_condensing = B_FALSE; } @@ -2372,8 +2346,10 @@ metaslab_sync(metaslab_t *msp, uint64_t txg) metaslab_condense(msp, txg, tx); } else { mutex_exit(&msp->ms_lock); - space_map_write(msp->ms_sm, alloctree, SM_ALLOC, tx); - space_map_write(msp->ms_sm, msp->ms_freeing, SM_FREE, tx); + space_map_write(msp->ms_sm, alloctree, SM_ALLOC, + SM_NO_VDEVID, tx); + space_map_write(msp->ms_sm, msp->ms_freeing, SM_FREE, + SM_NO_VDEVID, tx); mutex_enter(&msp->ms_lock); } @@ -2388,7 +2364,7 @@ metaslab_sync(metaslab_t *msp, uint64_t txg) */ mutex_exit(&msp->ms_lock); space_map_write(vd->vdev_checkpoint_sm, - msp->ms_checkpointing, SM_FREE, tx); + msp->ms_checkpointing, SM_FREE, SM_NO_VDEVID, tx); mutex_enter(&msp->ms_lock); space_map_update(vd->vdev_checkpoint_sm); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_checkpoint.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_checkpoint.c Wed Oct 3 02:08:32 2018 (r339103) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_checkpoint.c Wed Oct 3 02:09:36 2018 (r339104) @@ -203,13 +203,12 @@ typedef struct spa_checkpoint_discard_sync_callback_ar } spa_checkpoint_discard_sync_callback_arg_t; static int -spa_checkpoint_discard_sync_callback(maptype_t type, uint64_t offset, - uint64_t size, void *arg) +spa_checkpoint_discard_sync_callback(space_map_entry_t *sme, void *arg) { spa_checkpoint_discard_sync_callback_arg_t *sdc = arg; vdev_t *vd = sdc->sdc_vd; - metaslab_t *ms = vd->vdev_ms[offset >> vd->vdev_ms_shift]; - uint64_t end = offset + size; + metaslab_t *ms = vd->vdev_ms[sme->sme_offset >> vd->vdev_ms_shift]; + uint64_t end = sme->sme_offset + sme->sme_run; if (sdc->sdc_entry_limit == 0) return (EINTR); @@ -224,8 +223,8 @@ spa_checkpoint_discard_sync_callback(maptype_t type, u * metaslab boundaries. So if needed we could add code * that handles metaslab-crossing segments in the future. */ - VERIFY3U(type, ==, SM_FREE); - VERIFY3U(offset, >=, ms->ms_start); + VERIFY3U(sme->sme_type, ==, SM_FREE); + VERIFY3U(sme->sme_offset, >=, ms->ms_start); VERIFY3U(end, <=, ms->ms_start + ms->ms_size); /* @@ -237,14 +236,15 @@ spa_checkpoint_discard_sync_callback(maptype_t type, u mutex_enter(&ms->ms_lock); if (range_tree_is_empty(ms->ms_freeing)) vdev_dirty(vd, VDD_METASLAB, ms, sdc->sdc_txg); - range_tree_add(ms->ms_freeing, offset, size); + range_tree_add(ms->ms_freeing, sme->sme_offset, sme->sme_run); mutex_exit(&ms->ms_lock); - ASSERT3U(vd->vdev_spa->spa_checkpoint_info.sci_dspace, >=, size); - ASSERT3U(vd->vdev_stat.vs_checkpoint_space, >=, size); + ASSERT3U(vd->vdev_spa->spa_checkpoint_info.sci_dspace, >=, + sme->sme_run); + ASSERT3U(vd->vdev_stat.vs_checkpoint_space, >=, sme->sme_run); - vd->vdev_spa->spa_checkpoint_info.sci_dspace -= size; - vd->vdev_stat.vs_checkpoint_space -= size; + vd->vdev_spa->spa_checkpoint_info.sci_dspace -= sme->sme_run; + vd->vdev_stat.vs_checkpoint_space -= sme->sme_run; sdc->sdc_entry_limit--; return (0); @@ -289,13 +289,14 @@ spa_checkpoint_discard_thread_sync(void *arg, dmu_tx_t * Thus, we set the maximum entries that the space map callback * will be applied to be half the entries that could fit in the * imposed memory limit. + * + * Note that since this is a conservative estimate we also + * assume the worst case scenario in our computation where each + * entry is two-word. */ uint64_t max_entry_limit = - (zfs_spa_discard_memory_limit / sizeof (uint64_t)) >> 1; + (zfs_spa_discard_memory_limit / (2 * sizeof (uint64_t))) >> 1; - uint64_t entries_in_sm = - space_map_length(vd->vdev_checkpoint_sm) / sizeof (uint64_t); - /* * Iterate from the end of the space map towards the beginning, * placing its entries on ms_freeing and removing them from the @@ -318,14 +319,15 @@ spa_checkpoint_discard_thread_sync(void *arg, dmu_tx_t spa_checkpoint_discard_sync_callback_arg_t sdc; sdc.sdc_vd = vd; sdc.sdc_txg = tx->tx_txg; - sdc.sdc_entry_limit = MIN(entries_in_sm, max_entry_limit); + sdc.sdc_entry_limit = max_entry_limit; - uint64_t entries_before = entries_in_sm; + uint64_t words_before = + space_map_length(vd->vdev_checkpoint_sm) / sizeof (uint64_t); error = space_map_incremental_destroy(vd->vdev_checkpoint_sm, spa_checkpoint_discard_sync_callback, &sdc, tx); - uint64_t entries_after = + uint64_t words_after = space_map_length(vd->vdev_checkpoint_sm) / sizeof (uint64_t); #ifdef DEBUG @@ -333,9 +335,9 @@ spa_checkpoint_discard_thread_sync(void *arg, dmu_tx_t #endif zfs_dbgmsg("discarding checkpoint: txg %llu, vdev id %d, " - "deleted %llu entries - %llu entries are left", - tx->tx_txg, vd->vdev_id, (entries_before - entries_after), - entries_after); + "deleted %llu words - %llu words are left", + tx->tx_txg, vd->vdev_id, (words_before - words_after), + words_after); if (error != EINTR) { if (error != 0) { @@ -344,15 +346,15 @@ spa_checkpoint_discard_thread_sync(void *arg, dmu_tx_t "space map of vdev %llu\n", error, vd->vdev_id); } - ASSERT0(entries_after); + ASSERT0(words_after); ASSERT0(vd->vdev_checkpoint_sm->sm_alloc); - ASSERT0(vd->vdev_checkpoint_sm->sm_length); + ASSERT0(space_map_length(vd->vdev_checkpoint_sm)); space_map_free(vd->vdev_checkpoint_sm, tx); space_map_close(vd->vdev_checkpoint_sm); vd->vdev_checkpoint_sm = NULL; - VERIFY0(zap_remove(vd->vdev_spa->spa_meta_objset, + VERIFY0(zap_remove(spa_meta_objset(vd->vdev_spa), vd->vdev_top_zap, VDEV_TOP_ZAP_POOL_CHECKPOINT_SM, tx)); } } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c Wed Oct 3 02:08:32 2018 (r339103) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c Wed Oct 3 02:09:36 2018 (r339104) @@ -43,68 +43,194 @@ SYSCTL_DECL(_vfs_zfs); * Note on space map block size: * * The data for a given space map can be kept on blocks of any size. - * Larger blocks entail fewer i/o operations, but they also cause the - * DMU to keep more data in-core, and also to waste more i/o bandwidth + * Larger blocks entail fewer I/O operations, but they also cause the + * DMU to keep more data in-core, and also to waste more I/O bandwidth * when only a few blocks have changed since the last transaction group. */ /* + * Enabled whenever we want to stress test the use of double-word + * space map entries. + */ +boolean_t zfs_force_some_double_word_sm_entries = B_FALSE; + +boolean_t +sm_entry_is_debug(uint64_t e) +{ + return (SM_PREFIX_DECODE(e) == SM_DEBUG_PREFIX); +} + +boolean_t +sm_entry_is_single_word(uint64_t e) +{ + uint8_t prefix = SM_PREFIX_DECODE(e); + return (prefix != SM_DEBUG_PREFIX && prefix != SM2_PREFIX); +} + +boolean_t +sm_entry_is_double_word(uint64_t e) +{ + return (SM_PREFIX_DECODE(e) == SM2_PREFIX); +} + +/* * Iterate through the space map, invoking the callback on each (non-debug) * space map entry. */ int space_map_iterate(space_map_t *sm, sm_cb_t callback, void *arg) { - uint64_t *entry, *entry_map, *entry_map_end; - uint64_t bufsize, size, offset, end; + uint64_t sm_len = space_map_length(sm); + ASSERT3U(sm->sm_blksz, !=, 0); + + dmu_prefetch(sm->sm_os, space_map_object(sm), 0, 0, sm_len, + ZIO_PRIORITY_SYNC_READ); + + uint64_t blksz = sm->sm_blksz; int error = 0; + for (uint64_t block_base = 0; block_base < sm_len && error == 0; + block_base += blksz) { + dmu_buf_t *db; + error = dmu_buf_hold(sm->sm_os, space_map_object(sm), + block_base, FTAG, &db, DMU_READ_PREFETCH); + if (error != 0) + return (error); - end = space_map_length(sm); + uint64_t *block_start = db->db_data; + uint64_t block_length = MIN(sm_len - block_base, blksz); + uint64_t *block_end = block_start + + (block_length / sizeof (uint64_t)); - bufsize = MAX(sm->sm_blksz, SPA_MINBLOCKSIZE); - entry_map = zio_buf_alloc(bufsize); + VERIFY0(P2PHASE(block_length, sizeof (uint64_t))); + VERIFY3U(block_length, !=, 0); + ASSERT3U(blksz, ==, db->db_size); - if (end > bufsize) { - dmu_prefetch(sm->sm_os, space_map_object(sm), 0, bufsize, - end - bufsize, ZIO_PRIORITY_SYNC_READ); + for (uint64_t *block_cursor = block_start; + block_cursor < block_end && error == 0; block_cursor++) { + uint64_t e = *block_cursor; + + if (sm_entry_is_debug(e)) /* Skip debug entries */ + continue; + + uint64_t raw_offset, raw_run, vdev_id; + maptype_t type; + if (sm_entry_is_single_word(e)) { + type = SM_TYPE_DECODE(e); + vdev_id = SM_NO_VDEVID; + raw_offset = SM_OFFSET_DECODE(e); + raw_run = SM_RUN_DECODE(e); + } else { + /* it is a two-word entry */ + ASSERT(sm_entry_is_double_word(e)); + raw_run = SM2_RUN_DECODE(e); + vdev_id = SM2_VDEV_DECODE(e); + + /* move on to the second word */ + block_cursor++; + e = *block_cursor; + VERIFY3P(block_cursor, <=, block_end); + + type = SM2_TYPE_DECODE(e); + raw_offset = SM2_OFFSET_DECODE(e); + } + + uint64_t entry_offset = (raw_offset << sm->sm_shift) + + sm->sm_start; + uint64_t entry_run = raw_run << sm->sm_shift; + + VERIFY0(P2PHASE(entry_offset, 1ULL << sm->sm_shift)); + VERIFY0(P2PHASE(entry_run, 1ULL << sm->sm_shift)); + ASSERT3U(entry_offset, >=, sm->sm_start); + ASSERT3U(entry_offset, <, sm->sm_start + sm->sm_size); + ASSERT3U(entry_run, <=, sm->sm_size); + ASSERT3U(entry_offset + entry_run, <=, + sm->sm_start + sm->sm_size); + + space_map_entry_t sme = { + .sme_type = type, + .sme_vdev = vdev_id, + .sme_offset = entry_offset, + .sme_run = entry_run + }; + error = callback(&sme, arg); + } + dmu_buf_rele(db, FTAG); } + return (error); +} - for (offset = 0; offset < end && error == 0; offset += bufsize) { - size = MIN(end - offset, bufsize); - VERIFY(P2PHASE(size, sizeof (uint64_t)) == 0); - VERIFY(size != 0); - ASSERT3U(sm->sm_blksz, !=, 0); +/* + * Reads the entries from the last block of the space map into + * buf in reverse order. Populates nwords with number of words + * in the last block. + * + * Refer to block comment within space_map_incremental_destroy() + * to understand why this function is needed. + */ +static int +space_map_reversed_last_block_entries(space_map_t *sm, uint64_t *buf, + uint64_t bufsz, uint64_t *nwords) +{ + int error = 0; + dmu_buf_t *db; - dprintf("object=%llu offset=%llx size=%llx\n", - space_map_object(sm), offset, size); + /* + * Find the offset of the last word in the space map and use + * that to read the last block of the space map with + * dmu_buf_hold(). + */ + uint64_t last_word_offset = + sm->sm_phys->smp_objsize - sizeof (uint64_t); + error = dmu_buf_hold(sm->sm_os, space_map_object(sm), last_word_offset, + FTAG, &db, DMU_READ_NO_PREFETCH); + if (error != 0) + return (error); - error = dmu_read(sm->sm_os, space_map_object(sm), offset, size, - entry_map, DMU_READ_PREFETCH); - if (error != 0) - break; + ASSERT3U(sm->sm_object, ==, db->db_object); + ASSERT3U(sm->sm_blksz, ==, db->db_size); + ASSERT3U(bufsz, >=, db->db_size); + ASSERT(nwords != NULL); - entry_map_end = entry_map + (size / sizeof (uint64_t)); - for (entry = entry_map; entry < entry_map_end && error == 0; - entry++) { - uint64_t e = *entry; - uint64_t offset, size; + uint64_t *words = db->db_data; + *nwords = + (sm->sm_phys->smp_objsize - db->db_offset) / sizeof (uint64_t); - if (SM_DEBUG_DECODE(e)) /* Skip debug entries */ - continue; + ASSERT3U(*nwords, <=, bufsz / sizeof (uint64_t)); - offset = (SM_OFFSET_DECODE(e) << sm->sm_shift) + - sm->sm_start; - size = SM_RUN_DECODE(e) << sm->sm_shift; + uint64_t n = *nwords; + uint64_t j = n - 1; + for (uint64_t i = 0; i < n; i++) { + uint64_t entry = words[i]; + if (sm_entry_is_double_word(entry)) { + /* + * Since we are populating the buffer backwards + * we have to be extra careful and add the two + * words of the double-word entry in the right + * order. + */ + ASSERT3U(j, >, 0); + buf[j - 1] = entry; - VERIFY0(P2PHASE(offset, 1ULL << sm->sm_shift)); - VERIFY0(P2PHASE(size, 1ULL << sm->sm_shift)); - VERIFY3U(offset, >=, sm->sm_start); - VERIFY3U(offset + size, <=, sm->sm_start + sm->sm_size); - error = callback(SM_TYPE_DECODE(e), offset, size, arg); + i++; + ASSERT3U(i, <, n); + entry = words[i]; + buf[j] = entry; + j -= 2; + } else { + ASSERT(sm_entry_is_debug(entry) || + sm_entry_is_single_word(entry)); + buf[j] = entry; + j--; } } - zio_buf_free(entry_map, bufsize); + /* + * Assert that we wrote backwards all the + * way to the beginning of the buffer. + */ + ASSERT3S(j, ==, -1); + + dmu_buf_rele(db, FTAG); return (error); } @@ -118,124 +244,122 @@ int space_map_incremental_destroy(space_map_t *sm, sm_cb_t callback, void *arg, dmu_tx_t *tx) { - uint64_t bufsize, len; - uint64_t *entry_map; - int error = 0; + uint64_t bufsz = MAX(sm->sm_blksz, SPA_MINBLOCKSIZE); + uint64_t *buf = zio_buf_alloc(bufsz); - len = space_map_length(sm); - bufsize = MAX(sm->sm_blksz, SPA_MINBLOCKSIZE); - entry_map = zio_buf_alloc(bufsize); - dmu_buf_will_dirty(sm->sm_dbuf, tx); /* - * Since we can't move the starting offset of the space map - * (e.g there are reference on-disk pointing to it), we destroy - * its entries incrementally starting from the end. + * Ideally we would want to iterate from the beginning of the + * space map to the end in incremental steps. The issue with this + * approach is that we don't have any field on-disk that points + * us where to start between each step. We could try zeroing out + * entries that we've destroyed, but this doesn't work either as + * an entry that is 0 is a valid one (ALLOC for range [0x0:0x200]). * - * The logic that follows is basically the same as the one used - * in space_map_iterate() but it traverses the space map - * backwards: + * As a result, we destroy its entries incrementally starting from + * the end after applying the callback to each of them. * - * 1] We figure out the size of the buffer that we want to use - * to read the on-disk space map entries. - * 2] We figure out the offset at the end of the space map where - * we will start reading entries into our buffer. - * 3] We read the on-disk entries into the buffer. - * 4] We iterate over the entries from end to beginning calling - * the callback function on each one. As we move from entry - * to entry we decrease the size of the space map, deleting - * effectively each entry. - * 5] If there are no more entries in the space map or the - * callback returns a value other than 0, we stop iterating - * over the space map. If there are entries remaining and - * the callback returned zero we go back to step [1]. + * The problem with this approach is that we cannot literally + * iterate through the words in the space map backwards as we + * can't distinguish two-word space map entries from their second + * word. Thus we do the following: + * + * 1] We get all the entries from the last block of the space map + * and put them into a buffer in reverse order. This way the + * last entry comes first in the buffer, the second to last is + * second, etc. + * 2] We iterate through the entries in the buffer and we apply + * the callback to each one. As we move from entry to entry we + * we decrease the size of the space map, deleting effectively + * each entry. + * 3] If there are no more entries in the space map or the callback + * returns a value other than 0, we stop iterating over the + * space map. If there are entries remaining and the callback + * returned 0, we go back to step [1]. */ - uint64_t offset = 0, size = 0; - while (len > 0 && error == 0) { - size = MIN(bufsize, len); - - VERIFY(P2PHASE(size, sizeof (uint64_t)) == 0); - VERIFY3U(size, >, 0); - ASSERT3U(sm->sm_blksz, !=, 0); - - offset = len - size; - - IMPLY(bufsize > len, offset == 0); - IMPLY(bufsize == len, offset == 0); - IMPLY(bufsize < len, offset > 0); - - - EQUIV(size == len, offset == 0); - IMPLY(size < len, bufsize < len); - - dprintf("object=%llu offset=%llx size=%llx\n", - space_map_object(sm), offset, size); - - error = dmu_read(sm->sm_os, space_map_object(sm), - offset, size, entry_map, DMU_READ_PREFETCH); + int error = 0; + while (space_map_length(sm) > 0 && error == 0) { + uint64_t nwords = 0; + error = space_map_reversed_last_block_entries(sm, buf, bufsz, + &nwords); if (error != 0) break; - uint64_t num_entries = size / sizeof (uint64_t); + ASSERT3U(nwords, <=, bufsz / sizeof (uint64_t)); - ASSERT3U(num_entries, >, 0); + for (uint64_t i = 0; i < nwords; i++) { + uint64_t e = buf[i]; - while (num_entries > 0) { - uint64_t e, entry_offset, entry_size; + if (sm_entry_is_debug(e)) { + sm->sm_phys->smp_objsize -= sizeof (uint64_t); + space_map_update(sm); + continue; + } + + int words = 1; + uint64_t raw_offset, raw_run, vdev_id; maptype_t type; + if (sm_entry_is_single_word(e)) { + type = SM_TYPE_DECODE(e); + vdev_id = SM_NO_VDEVID; + raw_offset = SM_OFFSET_DECODE(e); + raw_run = SM_RUN_DECODE(e); + } else { + ASSERT(sm_entry_is_double_word(e)); + words = 2; - e = entry_map[num_entries - 1]; + raw_run = SM2_RUN_DECODE(e); + vdev_id = SM2_VDEV_DECODE(e); - ASSERT3U(num_entries, >, 0); - ASSERT0(error); + /* move to the second word */ + i++; + e = buf[i]; - if (SM_DEBUG_DECODE(e)) { - sm->sm_phys->smp_objsize -= sizeof (uint64_t); - space_map_update(sm); - len -= sizeof (uint64_t); - num_entries--; - continue; + ASSERT3P(i, <=, nwords); + + type = SM2_TYPE_DECODE(e); + raw_offset = SM2_OFFSET_DECODE(e); } - type = SM_TYPE_DECODE(e); - entry_offset = (SM_OFFSET_DECODE(e) << sm->sm_shift) + - sm->sm_start; - entry_size = SM_RUN_DECODE(e) << sm->sm_shift; + uint64_t entry_offset = + (raw_offset << sm->sm_shift) + sm->sm_start; + uint64_t entry_run = raw_run << sm->sm_shift; VERIFY0(P2PHASE(entry_offset, 1ULL << sm->sm_shift)); - VERIFY0(P2PHASE(entry_size, 1ULL << sm->sm_shift)); + VERIFY0(P2PHASE(entry_run, 1ULL << sm->sm_shift)); VERIFY3U(entry_offset, >=, sm->sm_start); - VERIFY3U(entry_offset + entry_size, <=, + VERIFY3U(entry_offset, <, sm->sm_start + sm->sm_size); + VERIFY3U(entry_run, <=, sm->sm_size); + VERIFY3U(entry_offset + entry_run, <=, sm->sm_start + sm->sm_size); - error = callback(type, entry_offset, entry_size, arg); + space_map_entry_t sme = { + .sme_type = type, + .sme_vdev = vdev_id, + .sme_offset = entry_offset, + .sme_run = entry_run + }; + error = callback(&sme, arg); if (error != 0) break; if (type == SM_ALLOC) - sm->sm_phys->smp_alloc -= entry_size; + sm->sm_phys->smp_alloc -= entry_run; else - sm->sm_phys->smp_alloc += entry_size; - - sm->sm_phys->smp_objsize -= sizeof (uint64_t); + sm->sm_phys->smp_alloc += entry_run; + sm->sm_phys->smp_objsize -= words * sizeof (uint64_t); space_map_update(sm); - len -= sizeof (uint64_t); - num_entries--; } - IMPLY(error == 0, num_entries == 0); - EQUIV(offset == 0 && error == 0, len == 0 && num_entries == 0); } - if (len == 0) { + if (space_map_length(sm) == 0) { ASSERT0(error); - ASSERT0(offset); - ASSERT0(sm->sm_length); ASSERT0(sm->sm_phys->smp_objsize); ASSERT0(sm->sm_alloc); } - zio_buf_free(entry_map, bufsize); + zio_buf_free(buf, bufsz); return (error); } @@ -246,16 +370,15 @@ typedef struct space_map_load_arg { } space_map_load_arg_t; static int -space_map_load_callback(maptype_t type, uint64_t offset, uint64_t size, - void *arg) +space_map_load_callback(space_map_entry_t *sme, void *arg) { space_map_load_arg_t *smla = arg; - if (type == smla->smla_type) { - VERIFY3U(range_tree_space(smla->smla_rt) + size, <=, + if (sme->sme_type == smla->smla_type) { + VERIFY3U(range_tree_space(smla->smla_rt) + sme->sme_run, <=, smla->smla_sm->sm_size); - range_tree_add(smla->smla_rt, offset, size); + range_tree_add(smla->smla_rt, sme->sme_offset, sme->sme_run); } else { - range_tree_remove(smla->smla_rt, offset, size); + range_tree_remove(smla->smla_rt, sme->sme_offset, sme->sme_run); } return (0); @@ -367,43 +490,239 @@ space_map_histogram_add(space_map_t *sm, range_tree_t } } -uint64_t *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-stable-11@freebsd.org Wed Oct 3 02:10:24 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E276110B4822; Wed, 3 Oct 2018 02:10:23 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 98C52861AD; Wed, 3 Oct 2018 02:10:23 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 936FF107F5; Wed, 3 Oct 2018 02:10:23 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w932ANeu040757; Wed, 3 Oct 2018 02:10:23 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w932ANJ7040754; Wed, 3 Oct 2018 02:10:23 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030210.w932ANJ7040754@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:10:23 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339105 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 339105 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:10:24 -0000 Author: mav Date: Wed Oct 3 02:10:23 2018 New Revision: 339105 URL: https://svnweb.freebsd.org/changeset/base/339105 Log: MFC r336949: MFV r336948: 9112 Improve allocation performance on high-end systems On high-end systems running async sequential write workloads, especially NUMA systems with flash or NVMe storage, one significant performance bottleneck is selecting a metaslab to do allocations from. This process can be parallelized, providing significant performance increases for these workloads. illumos/illumos-gate@f78cdc34af236a6199dd9e21376f4a46348c0d56 Reviewed by: Matthew Ahrens Reviewed by: George Wilson Reviewed by: Serapheim Dimitropoulos Reviewed by: Alexander Motin Approved by: Gordon Ross Author: Paul Dagnelie Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_removal.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Oct 3 02:09:36 2018 (r339104) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Oct 3 02:10:23 2018 (r339105) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2011, 2015 by Delphix. All rights reserved. + * Copyright (c) 2011, 2018 by Delphix. All rights reserved. * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. * Copyright (c) 2014 Integros [integros.com] */ @@ -275,6 +275,8 @@ static uint64_t metaslab_weight(metaslab_t *); static void metaslab_set_fragmentation(metaslab_t *); static void metaslab_free_impl(vdev_t *, uint64_t, uint64_t, boolean_t); static void metaslab_check_free_impl(vdev_t *, uint64_t, uint64_t); +static void metaslab_passivate(metaslab_t *msp, uint64_t weight); +static uint64_t metaslab_weight_from_range_tree(metaslab_t *msp); kmem_cache_t *metaslab_alloc_trace_cache; @@ -294,7 +296,12 @@ metaslab_class_create(spa_t *spa, metaslab_ops_t *ops) mc->mc_rotor = NULL; mc->mc_ops = ops; mutex_init(&mc->mc_lock, NULL, MUTEX_DEFAULT, NULL); - refcount_create_tracked(&mc->mc_alloc_slots); + mc->mc_alloc_slots = kmem_zalloc(spa->spa_alloc_count * + sizeof (refcount_t), KM_SLEEP); + mc->mc_alloc_max_slots = kmem_zalloc(spa->spa_alloc_count * + sizeof (uint64_t), KM_SLEEP); + for (int i = 0; i < spa->spa_alloc_count; i++) + refcount_create_tracked(&mc->mc_alloc_slots[i]); return (mc); } @@ -308,7 +315,12 @@ metaslab_class_destroy(metaslab_class_t *mc) ASSERT(mc->mc_space == 0); ASSERT(mc->mc_dspace == 0); - refcount_destroy(&mc->mc_alloc_slots); + for (int i = 0; i < mc->mc_spa->spa_alloc_count; i++) + refcount_destroy(&mc->mc_alloc_slots[i]); + kmem_free(mc->mc_alloc_slots, mc->mc_spa->spa_alloc_count * + sizeof (refcount_t)); + kmem_free(mc->mc_alloc_max_slots, mc->mc_spa->spa_alloc_count * + sizeof (uint64_t)); mutex_destroy(&mc->mc_lock); kmem_free(mc, sizeof (metaslab_class_t)); } @@ -532,6 +544,30 @@ metaslab_compare(const void *x1, const void *x2) const metaslab_t *m1 = x1; const metaslab_t *m2 = x2; + int sort1 = 0; + int sort2 = 0; + if (m1->ms_allocator != -1 && m1->ms_primary) + sort1 = 1; + else if (m1->ms_allocator != -1 && !m1->ms_primary) + sort1 = 2; + if (m2->ms_allocator != -1 && m2->ms_primary) + sort2 = 1; + else if (m2->ms_allocator != -1 && !m2->ms_primary) + sort2 = 2; + + /* + * Sort inactive metaslabs first, then primaries, then secondaries. When + * selecting a metaslab to allocate from, an allocator first tries its + * primary, then secondary active metaslab. If it doesn't have active + * metaslabs, or can't allocate from them, it searches for an inactive + * metaslab to activate. If it can't find a suitable one, it will steal + * a primary or secondary metaslab from another allocator. + */ + if (sort1 < sort2) + return (-1); + if (sort1 > sort2) + return (1); + if (m1->ms_weight < m2->ms_weight) return (1); if (m1->ms_weight > m2->ms_weight) @@ -683,12 +719,16 @@ metaslab_group_alloc_update(metaslab_group_t *mg) } metaslab_group_t * -metaslab_group_create(metaslab_class_t *mc, vdev_t *vd) +metaslab_group_create(metaslab_class_t *mc, vdev_t *vd, int allocators) { metaslab_group_t *mg; mg = kmem_zalloc(sizeof (metaslab_group_t), KM_SLEEP); mutex_init(&mg->mg_lock, NULL, MUTEX_DEFAULT, NULL); + mg->mg_primaries = kmem_zalloc(allocators * sizeof (metaslab_t *), + KM_SLEEP); + mg->mg_secondaries = kmem_zalloc(allocators * sizeof (metaslab_t *), + KM_SLEEP); avl_create(&mg->mg_metaslab_tree, metaslab_compare, sizeof (metaslab_t), offsetof(struct metaslab, ms_group_node)); mg->mg_vd = vd; @@ -696,8 +736,17 @@ metaslab_group_create(metaslab_class_t *mc, vdev_t *vd mg->mg_activation_count = 0; mg->mg_initialized = B_FALSE; mg->mg_no_free_space = B_TRUE; - refcount_create_tracked(&mg->mg_alloc_queue_depth); + mg->mg_allocators = allocators; + mg->mg_alloc_queue_depth = kmem_zalloc(allocators * sizeof (refcount_t), + KM_SLEEP); + mg->mg_cur_max_alloc_queue_depth = kmem_zalloc(allocators * + sizeof (uint64_t), KM_SLEEP); + for (int i = 0; i < allocators; i++) { + refcount_create_tracked(&mg->mg_alloc_queue_depth[i]); + mg->mg_cur_max_alloc_queue_depth[i] = 0; + } + mg->mg_taskq = taskq_create("metaslab_group_taskq", metaslab_load_pct, minclsyspri, 10, INT_MAX, TASKQ_THREADS_CPU_PCT); @@ -718,8 +767,20 @@ metaslab_group_destroy(metaslab_group_t *mg) taskq_destroy(mg->mg_taskq); avl_destroy(&mg->mg_metaslab_tree); + kmem_free(mg->mg_primaries, mg->mg_allocators * sizeof (metaslab_t *)); + kmem_free(mg->mg_secondaries, mg->mg_allocators * + sizeof (metaslab_t *)); mutex_destroy(&mg->mg_lock); - refcount_destroy(&mg->mg_alloc_queue_depth); + + for (int i = 0; i < mg->mg_allocators; i++) { + refcount_destroy(&mg->mg_alloc_queue_depth[i]); + mg->mg_cur_max_alloc_queue_depth[i] = 0; + } + kmem_free(mg->mg_alloc_queue_depth, mg->mg_allocators * + sizeof (refcount_t)); + kmem_free(mg->mg_cur_max_alloc_queue_depth, mg->mg_allocators * + sizeof (uint64_t)); + kmem_free(mg, sizeof (metaslab_group_t)); } @@ -799,6 +860,22 @@ metaslab_group_passivate(metaslab_group_t *mg) taskq_wait(mg->mg_taskq); spa_config_enter(spa, locks & ~(SCL_ZIO - 1), spa, RW_WRITER); metaslab_group_alloc_update(mg); + for (int i = 0; i < mg->mg_allocators; i++) { + metaslab_t *msp = mg->mg_primaries[i]; + if (msp != NULL) { + mutex_enter(&msp->ms_lock); + metaslab_passivate(msp, + metaslab_weight_from_range_tree(msp)); + mutex_exit(&msp->ms_lock); + } + msp = mg->mg_secondaries[i]; + if (msp != NULL) { + mutex_enter(&msp->ms_lock); + metaslab_passivate(msp, + metaslab_weight_from_range_tree(msp)); + mutex_exit(&msp->ms_lock); + } + } mgprev = mg->mg_prev; mgnext = mg->mg_next; @@ -940,6 +1017,17 @@ metaslab_group_remove(metaslab_group_t *mg, metaslab_t } static void +metaslab_group_sort_impl(metaslab_group_t *mg, metaslab_t *msp, uint64_t weight) +{ + ASSERT(MUTEX_HELD(&mg->mg_lock)); + ASSERT(msp->ms_group == mg); + avl_remove(&mg->mg_metaslab_tree, msp); + msp->ms_weight = weight; + avl_add(&mg->mg_metaslab_tree, msp); + +} + +static void metaslab_group_sort(metaslab_group_t *mg, metaslab_t *msp, uint64_t weight) { /* @@ -950,10 +1038,7 @@ metaslab_group_sort(metaslab_group_t *mg, metaslab_t * ASSERT(MUTEX_HELD(&msp->ms_lock)); mutex_enter(&mg->mg_lock); - ASSERT(msp->ms_group == mg); - avl_remove(&mg->mg_metaslab_tree, msp); - msp->ms_weight = weight; - avl_add(&mg->mg_metaslab_tree, msp); + metaslab_group_sort_impl(mg, msp, weight); mutex_exit(&mg->mg_lock); } @@ -1001,7 +1086,7 @@ metaslab_group_fragmentation(metaslab_group_t *mg) */ static boolean_t metaslab_group_allocatable(metaslab_group_t *mg, metaslab_group_t *rotor, - uint64_t psize) + uint64_t psize, int allocator) { spa_t *spa = mg->mg_vd->vdev_spa; metaslab_class_t *mc = mg->mg_class; @@ -1030,7 +1115,7 @@ metaslab_group_allocatable(metaslab_group_t *mg, metas if (mg->mg_allocatable) { metaslab_group_t *mgp; int64_t qdepth; - uint64_t qmax = mg->mg_max_alloc_queue_depth; + uint64_t qmax = mg->mg_cur_max_alloc_queue_depth[allocator]; if (!mc->mc_alloc_throttle_enabled) return (B_TRUE); @@ -1042,7 +1127,7 @@ metaslab_group_allocatable(metaslab_group_t *mg, metas if (mg->mg_no_free_space) return (B_FALSE); - qdepth = refcount_count(&mg->mg_alloc_queue_depth); + qdepth = refcount_count(&mg->mg_alloc_queue_depth[allocator]); /* * If this metaslab group is below its qmax or it's @@ -1061,9 +1146,10 @@ metaslab_group_allocatable(metaslab_group_t *mg, metas * groups at the same time when we make this check. */ for (mgp = mg->mg_next; mgp != rotor; mgp = mgp->mg_next) { - qmax = mgp->mg_max_alloc_queue_depth; + qmax = mgp->mg_cur_max_alloc_queue_depth[allocator]; - qdepth = refcount_count(&mgp->mg_alloc_queue_depth); + qdepth = refcount_count( + &mgp->mg_alloc_queue_depth[allocator]); /* * If there is another metaslab group that @@ -1471,6 +1557,8 @@ metaslab_init(metaslab_group_t *mg, uint64_t id, uint6 ms->ms_id = id; ms->ms_start = id << vd->vdev_ms_shift; ms->ms_size = 1ULL << vd->vdev_ms_shift; + ms->ms_allocator = -1; + ms->ms_new = B_TRUE; /* * We only open space map objects that already exist. All others @@ -1567,6 +1655,7 @@ metaslab_fini(metaslab_t *msp) cv_destroy(&msp->ms_load_cv); mutex_destroy(&msp->ms_lock); mutex_destroy(&msp->ms_sync_lock); + ASSERT3U(msp->ms_allocator, ==, -1); kmem_free(msp, sizeof (metaslab_t)); } @@ -1963,19 +2052,59 @@ metaslab_weight(metaslab_t *msp) } static int -metaslab_activate(metaslab_t *msp, uint64_t activation_weight) +metaslab_activate_allocator(metaslab_group_t *mg, metaslab_t *msp, + int allocator, uint64_t activation_weight) { + /* + * If we're activating for the claim code, we don't want to actually + * set the metaslab up for a specific allocator. + */ + if (activation_weight == METASLAB_WEIGHT_CLAIM) + return (0); + metaslab_t **arr = (activation_weight == METASLAB_WEIGHT_PRIMARY ? + mg->mg_primaries : mg->mg_secondaries); + ASSERT(MUTEX_HELD(&msp->ms_lock)); + mutex_enter(&mg->mg_lock); + if (arr[allocator] != NULL) { + mutex_exit(&mg->mg_lock); + return (EEXIST); + } + arr[allocator] = msp; + ASSERT3S(msp->ms_allocator, ==, -1); + msp->ms_allocator = allocator; + msp->ms_primary = (activation_weight == METASLAB_WEIGHT_PRIMARY); + mutex_exit(&mg->mg_lock); + + return (0); +} + +static int +metaslab_activate(metaslab_t *msp, int allocator, uint64_t activation_weight) +{ + ASSERT(MUTEX_HELD(&msp->ms_lock)); + if ((msp->ms_weight & METASLAB_ACTIVE_MASK) == 0) { + int error = 0; metaslab_load_wait(msp); if (!msp->ms_loaded) { - int error = metaslab_load(msp); - if (error) { + if ((error = metaslab_load(msp)) != 0) { metaslab_group_sort(msp->ms_group, msp, 0); return (error); } } + if ((msp->ms_weight & METASLAB_ACTIVE_MASK) != 0) { + /* + * The metaslab was activated for another allocator + * while we were waiting, we should reselect. + */ + return (EBUSY); + } + if ((error = metaslab_activate_allocator(msp->ms_group, msp, + allocator, activation_weight)) != 0) { + return (error); + } msp->ms_activation_weight = msp->ms_weight; metaslab_group_sort(msp->ms_group, msp, @@ -1988,6 +2117,34 @@ metaslab_activate(metaslab_t *msp, uint64_t activation } static void +metaslab_passivate_allocator(metaslab_group_t *mg, metaslab_t *msp, + uint64_t weight) +{ + ASSERT(MUTEX_HELD(&msp->ms_lock)); + if (msp->ms_weight & METASLAB_WEIGHT_CLAIM) { + metaslab_group_sort(mg, msp, weight); + return; + } + + mutex_enter(&mg->mg_lock); + ASSERT3P(msp->ms_group, ==, mg); + if (msp->ms_primary) { + ASSERT3U(0, <=, msp->ms_allocator); + ASSERT3U(msp->ms_allocator, <, mg->mg_allocators); + ASSERT3P(mg->mg_primaries[msp->ms_allocator], ==, msp); + ASSERT(msp->ms_weight & METASLAB_WEIGHT_PRIMARY); + mg->mg_primaries[msp->ms_allocator] = NULL; + } else { + ASSERT(msp->ms_weight & METASLAB_WEIGHT_SECONDARY); + ASSERT3P(mg->mg_secondaries[msp->ms_allocator], ==, msp); + mg->mg_secondaries[msp->ms_allocator] = NULL; + } + msp->ms_allocator = -1; + metaslab_group_sort_impl(mg, msp, weight); + mutex_exit(&mg->mg_lock); +} + +static void metaslab_passivate(metaslab_t *msp, uint64_t weight) { uint64_t size = weight & ~METASLAB_WEIGHT_TYPE; @@ -2002,7 +2159,7 @@ metaslab_passivate(metaslab_t *msp, uint64_t weight) ASSERT0(weight & METASLAB_ACTIVE_MASK); msp->ms_activation_weight = 0; - metaslab_group_sort(msp->ms_group, msp, weight); + metaslab_passivate_allocator(msp->ms_group, msp, weight); ASSERT((msp->ms_weight & METASLAB_ACTIVE_MASK) == 0); } @@ -2556,11 +2713,18 @@ metaslab_sync_done(metaslab_t *msp, uint64_t txg) vdev_dirty(vd, VDD_METASLAB, msp, txg + 1); } + if (msp->ms_new) { + msp->ms_new = B_FALSE; + mutex_enter(&mg->mg_lock); + mg->mg_ms_ready++; + mutex_exit(&mg->mg_lock); + } /* * Calculate the new weights before unloading any metaslabs. * This will give us the most accurate weighting. */ - metaslab_group_sort(mg, msp, metaslab_weight(msp)); + metaslab_group_sort(mg, msp, metaslab_weight(msp) | + (msp->ms_weight & METASLAB_ACTIVE_MASK)); /* * If the metaslab is loaded and we've not tried to load or allocate @@ -2572,6 +2736,10 @@ metaslab_sync_done(metaslab_t *msp, uint64_t txg) VERIFY0(range_tree_space( msp->ms_allocating[(txg + t) & TXG_MASK])); } + if (msp->ms_allocator != -1) { + metaslab_passivate(msp, msp->ms_weight & + ~METASLAB_ACTIVE_MASK); + } if (!metaslab_debug_unload) metaslab_unload(msp); @@ -2665,7 +2833,8 @@ metaslab_alloc_trace_fini(void) */ static void metaslab_trace_add(zio_alloc_list_t *zal, metaslab_group_t *mg, - metaslab_t *msp, uint64_t psize, uint32_t dva_id, uint64_t offset) + metaslab_t *msp, uint64_t psize, uint32_t dva_id, uint64_t offset, + int allocator) { if (!metaslab_trace_enabled) return; @@ -2698,6 +2867,7 @@ metaslab_trace_add(zio_alloc_list_t *zal, metaslab_gro mat->mat_dva_id = dva_id; mat->mat_offset = offset; mat->mat_weight = 0; + mat->mat_allocator = allocator; if (msp != NULL) mat->mat_weight = msp->ms_weight; @@ -2738,35 +2908,56 @@ metaslab_trace_fini(zio_alloc_list_t *zal) */ static void -metaslab_group_alloc_increment(spa_t *spa, uint64_t vdev, void *tag, int flags) +metaslab_group_alloc_increment(spa_t *spa, uint64_t vdev, void *tag, int flags, + int allocator) { if (!(flags & METASLAB_ASYNC_ALLOC) || - flags & METASLAB_DONT_THROTTLE) + (flags & METASLAB_DONT_THROTTLE)) return; metaslab_group_t *mg = vdev_lookup_top(spa, vdev)->vdev_mg; if (!mg->mg_class->mc_alloc_throttle_enabled) return; - (void) refcount_add(&mg->mg_alloc_queue_depth, tag); + (void) refcount_add(&mg->mg_alloc_queue_depth[allocator], tag); } +static void +metaslab_group_increment_qdepth(metaslab_group_t *mg, int allocator) +{ + uint64_t max = mg->mg_max_alloc_queue_depth; + uint64_t cur = mg->mg_cur_max_alloc_queue_depth[allocator]; + while (cur < max) { + if (atomic_cas_64(&mg->mg_cur_max_alloc_queue_depth[allocator], + cur, cur + 1) == cur) { + atomic_inc_64( + &mg->mg_class->mc_alloc_max_slots[allocator]); + return; + } + cur = mg->mg_cur_max_alloc_queue_depth[allocator]; + } +} + void -metaslab_group_alloc_decrement(spa_t *spa, uint64_t vdev, void *tag, int flags) +metaslab_group_alloc_decrement(spa_t *spa, uint64_t vdev, void *tag, int flags, + int allocator, boolean_t io_complete) { if (!(flags & METASLAB_ASYNC_ALLOC) || - flags & METASLAB_DONT_THROTTLE) + (flags & METASLAB_DONT_THROTTLE)) return; metaslab_group_t *mg = vdev_lookup_top(spa, vdev)->vdev_mg; if (!mg->mg_class->mc_alloc_throttle_enabled) return; - (void) refcount_remove(&mg->mg_alloc_queue_depth, tag); + (void) refcount_remove(&mg->mg_alloc_queue_depth[allocator], tag); + if (io_complete) + metaslab_group_increment_qdepth(mg, allocator); } void -metaslab_group_alloc_verify(spa_t *spa, const blkptr_t *bp, void *tag) +metaslab_group_alloc_verify(spa_t *spa, const blkptr_t *bp, void *tag, + int allocator) { #ifdef ZFS_DEBUG const dva_t *dva = bp->blk_dva; @@ -2775,7 +2966,8 @@ metaslab_group_alloc_verify(spa_t *spa, const blkptr_t for (int d = 0; d < ndvas; d++) { uint64_t vdev = DVA_GET_VDEV(&dva[d]); metaslab_group_t *mg = vdev_lookup_top(spa, vdev)->vdev_mg; - VERIFY(refcount_not_held(&mg->mg_alloc_queue_depth, tag)); + VERIFY(refcount_not_held(&mg->mg_alloc_queue_depth[allocator], + tag)); } #endif } @@ -2817,91 +3009,146 @@ metaslab_block_alloc(metaslab_t *msp, uint64_t size, u return (start); } +/* + * Find the metaslab with the highest weight that is less than what we've + * already tried. In the common case, this means that we will examine each + * metaslab at most once. Note that concurrent callers could reorder metaslabs + * by activation/passivation once we have dropped the mg_lock. If a metaslab is + * activated by another thread, and we fail to allocate from the metaslab we + * have selected, we may not try the newly-activated metaslab, and instead + * activate another metaslab. This is not optimal, but generally does not cause + * any problems (a possible exception being if every metaslab is completely full + * except for the the newly-activated metaslab which we fail to examine). + */ +static metaslab_t * +find_valid_metaslab(metaslab_group_t *mg, uint64_t activation_weight, + dva_t *dva, int d, uint64_t min_distance, uint64_t asize, int allocator, + zio_alloc_list_t *zal, metaslab_t *search, boolean_t *was_active) +{ + avl_index_t idx; + avl_tree_t *t = &mg->mg_metaslab_tree; + metaslab_t *msp = avl_find(t, search, &idx); + if (msp == NULL) + msp = avl_nearest(t, idx, AVL_AFTER); + + for (; msp != NULL; msp = AVL_NEXT(t, msp)) { + int i; + if (!metaslab_should_allocate(msp, asize)) { + metaslab_trace_add(zal, mg, msp, asize, d, + TRACE_TOO_SMALL, allocator); + continue; + } + + /* + * If the selected metaslab is condensing, skip it. + */ + if (msp->ms_condensing) + continue; + + *was_active = msp->ms_allocator != -1; + /* + * If we're activating as primary, this is our first allocation + * from this disk, so we don't need to check how close we are. + * If the metaslab under consideration was already active, + * we're getting desperate enough to steal another allocator's + * metaslab, so we still don't care about distances. + */ + if (activation_weight == METASLAB_WEIGHT_PRIMARY || *was_active) + break; + + uint64_t target_distance = min_distance + + (space_map_allocated(msp->ms_sm) != 0 ? 0 : + min_distance >> 1); + + for (i = 0; i < d; i++) { + if (metaslab_distance(msp, &dva[i]) < target_distance) + break; + } + if (i == d) + break; + } + + if (msp != NULL) { + search->ms_weight = msp->ms_weight; + search->ms_start = msp->ms_start + 1; + search->ms_allocator = msp->ms_allocator; + search->ms_primary = msp->ms_primary; + } + return (msp); +} + +/* ARGSUSED */ static uint64_t metaslab_group_alloc_normal(metaslab_group_t *mg, zio_alloc_list_t *zal, - uint64_t asize, uint64_t txg, uint64_t min_distance, dva_t *dva, int d) + uint64_t asize, uint64_t txg, uint64_t min_distance, dva_t *dva, int d, + int allocator) { metaslab_t *msp = NULL; uint64_t offset = -1ULL; uint64_t activation_weight; - uint64_t target_distance; - int i; + boolean_t tertiary = B_FALSE; activation_weight = METASLAB_WEIGHT_PRIMARY; - for (i = 0; i < d; i++) { - if (DVA_GET_VDEV(&dva[i]) == mg->mg_vd->vdev_id) { + for (int i = 0; i < d; i++) { + if (activation_weight == METASLAB_WEIGHT_PRIMARY && + DVA_GET_VDEV(&dva[i]) == mg->mg_vd->vdev_id) { activation_weight = METASLAB_WEIGHT_SECONDARY; + } else if (activation_weight == METASLAB_WEIGHT_SECONDARY && + DVA_GET_VDEV(&dva[i]) == mg->mg_vd->vdev_id) { + tertiary = B_TRUE; break; } } + /* + * If we don't have enough metaslabs active to fill the entire array, we + * just use the 0th slot. + */ + if (mg->mg_ms_ready < mg->mg_allocators * 2) { + tertiary = B_FALSE; + allocator = 0; + } + + ASSERT3U(mg->mg_vd->vdev_ms_count, >=, 2); + metaslab_t *search = kmem_alloc(sizeof (*search), KM_SLEEP); search->ms_weight = UINT64_MAX; search->ms_start = 0; + /* + * At the end of the metaslab tree are the already-active metaslabs, + * first the primaries, then the secondaries. When we resume searching + * through the tree, we need to consider ms_allocator and ms_primary so + * we start in the location right after where we left off, and don't + * accidentally loop forever considering the same metaslabs. + */ + search->ms_allocator = -1; + search->ms_primary = B_TRUE; for (;;) { - boolean_t was_active; - avl_tree_t *t = &mg->mg_metaslab_tree; - avl_index_t idx; + boolean_t was_active = B_FALSE; mutex_enter(&mg->mg_lock); - /* - * Find the metaslab with the highest weight that is less - * than what we've already tried. In the common case, this - * means that we will examine each metaslab at most once. - * Note that concurrent callers could reorder metaslabs - * by activation/passivation once we have dropped the mg_lock. - * If a metaslab is activated by another thread, and we fail - * to allocate from the metaslab we have selected, we may - * not try the newly-activated metaslab, and instead activate - * another metaslab. This is not optimal, but generally - * does not cause any problems (a possible exception being - * if every metaslab is completely full except for the - * the newly-activated metaslab which we fail to examine). - */ - msp = avl_find(t, search, &idx); - if (msp == NULL) - msp = avl_nearest(t, idx, AVL_AFTER); - for (; msp != NULL; msp = AVL_NEXT(t, msp)) { - - if (!metaslab_should_allocate(msp, asize)) { - metaslab_trace_add(zal, mg, msp, asize, d, - TRACE_TOO_SMALL); - continue; - } - - /* - * If the selected metaslab is condensing, skip it. - */ - if (msp->ms_condensing) - continue; - - was_active = msp->ms_weight & METASLAB_ACTIVE_MASK; - if (activation_weight == METASLAB_WEIGHT_PRIMARY) - break; - - target_distance = min_distance + - (space_map_allocated(msp->ms_sm) != 0 ? 0 : - min_distance >> 1); - - for (i = 0; i < d; i++) { - if (metaslab_distance(msp, &dva[i]) < - target_distance) - break; - } - if (i == d) - break; + if (activation_weight == METASLAB_WEIGHT_PRIMARY && + mg->mg_primaries[allocator] != NULL) { + msp = mg->mg_primaries[allocator]; + was_active = B_TRUE; + } else if (activation_weight == METASLAB_WEIGHT_SECONDARY && + mg->mg_secondaries[allocator] != NULL && !tertiary) { + msp = mg->mg_secondaries[allocator]; + was_active = B_TRUE; + } else { + msp = find_valid_metaslab(mg, activation_weight, dva, d, + min_distance, asize, allocator, zal, search, + &was_active); } + mutex_exit(&mg->mg_lock); if (msp == NULL) { kmem_free(search, sizeof (*search)); return (-1ULL); } - search->ms_weight = msp->ms_weight; - search->ms_start = msp->ms_start + 1; mutex_enter(&msp->ms_lock); - /* * Ensure that the metaslab we have selected is still * capable of handling our request. It's possible that @@ -2915,18 +3162,32 @@ metaslab_group_alloc_normal(metaslab_group_t *mg, zio_ continue; } - if ((msp->ms_weight & METASLAB_WEIGHT_SECONDARY) && - activation_weight == METASLAB_WEIGHT_PRIMARY) { - metaslab_passivate(msp, - msp->ms_weight & ~METASLAB_ACTIVE_MASK); + /* + * If the metaslab is freshly activated for an allocator that + * isn't the one we're allocating from, or if it's a primary and + * we're seeking a secondary (or vice versa), we go back and + * select a new metaslab. + */ + if (!was_active && (msp->ms_weight & METASLAB_ACTIVE_MASK) && + (msp->ms_allocator != -1) && + (msp->ms_allocator != allocator || ((activation_weight == + METASLAB_WEIGHT_PRIMARY) != msp->ms_primary))) { mutex_exit(&msp->ms_lock); continue; } - if (metaslab_activate(msp, activation_weight) != 0) { + if (msp->ms_weight & METASLAB_WEIGHT_CLAIM) { + metaslab_passivate(msp, msp->ms_weight & + ~METASLAB_WEIGHT_CLAIM); mutex_exit(&msp->ms_lock); continue; } + + if (metaslab_activate(msp, allocator, activation_weight) != 0) { + mutex_exit(&msp->ms_lock); + continue; + } + msp->ms_selected_txg = txg; /* @@ -2939,7 +3200,7 @@ metaslab_group_alloc_normal(metaslab_group_t *mg, zio_ if (!metaslab_should_allocate(msp, asize)) { /* Passivate this metaslab and select a new one. */ metaslab_trace_add(zal, mg, msp, asize, d, - TRACE_TOO_SMALL); + TRACE_TOO_SMALL, allocator); goto next; } @@ -2950,13 +3211,15 @@ metaslab_group_alloc_normal(metaslab_group_t *mg, zio_ */ if (msp->ms_condensing) { metaslab_trace_add(zal, mg, msp, asize, d, - TRACE_CONDENSING); + TRACE_CONDENSING, allocator); + metaslab_passivate(msp, msp->ms_weight & + ~METASLAB_ACTIVE_MASK); mutex_exit(&msp->ms_lock); continue; } offset = metaslab_block_alloc(msp, asize, txg); - metaslab_trace_add(zal, mg, msp, asize, d, offset); + metaslab_trace_add(zal, mg, msp, asize, d, offset, allocator); if (offset != -1ULL) { /* Proactively passivate the metaslab, if needed */ @@ -3012,19 +3275,20 @@ next: static uint64_t metaslab_group_alloc(metaslab_group_t *mg, zio_alloc_list_t *zal, - uint64_t asize, uint64_t txg, uint64_t min_distance, dva_t *dva, int d) + uint64_t asize, uint64_t txg, uint64_t min_distance, dva_t *dva, int d, + int allocator) { uint64_t offset; ASSERT(mg->mg_initialized); offset = metaslab_group_alloc_normal(mg, zal, asize, txg, - min_distance, dva, d); + min_distance, dva, d, allocator); mutex_enter(&mg->mg_lock); if (offset == -1ULL) { mg->mg_failed_allocations++; metaslab_trace_add(zal, mg, NULL, asize, d, - TRACE_GROUP_FAILURE); + TRACE_GROUP_FAILURE, allocator); if (asize == SPA_GANGBLOCKSIZE) { /* * This metaslab group was unable to allocate @@ -3059,7 +3323,7 @@ int ditto_same_vdev_distance_shift = 3; int metaslab_alloc_dva(spa_t *spa, metaslab_class_t *mc, uint64_t psize, dva_t *dva, int d, dva_t *hintdva, uint64_t txg, int flags, - zio_alloc_list_t *zal) + zio_alloc_list_t *zal, int allocator) { metaslab_group_t *mg, *rotor; vdev_t *vd; @@ -3071,7 +3335,8 @@ metaslab_alloc_dva(spa_t *spa, metaslab_class_t *mc, u * For testing, make some blocks above a certain size be gang blocks. */ if (psize >= metaslab_force_ganging && (ddi_get_lbolt() & 3) == 0) { - metaslab_trace_add(zal, NULL, NULL, psize, d, TRACE_FORCE_GANG); + metaslab_trace_add(zal, NULL, NULL, psize, d, TRACE_FORCE_GANG, + allocator); return (SET_ERROR(ENOSPC)); } @@ -3157,12 +3422,12 @@ top: */ if (allocatable && !GANG_ALLOCATION(flags) && !try_hard) { allocatable = metaslab_group_allocatable(mg, rotor, - psize); + psize, allocator); } if (!allocatable) { metaslab_trace_add(zal, mg, NULL, psize, d, - TRACE_NOT_ALLOCATABLE); + TRACE_NOT_ALLOCATABLE, allocator); goto next; } @@ -3177,7 +3442,7 @@ top: vd->vdev_state < VDEV_STATE_HEALTHY) && d == 0 && !try_hard && vd->vdev_children == 0) { metaslab_trace_add(zal, mg, NULL, psize, d, - TRACE_VDEV_ERROR); + TRACE_VDEV_ERROR, allocator); goto next; } @@ -3201,7 +3466,7 @@ top: ASSERT(P2PHASE(asize, 1ULL << vd->vdev_ashift) == 0); uint64_t offset = metaslab_group_alloc(mg, zal, asize, txg, - distance, dva, d); + distance, dva, d, allocator); if (offset != -1ULL) { /* @@ -3264,7 +3529,7 @@ next: bzero(&dva[d], sizeof (dva_t)); - metaslab_trace_add(zal, rotor, NULL, psize, d, TRACE_ENOSPC); + metaslab_trace_add(zal, rotor, NULL, psize, d, TRACE_ENOSPC, allocator); return (SET_ERROR(ENOSPC)); } @@ -3565,18 +3830,20 @@ metaslab_free_dva(spa_t *spa, const dva_t *dva, boolea * the reservation. */ boolean_t -metaslab_class_throttle_reserve(metaslab_class_t *mc, int slots, zio_t *zio, - int flags) +metaslab_class_throttle_reserve(metaslab_class_t *mc, int slots, int allocator, + zio_t *zio, int flags) { uint64_t available_slots = 0; boolean_t slot_reserved = B_FALSE; + uint64_t max = mc->mc_alloc_max_slots[allocator]; ASSERT(mc->mc_alloc_throttle_enabled); mutex_enter(&mc->mc_lock); - uint64_t reserved_slots = refcount_count(&mc->mc_alloc_slots); - if (reserved_slots < mc->mc_alloc_max_slots) - available_slots = mc->mc_alloc_max_slots - reserved_slots; + uint64_t reserved_slots = + refcount_count(&mc->mc_alloc_slots[allocator]); + if (reserved_slots < max) + available_slots = max - reserved_slots; if (slots <= available_slots || GANG_ALLOCATION(flags)) { /* @@ -3584,7 +3851,9 @@ metaslab_class_throttle_reserve(metaslab_class_t *mc, * them individually when an I/O completes. */ for (int d = 0; d < slots; d++) { - reserved_slots = refcount_add(&mc->mc_alloc_slots, zio); + reserved_slots = + refcount_add(&mc->mc_alloc_slots[allocator], + zio); } zio->io_flags |= ZIO_FLAG_IO_ALLOCATING; slot_reserved = B_TRUE; @@ -3595,12 +3864,14 @@ metaslab_class_throttle_reserve(metaslab_class_t *mc, } void -metaslab_class_throttle_unreserve(metaslab_class_t *mc, int slots, zio_t *zio) +metaslab_class_throttle_unreserve(metaslab_class_t *mc, int slots, + int allocator, zio_t *zio) { ASSERT(mc->mc_alloc_throttle_enabled); mutex_enter(&mc->mc_lock); for (int d = 0; d < slots; d++) { - (void) refcount_remove(&mc->mc_alloc_slots, zio); + (void) refcount_remove(&mc->mc_alloc_slots[allocator], + zio); } mutex_exit(&mc->mc_lock); } @@ -3622,7 +3893,13 @@ metaslab_claim_concrete(vdev_t *vd, uint64_t offset, u mutex_enter(&msp->ms_lock); if ((txg != 0 && spa_writeable(spa)) || !msp->ms_loaded) - error = metaslab_activate(msp, METASLAB_WEIGHT_SECONDARY); + error = metaslab_activate(msp, 0, METASLAB_WEIGHT_CLAIM); + /* + * No need to fail in that case; someone else has activated the + * metaslab, but that doesn't preclude us from using it. + */ + if (error == EBUSY) + error = 0; if (error == 0 && !range_tree_contains(msp->ms_allocatable, offset, size)) @@ -3727,7 +4004,7 @@ metaslab_claim_dva(spa_t *spa, const dva_t *dva, uint6 int metaslab_alloc(spa_t *spa, metaslab_class_t *mc, uint64_t psize, blkptr_t *bp, int ndvas, uint64_t txg, blkptr_t *hintbp, int flags, - zio_alloc_list_t *zal, zio_t *zio) + zio_alloc_list_t *zal, zio_t *zio, int allocator) { dva_t *dva = bp->blk_dva; dva_t *hintdva = hintbp->blk_dva; @@ -3750,12 +4027,13 @@ metaslab_alloc(spa_t *spa, metaslab_class_t *mc, uint6 for (int d = 0; d < ndvas; d++) { error = metaslab_alloc_dva(spa, mc, psize, dva, d, hintdva, - txg, flags, zal); + txg, flags, zal, allocator); if (error != 0) { for (d--; d >= 0; d--) { metaslab_unalloc_dva(spa, &dva[d], txg); metaslab_group_alloc_decrement(spa, - DVA_GET_VDEV(&dva[d]), zio, flags); + DVA_GET_VDEV(&dva[d]), zio, flags, + allocator, B_FALSE); bzero(&dva[d], sizeof (dva_t)); } spa_config_exit(spa, SCL_ALLOC, FTAG); @@ -3766,7 +4044,7 @@ metaslab_alloc(spa_t *spa, metaslab_class_t *mc, uint6 * based on the newly allocated dva. */ metaslab_group_alloc_increment(spa, - DVA_GET_VDEV(&dva[d]), zio, flags); + DVA_GET_VDEV(&dva[d]), zio, flags, allocator); } } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Oct 3 02:09:36 2018 (r339104) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Oct 3 02:10:23 2018 (r339105) @@ -7777,9 +7777,11 @@ spa_sync(spa_t *spa, uint64_t txg) spa->spa_syncing_txg = txg; spa->spa_sync_pass = 0; - mutex_enter(&spa->spa_alloc_lock); - VERIFY0(avl_numnodes(&spa->spa_alloc_tree)); - mutex_exit(&spa->spa_alloc_lock); + for (int i = 0; i < spa->spa_alloc_count; i++) { + mutex_enter(&spa->spa_alloc_locks[i]); + VERIFY0(avl_numnodes(&spa->spa_alloc_trees[i])); + mutex_exit(&spa->spa_alloc_locks[i]); + } /* * If there are any pending vdev state changes, convert them @@ -7845,7 +7847,7 @@ spa_sync(spa_t *spa, uint64_t txg) * The max queue depth will not change in the middle of syncing * out this txg. */ - uint64_t queue_depth_total = 0; + uint64_t slots_per_allocator = 0; for (int c = 0; c < rvd->vdev_children; c++) { vdev_t *tvd = rvd->vdev_child[c]; metaslab_group_t *mg = tvd->vdev_mg; @@ -7859,18 +7861,23 @@ spa_sync(spa_t *spa, uint64_t txg) * allocations look at mg_max_alloc_queue_depth, and async * allocations all happen from spa_sync(). */ - ASSERT0(refcount_count(&mg->mg_alloc_queue_depth)); + for (int i = 0; i < spa->spa_alloc_count; i++) + ASSERT0(refcount_count(&(mg->mg_alloc_queue_depth[i]))); mg->mg_max_alloc_queue_depth = max_queue_depth; - queue_depth_total += mg->mg_max_alloc_queue_depth; + + for (int i = 0; i < spa->spa_alloc_count; i++) { + mg->mg_cur_max_alloc_queue_depth[i] = + zfs_vdev_def_queue_depth; + } + slots_per_allocator += zfs_vdev_def_queue_depth; } metaslab_class_t *mc = spa_normal_class(spa); - ASSERT0(refcount_count(&mc->mc_alloc_slots)); - mc->mc_alloc_max_slots = queue_depth_total; + for (int i = 0; i < spa->spa_alloc_count; i++) { + ASSERT0(refcount_count(&mc->mc_alloc_slots[i])); + mc->mc_alloc_max_slots[i] = slots_per_allocator; + } mc->mc_alloc_throttle_enabled = zio_dva_throttle_enabled; - ASSERT3U(mc->mc_alloc_max_slots, <=, - max_queue_depth * rvd->vdev_children); - for (int c = 0; c < rvd->vdev_children; c++) { vdev_t *vd = rvd->vdev_child[c]; vdev_indirect_state_sync_verify(vd); @@ -8053,9 +8060,11 @@ spa_sync(spa_t *spa, uint64_t txg) dsl_pool_sync_done(dp, txg); - mutex_enter(&spa->spa_alloc_lock); - VERIFY0(avl_numnodes(&spa->spa_alloc_tree)); - mutex_exit(&spa->spa_alloc_lock); + for (int i = 0; i < spa->spa_alloc_count; i++) { *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-stable-11@freebsd.org Wed Oct 3 02:11:09 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CD64510B48E5; Wed, 3 Oct 2018 02:11:08 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8146B86366; Wed, 3 Oct 2018 02:11:08 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 7B8F510818; Wed, 3 Oct 2018 02:11:08 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w932B8Fn040883; Wed, 3 Oct 2018 02:11:08 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w932B5HN040866; Wed, 3 Oct 2018 02:11:05 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030211.w932B5HN040866@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:11:05 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339106 - in stable/11: cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/cmd/ztest cddl/contrib/opensolaris/lib/libzfs/common sys/cddl/contrib/opensolaris/uts/common/fs/zfs sys... X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11: cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/cmd/ztest cddl/contrib/opensolaris/lib/libzfs/common sys/cddl/contrib/opensolaris/uts/common/fs/zfs sys/cddl/contrib/opensolaris... X-SVN-Commit-Revision: 339106 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:11:09 -0000 Author: mav Date: Wed Oct 3 02:11:05 2018 New Revision: 339106 URL: https://svnweb.freebsd.org/changeset/base/339106 Log: MFC r336951: MFV r336950: 9290 device removal reduces redundancy of mirrors Mirrors are supposed to provide redundancy in the face of whole-disk failure and silent damage (e.g. some data on disk is not right, but ZFS hasn't detected the whole device as being broken). However, the current device removal implementation bypasses some of the mirror's redundancy. illumos/illumos-gate@3a4b1be953ee5601bab748afa07c26ed4996cde6 Reviewed by: George Wilson Reviewed by: Prashanth Sreenivasa Reviewed by: Sara Hartse Reviewed by: Serapheim Dimitropoulos Reviewed by: Brian Behlendorf Reviewed by: Tim Chase Approved by: Richard Lowe Author: Matthew Ahrens Modified: stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_removal.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_removal.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Oct 3 02:10:23 2018 (r339105) +++ stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Oct 3 02:11:05 2018 (r339106) @@ -3033,7 +3033,7 @@ zdb_claim_removing(spa_t *spa, zdb_cb_t *zcb) spa_config_enter(spa, SCL_CONFIG, FTAG, RW_READER); spa_vdev_removal_t *svr = spa->spa_vdev_removal; - vdev_t *vd = svr->svr_vdev; + vdev_t *vd = vdev_lookup_top(spa, svr->svr_vdev_id); vdev_indirect_mapping_t *vim = vd->vdev_indirect_mapping; for (uint64_t msi = 0; msi < vd->vdev_ms_count; msi++) { @@ -3049,13 +3049,17 @@ zdb_claim_removing(spa_t *spa, zdb_cb_t *zcb) svr->svr_allocd_segs, SM_ALLOC)); /* - * Clear everything past what has been synced, - * because we have not allocated mappings for it yet. + * Clear everything past what has been synced unless + * it's past the spacemap, because we have not allocated + * mappings for it yet. */ - range_tree_clear(svr->svr_allocd_segs, - vdev_indirect_mapping_max_offset(vim), - msp->ms_sm->sm_start + msp->ms_sm->sm_size - - vdev_indirect_mapping_max_offset(vim)); + uint64_t vim_max_offset = + vdev_indirect_mapping_max_offset(vim); + uint64_t sm_end = msp->ms_sm->sm_start + + msp->ms_sm->sm_size; + if (sm_end > vim_max_offset) + range_tree_clear(svr->svr_allocd_segs, + vim_max_offset, sm_end - vim_max_offset); } zcb->zcb_removing_size += Modified: stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c Wed Oct 3 02:10:23 2018 (r339105) +++ stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c Wed Oct 3 02:11:05 2018 (r339106) @@ -438,6 +438,7 @@ static ztest_ds_t *ztest_ds; static kmutex_t ztest_vdev_lock; static kmutex_t ztest_checkpoint_lock; +static boolean_t ztest_device_removal_active = B_FALSE; /* * The ztest_name_lock protects the pool and dataset namespace used by @@ -2882,7 +2883,7 @@ ztest_vdev_attach_detach(ztest_ds_t *zd, uint64_t id) * value. Don't bother trying to attach while we are in the middle * of removal. */ - if (spa->spa_vdev_removal != NULL) { + if (ztest_device_removal_active) { spa_config_exit(spa, SCL_ALL, FTAG); mutex_exit(&ztest_vdev_lock); return; @@ -3057,16 +3058,49 @@ ztest_device_removal(ztest_ds_t *zd, uint64_t id) spa_t *spa = ztest_spa; vdev_t *vd; uint64_t guid; + int error; mutex_enter(&ztest_vdev_lock); + if (ztest_device_removal_active) { + mutex_exit(&ztest_vdev_lock); + return; + } + + /* + * Remove a random top-level vdev and wait for removal to finish. + */ spa_config_enter(spa, SCL_VDEV, FTAG, RW_READER); vd = vdev_lookup_top(spa, ztest_random_vdev_top(spa, B_FALSE)); guid = vd->vdev_guid; spa_config_exit(spa, SCL_VDEV, FTAG); - (void) spa_vdev_remove(spa, guid, B_FALSE); + error = spa_vdev_remove(spa, guid, B_FALSE); + if (error == 0) { + ztest_device_removal_active = B_TRUE; + mutex_exit(&ztest_vdev_lock); + while (spa->spa_vdev_removal != NULL) + txg_wait_synced(spa_get_dsl(spa), 0); + } else { + mutex_exit(&ztest_vdev_lock); + return; + } + + /* + * The pool needs to be scrubbed after completing device removal. + * Failure to do so may result in checksum errors due to the + * strategy employed by ztest_fault_inject() when selecting which + * offset are redundant and can be damaged. + */ + error = spa_scan(spa, POOL_SCAN_SCRUB); + if (error == 0) { + while (dsl_scan_scrubbing(spa_get_dsl(spa))) + txg_wait_synced(spa_get_dsl(spa), 0); + } + + mutex_enter(&ztest_vdev_lock); + ztest_device_removal_active = B_FALSE; mutex_exit(&ztest_vdev_lock); } @@ -3205,7 +3239,7 @@ ztest_vdev_LUN_growth(ztest_ds_t *zd, uint64_t id) * that the metaslab_class space increased (because it decreases * when the device removal completes). */ - if (spa->spa_vdev_removal != NULL) { + if (ztest_device_removal_active) { spa_config_exit(spa, SCL_STATE, spa); mutex_exit(&ztest_vdev_lock); mutex_exit(&ztest_checkpoint_lock); @@ -4986,6 +5020,18 @@ ztest_fault_inject(ztest_ds_t *zd, uint64_t id) boolean_t islog = B_FALSE; mutex_enter(&ztest_vdev_lock); + + /* + * Device removal is in progress, fault injection must be disabled + * until it completes and the pool is scrubbed. The fault injection + * strategy for damaging blocks does not take in to account evacuated + * blocks which may have already been damaged. + */ + if (ztest_device_removal_active) { + mutex_exit(&ztest_vdev_lock); + return; + } + maxfaults = MAXFAULTS(); leaves = MAX(zs->zs_mirrors, 1) * ztest_opts.zo_raidz; mirror_save = zs->zs_mirrors; @@ -5330,6 +5376,12 @@ void ztest_scrub(ztest_ds_t *zd, uint64_t id) { spa_t *spa = ztest_spa; + + /* + * Scrub in progress by device removal. + */ + if (ztest_device_removal_active) + return; (void) spa_scan(spa, POOL_SCAN_SCRUB); (void) poll(NULL, 0, 100); /* wait a moment, then force a restart */ Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c Wed Oct 3 02:10:23 2018 (r339105) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c Wed Oct 3 02:11:05 2018 (r339106) @@ -2836,7 +2836,7 @@ zpool_vdev_attach(zpool_handle_t *zhp, case EBUSY: zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, "%s is busy, " - "or pool has removing/removed vdevs"), + "or device removal is in progress"), new_disk); (void) zfs_error(hdl, EZFS_BADDEV, msg); break; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Oct 3 02:10:23 2018 (r339105) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Oct 3 02:11:05 2018 (r339106) @@ -2959,6 +2959,16 @@ dsl_scan_need_resilver(spa_t *spa, const dva_t *dva, s { vdev_t *vd; + if (vd->vdev_ops == &vdev_indirect_ops) { + /* + * The indirect vdev can point to multiple + * vdevs. For simplicity, always create + * the resilver zio_t. zio_vdev_io_start() + * will bypass the child resilver i/o's if + * they are on vdevs that don't have DTL's. + */ + return (B_TRUE); + } if (DVA_GET_GANG(dva)) { /* * Gang members may be spread across multiple Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Oct 3 02:10:23 2018 (r339105) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Oct 3 02:11:05 2018 (r339106) @@ -3596,7 +3596,7 @@ metaslab_free_impl(vdev_t *vd, uint64_t offset, uint64 return; if (spa->spa_vdev_removal != NULL && - spa->spa_vdev_removal->svr_vdev == vd && + spa->spa_vdev_removal->svr_vdev_id == vd->vdev_id && vdev_is_concrete(vd)) { /* * Note: we check if the vdev is concrete because when Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Oct 3 02:10:23 2018 (r339105) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Oct 3 02:11:05 2018 (r339106) @@ -5838,8 +5838,7 @@ spa_vdev_add(spa_t *spa, nvlist_t *nvroot) for (int c = 0; c < vd->vdev_children; c++) { tvd = vd->vdev_child[c]; if (spa->spa_vdev_removal != NULL && - tvd->vdev_ashift != - spa->spa_vdev_removal->svr_vdev->vdev_ashift) { + tvd->vdev_ashift != spa->spa_max_ashift) { return (spa_vdev_exit(spa, vd, txg, EINVAL)); } /* Fail if top level vdev is raidz */ @@ -5955,10 +5954,8 @@ spa_vdev_attach(spa_t *spa, uint64_t guid, nvlist_t *n return (spa_vdev_exit(spa, NULL, txg, error)); } - if (spa->spa_vdev_removal != NULL || - spa->spa_removing_phys.sr_prev_indirect_vdev != -1) { + if (spa->spa_vdev_removal != NULL) return (spa_vdev_exit(spa, NULL, txg, EBUSY)); - } if (oldvd == NULL) return (spa_vdev_exit(spa, NULL, txg, ENODEV)); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Wed Oct 3 02:10:23 2018 (r339105) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Wed Oct 3 02:11:05 2018 (r339106) @@ -1882,9 +1882,12 @@ spa_update_dspace(spa_t *spa) * allocated twice (on the old device and the new * device). */ - vdev_t *vd = spa->spa_vdev_removal->svr_vdev; + spa_config_enter(spa, SCL_VDEV, FTAG, RW_READER); + vdev_t *vd = + vdev_lookup_top(spa, spa->spa_vdev_removal->svr_vdev_id); spa->spa_dspace -= spa_deflate(spa) ? vd->vdev_stat.vs_dspace : vd->vdev_stat.vs_space; + spa_config_exit(spa, SCL_VDEV, FTAG); } } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_removal.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_removal.h Wed Oct 3 02:10:23 2018 (r339105) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_removal.h Wed Oct 3 02:11:05 2018 (r339106) @@ -30,7 +30,7 @@ extern "C" { #endif typedef struct spa_vdev_removal { - vdev_t *svr_vdev; + uint64_t svr_vdev_id; uint64_t svr_max_offset_to_sync[TXG_SIZE]; /* Thread performing a vdev removal. */ kthread_t *svr_thread; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h Wed Oct 3 02:10:23 2018 (r339105) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h Wed Oct 3 02:11:05 2018 (r339106) @@ -587,7 +587,7 @@ extern zio_t *zio_vdev_child_io(zio_t *zio, blkptr_t * zio_done_func_t *done, void *priv); extern zio_t *zio_vdev_delegated_io(vdev_t *vd, uint64_t offset, - struct abd *data, uint64_t size, int type, zio_priority_t priority, + struct abd *data, uint64_t size, zio_type_t type, zio_priority_t priority, enum zio_flag flags, zio_done_func_t *done, void *priv); extern void zio_vdev_io_bypass(zio_t *zio); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Oct 3 02:10:23 2018 (r339105) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Oct 3 02:11:05 2018 (r339106) @@ -988,6 +988,32 @@ vdev_top_transfer(vdev_t *svd, vdev_t *tvd) svd->vdev_stat.vs_space = 0; svd->vdev_stat.vs_dspace = 0; + /* + * State which may be set on a top-level vdev that's in the + * process of being removed. + */ + ASSERT0(tvd->vdev_indirect_config.vic_births_object); + ASSERT0(tvd->vdev_indirect_config.vic_mapping_object); + ASSERT3U(tvd->vdev_indirect_config.vic_prev_indirect_vdev, ==, -1ULL); + ASSERT3P(tvd->vdev_indirect_mapping, ==, NULL); + ASSERT3P(tvd->vdev_indirect_births, ==, NULL); + ASSERT3P(tvd->vdev_obsolete_sm, ==, NULL); + ASSERT0(tvd->vdev_removing); + tvd->vdev_removing = svd->vdev_removing; + tvd->vdev_indirect_config = svd->vdev_indirect_config; + tvd->vdev_indirect_mapping = svd->vdev_indirect_mapping; + tvd->vdev_indirect_births = svd->vdev_indirect_births; + range_tree_swap(&svd->vdev_obsolete_segments, + &tvd->vdev_obsolete_segments); + tvd->vdev_obsolete_sm = svd->vdev_obsolete_sm; + svd->vdev_indirect_config.vic_mapping_object = 0; + svd->vdev_indirect_config.vic_births_object = 0; + svd->vdev_indirect_config.vic_prev_indirect_vdev = -1ULL; + svd->vdev_indirect_mapping = NULL; + svd->vdev_indirect_births = NULL; + svd->vdev_obsolete_sm = NULL; + svd->vdev_removing = 0; + for (t = 0; t < TXG_SIZE; t++) { while ((msp = txg_list_remove(&svd->vdev_ms_list, t)) != NULL) (void) txg_list_add(&tvd->vdev_ms_list, msp, t); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c Wed Oct 3 02:10:23 2018 (r339105) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c Wed Oct 3 02:11:05 2018 (r339106) @@ -23,6 +23,7 @@ #include #include #include +#include #include #include #include @@ -46,10 +47,11 @@ * "vdev_remap" operation that executes a callback on each contiguous * segment of the new location. This function is used in multiple ways: * - * - reads and repair writes to this device use the callback to create - * a child io for each mapped segment. + * - i/os to this vdev use the callback to determine where the + * data is now located, and issue child i/os for each segment's new + * location. * - * - frees and claims to this device use the callback to free or claim + * - frees and claims to this vdev use the callback to free or claim * each mapped segment. (Note that we don't actually need to claim * log blocks on indirect vdevs, because we don't allocate to * removing vdevs. However, zdb uses zio_claim() for its leak @@ -204,6 +206,94 @@ uint64_t zfs_condense_min_mapping_bytes = 128 * 1024; int zfs_condense_indirect_commit_entry_delay_ticks = 0; /* + * If a split block contains more than this many segments, consider it too + * computationally expensive to check all (2^num_segments) possible + * combinations. Instead, try at most 2^_segments_max randomly-selected + * combinations. + * + * This is reasonable if only a few segment copies are damaged and the + * majority of segment copies are good. This allows all the segment copies to + * participate fairly in the reconstruction and prevents the repeated use of + * one bad copy. + */ +int zfs_reconstruct_indirect_segments_max = 10; + +/* + * The indirect_child_t represents the vdev that we will read from, when we + * need to read all copies of the data (e.g. for scrub or reconstruction). + * For plain (non-mirror) top-level vdevs (i.e. is_vdev is not a mirror), + * ic_vdev is the same as is_vdev. However, for mirror top-level vdevs, + * ic_vdev is a child of the mirror. + */ +typedef struct indirect_child { + abd_t *ic_data; + vdev_t *ic_vdev; +} indirect_child_t; + +/* + * The indirect_split_t represents one mapped segment of an i/o to the + * indirect vdev. For non-split (contiguously-mapped) blocks, there will be + * only one indirect_split_t, with is_split_offset==0 and is_size==io_size. + * For split blocks, there will be several of these. + */ +typedef struct indirect_split { + list_node_t is_node; /* link on iv_splits */ + + /* + * is_split_offset is the offset into the i/o. + * This is the sum of the previous splits' is_size's. + */ + uint64_t is_split_offset; + + vdev_t *is_vdev; /* top-level vdev */ + uint64_t is_target_offset; /* offset on is_vdev */ + uint64_t is_size; + int is_children; /* number of entries in is_child[] */ + + /* + * is_good_child is the child that we are currently using to + * attempt reconstruction. + */ + int is_good_child; + + indirect_child_t is_child[1]; /* variable-length */ +} indirect_split_t; + +/* + * The indirect_vsd_t is associated with each i/o to the indirect vdev. + * It is the "Vdev-Specific Data" in the zio_t's io_vsd. + */ +typedef struct indirect_vsd { + boolean_t iv_split_block; + boolean_t iv_reconstruct; + + list_t iv_splits; /* list of indirect_split_t's */ +} indirect_vsd_t; + +static void +vdev_indirect_map_free(zio_t *zio) +{ + indirect_vsd_t *iv = zio->io_vsd; + + indirect_split_t *is; + while ((is = list_head(&iv->iv_splits)) != NULL) { + for (int c = 0; c < is->is_children; c++) { + indirect_child_t *ic = &is->is_child[c]; + if (ic->ic_data != NULL) + abd_free(ic->ic_data); + } + list_remove(&iv->iv_splits, is); + kmem_free(is, + offsetof(indirect_split_t, is_child[is->is_children])); + } + kmem_free(iv, sizeof (*iv)); +} + +static const zio_vsd_ops_t vdev_indirect_vsd_ops = { + vdev_indirect_map_free, + zio_vsd_default_cksum_report +}; +/* * Mark the given offset and size as being obsolete. */ void @@ -818,12 +908,6 @@ vdev_indirect_close(vdev_t *vd) } /* ARGSUSED */ -static void -vdev_indirect_io_done(zio_t *zio) -{ -} - -/* ARGSUSED */ static int vdev_indirect_open(vdev_t *vd, uint64_t *psize, uint64_t *max_psize, uint64_t *logical_ashift, uint64_t *physical_ashift) @@ -1067,39 +1151,473 @@ vdev_indirect_child_io_done(zio_t *zio) abd_put(zio->io_abd); } +/* + * This is a callback for vdev_indirect_remap() which allocates an + * indirect_split_t for each split segment and adds it to iv_splits. + */ static void -vdev_indirect_io_start_cb(uint64_t split_offset, vdev_t *vd, uint64_t offset, +vdev_indirect_gather_splits(uint64_t split_offset, vdev_t *vd, uint64_t offset, uint64_t size, void *arg) { zio_t *zio = arg; + indirect_vsd_t *iv = zio->io_vsd; ASSERT3P(vd, !=, NULL); if (vd->vdev_ops == &vdev_indirect_ops) return; - zio_nowait(zio_vdev_child_io(zio, NULL, vd, offset, - abd_get_offset(zio->io_abd, split_offset), - size, zio->io_type, zio->io_priority, - 0, vdev_indirect_child_io_done, zio)); + int n = 1; + if (vd->vdev_ops == &vdev_mirror_ops) + n = vd->vdev_children; + + indirect_split_t *is = + kmem_zalloc(offsetof(indirect_split_t, is_child[n]), KM_SLEEP); + + is->is_children = n; + is->is_size = size; + is->is_split_offset = split_offset; + is->is_target_offset = offset; + is->is_vdev = vd; + + /* + * Note that we only consider multiple copies of the data for + * *mirror* vdevs. We don't for "replacing" or "spare" vdevs, even + * though they use the same ops as mirror, because there's only one + * "good" copy under the replacing/spare. + */ + if (vd->vdev_ops == &vdev_mirror_ops) { + for (int i = 0; i < n; i++) { + is->is_child[i].ic_vdev = vd->vdev_child[i]; + } + } else { + is->is_child[0].ic_vdev = vd; + } + + list_insert_tail(&iv->iv_splits, is); } static void +vdev_indirect_read_split_done(zio_t *zio) +{ + indirect_child_t *ic = zio->io_private; + + if (zio->io_error != 0) { + /* + * Clear ic_data to indicate that we do not have data for this + * child. + */ + abd_free(ic->ic_data); + ic->ic_data = NULL; + } +} + +/* + * Issue reads for all copies (mirror children) of all splits. + */ +static void +vdev_indirect_read_all(zio_t *zio) +{ + indirect_vsd_t *iv = zio->io_vsd; + + for (indirect_split_t *is = list_head(&iv->iv_splits); + is != NULL; is = list_next(&iv->iv_splits, is)) { + for (int i = 0; i < is->is_children; i++) { + indirect_child_t *ic = &is->is_child[i]; + + if (!vdev_readable(ic->ic_vdev)) + continue; + + /* + * Note, we may read from a child whose DTL + * indicates that the data may not be present here. + * While this might result in a few i/os that will + * likely return incorrect data, it simplifies the + * code since we can treat scrub and resilver + * identically. (The incorrect data will be + * detected and ignored when we verify the + * checksum.) + */ + + ic->ic_data = abd_alloc_sametype(zio->io_abd, + is->is_size); + + zio_nowait(zio_vdev_child_io(zio, NULL, + ic->ic_vdev, is->is_target_offset, ic->ic_data, + is->is_size, zio->io_type, zio->io_priority, 0, + vdev_indirect_read_split_done, ic)); + } + } + iv->iv_reconstruct = B_TRUE; +} + +static void vdev_indirect_io_start(zio_t *zio) { spa_t *spa = zio->io_spa; + indirect_vsd_t *iv = kmem_zalloc(sizeof (*iv), KM_SLEEP); + list_create(&iv->iv_splits, + sizeof (indirect_split_t), offsetof(indirect_split_t, is_node)); + zio->io_vsd = iv; + zio->io_vsd_ops = &vdev_indirect_vsd_ops; + ASSERT(spa_config_held(spa, SCL_ALL, RW_READER) != 0); if (zio->io_type != ZIO_TYPE_READ) { ASSERT3U(zio->io_type, ==, ZIO_TYPE_WRITE); - ASSERT((zio->io_flags & - (ZIO_FLAG_SELF_HEAL | ZIO_FLAG_INDUCE_DAMAGE)) != 0); + /* + * Note: this code can handle other kinds of writes, + * but we don't expect them. + */ + ASSERT((zio->io_flags & (ZIO_FLAG_SELF_HEAL | + ZIO_FLAG_RESILVER | ZIO_FLAG_INDUCE_DAMAGE)) != 0); } vdev_indirect_remap(zio->io_vd, zio->io_offset, zio->io_size, - vdev_indirect_io_start_cb, zio); + vdev_indirect_gather_splits, zio); + indirect_split_t *first = list_head(&iv->iv_splits); + if (first->is_size == zio->io_size) { + /* + * This is not a split block; we are pointing to the entire + * data, which will checksum the same as the original data. + * Pass the BP down so that the child i/o can verify the + * checksum, and try a different location if available + * (e.g. on a mirror). + * + * While this special case could be handled the same as the + * general (split block) case, doing it this way ensures + * that the vast majority of blocks on indirect vdevs + * (which are not split) are handled identically to blocks + * on non-indirect vdevs. This allows us to be less strict + * about performance in the general (but rare) case. + */ + ASSERT0(first->is_split_offset); + ASSERT3P(list_next(&iv->iv_splits, first), ==, NULL); + zio_nowait(zio_vdev_child_io(zio, zio->io_bp, + first->is_vdev, first->is_target_offset, + abd_get_offset(zio->io_abd, 0), + zio->io_size, zio->io_type, zio->io_priority, 0, + vdev_indirect_child_io_done, zio)); + } else { + iv->iv_split_block = B_TRUE; + if (zio->io_flags & (ZIO_FLAG_SCRUB | ZIO_FLAG_RESILVER)) { + /* + * Read all copies. Note that for simplicity, + * we don't bother consulting the DTL in the + * resilver case. + */ + vdev_indirect_read_all(zio); + } else { + /* + * Read one copy of each split segment, from the + * top-level vdev. Since we don't know the + * checksum of each split individually, the child + * zio can't ensure that we get the right data. + * E.g. if it's a mirror, it will just read from a + * random (healthy) leaf vdev. We have to verify + * the checksum in vdev_indirect_io_done(). + */ + for (indirect_split_t *is = list_head(&iv->iv_splits); + is != NULL; is = list_next(&iv->iv_splits, is)) { + zio_nowait(zio_vdev_child_io(zio, NULL, + is->is_vdev, is->is_target_offset, + abd_get_offset(zio->io_abd, + is->is_split_offset), + is->is_size, zio->io_type, + zio->io_priority, 0, + vdev_indirect_child_io_done, zio)); + } + } + } + zio_execute(zio); +} + +/* + * Report a checksum error for a child. + */ +static void +vdev_indirect_checksum_error(zio_t *zio, + indirect_split_t *is, indirect_child_t *ic) +{ + vdev_t *vd = ic->ic_vdev; + + if (zio->io_flags & ZIO_FLAG_SPECULATIVE) + return; + + mutex_enter(&vd->vdev_stat_lock); + vd->vdev_stat.vs_checksum_errors++; + mutex_exit(&vd->vdev_stat_lock); + + zio_bad_cksum_t zbc = { 0 }; + void *bad_buf = abd_borrow_buf_copy(ic->ic_data, is->is_size); + abd_t *good_abd = is->is_child[is->is_good_child].ic_data; + void *good_buf = abd_borrow_buf_copy(good_abd, is->is_size); + zfs_ereport_post_checksum(zio->io_spa, vd, zio, + is->is_target_offset, is->is_size, good_buf, bad_buf, &zbc); + abd_return_buf(ic->ic_data, bad_buf, is->is_size); + abd_return_buf(good_abd, good_buf, is->is_size); +} + +/* + * Issue repair i/os for any incorrect copies. We do this by comparing + * each split segment's correct data (is_good_child's ic_data) with each + * other copy of the data. If they differ, then we overwrite the bad data + * with the good copy. Note that we do this without regard for the DTL's, + * which simplifies this code and also issues the optimal number of writes + * (based on which copies actually read bad data, as opposed to which we + * think might be wrong). For the same reason, we always use + * ZIO_FLAG_SELF_HEAL, to bypass the DTL check in zio_vdev_io_start(). + */ +static void +vdev_indirect_repair(zio_t *zio) +{ + indirect_vsd_t *iv = zio->io_vsd; + + enum zio_flag flags = ZIO_FLAG_IO_REPAIR; + + if (!(zio->io_flags & (ZIO_FLAG_SCRUB | ZIO_FLAG_RESILVER))) + flags |= ZIO_FLAG_SELF_HEAL; + + if (!spa_writeable(zio->io_spa)) + return; + + for (indirect_split_t *is = list_head(&iv->iv_splits); + is != NULL; is = list_next(&iv->iv_splits, is)) { + indirect_child_t *good_child = &is->is_child[is->is_good_child]; + + for (int c = 0; c < is->is_children; c++) { + indirect_child_t *ic = &is->is_child[c]; + if (ic == good_child) + continue; + if (ic->ic_data == NULL) + continue; + if (abd_cmp(good_child->ic_data, ic->ic_data, + is->is_size) == 0) + continue; + + zio_nowait(zio_vdev_child_io(zio, NULL, + ic->ic_vdev, is->is_target_offset, + good_child->ic_data, is->is_size, + ZIO_TYPE_WRITE, ZIO_PRIORITY_ASYNC_WRITE, + ZIO_FLAG_IO_REPAIR | ZIO_FLAG_SELF_HEAL, + NULL, NULL)); + + vdev_indirect_checksum_error(zio, is, ic); + } + } +} + +/* + * Report checksum errors on all children that we read from. + */ +static void +vdev_indirect_all_checksum_errors(zio_t *zio) +{ + indirect_vsd_t *iv = zio->io_vsd; + + if (zio->io_flags & ZIO_FLAG_SPECULATIVE) + return; + + for (indirect_split_t *is = list_head(&iv->iv_splits); + is != NULL; is = list_next(&iv->iv_splits, is)) { + for (int c = 0; c < is->is_children; c++) { + indirect_child_t *ic = &is->is_child[c]; + + if (ic->ic_data == NULL) + continue; + + vdev_t *vd = ic->ic_vdev; + + mutex_enter(&vd->vdev_stat_lock); + vd->vdev_stat.vs_checksum_errors++; + mutex_exit(&vd->vdev_stat_lock); + + zfs_ereport_post_checksum(zio->io_spa, vd, zio, + is->is_target_offset, is->is_size, + NULL, NULL, NULL); + } + } +} + +/* + * This function is called when we have read all copies of the data and need + * to try to find a combination of copies that gives us the right checksum. + * + * If we pointed to any mirror vdevs, this effectively does the job of the + * mirror. The mirror vdev code can't do its own job because we don't know + * the checksum of each split segment individually. We have to try every + * combination of copies of split segments, until we find one that checksums + * correctly. (Or until we have tried all combinations, or have tried + * 2^zfs_reconstruct_indirect_segments_max combinations. In these cases we + * set io_error to ECKSUM to propagate the error up to the user.) + * + * For example, if we have 3 segments in the split, + * and each points to a 2-way mirror, we will have the following pieces of + * data: + * + * | mirror child + * split | [0] [1] + * ======|===================== + * A | data_A_0 data_A_1 + * B | data_B_0 data_B_1 + * C | data_C_0 data_C_1 + * + * We will try the following (mirror children)^(number of splits) (2^3=8) + * combinations, which is similar to bitwise-little-endian counting in + * binary. In general each "digit" corresponds to a split segment, and the + * base of each digit is is_children, which can be different for each + * digit. + * + * "low bit" "high bit" + * v v + * data_A_0 data_B_0 data_C_0 + * data_A_1 data_B_0 data_C_0 + * data_A_0 data_B_1 data_C_0 + * data_A_1 data_B_1 data_C_0 + * data_A_0 data_B_0 data_C_1 + * data_A_1 data_B_0 data_C_1 + * data_A_0 data_B_1 data_C_1 + * data_A_1 data_B_1 data_C_1 + * + * Note that the split segments may be on the same or different top-level + * vdevs. In either case, we try lots of combinations (see + * zfs_reconstruct_indirect_segments_max). This ensures that if a mirror has + * small silent errors on all of its children, we can still reconstruct the + * correct data, as long as those errors are at sufficiently-separated + * offsets (specifically, separated by the largest block size - default of + * 128KB, but up to 16MB). + */ +static void +vdev_indirect_reconstruct_io_done(zio_t *zio) +{ + indirect_vsd_t *iv = zio->io_vsd; + uint64_t attempts = 0; + uint64_t attempts_max = 1ULL << zfs_reconstruct_indirect_segments_max; + int segments = 0; + + for (indirect_split_t *is = list_head(&iv->iv_splits); + is != NULL; is = list_next(&iv->iv_splits, is)) + segments++; + + for (;;) { + /* copy data from splits to main zio */ + int ret; + for (indirect_split_t *is = list_head(&iv->iv_splits); + is != NULL; is = list_next(&iv->iv_splits, is)) { + + /* + * If this child failed, its ic_data will be NULL. + * Skip this combination. + */ + if (is->is_child[is->is_good_child].ic_data == NULL) { + ret = EIO; + goto next; + } + + abd_copy_off(zio->io_abd, + is->is_child[is->is_good_child].ic_data, + is->is_split_offset, 0, is->is_size); + } + + /* See if this checksum matches. */ + zio_bad_cksum_t zbc; + ret = zio_checksum_error(zio, &zbc); + if (ret == 0) { + /* Found a matching checksum. Issue repair i/os. */ + vdev_indirect_repair(zio); + zio_checksum_verified(zio); + return; + } + + /* + * Checksum failed; try a different combination of split + * children. + */ + boolean_t more; +next: + more = B_FALSE; + if (segments <= zfs_reconstruct_indirect_segments_max) { + /* + * There are relatively few segments, so + * deterministically check all combinations. We do + * this by by adding one to the first split's + * good_child. If it overflows, then "carry over" to + * the next split (like counting in base is_children, + * but each digit can have a different base). + */ + for (indirect_split_t *is = list_head(&iv->iv_splits); + is != NULL; is = list_next(&iv->iv_splits, is)) { + is->is_good_child++; + if (is->is_good_child < is->is_children) { + more = B_TRUE; + break; + } + is->is_good_child = 0; + } + } else if (++attempts < attempts_max) { + /* + * There are too many combinations to try all of them + * in a reasonable amount of time, so try a fixed + * number of random combinations, after which we'll + * consider the block unrecoverable. + */ + for (indirect_split_t *is = list_head(&iv->iv_splits); + is != NULL; is = list_next(&iv->iv_splits, is)) { + is->is_good_child = + spa_get_random(is->is_children); + } + more = B_TRUE; + } + if (!more) { + /* All combinations failed. */ + zio->io_error = ret; + vdev_indirect_all_checksum_errors(zio); + zio_checksum_verified(zio); + return; + } + } +} + +static void +vdev_indirect_io_done(zio_t *zio) +{ + indirect_vsd_t *iv = zio->io_vsd; + + if (iv->iv_reconstruct) { + /* + * We have read all copies of the data (e.g. from mirrors), + * either because this was a scrub/resilver, or because the + * one-copy read didn't checksum correctly. + */ + vdev_indirect_reconstruct_io_done(zio); + return; + } + + if (!iv->iv_split_block) { + /* + * This was not a split block, so we passed the BP down, + * and the checksum was handled by the (one) child zio. + */ + return; + } + + zio_bad_cksum_t zbc; + int ret = zio_checksum_error(zio, &zbc); + if (ret == 0) { + zio_checksum_verified(zio); + return; + } + + /* + * The checksum didn't match. Read all copies of all splits, and + * then we will try to reconstruct. The next time + * vdev_indirect_io_done() is called, iv_reconstruct will be set. + */ + vdev_indirect_read_all(zio); + + zio_vdev_io_redone(zio); } vdev_ops_t vdev_indirect_ops = { Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c Wed Oct 3 02:10:23 2018 (r339105) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c Wed Oct 3 02:11:05 2018 (r339106) @@ -24,7 +24,7 @@ */ /* - * Copyright (c) 2012, 2015 by Delphix. All rights reserved. + * Copyright (c) 2012, 2018 by Delphix. All rights reserved. */ #include @@ -516,13 +516,16 @@ vdev_mirror_io_start(zio_t *zio) } if (zio->io_type == ZIO_TYPE_READ) { - if ((zio->io_flags & ZIO_FLAG_SCRUB) && !mm->mm_resilvering && + if (zio->io_bp != NULL && + (zio->io_flags & ZIO_FLAG_SCRUB) && !mm->mm_resilvering && mm->mm_children > 1) { /* - * For scrubbing reads we need to allocate a read - * buffer for each child and issue reads to all - * children. If any child succeeds, it will copy its - * data into zio->io_data in vdev_mirror_scrub_done. + * For scrubbing reads (if we can verify the + * checksum here, as indicated by io_bp being + * non-NULL) we need to allocate a read buffer for + * each child and issue reads to all children. If + * any child succeeds, it will copy its data into + * zio->io_data in vdev_mirror_scrub_done. */ for (c = 0; c < mm->mm_children; c++) { mc = &mm->mm_child[c]; @@ -677,7 +680,21 @@ vdev_mirror_io_done(zio_t *zio) if (mc->mc_error == 0) { if (mc->mc_tried) continue; + /* + * We didn't try this child. We need to + * repair it if: + * 1. it's a scrub (in which case we have + * tried everything that was healthy) + * - or - + * 2. it's an indirect vdev (in which case + * it could point to any other vdev, which + * might have a bad DTL) + * - or - + * 3. the DTL indicates that this data is + * missing from this vdev + */ if (!(zio->io_flags & ZIO_FLAG_SCRUB) && + mc->mc_vd->vdev_ops != &vdev_indirect_ops && !vdev_dtl_contains(mc->mc_vd, DTL_PARTIAL, zio->io_txg, 1)) continue; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_removal.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_removal.c Wed Oct 3 02:10:23 2018 (r339105) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_removal.c Wed Oct 3 02:11:05 2018 (r339106) @@ -83,18 +83,12 @@ typedef struct vdev_copy_arg { kmutex_t vca_lock; } vdev_copy_arg_t; -typedef struct vdev_copy_seg_arg { - vdev_copy_arg_t *vcsa_copy_arg; - uint64_t vcsa_txg; - dva_t *vcsa_dest_dva; - blkptr_t *vcsa_dest_bp; -} vdev_copy_seg_arg_t; - /* - * The maximum amount of allowed data we're allowed to copy from a device - * at a time when removing it. + * The maximum amount of memory we can use for outstanding i/o while + * doing a device removal. This determines how much i/o we can have + * in flight concurrently. */ -int zfs_remove_max_copy_bytes = 8 * 1024 * 1024; +int zfs_remove_max_copy_bytes = 64 * 1024 * 1024; /* * The largest contiguous segment that we will attempt to allocate when @@ -176,7 +170,7 @@ spa_vdev_removal_create(vdev_t *vd) mutex_init(&svr->svr_lock, NULL, MUTEX_DEFAULT, NULL); cv_init(&svr->svr_cv, NULL, CV_DEFAULT, NULL); svr->svr_allocd_segs = range_tree_create(NULL, NULL); - svr->svr_vdev = vd; + svr->svr_vdev_id = vd->vdev_id; for (int i = 0; i < TXG_SIZE; i++) { svr->svr_frees[i] = range_tree_create(NULL, NULL); @@ -218,9 +212,10 @@ spa_vdev_removal_destroy(spa_vdev_removal_t *svr) static void vdev_remove_initiate_sync(void *arg, dmu_tx_t *tx) { - vdev_t *vd = arg; + int vdev_id = (uintptr_t)arg; + spa_t *spa = dmu_tx_pool(tx)->dp_spa; + vdev_t *vd = vdev_lookup_top(spa, vdev_id); vdev_indirect_config_t *vic = &vd->vdev_indirect_config; - spa_t *spa = vd->vdev_spa; objset_t *mos = spa->spa_dsl_pool->dp_meta_objset; spa_vdev_removal_t *svr = NULL; *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-stable-11@freebsd.org Wed Oct 3 02:12:26 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C78D610B4B0B; Wed, 3 Oct 2018 02:12:25 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7CFD4865CB; Wed, 3 Oct 2018 02:12:25 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 77D1B1097E; Wed, 3 Oct 2018 02:12:25 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w932CPgF044910; Wed, 3 Oct 2018 02:12:25 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w932CP3C044909; Wed, 3 Oct 2018 02:12:25 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030212.w932CP3C044909@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:12:25 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339107 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339107 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:12:26 -0000 Author: mav Date: Wed Oct 3 02:12:24 2018 New Revision: 339107 URL: https://svnweb.freebsd.org/changeset/base/339107 Log: MFC r336954: MFV r336952: 9192 explicitly pass good_writes to vdev_uberblock/label_sync Currently vdev_label_sync and vdev_uberblock_sync take a zio_t and assume that its io_private is a pointer to the good_writes count. They should instead accept this argument explicitly. illumos/illumos-gate@a3b5583021b7b45676bf1f0cc68adf7a97900b56 Reviewed by: Pavel Zakharov Reviewed by: George Wilson Approved by: Richard Lowe Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c Wed Oct 3 02:11:05 2018 (r339106) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c Wed Oct 3 02:12:24 2018 (r339107) @@ -1175,10 +1175,13 @@ vdev_uberblock_sync_done(zio_t *zio) * Write the uberblock to all labels of all leaves of the specified vdev. */ static void -vdev_uberblock_sync(zio_t *zio, uberblock_t *ub, vdev_t *vd, int flags) +vdev_uberblock_sync(zio_t *zio, uint64_t *good_writes, + uberblock_t *ub, vdev_t *vd, int flags) { - for (uint64_t c = 0; c < vd->vdev_children; c++) - vdev_uberblock_sync(zio, ub, vd->vdev_child[c], flags); + for (uint64_t c = 0; c < vd->vdev_children; c++) { + vdev_uberblock_sync(zio, good_writes, + ub, vd->vdev_child[c], flags); + } if (!vd->vdev_ops->vdev_op_leaf) return; @@ -1196,7 +1199,7 @@ vdev_uberblock_sync(zio_t *zio, uberblock_t *ub, vdev_ for (int l = 0; l < VDEV_LABELS; l++) vdev_label_write(zio, vd, l, ub_abd, VDEV_UBERBLOCK_OFFSET(vd, n), VDEV_UBERBLOCK_SIZE(vd), - vdev_uberblock_sync_done, zio->io_private, + vdev_uberblock_sync_done, good_writes, flags | ZIO_FLAG_DONT_PROPAGATE); abd_free(ub_abd); @@ -1210,10 +1213,10 @@ vdev_uberblock_sync_list(vdev_t **svd, int svdcount, u zio_t *zio; uint64_t good_writes = 0; - zio = zio_root(spa, NULL, &good_writes, flags); + zio = zio_root(spa, NULL, NULL, flags); for (int v = 0; v < svdcount; v++) - vdev_uberblock_sync(zio, ub, svd[v], flags); + vdev_uberblock_sync(zio, &good_writes, ub, svd[v], flags); (void) zio_wait(zio); @@ -1274,7 +1277,8 @@ vdev_label_sync_ignore_done(zio_t *zio) * Write all even or odd labels to all leaves of the specified vdev. */ static void -vdev_label_sync(zio_t *zio, vdev_t *vd, int l, uint64_t txg, int flags) +vdev_label_sync(zio_t *zio, uint64_t *good_writes, + vdev_t *vd, int l, uint64_t txg, int flags) { nvlist_t *label; vdev_phys_t *vp; @@ -1282,8 +1286,10 @@ vdev_label_sync(zio_t *zio, vdev_t *vd, int l, uint64_ char *buf; size_t buflen; - for (int c = 0; c < vd->vdev_children; c++) - vdev_label_sync(zio, vd->vdev_child[c], l, txg, flags); + for (int c = 0; c < vd->vdev_children; c++) { + vdev_label_sync(zio, good_writes, + vd->vdev_child[c], l, txg, flags); + } if (!vd->vdev_ops->vdev_op_leaf) return; @@ -1308,7 +1314,7 @@ vdev_label_sync(zio_t *zio, vdev_t *vd, int l, uint64_ vdev_label_write(zio, vd, l, vp_abd, offsetof(vdev_label_t, vl_vdev_phys), sizeof (vdev_phys_t), - vdev_label_sync_done, zio->io_private, + vdev_label_sync_done, good_writes, flags | ZIO_FLAG_DONT_PROPAGATE); } } @@ -1340,7 +1346,7 @@ vdev_label_sync_list(spa_t *spa, int l, uint64_t txg, (vd->vdev_islog || vd->vdev_aux != NULL) ? vdev_label_sync_ignore_done : vdev_label_sync_top_done, good_writes, flags); - vdev_label_sync(vio, vd, l, txg, flags); + vdev_label_sync(vio, good_writes, vd, l, txg, flags); zio_nowait(vio); } From owner-svn-src-stable-11@freebsd.org Wed Oct 3 02:13:07 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 079F310B4B84; Wed, 3 Oct 2018 02:13:07 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id AFD3D8683E; Wed, 3 Oct 2018 02:13:06 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id AABDC1098A; Wed, 3 Oct 2018 02:13:06 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w932D6vL045704; Wed, 3 Oct 2018 02:13:06 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w932D5Wp045694; Wed, 3 Oct 2018 02:13:05 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030213.w932D5Wp045694@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:13:05 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339108 - in stable/11: cddl/contrib/opensolaris/cmd/ztest sys/cddl/contrib/opensolaris/uts/common/fs/zfs sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11: cddl/contrib/opensolaris/cmd/ztest sys/cddl/contrib/opensolaris/uts/common/fs/zfs sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys X-SVN-Commit-Revision: 339108 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:13:07 -0000 Author: mav Date: Wed Oct 3 02:13:04 2018 New Revision: 339108 URL: https://svnweb.freebsd.org/changeset/base/339108 Log: MFC r336956: MFV r336955: 9236 nuke spa_dbgmsg We should use zfs_dbgmsg instead of spa_dbgmsg. Or at least, metaslab_condense() should call zfs_dbgmsg because it's important and rare enough to always log. It's possible that the message in zio_dva_allocate() would be too high-frequency for zfs_dbgmsg. illumos/illumos-gate@21f7c81cc1156e9202ce3412d3ecaa697c3b2222 Reviewed by: Serapheim Dimitropoulos Reviewed by: Pavel Zakharov Reviewed by: George Wilson Reviewed by: Richard Elling Approved by: Richard Lowe Author: Matthew Ahrens Modified: stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_debug.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c Wed Oct 3 02:12:24 2018 (r339107) +++ stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c Wed Oct 3 02:13:04 2018 (r339108) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2011, 2016 by Delphix. All rights reserved. + * Copyright (c) 2011, 2017 by Delphix. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. * Copyright (c) 2012 Martin Matuska . All rights reserved. * Copyright (c) 2013 Steven Hartland. All rights reserved. @@ -5921,7 +5921,6 @@ ztest_run(ztest_shared_t *zs) */ kernel_init(FREAD | FWRITE); VERIFY0(spa_open(ztest_opts.zo_pool, &spa, FTAG)); - spa->spa_debug = B_TRUE; metaslab_preload_limit = ztest_random(20) + 1; ztest_spa = spa; @@ -6078,7 +6077,6 @@ ztest_freeze(void) kernel_init(FREAD | FWRITE); VERIFY3U(0, ==, spa_open(ztest_opts.zo_pool, &spa, FTAG)); VERIFY3U(0, ==, ztest_dataset_open(0)); - spa->spa_debug = B_TRUE; ztest_spa = spa; /* @@ -6149,7 +6147,6 @@ ztest_freeze(void) VERIFY3U(0, ==, ztest_dataset_open(0)); ztest_dataset_close(0); - spa->spa_debug = B_TRUE; ztest_spa = spa; txg_wait_synced(spa_get_dsl(spa), 0); ztest_reguid(NULL, 0); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Oct 3 02:12:24 2018 (r339107) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Oct 3 02:13:04 2018 (r339108) @@ -1747,7 +1747,7 @@ metaslab_set_fragmentation(metaslab_t *msp) if (spa_writeable(spa) && txg < spa_final_dirty_txg(spa)) { msp->ms_condense_wanted = B_TRUE; vdev_dirty(vd, VDD_METASLAB, msp, txg + 1); - spa_dbgmsg(spa, "txg %llu, requesting force condense: " + zfs_dbgmsg("txg %llu, requesting force condense: " "ms_id %llu, vdev_id %llu", txg, msp->ms_id, vd->vdev_id); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Wed Oct 3 02:12:24 2018 (r339107) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Wed Oct 3 02:13:04 2018 (r339108) @@ -252,7 +252,7 @@ int spa_mode_global; * Everything except dprintf, spa, and indirect_remap is on by default * in debug builds. */ -int zfs_flags = ~(ZFS_DEBUG_DPRINTF | ZFS_DEBUG_SPA | ZFS_DEBUG_INDIRECT_REMAP); +int zfs_flags = ~(ZFS_DEBUG_DPRINTF | ZFS_DEBUG_INDIRECT_REMAP); #else int zfs_flags = 0; #endif @@ -825,8 +825,6 @@ spa_add(const char *name, nvlist_t *config, const char KM_SLEEP) == 0); } - spa->spa_debug = ((zfs_flags & ZFS_DEBUG_SPA) != 0); - spa->spa_min_ashift = INT_MAX; spa->spa_max_ashift = 0; @@ -2282,12 +2280,6 @@ spa_scan_get_stats(spa_t *spa, pool_scan_stat_t *ps) ps->pss_pass_scrub_spent_paused = spa->spa_scan_pass_scrub_spent_paused; return (0); -} - -boolean_t -spa_debug_enabled(spa_t *spa) -{ - return (spa->spa_debug); } int Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h Wed Oct 3 02:12:24 2018 (r339107) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h Wed Oct 3 02:13:04 2018 (r339108) @@ -947,13 +947,6 @@ _NOTE(CONSTCOND) } while (0) #define dprintf_bp(bp, fmt, ...) #endif -extern boolean_t spa_debug_enabled(spa_t *spa); -#define spa_dbgmsg(spa, ...) \ -{ \ - if (spa_debug_enabled(spa)) \ - zfs_dbgmsg(__VA_ARGS__); \ -} - extern int spa_mode_global; /* mode, e.g. FREAD | FWRITE */ #ifdef __cplusplus Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h Wed Oct 3 02:12:24 2018 (r339107) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h Wed Oct 3 02:13:04 2018 (r339108) @@ -332,7 +332,6 @@ struct spa { kcondvar_t spa_suspend_cv; /* notification of resume */ uint8_t spa_suspended; /* pool is suspended */ uint8_t spa_claiming; /* pool is doing zil_claim() */ - boolean_t spa_debug; /* debug enabled? */ boolean_t spa_is_root; /* pool is root */ int spa_minref; /* num refs when first opened */ int spa_mode; /* FREAD | FWRITE */ Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_debug.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_debug.h Wed Oct 3 02:12:24 2018 (r339107) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_debug.h Wed Oct 3 02:13:04 2018 (r339108) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2012, 2016 by Delphix. All rights reserved. + * Copyright (c) 2012, 2017 by Delphix. All rights reserved. */ #ifndef _SYS_ZFS_DEBUG_H @@ -57,7 +57,7 @@ extern boolean_t zfs_free_leak_on_eio; #define ZFS_DEBUG_DNODE_VERIFY (1 << 2) #define ZFS_DEBUG_SNAPNAMES (1 << 3) #define ZFS_DEBUG_MODIFY (1 << 4) -#define ZFS_DEBUG_SPA (1 << 5) +/* 1<<5 was previously used, try not to reuse */ #define ZFS_DEBUG_ZIO_FREE (1 << 6) #define ZFS_DEBUG_HISTOGRAM_VERIFY (1 << 7) #define ZFS_DEBUG_METASLAB_VERIFY (1 << 8) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Wed Oct 3 02:12:24 2018 (r339107) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Wed Oct 3 02:13:04 2018 (r339108) @@ -3037,7 +3037,7 @@ zio_dva_allocate(zio_t *zio) &zio->io_alloc_list, zio, zio->io_allocator); if (error != 0) { - spa_dbgmsg(spa, "%s: metaslab allocation failure: zio %p, " + zfs_dbgmsg("%s: metaslab allocation failure: zio %p, " "size %llu, error %d", spa_name(spa), zio, zio->io_size, error); if (error == ENOSPC && zio->io_size > SPA_MINBLOCKSIZE) From owner-svn-src-stable-11@freebsd.org Wed Oct 3 02:13:45 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DE2DF10B4C05; Wed, 3 Oct 2018 02:13:44 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8DFB786988; Wed, 3 Oct 2018 02:13:44 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 88A4D1098C; Wed, 3 Oct 2018 02:13:44 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w932Dimi045788; Wed, 3 Oct 2018 02:13:44 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w932Dg5g045781; Wed, 3 Oct 2018 02:13:42 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030213.w932Dg5g045781@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:13:42 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339109 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 339109 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:13:45 -0000 Author: mav Date: Wed Oct 3 02:13:42 2018 New Revision: 339109 URL: https://svnweb.freebsd.org/changeset/base/339109 Log: MFC r336959: MFV r336958: 9337 zfs get all is slow due to uncached metadata This project's goal is to make read-heavy channel programs and zfs(1m) administrative commands faster by caching all the metadata that they will need in the dbuf layer. This will prevent the data from being evicted, so that any future call to i.e. zfs get all won't have to go to disk (very much). illumos/illumos-gate@adb52d9262f45a04318fc6e188fe2b7f59d989a5 Reviewed by: Prakash Surya Reviewed by: George Wilson Reviewed by: Thomas Caputi Approved by: Richard Lowe Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Oct 3 02:13:04 2018 (r339108) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Oct 3 02:13:42 2018 (r339109) @@ -49,6 +49,7 @@ #include #include #include +#include uint_t zfs_dbuf_evict_key; @@ -74,24 +75,58 @@ static kcondvar_t dbuf_evict_cv; static boolean_t dbuf_evict_thread_exit; /* - * LRU cache of dbufs. The dbuf cache maintains a list of dbufs that - * are not currently held but have been recently released. These dbufs - * are not eligible for arc eviction until they are aged out of the cache. - * Dbufs are added to the dbuf cache once the last hold is released. If a - * dbuf is later accessed and still exists in the dbuf cache, then it will - * be removed from the cache and later re-added to the head of the cache. - * Dbufs that are aged out of the cache will be immediately destroyed and - * become eligible for arc eviction. + * There are two dbuf caches; each dbuf can only be in one of them at a time. + * + * 1. Cache of metadata dbufs, to help make read-heavy administrative commands + * from /sbin/zfs run faster. The "metadata cache" specifically stores dbufs + * that represent the metadata that describes filesystems/snapshots/ + * bookmarks/properties/etc. We only evict from this cache when we export a + * pool, to short-circuit as much I/O as possible for all administrative + * commands that need the metadata. There is no eviction policy for this + * cache, because we try to only include types in it which would occupy a + * very small amount of space per object but create a large impact on the + * performance of these commands. Instead, after it reaches a maximum size + * (which should only happen on very small memory systems with a very large + * number of filesystem objects), we stop taking new dbufs into the + * metadata cache, instead putting them in the normal dbuf cache. + * + * 2. LRU cache of dbufs. The "dbuf cache" maintains a list of dbufs that + * are not currently held but have been recently released. These dbufs + * are not eligible for arc eviction until they are aged out of the cache. + * Dbufs that are aged out of the cache will be immediately destroyed and + * become eligible for arc eviction. + * + * Dbufs are added to these caches once the last hold is released. If a dbuf is + * later accessed and still exists in the dbuf cache, then it will be removed + * from the cache and later re-added to the head of the cache. + * + * If a given dbuf meets the requirements for the metadata cache, it will go + * there, otherwise it will be considered for the generic LRU dbuf cache. The + * caches and the refcounts tracking their sizes are stored in an array indexed + * by those caches' matching enum values (from dbuf_cached_state_t). */ -static multilist_t *dbuf_cache; -static refcount_t dbuf_cache_size; -uint64_t dbuf_cache_max_bytes = 0; +typedef struct dbuf_cache { + multilist_t *cache; + refcount_t size; +} dbuf_cache_t; +dbuf_cache_t dbuf_caches[DB_CACHE_MAX]; -/* Set the default size of the dbuf cache to log2 fraction of arc size. */ +/* Size limits for the caches */ +uint64_t dbuf_cache_max_bytes = 0; +uint64_t dbuf_metadata_cache_max_bytes = 0; +/* Set the default sizes of the caches to log2 fraction of arc size */ int dbuf_cache_shift = 5; +int dbuf_metadata_cache_shift = 6; /* - * The dbuf cache uses a three-stage eviction policy: + * For diagnostic purposes, this is incremented whenever we can't add + * something to the metadata cache because it's full, and instead put + * the data in the regular dbuf cache. + */ +uint64_t dbuf_metadata_cache_overflow; + +/* + * The LRU dbuf cache uses a three-stage eviction policy: * - A low water marker designates when the dbuf eviction thread * should stop evicting from the dbuf cache. * - When we reach the maximum size (aka mid water mark), we @@ -404,6 +439,41 @@ dbuf_is_metadata(dmu_buf_impl_t *db) } /* + * This returns whether this dbuf should be stored in the metadata cache, which + * is based on whether it's from one of the dnode types that store data related + * to traversing dataset hierarchies. + */ +static boolean_t +dbuf_include_in_metadata_cache(dmu_buf_impl_t *db) +{ + DB_DNODE_ENTER(db); + dmu_object_type_t type = DB_DNODE(db)->dn_type; + DB_DNODE_EXIT(db); + + /* Check if this dbuf is one of the types we care about */ + if (DMU_OT_IS_METADATA_CACHED(type)) { + /* If we hit this, then we set something up wrong in dmu_ot */ + ASSERT(DMU_OT_IS_METADATA(type)); + + /* + * Sanity check for small-memory systems: don't allocate too + * much memory for this purpose. + */ + if (refcount_count(&dbuf_caches[DB_DBUF_METADATA_CACHE].size) > + dbuf_metadata_cache_max_bytes) { + dbuf_metadata_cache_overflow++; + DTRACE_PROBE1(dbuf__metadata__cache__overflow, + dmu_buf_impl_t *, db); + return (B_FALSE); + } + + return (B_TRUE); + } + + return (B_FALSE); +} + +/* * This function *must* return indices evenly distributed between all * sublists of the multilist. This is needed due to how the dbuf eviction * code is laid out; dbuf_evict_thread() assumes dbufs are evenly @@ -438,7 +508,7 @@ dbuf_cache_above_hiwater(void) uint64_t dbuf_cache_hiwater_bytes = (dbuf_cache_max_bytes * dbuf_cache_hiwater_pct) / 100; - return (refcount_count(&dbuf_cache_size) > + return (refcount_count(&dbuf_caches[DB_DBUF_CACHE].size) > dbuf_cache_max_bytes + dbuf_cache_hiwater_bytes); } @@ -448,7 +518,7 @@ dbuf_cache_above_lowater(void) uint64_t dbuf_cache_lowater_bytes = (dbuf_cache_max_bytes * dbuf_cache_lowater_pct) / 100; - return (refcount_count(&dbuf_cache_size) > + return (refcount_count(&dbuf_caches[DB_DBUF_CACHE].size) > dbuf_cache_max_bytes - dbuf_cache_lowater_bytes); } @@ -458,8 +528,9 @@ dbuf_cache_above_lowater(void) static void dbuf_evict_one(void) { - int idx = multilist_get_random_index(dbuf_cache); - multilist_sublist_t *mls = multilist_sublist_lock(dbuf_cache, idx); + int idx = multilist_get_random_index(dbuf_caches[DB_DBUF_CACHE].cache); + multilist_sublist_t *mls = multilist_sublist_lock( + dbuf_caches[DB_DBUF_CACHE].cache, idx); ASSERT(!MUTEX_HELD(&dbuf_evict_lock)); @@ -482,8 +553,10 @@ dbuf_evict_one(void) if (db != NULL) { multilist_sublist_remove(mls, db); multilist_sublist_unlock(mls); - (void) refcount_remove_many(&dbuf_cache_size, + (void) refcount_remove_many(&dbuf_caches[DB_DBUF_CACHE].size, db->db.db_size, db); + ASSERT3U(db->db_caching_status, ==, DB_DBUF_CACHE); + db->db_caching_status = DB_NO_CACHE; dbuf_destroy(db); } else { multilist_sublist_unlock(mls); @@ -570,7 +643,8 @@ dbuf_evict_notify(void) * because it's OK to occasionally make the wrong decision here, * and grabbing the lock results in massive lock contention. */ - if (refcount_count(&dbuf_cache_size) > dbuf_cache_max_bytes) { + if (refcount_count(&dbuf_caches[DB_DBUF_CACHE].size) > + dbuf_cache_max_bytes) { if (dbuf_cache_above_hiwater()) dbuf_evict_one(); cv_signal(&dbuf_evict_cv); @@ -610,15 +684,21 @@ retry: mutex_init(&h->hash_mutexes[i], NULL, MUTEX_DEFAULT, NULL); /* - * Setup the parameters for the dbuf cache. We set the size of the - * dbuf cache to 1/32nd (default) of the size of the ARC. If the value - * has been set in /etc/system and it's not greater than the size of - * the ARC, then we honor that value. + * Setup the parameters for the dbuf caches. We set the sizes of the + * dbuf cache and the metadata cache to 1/32nd and 1/16th (default) + * of the size of the ARC, respectively. If the values are set in + * /etc/system and they're not greater than the size of the ARC, then + * we honor that value. */ if (dbuf_cache_max_bytes == 0 || dbuf_cache_max_bytes >= arc_max_bytes()) { dbuf_cache_max_bytes = arc_max_bytes() >> dbuf_cache_shift; } + if (dbuf_metadata_cache_max_bytes == 0 || + dbuf_metadata_cache_max_bytes >= arc_max_bytes()) { + dbuf_metadata_cache_max_bytes = + arc_max_bytes() >> dbuf_metadata_cache_shift; + } /* * All entries are queued via taskq_dispatch_ent(), so min/maxalloc @@ -626,10 +706,13 @@ retry: */ dbu_evict_taskq = taskq_create("dbu_evict", 1, minclsyspri, 0, 0, 0); - dbuf_cache = multilist_create(sizeof (dmu_buf_impl_t), - offsetof(dmu_buf_impl_t, db_cache_link), - dbuf_cache_multilist_index_func); - refcount_create(&dbuf_cache_size); + for (dbuf_cached_state_t dcs = 0; dcs < DB_CACHE_MAX; dcs++) { + dbuf_caches[dcs].cache = + multilist_create(sizeof (dmu_buf_impl_t), + offsetof(dmu_buf_impl_t, db_cache_link), + dbuf_cache_multilist_index_func); + refcount_create(&dbuf_caches[dcs].size); + } tsd_create(&zfs_dbuf_evict_key, NULL); dbuf_evict_thread_exit = B_FALSE; @@ -663,8 +746,10 @@ dbuf_fini(void) mutex_destroy(&dbuf_evict_lock); cv_destroy(&dbuf_evict_cv); - refcount_destroy(&dbuf_cache_size); - multilist_destroy(dbuf_cache); + for (dbuf_cached_state_t dcs = 0; dcs < DB_CACHE_MAX; dcs++) { + refcount_destroy(&dbuf_caches[dcs].size); + multilist_destroy(dbuf_caches[dcs].cache); + } } /* @@ -2051,9 +2136,15 @@ dbuf_destroy(dmu_buf_impl_t *db) dbuf_clear_data(db); if (multilist_link_active(&db->db_cache_link)) { - multilist_remove(dbuf_cache, db); - (void) refcount_remove_many(&dbuf_cache_size, + ASSERT(db->db_caching_status == DB_DBUF_CACHE || + db->db_caching_status == DB_DBUF_METADATA_CACHE); + + multilist_remove(dbuf_caches[db->db_caching_status].cache, db); + (void) refcount_remove_many( + &dbuf_caches[db->db_caching_status].size, db->db.db_size, db); + + db->db_caching_status = DB_NO_CACHE; } ASSERT(db->db_state == DB_UNCACHED || db->db_state == DB_NOFILL); @@ -2107,6 +2198,7 @@ dbuf_destroy(dmu_buf_impl_t *db) ASSERT(db->db_hash_next == NULL); ASSERT(db->db_blkptr == NULL); ASSERT(db->db_data_pending == NULL); + ASSERT3U(db->db_caching_status, ==, DB_NO_CACHE); ASSERT(!multilist_link_active(&db->db_cache_link)); kmem_cache_free(dbuf_kmem_cache, db); @@ -2245,6 +2337,7 @@ dbuf_create(dnode_t *dn, uint8_t level, uint64_t blkid ASSERT3U(db->db.db_size, >=, dn->dn_bonuslen); db->db.db_offset = DMU_BONUS_BLKID; db->db_state = DB_UNCACHED; + db->db_caching_status = DB_NO_CACHE; /* the bonus dbuf is not placed in the hash table */ arc_space_consume(sizeof (dmu_buf_impl_t), ARC_SPACE_OTHER); return (db); @@ -2277,6 +2370,7 @@ dbuf_create(dnode_t *dn, uint8_t level, uint64_t blkid avl_add(&dn->dn_dbufs, db); db->db_state = DB_UNCACHED; + db->db_caching_status = DB_NO_CACHE; mutex_exit(&dn->dn_dbufs_mtx); arc_space_consume(sizeof (dmu_buf_impl_t), ARC_SPACE_OTHER); @@ -2619,9 +2713,15 @@ top: if (multilist_link_active(&db->db_cache_link)) { ASSERT(refcount_is_zero(&db->db_holds)); - multilist_remove(dbuf_cache, db); - (void) refcount_remove_many(&dbuf_cache_size, + ASSERT(db->db_caching_status == DB_DBUF_CACHE || + db->db_caching_status == DB_DBUF_METADATA_CACHE); + + multilist_remove(dbuf_caches[db->db_caching_status].cache, db); + (void) refcount_remove_many( + &dbuf_caches[db->db_caching_status].size, db->db.db_size, db); + + db->db_caching_status = DB_NO_CACHE; } (void) refcount_add(&db->db_holds, tag); DBUF_VERIFY(db); @@ -2838,12 +2938,22 @@ dbuf_rele_and_unlock(dmu_buf_impl_t *db, void *tag) db->db_pending_evict) { dbuf_destroy(db); } else if (!multilist_link_active(&db->db_cache_link)) { - multilist_insert(dbuf_cache, db); - (void) refcount_add_many(&dbuf_cache_size, + ASSERT3U(db->db_caching_status, ==, + DB_NO_CACHE); + + dbuf_cached_state_t dcs = + dbuf_include_in_metadata_cache(db) ? + DB_DBUF_METADATA_CACHE : DB_DBUF_CACHE; + db->db_caching_status = dcs; + + multilist_insert(dbuf_caches[dcs].cache, db); + (void) refcount_add_many(&dbuf_caches[dcs].size, db->db.db_size, db); mutex_exit(&db->db_mtx); - dbuf_evict_notify(); + if (db->db_caching_status == DB_DBUF_CACHE) { + dbuf_evict_notify(); + } } if (do_arc_evict) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Wed Oct 3 02:13:04 2018 (r339108) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Wed Oct 3 02:13:42 2018 (r339109) @@ -79,60 +79,60 @@ SYSCTL_INT(_vfs_zfs, OID_AUTO, per_txg_dirty_frees_per int zfs_object_remap_one_indirect_delay_ticks = 0; const dmu_object_type_info_t dmu_ot[DMU_OT_NUMTYPES] = { - { DMU_BSWAP_UINT8, TRUE, "unallocated" }, - { DMU_BSWAP_ZAP, TRUE, "object directory" }, - { DMU_BSWAP_UINT64, TRUE, "object array" }, - { DMU_BSWAP_UINT8, TRUE, "packed nvlist" }, - { DMU_BSWAP_UINT64, TRUE, "packed nvlist size" }, - { DMU_BSWAP_UINT64, TRUE, "bpobj" }, - { DMU_BSWAP_UINT64, TRUE, "bpobj header" }, - { DMU_BSWAP_UINT64, TRUE, "SPA space map header" }, - { DMU_BSWAP_UINT64, TRUE, "SPA space map" }, - { DMU_BSWAP_UINT64, TRUE, "ZIL intent log" }, - { DMU_BSWAP_DNODE, TRUE, "DMU dnode" }, - { DMU_BSWAP_OBJSET, TRUE, "DMU objset" }, - { DMU_BSWAP_UINT64, TRUE, "DSL directory" }, - { DMU_BSWAP_ZAP, TRUE, "DSL directory child map"}, - { DMU_BSWAP_ZAP, TRUE, "DSL dataset snap map" }, - { DMU_BSWAP_ZAP, TRUE, "DSL props" }, - { DMU_BSWAP_UINT64, TRUE, "DSL dataset" }, - { DMU_BSWAP_ZNODE, TRUE, "ZFS znode" }, - { DMU_BSWAP_OLDACL, TRUE, "ZFS V0 ACL" }, - { DMU_BSWAP_UINT8, FALSE, "ZFS plain file" }, - { DMU_BSWAP_ZAP, TRUE, "ZFS directory" }, - { DMU_BSWAP_ZAP, TRUE, "ZFS master node" }, - { DMU_BSWAP_ZAP, TRUE, "ZFS delete queue" }, - { DMU_BSWAP_UINT8, FALSE, "zvol object" }, - { DMU_BSWAP_ZAP, TRUE, "zvol prop" }, - { DMU_BSWAP_UINT8, FALSE, "other uint8[]" }, - { DMU_BSWAP_UINT64, FALSE, "other uint64[]" }, - { DMU_BSWAP_ZAP, TRUE, "other ZAP" }, - { DMU_BSWAP_ZAP, TRUE, "persistent error log" }, - { DMU_BSWAP_UINT8, TRUE, "SPA history" }, - { DMU_BSWAP_UINT64, TRUE, "SPA history offsets" }, - { DMU_BSWAP_ZAP, TRUE, "Pool properties" }, - { DMU_BSWAP_ZAP, TRUE, "DSL permissions" }, - { DMU_BSWAP_ACL, TRUE, "ZFS ACL" }, - { DMU_BSWAP_UINT8, TRUE, "ZFS SYSACL" }, - { DMU_BSWAP_UINT8, TRUE, "FUID table" }, - { DMU_BSWAP_UINT64, TRUE, "FUID table size" }, - { DMU_BSWAP_ZAP, TRUE, "DSL dataset next clones"}, - { DMU_BSWAP_ZAP, TRUE, "scan work queue" }, - { DMU_BSWAP_ZAP, TRUE, "ZFS user/group used" }, - { DMU_BSWAP_ZAP, TRUE, "ZFS user/group quota" }, - { DMU_BSWAP_ZAP, TRUE, "snapshot refcount tags"}, - { DMU_BSWAP_ZAP, TRUE, "DDT ZAP algorithm" }, - { DMU_BSWAP_ZAP, TRUE, "DDT statistics" }, - { DMU_BSWAP_UINT8, TRUE, "System attributes" }, - { DMU_BSWAP_ZAP, TRUE, "SA master node" }, - { DMU_BSWAP_ZAP, TRUE, "SA attr registration" }, - { DMU_BSWAP_ZAP, TRUE, "SA attr layouts" }, - { DMU_BSWAP_ZAP, TRUE, "scan translations" }, - { DMU_BSWAP_UINT8, FALSE, "deduplicated block" }, - { DMU_BSWAP_ZAP, TRUE, "DSL deadlist map" }, - { DMU_BSWAP_UINT64, TRUE, "DSL deadlist map hdr" }, - { DMU_BSWAP_ZAP, TRUE, "DSL dir clones" }, - { DMU_BSWAP_UINT64, TRUE, "bpobj subobj" } + { DMU_BSWAP_UINT8, TRUE, FALSE, "unallocated" }, + { DMU_BSWAP_ZAP, TRUE, TRUE, "object directory" }, + { DMU_BSWAP_UINT64, TRUE, TRUE, "object array" }, + { DMU_BSWAP_UINT8, TRUE, FALSE, "packed nvlist" }, + { DMU_BSWAP_UINT64, TRUE, FALSE, "packed nvlist size" }, + { DMU_BSWAP_UINT64, TRUE, FALSE, "bpobj" }, + { DMU_BSWAP_UINT64, TRUE, FALSE, "bpobj header" }, + { DMU_BSWAP_UINT64, TRUE, FALSE, "SPA space map header" }, + { DMU_BSWAP_UINT64, TRUE, FALSE, "SPA space map" }, + { DMU_BSWAP_UINT64, TRUE, FALSE, "ZIL intent log" }, + { DMU_BSWAP_DNODE, TRUE, FALSE, "DMU dnode" }, + { DMU_BSWAP_OBJSET, TRUE, TRUE, "DMU objset" }, + { DMU_BSWAP_UINT64, TRUE, TRUE, "DSL directory" }, + { DMU_BSWAP_ZAP, TRUE, TRUE, "DSL directory child map" }, + { DMU_BSWAP_ZAP, TRUE, TRUE, "DSL dataset snap map" }, + { DMU_BSWAP_ZAP, TRUE, TRUE, "DSL props" }, + { DMU_BSWAP_UINT64, TRUE, TRUE, "DSL dataset" }, + { DMU_BSWAP_ZNODE, TRUE, FALSE, "ZFS znode" }, + { DMU_BSWAP_OLDACL, TRUE, FALSE, "ZFS V0 ACL" }, + { DMU_BSWAP_UINT8, FALSE, FALSE, "ZFS plain file" }, + { DMU_BSWAP_ZAP, TRUE, FALSE, "ZFS directory" }, + { DMU_BSWAP_ZAP, TRUE, FALSE, "ZFS master node" }, + { DMU_BSWAP_ZAP, TRUE, FALSE, "ZFS delete queue" }, + { DMU_BSWAP_UINT8, FALSE, FALSE, "zvol object" }, + { DMU_BSWAP_ZAP, TRUE, FALSE, "zvol prop" }, + { DMU_BSWAP_UINT8, FALSE, FALSE, "other uint8[]" }, + { DMU_BSWAP_UINT64, FALSE, FALSE, "other uint64[]" }, + { DMU_BSWAP_ZAP, TRUE, FALSE, "other ZAP" }, + { DMU_BSWAP_ZAP, TRUE, FALSE, "persistent error log" }, + { DMU_BSWAP_UINT8, TRUE, FALSE, "SPA history" }, + { DMU_BSWAP_UINT64, TRUE, FALSE, "SPA history offsets" }, + { DMU_BSWAP_ZAP, TRUE, TRUE, "Pool properties" }, + { DMU_BSWAP_ZAP, TRUE, TRUE, "DSL permissions" }, + { DMU_BSWAP_ACL, TRUE, FALSE, "ZFS ACL" }, + { DMU_BSWAP_UINT8, TRUE, FALSE, "ZFS SYSACL" }, + { DMU_BSWAP_UINT8, TRUE, FALSE, "FUID table" }, + { DMU_BSWAP_UINT64, TRUE, FALSE, "FUID table size" }, + { DMU_BSWAP_ZAP, TRUE, TRUE, "DSL dataset next clones" }, + { DMU_BSWAP_ZAP, TRUE, FALSE, "scan work queue" }, + { DMU_BSWAP_ZAP, TRUE, FALSE, "ZFS user/group used" }, + { DMU_BSWAP_ZAP, TRUE, FALSE, "ZFS user/group quota" }, + { DMU_BSWAP_ZAP, TRUE, TRUE, "snapshot refcount tags" }, + { DMU_BSWAP_ZAP, TRUE, FALSE, "DDT ZAP algorithm" }, + { DMU_BSWAP_ZAP, TRUE, FALSE, "DDT statistics" }, + { DMU_BSWAP_UINT8, TRUE, FALSE, "System attributes" }, + { DMU_BSWAP_ZAP, TRUE, FALSE, "SA master node" }, + { DMU_BSWAP_ZAP, TRUE, FALSE, "SA attr registration" }, + { DMU_BSWAP_ZAP, TRUE, FALSE, "SA attr layouts" }, + { DMU_BSWAP_ZAP, TRUE, FALSE, "scan translations" }, + { DMU_BSWAP_UINT8, FALSE, FALSE, "deduplicated block" }, + { DMU_BSWAP_ZAP, TRUE, TRUE, "DSL deadlist map" }, + { DMU_BSWAP_UINT64, TRUE, TRUE, "DSL deadlist map hdr" }, + { DMU_BSWAP_ZAP, TRUE, TRUE, "DSL dir clones" }, + { DMU_BSWAP_UINT64, TRUE, FALSE, "bpobj subobj" } }; const dmu_object_byteswap_info_t dmu_ot_byteswap[DMU_BSWAP_NUMFUNCS] = { Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Oct 3 02:13:04 2018 (r339108) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Oct 3 02:13:42 2018 (r339109) @@ -498,6 +498,14 @@ dmu_objset_open_impl(spa_t *spa, dsl_dataset_t *ds, bl os->os_primary_cache = ZFS_CACHE_ALL; os->os_secondary_cache = ZFS_CACHE_ALL; } + /* + * These properties will be filled in by the logic in zfs_get_zplprop() + * when they are queried for the first time. + */ + os->os_version = OBJSET_PROP_UNINITIALIZED; + os->os_normalization = OBJSET_PROP_UNINITIALIZED; + os->os_utf8only = OBJSET_PROP_UNINITIALIZED; + os->os_casesensitivity = OBJSET_PROP_UNINITIALIZED; if (ds == NULL || !ds->ds_is_snapshot) os->os_zil_header = os->os_phys->os_zil_header; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h Wed Oct 3 02:13:04 2018 (r339108) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h Wed Oct 3 02:13:42 2018 (r339109) @@ -83,6 +83,13 @@ typedef enum dbuf_states { DB_EVICTING } dbuf_states_t; +typedef enum dbuf_cached_state { + DB_NO_CACHE = -1, + DB_DBUF_CACHE, + DB_DBUF_METADATA_CACHE, + DB_CACHE_MAX +} dbuf_cached_state_t; + struct dnode; struct dmu_tx; @@ -229,10 +236,11 @@ typedef struct dmu_buf_impl { */ avl_node_t db_link; - /* - * Link in dbuf_cache. - */ + /* Link in dbuf_cache or dbuf_metadata_cache */ multilist_node_t db_cache_link; + + /* Tells us which dbuf cache this dbuf is in, if any */ + dbuf_cached_state_t db_caching_status; /* Data which is unique to data (leaf) blocks: */ Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Wed Oct 3 02:13:04 2018 (r339108) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Wed Oct 3 02:13:42 2018 (r339109) @@ -109,7 +109,8 @@ typedef enum dmu_object_byteswap { /* * Defines a uint8_t object type. Object types specify if the data * in the object is metadata (boolean) and how to byteswap the data - * (dmu_object_byteswap_t). + * (dmu_object_byteswap_t). All of the types created by this method + * are cached in the dbuf metadata cache. */ #define DMU_OT(byteswap, metadata) \ (DMU_OT_NEWTYPE | \ @@ -124,6 +125,9 @@ typedef enum dmu_object_byteswap { ((ot) & DMU_OT_METADATA) : \ dmu_ot[(ot)].ot_metadata) +#define DMU_OT_IS_METADATA_CACHED(ot) (((ot) & DMU_OT_NEWTYPE) ? \ + B_TRUE : dmu_ot[(ot)].ot_dbuf_metadata_cache) + /* * These object types use bp_fill != 1 for their L0 bp's. Therefore they can't * have their data embedded (i.e. use a BP_IS_EMBEDDED() bp), because bp_fill @@ -810,6 +814,7 @@ typedef void arc_byteswap_func_t(void *buf, size_t siz typedef struct dmu_object_type_info { dmu_object_byteswap_t ot_byteswap; boolean_t ot_metadata; + boolean_t ot_dbuf_metadata_cache; char *ot_name; } dmu_object_type_info_t; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h Wed Oct 3 02:13:04 2018 (r339108) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h Wed Oct 3 02:13:42 2018 (r339109) @@ -39,6 +39,7 @@ #include #include #include +#include #ifdef __cplusplus extern "C" { @@ -69,6 +70,7 @@ typedef struct objset_phys { dnode_phys_t os_groupused_dnode; } objset_phys_t; +#define OBJSET_PROP_UNINITIALIZED ((uint64_t)-1) struct objset { /* Immutable: */ struct dsl_dataset *os_dsl_dataset; @@ -100,6 +102,16 @@ struct objset { zfs_sync_type_t os_sync; zfs_redundant_metadata_type_t os_redundant_metadata; int os_recordsize; + /* + * The next four values are used as a cache of whatever's on disk, and + * are initialized the first time these properties are queried. Before + * being initialized with their real values, their values are + * OBJSET_PROP_UNINITIALIZED. + */ + uint64_t os_version; + uint64_t os_normalization; + uint64_t os_utf8only; + uint64_t os_casesensitivity; /* * Pointer is constant; the blkptr it points to is protected by Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c Wed Oct 3 02:13:04 2018 (r339108) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c Wed Oct 3 02:13:42 2018 (r339109) @@ -2645,6 +2645,7 @@ zfs_set_version(zfsvfs_t *zfsvfs, uint64_t newvers) dmu_tx_commit(tx); zfsvfs->z_version = newvers; + os->os_version = newvers; zfs_set_fuid_feature(zfsvfs); @@ -2657,17 +2658,47 @@ zfs_set_version(zfsvfs_t *zfsvfs, uint64_t newvers) int zfs_get_zplprop(objset_t *os, zfs_prop_t prop, uint64_t *value) { - const char *pname; - int error = ENOENT; + uint64_t *cached_copy = NULL; /* - * Look up the file system's value for the property. For the - * version property, we look up a slightly different string. + * Figure out where in the objset_t the cached copy would live, if it + * is available for the requested property. */ - if (prop == ZFS_PROP_VERSION) + if (os != NULL) { + switch (prop) { + case ZFS_PROP_VERSION: + cached_copy = &os->os_version; + break; + case ZFS_PROP_NORMALIZE: + cached_copy = &os->os_normalization; + break; + case ZFS_PROP_UTF8ONLY: + cached_copy = &os->os_utf8only; + break; + case ZFS_PROP_CASE: + cached_copy = &os->os_casesensitivity; + break; + default: + break; + } + } + if (cached_copy != NULL && *cached_copy != OBJSET_PROP_UNINITIALIZED) { + *value = *cached_copy; + return (0); + } + + /* + * If the property wasn't cached, look up the file system's value for + * the property. For the version property, we look up a slightly + * different string. + */ + const char *pname; + int error = ENOENT; + if (prop == ZFS_PROP_VERSION) { pname = ZPL_VERSION_STR; - else + } else { pname = zfs_prop_to_name(prop); + } if (os != NULL) { ASSERT3U(os->os_phys->os_type, ==, DMU_OST_ZFS); @@ -2692,6 +2723,15 @@ zfs_get_zplprop(objset_t *os, zfs_prop_t prop, uint64_ } error = 0; } + + /* + * If one of the methods for getting the property value above worked, + * copy it into the objset_t's cache. + */ + if (error == 0 && cached_copy != NULL) { + *cached_copy = *value; + } + return (error); } From owner-svn-src-stable-11@freebsd.org Wed Oct 3 02:14:39 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B93AE10B4C9A; Wed, 3 Oct 2018 02:14:39 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6F72086AE2; Wed, 3 Oct 2018 02:14:39 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 6A7611098D; Wed, 3 Oct 2018 02:14:39 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w932EdsL045878; Wed, 3 Oct 2018 02:14:39 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w932Edk0045877; Wed, 3 Oct 2018 02:14:39 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030214.w932Edk0045877@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:14:39 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339110 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339110 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:14:39 -0000 Author: mav Date: Wed Oct 3 02:14:38 2018 New Revision: 339110 URL: https://svnweb.freebsd.org/changeset/base/339110 Log: MFC r336961: MFV r336960: 9256 zfs send space estimation off by > 10% on some datasets illumos/illummos-gate@df477c0afa111b5205c872dab36dbfde391656de Reviewed by: Matt Ahrens Reviewed by: John Kennedy Approved by: Richard Lowe Author: Paul Dagnelie Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Wed Oct 3 02:13:42 2018 (r339109) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Wed Oct 3 02:14:38 2018 (r339110) @@ -76,6 +76,11 @@ TUNABLE_INT("vfs.zfs.send_set_freerecords_bit", &zfs_s static char *dmu_recv_tag = "dmu_recv_tag"; const char *recv_clone_name = "%recv"; +/* + * Use this to override the recordsize calculation for fast zfs send estimates. + */ +uint64_t zfs_override_estimate_recordsize = 0; + #define BP_SPAN(datablkszsec, indblkshift, level) \ (((uint64_t)datablkszsec) << (SPA_MINBLOCKSHIFT + \ (level) * (indblkshift - SPA_BLKPTRSHIFT))) @@ -1131,7 +1136,7 @@ static int dmu_adjust_send_estimate_for_indirects(dsl_dataset_t *ds, uint64_t uncompressed, uint64_t compressed, boolean_t stream_compressed, uint64_t *sizep) { - int err; + int err = 0; uint64_t size; /* * Assume that space (both on-disk and in-stream) is dominated by @@ -1144,7 +1149,9 @@ dmu_adjust_send_estimate_for_indirects(dsl_dataset_t * VERIFY0(dmu_objset_from_ds(ds, &os)); /* Assume all (uncompressed) blocks are recordsize. */ - if (os->os_phys->os_type == DMU_OST_ZVOL) { + if (zfs_override_estimate_recordsize != 0) { + recordsize = zfs_override_estimate_recordsize; + } else if (os->os_phys->os_type == DMU_OST_ZVOL) { err = dsl_prop_get_int_ds(ds, zfs_prop_to_name(ZFS_PROP_VOLBLOCKSIZE), &recordsize); } else { From owner-svn-src-stable-11@freebsd.org Wed Oct 3 02:16:26 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B363810B4D5D; Wed, 3 Oct 2018 02:16:25 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 62A7086C54; Wed, 3 Oct 2018 02:16:25 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4EED41098E; Wed, 3 Oct 2018 02:16:25 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w932GPpt046024; Wed, 3 Oct 2018 02:16:25 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w932GM99046011; Wed, 3 Oct 2018 02:16:22 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030216.w932GM99046011@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:16:22 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339111 - in stable/11: cddl/contrib/opensolaris/cmd/zpool cddl/contrib/opensolaris/cmd/ztest cddl/contrib/opensolaris/lib/libzfs/common cddl/contrib/opensolaris/lib/libzfs_core/common ... X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11: cddl/contrib/opensolaris/cmd/zpool cddl/contrib/opensolaris/cmd/ztest cddl/contrib/opensolaris/lib/libzfs/common cddl/contrib/opensolaris/lib/libzfs_core/common sys/cddl/contrib/opensola... X-SVN-Commit-Revision: 339111 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:16:26 -0000 Author: mav Date: Wed Oct 3 02:16:22 2018 New Revision: 339111 URL: https://svnweb.freebsd.org/changeset/base/339111 Log: MFC r337007: MFV r336991, r337001: 9102 zfs should be able to initialize storage devices The first access to a disk block can incur a performance penalty on some platforms (e.g. AWS's EBS, VMware VMDKs). Therefore it is recommended that volumes be "thick provisioned", where supported by the platform (VMware). Thick provisioning is time consuming and often is ignored. If the thick provision step is omitted, customers will see suboptimal performance until we have written to all parts of the LUN. ZFS should be able to initialize any unused storage to remove any first-write penalty that exists. illumos/illumos-gate@094e47e980b0796b94b1b8f51f462a64d246e516 Reviewed by: John Wren Kennedy Reviewed by: Matthew Ahrens Reviewed by: Pavel Zakharov Reviewed by: Prakash Surya Approved by: Richard Lowe Author: George Wilson Added: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_initialize.h - copied unchanged from r337007, head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_initialize.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_initialize.c - copied unchanged from r337007, head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_initialize.c Modified: stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool.8 stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool_main.c stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_util.c stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.c stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.h stable/11/sys/cddl/contrib/opensolaris/uts/common/Makefile.files stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio_priority.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_disk.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_missing.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_removal.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_root.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h stable/11/sys/conf/files Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool.8 ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool.8 Wed Oct 3 02:14:38 2018 (r339110) +++ stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool.8 Wed Oct 3 02:16:22 2018 (r339111) @@ -121,6 +121,11 @@ .Ar pool | id .Op Ar newpool .Nm +.Cm initialize +.Op Fl cs +.Ar pool +.Op Ar device Ns ... +.Nm .Cm iostat .Op Fl T Cm d Ns | Ns Cm u .Op Fl v @@ -1434,6 +1439,32 @@ mounting option is enabled. In this case, the checkpointed state of the pool is opened and an administrator can see how the pool would look like if they were to fully rewind. +.El +.It Xo +.Nm +.Cm initialize +.Op Fl cs +.Ar pool +.Op Ar device Ns ... +.Xc +Begins initializing by writing to all unallocated regions on the specified +devices, or all eligible devices in the pool if no individual devices are +specified. +Only leaf data or log devices may be initialized. +.Bl -tag -width Ds +.It Fl c, -cancel +Cancel initializing on the specified devices, or all eligible devices if none +are specified. +If one or more target devices are invalid or are not currently being +initialized, the command will fail and no cancellation will occur on any device. +.It Fl s -suspend +Suspend initializing on the specified devices, or all eligible devices if none +are specified. +If one or more target devices are invalid or are not currently being +initialized, the command will fail and no suspension will occur on any device. +Initializing can then be resumed by running +.Nm zpool Cm initialize +with no flags on the relevant target devices. .El .It Xo .Nm Modified: stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool_main.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool_main.c Wed Oct 3 02:14:38 2018 (r339110) +++ stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool_main.c Wed Oct 3 02:16:22 2018 (r339111) @@ -87,6 +87,7 @@ static int zpool_do_detach(int, char **); static int zpool_do_replace(int, char **); static int zpool_do_split(int, char **); +static int zpool_do_initialize(int, char **); static int zpool_do_scrub(int, char **); static int zpool_do_import(int, char **); @@ -136,6 +137,7 @@ typedef enum { HELP_ONLINE, HELP_REPLACE, HELP_REMOVE, + HELP_INITIALIZE, HELP_SCRUB, HELP_STATUS, HELP_UPGRADE, @@ -187,6 +189,7 @@ static zpool_command_t command_table[] = { { "replace", zpool_do_replace, HELP_REPLACE }, { "split", zpool_do_split, HELP_SPLIT }, { NULL }, + { "initialize", zpool_do_initialize, HELP_INITIALIZE }, { "scrub", zpool_do_scrub, HELP_SCRUB }, { NULL }, { "import", zpool_do_import, HELP_IMPORT }, @@ -261,6 +264,8 @@ get_usage(zpool_help_t idx) return (gettext("\tremove [-nps] ...\n")); case HELP_REOPEN: return (gettext("\treopen \n")); + case HELP_INITIALIZE: + return (gettext("\tinitialize [-cs] [ ...]\n")); case HELP_SCRUB: return (gettext("\tscrub [-s | -p] ...\n")); case HELP_STATUS: @@ -1650,6 +1655,43 @@ print_status_config(zpool_handle_t *zhp, const char *n "resilvering" : "repairing"); } + if ((vs->vs_initialize_state == VDEV_INITIALIZE_ACTIVE || + vs->vs_initialize_state == VDEV_INITIALIZE_SUSPENDED || + vs->vs_initialize_state == VDEV_INITIALIZE_COMPLETE) && + !vs->vs_scan_removing) { + char zbuf[1024]; + char tbuf[256]; + struct tm zaction_ts; + + time_t t = vs->vs_initialize_action_time; + int initialize_pct = 100; + if (vs->vs_initialize_state != VDEV_INITIALIZE_COMPLETE) { + initialize_pct = (vs->vs_initialize_bytes_done * 100 / + (vs->vs_initialize_bytes_est + 1)); + } + + (void) localtime_r(&t, &zaction_ts); + (void) strftime(tbuf, sizeof (tbuf), "%c", &zaction_ts); + + switch (vs->vs_initialize_state) { + case VDEV_INITIALIZE_SUSPENDED: + (void) snprintf(zbuf, sizeof (zbuf), + ", suspended, started at %s", tbuf); + break; + case VDEV_INITIALIZE_ACTIVE: + (void) snprintf(zbuf, sizeof (zbuf), + ", started at %s", tbuf); + break; + case VDEV_INITIALIZE_COMPLETE: + (void) snprintf(zbuf, sizeof (zbuf), + ", completed at %s", tbuf); + break; + } + + (void) printf(gettext(" (%d%% initialized%s)"), + initialize_pct, zbuf); + } + (void) printf("\n"); for (c = 0; c < children; c++) { @@ -4236,6 +4278,119 @@ zpool_do_scrub(int argc, char **argv) } return (for_each_pool(argc, argv, B_TRUE, NULL, scrub_callback, &cb)); +} + +static void +zpool_collect_leaves(zpool_handle_t *zhp, nvlist_t *nvroot, nvlist_t *res) +{ + uint_t children = 0; + nvlist_t **child; + uint_t i; + + (void) nvlist_lookup_nvlist_array(nvroot, ZPOOL_CONFIG_CHILDREN, + &child, &children); + + if (children == 0) { + char *path = zpool_vdev_name(g_zfs, zhp, nvroot, B_FALSE); + fnvlist_add_boolean(res, path); + free(path); + return; + } + + for (i = 0; i < children; i++) { + zpool_collect_leaves(zhp, child[i], res); + } +} + +/* + * zpool initialize [-cs] [ ...] + * Initialize all unused blocks in the specified vdevs, or all vdevs in the pool + * if none specified. + * + * -c Cancel. Ends active initializing. + * -s Suspend. Initializing can then be restarted with no flags. + */ +int +zpool_do_initialize(int argc, char **argv) +{ + int c; + char *poolname; + zpool_handle_t *zhp; + nvlist_t *vdevs; + int err = 0; + + struct option long_options[] = { + {"cancel", no_argument, NULL, 'c'}, + {"suspend", no_argument, NULL, 's'}, + {0, 0, 0, 0} + }; + + pool_initialize_func_t cmd_type = POOL_INITIALIZE_DO; + while ((c = getopt_long(argc, argv, "cs", long_options, NULL)) != -1) { + switch (c) { + case 'c': + if (cmd_type != POOL_INITIALIZE_DO) { + (void) fprintf(stderr, gettext("-c cannot be " + "combined with other options\n")); + usage(B_FALSE); + } + cmd_type = POOL_INITIALIZE_CANCEL; + break; + case 's': + if (cmd_type != POOL_INITIALIZE_DO) { + (void) fprintf(stderr, gettext("-s cannot be " + "combined with other options\n")); + usage(B_FALSE); + } + cmd_type = POOL_INITIALIZE_SUSPEND; + break; + case '?': + if (optopt != 0) { + (void) fprintf(stderr, + gettext("invalid option '%c'\n"), optopt); + } else { + (void) fprintf(stderr, + gettext("invalid option '%s'\n"), + argv[optind - 1]); + } + usage(B_FALSE); + } + } + + argc -= optind; + argv += optind; + + if (argc < 1) { + (void) fprintf(stderr, gettext("missing pool name argument\n")); + usage(B_FALSE); + return (-1); + } + + poolname = argv[0]; + zhp = zpool_open(g_zfs, poolname); + if (zhp == NULL) + return (-1); + + vdevs = fnvlist_alloc(); + if (argc == 1) { + /* no individual leaf vdevs specified, so add them all */ + nvlist_t *config = zpool_get_config(zhp, NULL); + nvlist_t *nvroot = fnvlist_lookup_nvlist(config, + ZPOOL_CONFIG_VDEV_TREE); + zpool_collect_leaves(zhp, nvroot, vdevs); + } else { + int i; + for (i = 1; i < argc; i++) { + fnvlist_add_boolean(vdevs, argv[i]); + } + } + + err = zpool_initialize(zhp, cmd_type, vdevs); + + fnvlist_free(vdevs); + zpool_close(zhp); + + return (err); } typedef struct status_cbdata { Modified: stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c Wed Oct 3 02:14:38 2018 (r339110) +++ stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c Wed Oct 3 02:16:22 2018 (r339111) @@ -104,6 +104,7 @@ #include #include #include +#include #include #include #include @@ -348,6 +349,7 @@ ztest_func_t ztest_spa_upgrade; ztest_func_t ztest_device_removal; ztest_func_t ztest_remap_blocks; ztest_func_t ztest_spa_checkpoint_create_discard; +ztest_func_t ztest_initialize; uint64_t zopt_always = 0ULL * NANOSEC; /* all the time */ uint64_t zopt_incessant = 1ULL * NANOSEC / 10; /* every 1/10 second */ @@ -391,7 +393,8 @@ ztest_info_t ztest_info[] = { &ztest_opts.zo_vdevtime }, { ztest_device_removal, 1, &zopt_sometimes }, { ztest_remap_blocks, 1, &zopt_sometimes }, - { ztest_spa_checkpoint_create_discard, 1, &zopt_rarely } + { ztest_spa_checkpoint_create_discard, 1, &zopt_rarely }, + { ztest_initialize, 1, &zopt_sometimes } }; #define ZTEST_FUNCS (sizeof (ztest_info) / sizeof (ztest_info_t)) @@ -5469,6 +5472,97 @@ ztest_spa_rename(ztest_ds_t *zd, uint64_t id) umem_free(newname, strlen(newname) + 1); rw_exit(&ztest_name_lock); +} + +static vdev_t * +ztest_random_concrete_vdev_leaf(vdev_t *vd) +{ + if (vd == NULL) + return (NULL); + + if (vd->vdev_children == 0) + return (vd); + + vdev_t *eligible[vd->vdev_children]; + int eligible_idx = 0, i; + for (i = 0; i < vd->vdev_children; i++) { + vdev_t *cvd = vd->vdev_child[i]; + if (cvd->vdev_top->vdev_removing) + continue; + if (cvd->vdev_children > 0 || + (vdev_is_concrete(cvd) && !cvd->vdev_detached)) { + eligible[eligible_idx++] = cvd; + } + } + VERIFY(eligible_idx > 0); + + uint64_t child_no = ztest_random(eligible_idx); + return (ztest_random_concrete_vdev_leaf(eligible[child_no])); +} + +/* ARGSUSED */ +void +ztest_initialize(ztest_ds_t *zd, uint64_t id) +{ + spa_t *spa = ztest_spa; + int error = 0; + + mutex_enter(&ztest_vdev_lock); + + spa_config_enter(spa, SCL_VDEV, FTAG, RW_READER); + + /* Random leaf vdev */ + vdev_t *rand_vd = ztest_random_concrete_vdev_leaf(spa->spa_root_vdev); + if (rand_vd == NULL) { + spa_config_exit(spa, SCL_VDEV, FTAG); + mutex_exit(&ztest_vdev_lock); + return; + } + + /* + * The random vdev we've selected may change as soon as we + * drop the spa_config_lock. We create local copies of things + * we're interested in. + */ + uint64_t guid = rand_vd->vdev_guid; + char *path = strdup(rand_vd->vdev_path); + boolean_t active = rand_vd->vdev_initialize_thread != NULL; + + zfs_dbgmsg("vd %p, guid %llu", rand_vd, guid); + spa_config_exit(spa, SCL_VDEV, FTAG); + + uint64_t cmd = ztest_random(POOL_INITIALIZE_FUNCS); + error = spa_vdev_initialize(spa, guid, cmd); + switch (cmd) { + case POOL_INITIALIZE_CANCEL: + if (ztest_opts.zo_verbose >= 4) { + (void) printf("Cancel initialize %s", path); + if (!active) + (void) printf(" failed (no initialize active)"); + (void) printf("\n"); + } + break; + case POOL_INITIALIZE_DO: + if (ztest_opts.zo_verbose >= 4) { + (void) printf("Start initialize %s", path); + if (active && error == 0) + (void) printf(" failed (already active)"); + else if (error != 0) + (void) printf(" failed (error %d)", error); + (void) printf("\n"); + } + break; + case POOL_INITIALIZE_SUSPEND: + if (ztest_opts.zo_verbose >= 4) { + (void) printf("Suspend initialize %s", path); + if (!active) + (void) printf(" failed (no initialize active)"); + (void) printf("\n"); + } + break; + } + free(path); + mutex_exit(&ztest_vdev_lock); } /* Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h Wed Oct 3 02:14:38 2018 (r339110) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h Wed Oct 3 02:16:22 2018 (r339111) @@ -137,6 +137,9 @@ typedef enum zfs_error { EZFS_NO_CHECKPOINT, /* pool has no checkpoint */ EZFS_DEVRM_IN_PROGRESS, /* a device is currently being removed */ EZFS_VDEV_TOO_BIG, /* a device is too big to be used */ + EZFS_TOOMANY, /* argument list too long */ + EZFS_INITIALIZING, /* currently initializing */ + EZFS_NO_INITIALIZE, /* no active initialize */ EZFS_UNKNOWN } zfs_error_t; @@ -262,6 +265,8 @@ typedef struct splitflags { * Functions to manipulate pool and vdev state */ extern int zpool_scan(zpool_handle_t *, pool_scan_func_t, pool_scrub_cmd_t); +extern int zpool_initialize(zpool_handle_t *, pool_initialize_func_t, + nvlist_t *); extern int zpool_clear(zpool_handle_t *, const char *, nvlist_t *); extern int zpool_reguid(zpool_handle_t *); extern int zpool_reopen(zpool_handle_t *); Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c Wed Oct 3 02:14:38 2018 (r339110) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c Wed Oct 3 02:16:22 2018 (r339111) @@ -1981,6 +1981,100 @@ zpool_scan(zpool_handle_t *zhp, pool_scan_func_t func, } } +static int +xlate_init_err(int err) +{ + switch (err) { + case ENODEV: + return (EZFS_NODEVICE); + case EINVAL: + case EROFS: + return (EZFS_BADDEV); + case EBUSY: + return (EZFS_INITIALIZING); + case ESRCH: + return (EZFS_NO_INITIALIZE); + } + return (err); +} + +/* + * Begin, suspend, or cancel the initialization (initializing of all free + * blocks) for the given vdevs in the given pool. + */ +int +zpool_initialize(zpool_handle_t *zhp, pool_initialize_func_t cmd_type, + nvlist_t *vds) +{ + char msg[1024]; + libzfs_handle_t *hdl = zhp->zpool_hdl; + + nvlist_t *errlist; + + /* translate vdev names to guids */ + nvlist_t *vdev_guids = fnvlist_alloc(); + nvlist_t *guids_to_paths = fnvlist_alloc(); + boolean_t spare, cache; + nvlist_t *tgt; + nvpair_t *elem; + + for (elem = nvlist_next_nvpair(vds, NULL); elem != NULL; + elem = nvlist_next_nvpair(vds, elem)) { + char *vd_path = nvpair_name(elem); + tgt = zpool_find_vdev(zhp, vd_path, &spare, &cache, NULL); + + if ((tgt == NULL) || cache || spare) { + (void) snprintf(msg, sizeof (msg), + dgettext(TEXT_DOMAIN, "cannot initialize '%s'"), + vd_path); + int err = (tgt == NULL) ? EZFS_NODEVICE : + (spare ? EZFS_ISSPARE : EZFS_ISL2CACHE); + fnvlist_free(vdev_guids); + fnvlist_free(guids_to_paths); + return (zfs_error(hdl, err, msg)); + } + + uint64_t guid = fnvlist_lookup_uint64(tgt, ZPOOL_CONFIG_GUID); + fnvlist_add_uint64(vdev_guids, vd_path, guid); + + (void) snprintf(msg, sizeof (msg), "%llu", guid); + fnvlist_add_string(guids_to_paths, msg, vd_path); + } + + int err = lzc_initialize(zhp->zpool_name, cmd_type, vdev_guids, + &errlist); + fnvlist_free(vdev_guids); + + if (err == 0) { + fnvlist_free(guids_to_paths); + return (0); + } + + nvlist_t *vd_errlist = NULL; + if (errlist != NULL) { + vd_errlist = fnvlist_lookup_nvlist(errlist, + ZPOOL_INITIALIZE_VDEVS); + } + + (void) snprintf(msg, sizeof (msg), + dgettext(TEXT_DOMAIN, "operation failed")); + + for (elem = nvlist_next_nvpair(vd_errlist, NULL); elem != NULL; + elem = nvlist_next_nvpair(vd_errlist, elem)) { + int64_t vd_error = xlate_init_err(fnvpair_value_int64(elem)); + char *path = fnvlist_lookup_string(guids_to_paths, + nvpair_name(elem)); + (void) zfs_error_fmt(hdl, vd_error, "cannot initialize '%s'", + path); + } + + fnvlist_free(guids_to_paths); + if (vd_errlist != NULL) + return (-1); + + return (zpool_standard_error(hdl, err, msg)); +} + #ifdef illumos /* * This provides a very minimal check whether a given string is likely a Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_util.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_util.c Wed Oct 3 02:14:38 2018 (r339110) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_util.c Wed Oct 3 02:16:22 2018 (r339111) @@ -254,6 +254,13 @@ libzfs_error_description(libzfs_handle_t *hdl) return (dgettext(TEXT_DOMAIN, "device removal in progress")); case EZFS_VDEV_TOO_BIG: return (dgettext(TEXT_DOMAIN, "device exceeds supported size")); + case EZFS_TOOMANY: + return (dgettext(TEXT_DOMAIN, "argument list too long")); + case EZFS_INITIALIZING: + return (dgettext(TEXT_DOMAIN, "currently initializing")); + case EZFS_NO_INITIALIZE: + return (dgettext(TEXT_DOMAIN, "there is no active " + "initialization")); case EZFS_UNKNOWN: return (dgettext(TEXT_DOMAIN, "unknown error")); default: Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.c Wed Oct 3 02:14:38 2018 (r339110) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.c Wed Oct 3 02:16:22 2018 (r339111) @@ -1085,3 +1085,40 @@ lzc_channel_program_nosync(const char *pool, const cha return (lzc_channel_program_impl(pool, program, B_FALSE, timeout, memlimit, argnvl, outnvl)); } + +/* + * Changes initializing state. + * + * vdevs should be a list of (, guid) where guid is a uint64 vdev GUID. + * The key is ignored. + * + * If there are errors related to vdev arguments, per-vdev errors are returned + * in an nvlist with the key "vdevs". Each error is a (guid, errno) pair where + * guid is stringified with PRIu64, and errno is one of the following as + * an int64_t: + * - ENODEV if the device was not found + * - EINVAL if the devices is not a leaf or is not concrete (e.g. missing) + * - EROFS if the device is not writeable + * - EBUSY start requested but the device is already being initialized + * - ESRCH cancel/suspend requested but device is not being initialized + * + * If the errlist is empty, then return value will be: + * - EINVAL if one or more arguments was invalid + * - Other spa_open failures + * - 0 if the operation succeeded + */ +int +lzc_initialize(const char *poolname, pool_initialize_func_t cmd_type, + nvlist_t *vdevs, nvlist_t **errlist) +{ + int error; + nvlist_t *args = fnvlist_alloc(); + fnvlist_add_uint64(args, ZPOOL_INITIALIZE_COMMAND, (uint64_t)cmd_type); + fnvlist_add_nvlist(args, ZPOOL_INITIALIZE_VDEVS, vdevs); + + error = lzc_ioctl(ZFS_IOC_POOL_INITIALIZE, poolname, args, errlist); + + fnvlist_free(args); + + return (error); +} Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.h ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.h Wed Oct 3 02:14:38 2018 (r339110) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.h Wed Oct 3 02:16:22 2018 (r339111) @@ -31,7 +31,9 @@ #include #include #include +#include + #ifdef __cplusplus extern "C" { #endif @@ -56,6 +58,8 @@ int lzc_destroy_snaps(nvlist_t *, boolean_t, nvlist_t int lzc_bookmark(nvlist_t *, nvlist_t **); int lzc_get_bookmarks(const char *, nvlist_t *, nvlist_t **); int lzc_destroy_bookmarks(nvlist_t *, nvlist_t **); +int lzc_initialize(const char *, pool_initialize_func_t, nvlist_t *, + nvlist_t **); int lzc_snaprange_space(const char *, const char *, uint64_t *); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/Makefile.files ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/Makefile.files Wed Oct 3 02:14:38 2018 (r339110) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/Makefile.files Wed Oct 3 02:16:22 2018 (r339111) @@ -124,6 +124,7 @@ ZFS_COMMON_OBJS += \ vdev_indirect.o \ vdev_indirect_births.o \ vdev_indirect_mapping.o \ + vdev_initialize.o \ vdev_label.o \ vdev_mirror.o \ vdev_missing.o \ Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Oct 3 02:14:38 2018 (r339110) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Oct 3 02:16:22 2018 (r339111) @@ -725,6 +725,8 @@ metaslab_group_create(metaslab_class_t *mc, vdev_t *vd mg = kmem_zalloc(sizeof (metaslab_group_t), KM_SLEEP); mutex_init(&mg->mg_lock, NULL, MUTEX_DEFAULT, NULL); + mutex_init(&mg->mg_ms_initialize_lock, NULL, MUTEX_DEFAULT, NULL); + cv_init(&mg->mg_ms_initialize_cv, NULL, CV_DEFAULT, NULL); mg->mg_primaries = kmem_zalloc(allocators * sizeof (metaslab_t *), KM_SLEEP); mg->mg_secondaries = kmem_zalloc(allocators * sizeof (metaslab_t *), @@ -771,6 +773,8 @@ metaslab_group_destroy(metaslab_group_t *mg) kmem_free(mg->mg_secondaries, mg->mg_allocators * sizeof (metaslab_t *)); mutex_destroy(&mg->mg_lock); + mutex_destroy(&mg->mg_ms_initialize_lock); + cv_destroy(&mg->mg_ms_initialize_cv); for (int i = 0; i < mg->mg_allocators; i++) { refcount_destroy(&mg->mg_alloc_queue_depth[i]); @@ -1554,6 +1558,7 @@ metaslab_init(metaslab_group_t *mg, uint64_t id, uint6 mutex_init(&ms->ms_lock, NULL, MUTEX_DEFAULT, NULL); mutex_init(&ms->ms_sync_lock, NULL, MUTEX_DEFAULT, NULL); cv_init(&ms->ms_load_cv, NULL, CV_DEFAULT, NULL); + ms->ms_id = id; ms->ms_start = id << vd->vdev_ms_shift; ms->ms_size = 1ULL << vd->vdev_ms_shift; @@ -2731,6 +2736,7 @@ metaslab_sync_done(metaslab_t *msp, uint64_t txg) * from it in 'metaslab_unload_delay' txgs, then unload it. */ if (msp->ms_loaded && + msp->ms_initializing == 0 && msp->ms_selected_txg + metaslab_unload_delay < txg) { for (int t = 1; t < TXG_CONCURRENT_STATES; t++) { VERIFY0(range_tree_space( @@ -2980,6 +2986,7 @@ metaslab_block_alloc(metaslab_t *msp, uint64_t size, u metaslab_class_t *mc = msp->ms_group->mg_class; VERIFY(!msp->ms_condensing); + VERIFY0(msp->ms_initializing); start = mc->mc_ops->msop_alloc(msp, size); if (start != -1ULL) { @@ -3040,9 +3047,10 @@ find_valid_metaslab(metaslab_group_t *mg, uint64_t act } /* - * If the selected metaslab is condensing, skip it. + * If the selected metaslab is condensing or being + * initialized, skip it. */ - if (msp->ms_condensing) + if (msp->ms_condensing || msp->ms_initializing > 0) continue; *was_active = msp->ms_allocator != -1; @@ -3207,11 +3215,20 @@ metaslab_group_alloc_normal(metaslab_group_t *mg, zio_ /* * If this metaslab is currently condensing then pick again as * we can't manipulate this metaslab until it's committed - * to disk. + * to disk. If this metaslab is being initialized, we shouldn't + * allocate from it since the allocated region might be + * overwritten after allocation. */ if (msp->ms_condensing) { metaslab_trace_add(zal, mg, msp, asize, d, TRACE_CONDENSING, allocator); + metaslab_passivate(msp, msp->ms_weight & + ~METASLAB_ACTIVE_MASK); + mutex_exit(&msp->ms_lock); + continue; + } else if (msp->ms_initializing > 0) { + metaslab_trace_add(zal, mg, msp, asize, d, + TRACE_INITIALIZING, allocator); metaslab_passivate(msp, msp->ms_weight & ~METASLAB_ACTIVE_MASK); mutex_exit(&msp->ms_lock); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Oct 3 02:14:38 2018 (r339110) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Oct 3 02:16:22 2018 (r339111) @@ -55,6 +55,7 @@ #include #include #include +#include #include #include #include @@ -443,8 +444,9 @@ spa_prop_get(spa_t *spa, nvlist_t **nvp) dp = spa_get_dsl(spa); dsl_pool_config_enter(dp, FTAG); - if (err = dsl_dataset_hold_obj(dp, - za.za_first_integer, FTAG, &ds)) { + err = dsl_dataset_hold_obj(dp, + za.za_first_integer, FTAG, &ds); + if (err != 0) { dsl_pool_config_exit(dp, FTAG); break; } @@ -599,7 +601,8 @@ spa_prop_validate(spa_t *spa, nvlist_t *props) break; } - if (error = dmu_objset_hold(strval, FTAG, &os)) + error = dmu_objset_hold(strval, FTAG, &os); + if (error != 0) break; /* @@ -1215,8 +1218,10 @@ spa_activate(spa_t *spa, int mode) */ trim_thread_create(spa); - for (size_t i = 0; i < TXG_SIZE; i++) - spa->spa_txg_zio[i] = zio_root(spa, NULL, NULL, 0); + for (size_t i = 0; i < TXG_SIZE; i++) { + spa->spa_txg_zio[i] = zio_root(spa, NULL, NULL, + ZIO_FLAG_CANFAIL); + } list_create(&spa->spa_config_dirty_list, sizeof (vdev_t), offsetof(vdev_t, vdev_config_dirty_node)); @@ -1388,6 +1393,11 @@ spa_unload(spa_t *spa) */ spa_async_suspend(spa); + if (spa->spa_root_vdev) { + vdev_initialize_stop_all(spa->spa_root_vdev, + VDEV_INITIALIZE_ACTIVE); + } + /* * Stop syncing. */ @@ -1403,10 +1413,10 @@ spa_unload(spa_t *spa) * calling taskq_wait(mg_taskq). */ if (spa->spa_root_vdev != NULL) { - spa_config_enter(spa, SCL_ALL, FTAG, RW_WRITER); + spa_config_enter(spa, SCL_ALL, spa, RW_WRITER); for (int c = 0; c < spa->spa_root_vdev->vdev_children; c++) vdev_metaslab_fini(spa->spa_root_vdev->vdev_child[c]); - spa_config_exit(spa, SCL_ALL, FTAG); + spa_config_exit(spa, SCL_ALL, spa); } /* @@ -1440,7 +1450,7 @@ spa_unload(spa_t *spa) bpobj_close(&spa->spa_deferred_bpobj); - spa_config_enter(spa, SCL_ALL, FTAG, RW_WRITER); + spa_config_enter(spa, SCL_ALL, spa, RW_WRITER); /* * Close all vdevs. @@ -1502,7 +1512,7 @@ spa_unload(spa_t *spa) spa->spa_comment = NULL; } - spa_config_exit(spa, SCL_ALL, FTAG); + spa_config_exit(spa, SCL_ALL, spa); } /* @@ -3954,6 +3964,10 @@ spa_load_impl(spa_t *spa, spa_import_type_t type, char spa_restart_removal(spa); spa_spawn_aux_threads(spa); + + spa_config_enter(spa, SCL_CONFIG, FTAG, RW_READER); + vdev_initialize_restart(spa->spa_root_vdev); + spa_config_exit(spa, SCL_CONFIG, FTAG); } spa_load_note(spa, "LOADED"); @@ -5676,6 +5690,7 @@ spa_export_common(char *pool, int new_state, nvlist_t * in which case we can modify its state. */ if (spa->spa_state != POOL_STATE_UNINITIALIZED && spa->spa_sync_on) { + /* * Objsets may be open only because they're dirty, so we * have to force it to sync before checking spa_refcnt. @@ -5710,6 +5725,18 @@ spa_export_common(char *pool, int new_state, nvlist_t } /* + * We're about to export or destroy this pool. Make sure + * we stop all initializtion activity here before we + * set the spa_final_txg. This will ensure that all + * dirty data resulting from the initialization is + * committed to disk before we unload the pool. + */ + if (spa->spa_root_vdev != NULL) { + vdev_initialize_stop_all(spa->spa_root_vdev, + VDEV_INITIALIZE_ACTIVE); + } + + /* * We want this to be reflected on every label, * so mark them all dirty. spa_unload() will do the * final sync that pushes these changes out. @@ -6399,6 +6426,86 @@ spa_vdev_detach(spa_t *spa, uint64_t guid, uint64_t pg return (error); } +int +spa_vdev_initialize(spa_t *spa, uint64_t guid, uint64_t cmd_type) +{ + /* + * We hold the namespace lock through the whole function + * to prevent any changes to the pool while we're starting or + * stopping initialization. The config and state locks are held so that + * we can properly assess the vdev state before we commit to + * the initializing operation. + */ + mutex_enter(&spa_namespace_lock); + spa_config_enter(spa, SCL_CONFIG | SCL_STATE, FTAG, RW_READER); + + /* Look up vdev and ensure it's a leaf. */ + vdev_t *vd = spa_lookup_by_guid(spa, guid, B_FALSE); + if (vd == NULL || vd->vdev_detached) { + spa_config_exit(spa, SCL_CONFIG | SCL_STATE, FTAG); + mutex_exit(&spa_namespace_lock); + return (SET_ERROR(ENODEV)); + } else if (!vd->vdev_ops->vdev_op_leaf || !vdev_is_concrete(vd)) { + spa_config_exit(spa, SCL_CONFIG | SCL_STATE, FTAG); + mutex_exit(&spa_namespace_lock); + return (SET_ERROR(EINVAL)); + } else if (!vdev_writeable(vd)) { + spa_config_exit(spa, SCL_CONFIG | SCL_STATE, FTAG); + mutex_exit(&spa_namespace_lock); + return (SET_ERROR(EROFS)); + } + mutex_enter(&vd->vdev_initialize_lock); + spa_config_exit(spa, SCL_CONFIG | SCL_STATE, FTAG); + + /* + * When we activate an initialize action we check to see + * if the vdev_initialize_thread is NULL. We do this instead + * of using the vdev_initialize_state since there might be + * a previous initialization process which has completed but + * the thread is not exited. + */ + if (cmd_type == POOL_INITIALIZE_DO && + (vd->vdev_initialize_thread != NULL || + vd->vdev_top->vdev_removing)) { + mutex_exit(&vd->vdev_initialize_lock); + mutex_exit(&spa_namespace_lock); + return (SET_ERROR(EBUSY)); + } else if (cmd_type == POOL_INITIALIZE_CANCEL && + (vd->vdev_initialize_state != VDEV_INITIALIZE_ACTIVE && + vd->vdev_initialize_state != VDEV_INITIALIZE_SUSPENDED)) { + mutex_exit(&vd->vdev_initialize_lock); + mutex_exit(&spa_namespace_lock); + return (SET_ERROR(ESRCH)); + } else if (cmd_type == POOL_INITIALIZE_SUSPEND && + vd->vdev_initialize_state != VDEV_INITIALIZE_ACTIVE) { + mutex_exit(&vd->vdev_initialize_lock); + mutex_exit(&spa_namespace_lock); + return (SET_ERROR(ESRCH)); + } + + switch (cmd_type) { + case POOL_INITIALIZE_DO: + vdev_initialize(vd); + break; + case POOL_INITIALIZE_CANCEL: + vdev_initialize_stop(vd, VDEV_INITIALIZE_CANCELED); + break; + case POOL_INITIALIZE_SUSPEND: + vdev_initialize_stop(vd, VDEV_INITIALIZE_SUSPENDED); + break; + default: + panic("invalid cmd_type %llu", (unsigned long long)cmd_type); + } + mutex_exit(&vd->vdev_initialize_lock); + + /* Sync out the initializing state */ + txg_wait_synced(spa->spa_dsl_pool, 0); + mutex_exit(&spa_namespace_lock); + + return (0); +} + + /* * Split a set of devices from their mirrors, and create a new pool from them. */ @@ -6606,6 +6713,19 @@ spa_vdev_split_mirror(spa_t *spa, char *newname, nvlis spa_activate(newspa, spa_mode_global); spa_async_suspend(newspa); + for (c = 0; c < children; c++) { + if (vml[c] != NULL) { + /* + * Temporarily stop the initializing activity. We set + * the state to ACTIVE so that we know to resume + * the initializing once the split has completed. + */ + mutex_enter(&vml[c]->vdev_initialize_lock); + vdev_initialize_stop(vml[c], VDEV_INITIALIZE_ACTIVE); + mutex_exit(&vml[c]->vdev_initialize_lock); + } + } + #ifndef illumos /* mark that we are creating new spa by splitting */ newspa->spa_splitting_newspa = B_TRUE; @@ -6700,6 +6820,10 @@ out: if (vml[c] != NULL) vml[c]->vdev_offline = B_FALSE; } + + /* restart initializing disks as necessary */ + spa_async_request(spa, SPA_ASYNC_INITIALIZE_RESTART); + vdev_reopen(spa->spa_root_vdev); nvlist_free(spa->spa_config_splitting); @@ -7064,6 +7188,14 @@ spa_async_thread(void *arg) if (tasks & SPA_ASYNC_RESILVER) dsl_resilver_restart(spa->spa_dsl_pool, 0); + if (tasks & SPA_ASYNC_INITIALIZE_RESTART) { + mutex_enter(&spa_namespace_lock); + spa_config_enter(spa, SCL_CONFIG, FTAG, RW_READER); + vdev_initialize_restart(spa->spa_root_vdev); + spa_config_exit(spa, SCL_CONFIG, FTAG); + mutex_exit(&spa_namespace_lock); + } + /* * Let the world know that we're done. */ @@ -7763,8 +7895,9 @@ spa_sync(spa_t *spa, uint64_t txg) * Wait for i/os issued in open context that need to complete * before this txg syncs. */ - VERIFY0(zio_wait(spa->spa_txg_zio[txg & TXG_MASK])); - spa->spa_txg_zio[txg & TXG_MASK] = zio_root(spa, NULL, NULL, 0); + (void) zio_wait(spa->spa_txg_zio[txg & TXG_MASK]); + spa->spa_txg_zio[txg & TXG_MASK] = zio_root(spa, NULL, NULL, + ZIO_FLAG_CANFAIL); /* * Lock out configuration changes. @@ -8066,7 +8199,8 @@ spa_sync(spa_t *spa, uint64_t txg) /* * Update usable space statistics. */ - while (vd = txg_list_remove(&spa->spa_vdev_txg_list, TXG_CLEAN(txg))) + while ((vd = txg_list_remove(&spa->spa_vdev_txg_list, TXG_CLEAN(txg))) + != NULL) vdev_sync_done(vd, txg); spa_update_dspace(spa); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Wed Oct 3 02:14:38 2018 (r339110) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Wed Oct 3 02:16:22 2018 (r339111) @@ -41,6 +41,7 @@ #include #include #include +#include #include #include #include @@ -1317,6 +1318,12 @@ spa_vdev_config_exit(spa_t *spa, vdev_t *vd, uint64_t if (vd != NULL) { ASSERT(!vd->vdev_detached || vd->vdev_dtl_sm == NULL); + if (vd->vdev_ops->vdev_op_leaf) { + mutex_enter(&vd->vdev_initialize_lock); + vdev_initialize_stop(vd, VDEV_INITIALIZE_CANCELED); + mutex_exit(&vd->vdev_initialize_lock); + } + spa_config_enter(spa, SCL_ALL, spa, RW_WRITER); vdev_free(vd); spa_config_exit(spa, SCL_ALL, spa); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h Wed Oct 3 02:14:38 2018 (r339110) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h Wed Oct 3 02:16:22 2018 (r339111) @@ -68,7 +68,8 @@ typedef enum trace_alloc_type { TRACE_GROUP_FAILURE = -5ULL, TRACE_ENOSPC = -6ULL, TRACE_CONDENSING = -7ULL, - TRACE_VDEV_ERROR = -8ULL + TRACE_VDEV_ERROR = -8ULL, + TRACE_INITIALIZING = -9ULL } trace_alloc_type_t; #define METASLAB_WEIGHT_PRIMARY (1ULL << 63) @@ -271,6 +272,11 @@ struct metaslab_group { uint64_t mg_failed_allocations; uint64_t mg_fragmentation; uint64_t mg_histogram[RANGE_TREE_HISTOGRAM_SIZE]; + + int mg_ms_initializing; + boolean_t mg_initialize_updating; + kmutex_t mg_ms_initialize_lock; + kcondvar_t mg_ms_initialize_cv; }; /* @@ -360,6 +366,8 @@ struct metaslab { boolean_t ms_condensing; /* condensing? */ boolean_t ms_condense_wanted; uint64_t ms_condense_checked_txg; + + uint64_t ms_initializing; /* leaves initializing this ms */ *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-stable-11@freebsd.org Wed Oct 3 02:18:18 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1AD3610B4E18; Wed, 3 Oct 2018 02:18:18 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B392E86DB7; Wed, 3 Oct 2018 02:18:17 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 85AB41098F; Wed, 3 Oct 2018 02:18:17 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w932IHr4046154; Wed, 3 Oct 2018 02:18:17 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w932IGum046151; Wed, 3 Oct 2018 02:18:16 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030218.w932IGum046151@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:18:16 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339112 - in stable/11: cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/lib/libzfs/common sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11: cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/lib/libzfs/common sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339112 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:18:18 -0000 Author: mav Date: Wed Oct 3 02:18:16 2018 New Revision: 339112 URL: https://svnweb.freebsd.org/changeset/base/339112 Log: MFC r337017: MFV r337014: 9421 zdb should detect and print out the number of "leaked" objects 9422 zfs diff and zdb should explicitly mark objects that are on the deleted queue illumos/illumos-gate@20b5dafb425396adaebd0267d29e1026fc4dc413 Reviewed by: Matt Ahrens Reviewed by: Pavel Zakharov Approved by: Matt Ahrens Author: Paul Dagnelie Modified: stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_diff.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Oct 3 02:16:22 2018 (r339111) +++ stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Oct 3 02:18:16 2018 (r339112) @@ -108,6 +108,7 @@ static uint64_t *zopt_object = NULL; static unsigned zopt_objects = 0; static libzfs_handle_t *g_zfs; static uint64_t max_inflight = 1000; +static int leaked_objects = 0; static void snprintf_blkptr_compact(char *, size_t, const blkptr_t *); @@ -1988,9 +1989,12 @@ dump_znode(objset_t *os, uint64_t object, void *data, if (dump_opt['d'] > 4) { error = zfs_obj_to_path(os, object, path, sizeof (path)); - if (error != 0) { + if (error == ESTALE) { + (void) snprintf(path, sizeof (path), "on delete queue"); + } else if (error != 0) { + leaked_objects++; (void) snprintf(path, sizeof (path), - "\?\?\?", (u_longlong_t)object); + "path not found, possibly leaked"); } (void) printf("\tpath %s\n", path); } @@ -2320,6 +2324,12 @@ dump_dir(objset_t *os) } ASSERT3U(object_count, ==, usedobjs); + + if (leaked_objects != 0) { + (void) printf("%d potentially leaked objects detected\n", + leaked_objects); + leaked_objects = 0; + } } static void @@ -5405,5 +5415,5 @@ main(int argc, char **argv) libzfs_fini(g_zfs); kernel_fini(); - return (0); + return (error); } Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_diff.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_diff.c Wed Oct 3 02:16:22 2018 (r339111) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_diff.c Wed Oct 3 02:18:16 2018 (r339112) @@ -22,7 +22,7 @@ /* * Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved. * Copyright 2015 Nexenta Systems, Inc. All rights reserved. - * Copyright (c) 2015 by Delphix. All rights reserved. + * Copyright (c) 2015, 2017 by Delphix. All rights reserved. * Copyright 2016 Joyent, Inc. * Copyright 2016 Igor Kozhukhov */ @@ -101,7 +101,10 @@ get_stats_for_obj(differ_info_t *di, const char *dsnam return (0); } - if (di->zerr == EPERM) { + if (di->zerr == ESTALE) { + (void) snprintf(pn, maxlen, "(on_delete_queue)"); + return (0); + } else if (di->zerr == EPERM) { (void) snprintf(di->errbuf, sizeof (di->errbuf), dgettext(TEXT_DOMAIN, "The sys_config privilege or diff delegated permission " Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c Wed Oct 3 02:16:22 2018 (r339111) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c Wed Oct 3 02:18:16 2018 (r339112) @@ -2102,6 +2102,17 @@ zfs_obj_to_path_impl(objset_t *osp, uint64_t obj, sa_h *path = '\0'; sa_hdl = hdl; + uint64_t deleteq_obj; + VERIFY0(zap_lookup(osp, MASTER_NODE_OBJ, + ZFS_UNLINKED_SET, sizeof (uint64_t), 1, &deleteq_obj)); + error = zap_lookup_int(osp, deleteq_obj, obj); + if (error == 0) { + return (ESTALE); + } else if (error != ENOENT) { + return (error); + } + error = 0; + for (;;) { uint64_t pobj; char component[MAXNAMELEN + 2]; From owner-svn-src-stable-11@freebsd.org Wed Oct 3 02:19:18 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0CDBE10B4EBC; Wed, 3 Oct 2018 02:19:18 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B7E2086F1D; Wed, 3 Oct 2018 02:19:17 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id B2C0510992; Wed, 3 Oct 2018 02:19:17 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w932JHMf046248; Wed, 3 Oct 2018 02:19:17 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w932JHiG046247; Wed, 3 Oct 2018 02:19:17 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030219.w932JHiG046247@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:19:17 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339113 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339113 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:19:18 -0000 Author: mav Date: Wed Oct 3 02:19:17 2018 New Revision: 339113 URL: https://svnweb.freebsd.org/changeset/base/339113 Log: MFC r337021: MFV r337020:9443 panic when scrub a v10 pool illumos/illumos-gate@bb1f424574ac8e08069d0ba993c2a41ffe796794 Reviewed by: Serapheim Dimitropoulos Reviewed by: George Wilson Reviewed by: Andriy Gapon Reviewed by: Igor Kozhukhov Approved by: Dan McDonald Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Oct 3 02:18:16 2018 (r339112) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Oct 3 02:19:17 2018 (r339113) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2008, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2011, 2017 by Delphix. All rights reserved. + * Copyright (c) 2011, 2018 by Delphix. All rights reserved. * Copyright 2016 Gary Mills * Copyright (c) 2011, 2017 by Delphix. All rights reserved. * Copyright 2017 Joyent, Inc. @@ -2164,7 +2164,8 @@ dsl_scan_visitds(dsl_scan_t *scn, uint64_t dsobj, dmu_ * block-sharing rules don't apply to it. */ if (DSL_SCAN_IS_SCRUB_RESILVER(scn) && !dsl_dataset_is_snapshot(ds) && - ds->ds_dir != dp->dp_origin_snap->ds_dir) { + (dp->dp_origin_snap == NULL || + ds->ds_dir != dp->dp_origin_snap->ds_dir)) { objset_t *os; if (dmu_objset_from_ds(ds, &os) != 0) { goto out; From owner-svn-src-stable-11@freebsd.org Wed Oct 3 02:48:33 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BFF2110B576A; Wed, 3 Oct 2018 02:48:32 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6320288328; Wed, 3 Oct 2018 02:48:32 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 3F77810E7A; Wed, 3 Oct 2018 02:48:32 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w932mWR7061497; Wed, 3 Oct 2018 02:48:32 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w932mVoU061494; Wed, 3 Oct 2018 02:48:31 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030248.w932mVoU061494@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:48:31 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339114 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339114 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:48:33 -0000 Author: mav Date: Wed Oct 3 02:48:31 2018 New Revision: 339114 URL: https://svnweb.freebsd.org/changeset/base/339114 Log: MFC r337025: MFV r337022: 9403 assertion failed in arc_buf_destroy() when concurrently reading block with checksum error This assertion (VERIFY) failure was reported when reading a block. Turns out the problem is that if we get an i/o error (ECKSUM in this case), and there are multiple concurrent ARC reads of the same block (from different clones), then the ARC will put multiple buf's on the same ANON hdr, which isn't supposed to happen, and then causes a panic when we try to arc_buf_destroy() the buf. illumos/illumos-gate@fa98e487a9619b7902f218663be219e787a57dad Reviewed by: George Wilson Reviewed by: Paul Dagnelie Reviewed by: Pavel Zakharov Approved by: Matt Ahrens Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_compress.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Oct 3 02:19:17 2018 (r339113) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Oct 3 02:48:31 2018 (r339114) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2018, Joyent, Inc. - * Copyright (c) 2011, 2017 by Delphix. All rights reserved. + * Copyright (c) 2011, 2018 by Delphix. All rights reserved. * Copyright (c) 2014 by Saso Kiselkov. All rights reserved. * Copyright 2017 Nexenta Systems, Inc. All rights reserved. */ @@ -5177,12 +5177,13 @@ arc_getbuf_func(zio_t *zio, const zbookmark_phys_t *zb arc_buf_t *buf, void *arg) { arc_buf_t **bufp = arg; - if (buf == NULL) { + ASSERT(zio == NULL || zio->io_error != 0); *bufp = NULL; } else { + ASSERT(zio == NULL || zio->io_error == 0); *bufp = buf; - ASSERT(buf->b_data); + ASSERT(buf->b_data != NULL); } } @@ -5210,6 +5211,7 @@ arc_read_done(zio_t *zio) arc_callback_t *callback_list; arc_callback_t *acb; boolean_t freeable = B_FALSE; + boolean_t no_zio_error = (zio->io_error == 0); /* * The hdr was inserted into hash-table and removed from lists @@ -5235,7 +5237,7 @@ arc_read_done(zio_t *zio) ASSERT3P(hash_lock, !=, NULL); } - if (zio->io_error == 0) { + if (no_zio_error) { /* byteswap if necessary */ if (BP_SHOULD_BYTESWAP(zio->io_bp)) { if (BP_GET_LEVEL(zio->io_bp) > 0) { @@ -5256,8 +5258,7 @@ arc_read_done(zio_t *zio) callback_list = hdr->b_l1hdr.b_acb; ASSERT3P(callback_list, !=, NULL); - if (hash_lock && zio->io_error == 0 && - hdr->b_l1hdr.b_state == arc_anon) { + if (hash_lock && no_zio_error && hdr->b_l1hdr.b_state == arc_anon) { /* * Only call arc_access on anonymous buffers. This is because * if we've issued an I/O for an evicted buffer, we've already @@ -5280,20 +5281,38 @@ arc_read_done(zio_t *zio) callback_cnt++; - if (zio->io_error != 0) - continue; - - int error = arc_buf_alloc_impl(hdr, acb->acb_private, - acb->acb_compressed, - B_TRUE, &acb->acb_buf); - if (error != 0) { - arc_buf_destroy(acb->acb_buf, acb->acb_private); - acb->acb_buf = NULL; + if (no_zio_error) { + int error = arc_buf_alloc_impl(hdr, acb->acb_private, + acb->acb_compressed, zio->io_error == 0, + &acb->acb_buf); + if (error != 0) { + /* + * Decompression failed. Set io_error + * so that when we call acb_done (below), + * we will indicate that the read failed. + * Note that in the unusual case where one + * callback is compressed and another + * uncompressed, we will mark all of them + * as failed, even though the uncompressed + * one can't actually fail. In this case, + * the hdr will not be anonymous, because + * if there are multiple callbacks, it's + * because multiple threads found the same + * arc buf in the hash table. + */ + zio->io_error = error; + } } - - if (zio->io_error == 0) - zio->io_error = error; } + /* + * If there are multiple callbacks, we must have the hash lock, + * because the only way for multiple threads to find this hdr is + * in the hash table. This ensures that if there are multiple + * callbacks, the hdr is not anonymous. If it were anonymous, + * we couldn't use arc_buf_destroy() in the error case below. + */ + ASSERT(callback_cnt < 2 || hash_lock != NULL); + hdr->b_l1hdr.b_acb = NULL; arc_hdr_clear_flags(hdr, ARC_FLAG_IO_IN_PROGRESS); if (callback_cnt == 0) { @@ -5305,7 +5324,7 @@ arc_read_done(zio_t *zio) ASSERT(refcount_is_zero(&hdr->b_l1hdr.b_refcnt) || callback_list != NULL); - if (zio->io_error == 0) { + if (no_zio_error) { arc_hdr_verify(hdr, zio->io_bp); } else { arc_hdr_set_flags(hdr, ARC_FLAG_IO_ERROR); @@ -5338,7 +5357,16 @@ arc_read_done(zio_t *zio) /* execute each callback and free its structure */ while ((acb = callback_list) != NULL) { - if (acb->acb_done) { + if (acb->acb_done != NULL) { + if (zio->io_error != 0 && acb->acb_buf != NULL) { + /* + * If arc_buf_alloc_impl() fails during + * decompression, the buf will still be + * allocated, and needs to be freed here. + */ + arc_buf_destroy(acb->acb_buf, acb->acb_private); + acb->acb_buf = NULL; + } acb->acb_done(zio, &zio->io_bookmark, zio->io_bp, acb->acb_buf, acb->acb_private); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Oct 3 02:19:17 2018 (r339113) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Oct 3 02:48:31 2018 (r339114) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. - * Copyright (c) 2012, 2017 by Delphix. All rights reserved. + * Copyright (c) 2012, 2018 by Delphix. All rights reserved. * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. * Copyright (c) 2013, Joyent, Inc. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. @@ -1000,8 +1000,15 @@ dbuf_read_done(zio_t *zio, const zbookmark_phys_t *zb, ASSERT(refcount_count(&db->db_holds) > 0); ASSERT(db->db_buf == NULL); ASSERT(db->db.db_data == NULL); - if (db->db_level == 0 && db->db_freed_in_flight) { - /* we were freed in flight; disregard any error */ + if (buf == NULL) { + /* i/o error */ + ASSERT(zio == NULL || zio->io_error != 0); + ASSERT(db->db_blkid != DMU_BONUS_BLKID); + ASSERT3P(db->db_buf, ==, NULL); + db->db_state = DB_UNCACHED; + } else if (db->db_level == 0 && db->db_freed_in_flight) { + /* freed in flight */ + ASSERT(zio == NULL || zio->io_error == 0); if (buf == NULL) { buf = arc_alloc_buf(db->db_objset->os_spa, db, DBUF_GET_BUFC_TYPE(db), db->db.db_size); @@ -1012,13 +1019,11 @@ dbuf_read_done(zio_t *zio, const zbookmark_phys_t *zb, db->db_freed_in_flight = FALSE; dbuf_set_data(db, buf); db->db_state = DB_CACHED; - } else if (buf != NULL) { + } else { + /* success */ + ASSERT(zio == NULL || zio->io_error == 0); dbuf_set_data(db, buf); db->db_state = DB_CACHED; - } else { - ASSERT(db->db_blkid != DMU_BONUS_BLKID); - ASSERT3P(db->db_buf, ==, NULL); - db->db_state = DB_UNCACHED; } cv_broadcast(&db->db_changed); dbuf_rele_and_unlock(db, NULL); @@ -2431,6 +2436,13 @@ dbuf_prefetch_indirect_done(zio_t *zio, const zbookmar ASSERT3S(dpa->dpa_zb.zb_level, <, dpa->dpa_curlevel); ASSERT3S(dpa->dpa_curlevel, >, 0); + + if (abuf == NULL) { + ASSERT(zio == NULL || zio->io_error != 0); + kmem_free(dpa, sizeof (*dpa)); + return; + } + ASSERT(zio == NULL || zio->io_error == 0); /* * The dpa_dnode is only valid if we are called with a NULL Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_compress.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_compress.c Wed Oct 3 02:19:17 2018 (r339113) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_compress.c Wed Oct 3 02:48:31 2018 (r339114) @@ -25,7 +25,7 @@ */ /* * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. - * Copyright (c) 2013, 2016 by Delphix. All rights reserved. + * Copyright (c) 2013, 2018 by Delphix. All rights reserved. */ #include @@ -56,6 +56,12 @@ static zcomp_stats_t zcomp_stats = { kstat_t *zcomp_ksp; /* + * If nonzero, every 1/X decompression attempts will fail, simulating + * an undetected memory error. + */ +uint64_t zio_decompress_fail_fraction = 0; + +/* * Compression vectors. */ zio_compress_info_t zio_compress_table[ZIO_COMPRESS_FUNCTIONS] = { @@ -171,6 +177,16 @@ zio_decompress_data(enum zio_compress c, abd_t *src, v void *tmp = abd_borrow_buf_copy(src, s_len); int ret = zio_decompress_data_buf(c, tmp, dst, s_len, d_len); abd_return_buf(src, tmp, s_len); + + /* + * Decompression shouldn't fail, because we've already verifyied + * the checksum. However, for extra protection (e.g. against bitflips + * in non-ECC RAM), we handle this error (and test it). + */ + ASSERT0(ret); + if (zio_decompress_fail_fraction != 0 && + spa_get_random(zio_decompress_fail_fraction) == 0) + ret = SET_ERROR(EINVAL); return (ret); } From owner-svn-src-stable-11@freebsd.org Wed Oct 3 02:49:25 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 73FAD10B57DF; Wed, 3 Oct 2018 02:49:25 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2A30388475; Wed, 3 Oct 2018 02:49:25 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 24F6110E7B; Wed, 3 Oct 2018 02:49:25 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w932nP5T061594; Wed, 3 Oct 2018 02:49:25 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w932nOlP061590; Wed, 3 Oct 2018 02:49:24 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030249.w932nOlP061590@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:49:24 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339115 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339115 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:49:25 -0000 Author: mav Date: Wed Oct 3 02:49:24 2018 New Revision: 339115 URL: https://svnweb.freebsd.org/changeset/base/339115 Log: MFC r337028: MFV r337027: 9328 zap code can take advantage of c99 9329 panic in zap_leaf_lookup() due to concurrent zapification illumos/illumos-gate@bf26014c5541b6119f34e0d95294b7f2eb105ac2 Reviewed by: Steve Gonczi Reviewed by: George Wilson Reviewed by: Pavel Zakharov Reviewed by: Brad Lewis Approved by: Dan McDonald Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c Wed Oct 3 02:48:31 2018 (r339114) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c Wed Oct 3 02:49:24 2018 (r339115) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2013, 2015 by Delphix. All rights reserved. + * Copyright (c) 2013, 2017 by Delphix. All rights reserved. * Copyright 2014 HybridCluster. All rights reserved. */ @@ -204,12 +204,18 @@ dmu_object_zapify(objset_t *mos, uint64_t object, dmu_ } ASSERT3U(dn->dn_type, ==, old_type); ASSERT0(dn->dn_maxblkid); + + /* + * We must initialize the ZAP data before changing the type, + * so that concurrent calls to *_is_zapified() can determine if + * the object has been completely zapified by checking the type. + */ + mzap_create_impl(mos, object, 0, 0, tx); + dn->dn_next_type[tx->tx_txg & TXG_MASK] = dn->dn_type = DMU_OTN_ZAP_METADATA; dnode_setdirty(dn, tx); dnode_rele(dn, FTAG); - - mzap_create_impl(mos, object, 0, 0, tx); spa_feature_incr(dmu_objset_spa(mos), SPA_FEATURE_EXTENSIBLE_DATASET, tx); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c Wed Oct 3 02:48:31 2018 (r339114) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c Wed Oct 3 02:49:24 2018 (r339115) @@ -58,10 +58,8 @@ static uint64_t zap_allocate_blocks(zap_t *zap, int nb void fzap_byteswap(void *vbuf, size_t size) { - uint64_t block_type; + uint64_t block_type = *(uint64_t *)vbuf; - block_type = *(uint64_t *)vbuf; - if (block_type == ZBT_LEAF || block_type == BSWAP_64(ZBT_LEAF)) zap_leaf_byteswap(vbuf, size); else { @@ -73,11 +71,6 @@ fzap_byteswap(void *vbuf, size_t size) void fzap_upgrade(zap_t *zap, dmu_tx_t *tx, zap_flags_t flags) { - dmu_buf_t *db; - zap_leaf_t *l; - int i; - zap_phys_t *zp; - ASSERT(RW_WRITE_HELD(&zap->zap_rwlock)); zap->zap_ismicro = FALSE; @@ -87,7 +80,7 @@ fzap_upgrade(zap_t *zap, dmu_tx_t *tx, zap_flags_t fla mutex_init(&zap->zap_f.zap_num_entries_mtx, 0, 0, 0); zap->zap_f.zap_block_shift = highbit64(zap->zap_dbuf->db_size) - 1; - zp = zap_f_phys(zap); + zap_phys_t *zp = zap_f_phys(zap); /* * explicitly zero it since it might be coming from an * initialized microzap @@ -106,17 +99,18 @@ fzap_upgrade(zap_t *zap, dmu_tx_t *tx, zap_flags_t fla zp->zap_flags = flags; /* block 1 will be the first leaf */ - for (i = 0; i < (1<zap_ptrtbl.zt_shift); i++) + for (int i = 0; i < (1<zap_ptrtbl.zt_shift); i++) ZAP_EMBEDDED_PTRTBL_ENT(zap, i) = 1; /* * set up block 1 - the first leaf */ - VERIFY(0 == dmu_buf_hold(zap->zap_objset, zap->zap_object, + dmu_buf_t *db; + VERIFY0(dmu_buf_hold(zap->zap_objset, zap->zap_object, 1<l_dbuf = db; zap_leaf_init(l, zp->zap_normflags != 0); @@ -146,9 +140,7 @@ zap_table_grow(zap_t *zap, zap_table_phys_t *tbl, void (*transfer_func)(const uint64_t *src, uint64_t *dst, int n), dmu_tx_t *tx) { - uint64_t b, newblk; - dmu_buf_t *db_old, *db_new; - int err; + uint64_t newblk; int bs = FZAP_BLOCK_SHIFT(zap); int hepb = 1<<(bs-4); /* hepb = half the number of entries in a block */ @@ -172,21 +164,23 @@ zap_table_grow(zap_t *zap, zap_table_phys_t *tbl, * Copy the ptrtbl from the old to new location. */ - b = tbl->zt_blks_copied; - err = dmu_buf_hold(zap->zap_objset, zap->zap_object, + uint64_t b = tbl->zt_blks_copied; + dmu_buf_t *db_old; + int err = dmu_buf_hold(zap->zap_objset, zap->zap_object, (tbl->zt_blk + b) << bs, FTAG, &db_old, DMU_READ_NO_PREFETCH); - if (err) + if (err != 0) return (err); /* first half of entries in old[b] go to new[2*b+0] */ - VERIFY(0 == dmu_buf_hold(zap->zap_objset, zap->zap_object, + dmu_buf_t *db_new; + VERIFY0(dmu_buf_hold(zap->zap_objset, zap->zap_object, (newblk + 2*b+0) << bs, FTAG, &db_new, DMU_READ_NO_PREFETCH)); dmu_buf_will_dirty(db_new, tx); transfer_func(db_old->db_data, db_new->db_data, hepb); dmu_buf_rele(db_new, FTAG); /* second half of entries in old[b] go to new[2*b+1] */ - VERIFY(0 == dmu_buf_hold(zap->zap_objset, zap->zap_object, + VERIFY0(dmu_buf_hold(zap->zap_objset, zap->zap_object, (newblk + 2*b+1) << bs, FTAG, &db_new, DMU_READ_NO_PREFETCH)); dmu_buf_will_dirty(db_new, tx); transfer_func((uint64_t *)db_old->db_data + hepb, @@ -221,22 +215,20 @@ static int zap_table_store(zap_t *zap, zap_table_phys_t *tbl, uint64_t idx, uint64_t val, dmu_tx_t *tx) { - int err; - uint64_t blk, off; int bs = FZAP_BLOCK_SHIFT(zap); - dmu_buf_t *db; ASSERT(RW_LOCK_HELD(&zap->zap_rwlock)); ASSERT(tbl->zt_blk != 0); dprintf("storing %llx at index %llx\n", val, idx); - blk = idx >> (bs-3); - off = idx & ((1<<(bs-3))-1); + uint64_t blk = idx >> (bs-3); + uint64_t off = idx & ((1<<(bs-3))-1); - err = dmu_buf_hold(zap->zap_objset, zap->zap_object, + dmu_buf_t *db; + int err = dmu_buf_hold(zap->zap_objset, zap->zap_object, (tbl->zt_blk + blk) << bs, FTAG, &db, DMU_READ_NO_PREFETCH); - if (err) + if (err != 0) return (err); dmu_buf_will_dirty(db, tx); @@ -249,7 +241,7 @@ zap_table_store(zap_t *zap, zap_table_phys_t *tbl, uin err = dmu_buf_hold(zap->zap_objset, zap->zap_object, (tbl->zt_nextblk + blk2) << bs, FTAG, &db2, DMU_READ_NO_PREFETCH); - if (err) { + if (err != 0) { dmu_buf_rele(db, FTAG); return (err); } @@ -268,27 +260,24 @@ zap_table_store(zap_t *zap, zap_table_phys_t *tbl, uin static int zap_table_load(zap_t *zap, zap_table_phys_t *tbl, uint64_t idx, uint64_t *valp) { - uint64_t blk, off; - int err; - dmu_buf_t *db; - dnode_t *dn; int bs = FZAP_BLOCK_SHIFT(zap); ASSERT(RW_LOCK_HELD(&zap->zap_rwlock)); - blk = idx >> (bs-3); - off = idx & ((1<<(bs-3))-1); + uint64_t blk = idx >> (bs-3); + uint64_t off = idx & ((1<<(bs-3))-1); /* * Note: this is equivalent to dmu_buf_hold(), but we use * _dnode_enter / _by_dnode because it's faster because we don't * have to hold the dnode. */ - dn = dmu_buf_dnode_enter(zap->zap_dbuf); - err = dmu_buf_hold_by_dnode(dn, + dnode_t *dn = dmu_buf_dnode_enter(zap->zap_dbuf); + dmu_buf_t *db; + int err = dmu_buf_hold_by_dnode(dn, (tbl->zt_blk + blk) << bs, FTAG, &db, DMU_READ_NO_PREFETCH); dmu_buf_dnode_exit(zap->zap_dbuf); - if (err) + if (err != 0) return (err); *valp = ((uint64_t *)db->db_data)[off]; dmu_buf_rele(db, FTAG); @@ -319,11 +308,10 @@ zap_table_load(zap_t *zap, zap_table_phys_t *tbl, uint static void zap_ptrtbl_transfer(const uint64_t *src, uint64_t *dst, int n) { - int i; - for (i = 0; i < n; i++) { + for (int i = 0; i < n; i++) { uint64_t lb = src[i]; - dst[2*i+0] = lb; - dst[2*i+1] = lb; + dst[2 * i + 0] = lb; + dst[2 * i + 1] = lb; } } @@ -345,19 +333,16 @@ zap_grow_ptrtbl(zap_t *zap, dmu_tx_t *tx) * stored in the header block). Give it its own entire * block, which will double the size of the ptrtbl. */ - uint64_t newblk; - dmu_buf_t *db_new; - int err; - ASSERT3U(zap_f_phys(zap)->zap_ptrtbl.zt_shift, ==, ZAP_EMBEDDED_PTRTBL_SHIFT(zap)); ASSERT0(zap_f_phys(zap)->zap_ptrtbl.zt_blk); - newblk = zap_allocate_blocks(zap, 1); - err = dmu_buf_hold(zap->zap_objset, zap->zap_object, + uint64_t newblk = zap_allocate_blocks(zap, 1); + dmu_buf_t *db_new; + int err = dmu_buf_hold(zap->zap_objset, zap->zap_object, newblk << FZAP_BLOCK_SHIFT(zap), FTAG, &db_new, DMU_READ_NO_PREFETCH); - if (err) + if (err != 0) return (err); dmu_buf_will_dirty(db_new, tx); zap_ptrtbl_transfer(&ZAP_EMBEDDED_PTRTBL_ENT(zap, 0), @@ -392,9 +377,8 @@ zap_increment_num_entries(zap_t *zap, int delta, dmu_t static uint64_t zap_allocate_blocks(zap_t *zap, int nblocks) { - uint64_t newblk; ASSERT(RW_WRITE_HELD(&zap->zap_rwlock)); - newblk = zap_f_phys(zap)->zap_freeblk; + uint64_t newblk = zap_f_phys(zap)->zap_freeblk; zap_f_phys(zap)->zap_freeblk += nblocks; return (newblk); } @@ -411,7 +395,6 @@ zap_leaf_evict_sync(void *dbu) static zap_leaf_t * zap_create_leaf(zap_t *zap, dmu_tx_t *tx) { - void *winner; zap_leaf_t *l = kmem_zalloc(sizeof (zap_leaf_t), KM_SLEEP); ASSERT(RW_WRITE_HELD(&zap->zap_rwlock)); @@ -421,12 +404,11 @@ zap_create_leaf(zap_t *zap, dmu_tx_t *tx) l->l_blkid = zap_allocate_blocks(zap, 1); l->l_dbuf = NULL; - VERIFY(0 == dmu_buf_hold(zap->zap_objset, zap->zap_object, + VERIFY0(dmu_buf_hold(zap->zap_objset, zap->zap_object, l->l_blkid << FZAP_BLOCK_SHIFT(zap), NULL, &l->l_dbuf, DMU_READ_NO_PREFETCH)); dmu_buf_init_user(&l->l_dbu, zap_leaf_evict_sync, NULL, &l->l_dbuf); - winner = dmu_buf_set_user(l->l_dbuf, &l->l_dbu); - ASSERT(winner == NULL); + VERIFY3P(NULL, ==, dmu_buf_set_user(l->l_dbuf, &l->l_dbu)); dmu_buf_will_dirty(l->l_dbuf, tx); zap_leaf_init(l, zap->zap_normflags != 0); @@ -460,11 +442,9 @@ zap_put_leaf(zap_leaf_t *l) static zap_leaf_t * zap_open_leaf(uint64_t blkid, dmu_buf_t *db) { - zap_leaf_t *l, *winner; - ASSERT(blkid != 0); - l = kmem_zalloc(sizeof (zap_leaf_t), KM_SLEEP); + zap_leaf_t *l = kmem_zalloc(sizeof (zap_leaf_t), KM_SLEEP); rw_init(&l->l_rwlock, 0, 0, 0); rw_enter(&l->l_rwlock, RW_WRITER); l->l_blkid = blkid; @@ -472,7 +452,7 @@ zap_open_leaf(uint64_t blkid, dmu_buf_t *db) l->l_dbuf = db; dmu_buf_init_user(&l->l_dbu, zap_leaf_evict_sync, NULL, &l->l_dbuf); - winner = dmu_buf_set_user(db, &l->l_dbu); + zap_leaf_t *winner = dmu_buf_set_user(db, &l->l_dbu); rw_exit(&l->l_rwlock); if (winner != NULL) { @@ -510,17 +490,15 @@ zap_get_leaf_byblk(zap_t *zap, uint64_t blkid, dmu_tx_ zap_leaf_t **lp) { dmu_buf_t *db; - zap_leaf_t *l; - int bs = FZAP_BLOCK_SHIFT(zap); - int err; ASSERT(RW_LOCK_HELD(&zap->zap_rwlock)); + int bs = FZAP_BLOCK_SHIFT(zap); dnode_t *dn = dmu_buf_dnode_enter(zap->zap_dbuf); - err = dmu_buf_hold_by_dnode(dn, + int err = dmu_buf_hold_by_dnode(dn, blkid << bs, NULL, &db, DMU_READ_NO_PREFETCH); dmu_buf_dnode_exit(zap->zap_dbuf); - if (err) + if (err != 0) return (err); ASSERT3U(db->db_object, ==, zap->zap_object); @@ -528,7 +506,7 @@ zap_get_leaf_byblk(zap_t *zap, uint64_t blkid, dmu_tx_ ASSERT3U(db->db_size, ==, 1 << bs); ASSERT(blkid != 0); - l = dmu_buf_get_user(db); + zap_leaf_t *l = dmu_buf_get_user(db); if (l == NULL) l = zap_open_leaf(blkid, db); @@ -583,8 +561,7 @@ zap_set_idx_to_blk(zap_t *zap, uint64_t idx, uint64_t static int zap_deref_leaf(zap_t *zap, uint64_t h, dmu_tx_t *tx, krw_t lt, zap_leaf_t **lp) { - uint64_t idx, blk; - int err; + uint64_t blk; ASSERT(zap->zap_dbuf == NULL || zap_f_phys(zap) == zap->zap_dbuf->db_data); @@ -596,8 +573,8 @@ zap_deref_leaf(zap_t *zap, uint64_t h, dmu_tx_t *tx, k return (SET_ERROR(EIO)); } - idx = ZAP_HASH_IDX(h, zap_f_phys(zap)->zap_ptrtbl.zt_shift); - err = zap_idx_to_blk(zap, idx, &blk); + uint64_t idx = ZAP_HASH_IDX(h, zap_f_phys(zap)->zap_ptrtbl.zt_shift); + int err = zap_idx_to_blk(zap, idx, &blk); if (err != 0) return (err); err = zap_get_leaf_byblk(zap, blk, tx, lt, lp); @@ -614,9 +591,7 @@ zap_expand_leaf(zap_name_t *zn, zap_leaf_t *l, { zap_t *zap = zn->zn_zap; uint64_t hash = zn->zn_hash; - zap_leaf_t *nl; - int prefix_diff, i, err; - uint64_t sibling; + int err; int old_prefix_len = zap_leaf_phys(l)->l_hdr.lh_prefix_len; ASSERT3U(old_prefix_len, <=, zap_f_phys(zap)->zap_ptrtbl.zt_shift); @@ -636,19 +611,19 @@ zap_expand_leaf(zap_name_t *zn, zap_leaf_t *l, err = zap_lockdir(os, object, tx, RW_WRITER, FALSE, FALSE, tag, &zn->zn_zap); zap = zn->zn_zap; - if (err) + if (err != 0) return (err); ASSERT(!zap->zap_ismicro); while (old_prefix_len == zap_f_phys(zap)->zap_ptrtbl.zt_shift) { err = zap_grow_ptrtbl(zap, tx); - if (err) + if (err != 0) return (err); } err = zap_deref_leaf(zap, hash, tx, RW_WRITER, &l); - if (err) + if (err != 0) return (err); if (zap_leaf_phys(l)->l_hdr.lh_prefix_len != old_prefix_len) { @@ -662,25 +637,26 @@ zap_expand_leaf(zap_name_t *zn, zap_leaf_t *l, ASSERT3U(ZAP_HASH_IDX(hash, old_prefix_len), ==, zap_leaf_phys(l)->l_hdr.lh_prefix); - prefix_diff = zap_f_phys(zap)->zap_ptrtbl.zt_shift - + int prefix_diff = zap_f_phys(zap)->zap_ptrtbl.zt_shift - (old_prefix_len + 1); - sibling = (ZAP_HASH_IDX(hash, old_prefix_len + 1) | 1) << prefix_diff; + uint64_t sibling = + (ZAP_HASH_IDX(hash, old_prefix_len + 1) | 1) << prefix_diff; /* check for i/o errors before doing zap_leaf_split */ - for (i = 0; i < (1ULL<l_blkid); } - nl = zap_create_leaf(zap, tx); + zap_leaf_t *nl = zap_create_leaf(zap, tx); zap_leaf_split(l, nl, zap->zap_normflags != 0); /* set sibling pointers */ - for (i = 0; i < (1ULL << prefix_diff); i++) { - err = zap_set_idx_to_blk(zap, sibling+i, nl->l_blkid, tx); + for (int i = 0; i < (1ULL << prefix_diff); i++) { + err = zap_set_idx_to_blk(zap, sibling + i, nl->l_blkid, tx); ASSERT0(err); /* we checked for i/o errors above */ } @@ -708,8 +684,6 @@ zap_put_leaf_maybe_grow_ptrtbl(zap_name_t *zn, zap_lea zap_put_leaf(l); if (leaffull || zap_f_phys(zap)->zap_ptrtbl.zt_nextblk) { - int err; - /* * We are in the middle of growing the pointer table, or * this leaf will soon make us grow it. @@ -719,10 +693,10 @@ zap_put_leaf_maybe_grow_ptrtbl(zap_name_t *zn, zap_lea uint64_t zapobj = zap->zap_object; zap_unlockdir(zap, tag); - err = zap_lockdir(os, zapobj, tx, + int err = zap_lockdir(os, zapobj, tx, RW_WRITER, FALSE, FALSE, tag, &zn->zn_zap); zap = zn->zn_zap; - if (err) + if (err != 0) return; } @@ -763,9 +737,8 @@ fzap_checksize(uint64_t integer_size, uint64_t num_int static int fzap_check(zap_name_t *zn, uint64_t integer_size, uint64_t num_integers) { - int err; - - if ((err = fzap_checkname(zn)) != 0) + int err = fzap_checkname(zn); + if (err != 0) return (err); return (fzap_checksize(integer_size, num_integers)); } @@ -779,10 +752,10 @@ fzap_lookup(zap_name_t *zn, char *realname, int rn_len, boolean_t *ncp) { zap_leaf_t *l; - int err; zap_entry_handle_t zeh; - if ((err = fzap_checkname(zn)) != 0) + int err = fzap_checkname(zn); + if (err != 0) return (err); err = zap_deref_leaf(zn->zn_zap, zn->zn_hash, NULL, RW_READER, &l); @@ -870,7 +843,8 @@ fzap_update(zap_name_t *zn, void *tag, dmu_tx_t *tx) { zap_leaf_t *l; - int err, create; + int err; + boolean_t create; zap_entry_handle_t zeh; zap_t *zap = zn->zn_zap; @@ -923,9 +897,9 @@ fzap_length(zap_name_t *zn, if (err != 0) goto out; - if (integer_size) + if (integer_size != 0) *integer_size = zeh.zeh_integer_size; - if (num_integers) + if (num_integers != 0) *num_integers = zeh.zeh_num_integers; out: zap_put_leaf(l); @@ -954,15 +928,14 @@ fzap_remove(zap_name_t *zn, dmu_tx_t *tx) void fzap_prefetch(zap_name_t *zn) { - uint64_t idx, blk; + uint64_t blk; zap_t *zap = zn->zn_zap; - int bs; - idx = ZAP_HASH_IDX(zn->zn_hash, + uint64_t idx = ZAP_HASH_IDX(zn->zn_hash, zap_f_phys(zap)->zap_ptrtbl.zt_shift); if (zap_idx_to_blk(zap, idx, &blk) != 0) return; - bs = FZAP_BLOCK_SHIFT(zap); + int bs = FZAP_BLOCK_SHIFT(zap); dmu_prefetch(zap->zap_objset, zap->zap_object, 0, blk << bs, 1 << bs, ZIO_PRIORITY_SYNC_READ); } @@ -975,9 +948,8 @@ uint64_t zap_create_link(objset_t *os, dmu_object_type_t ot, uint64_t parent_obj, const char *name, dmu_tx_t *tx) { - uint64_t new_obj; - - VERIFY((new_obj = zap_create(os, ot, DMU_OT_NONE, 0, tx)) > 0); + uint64_t new_obj = zap_create(os, ot, DMU_OT_NONE, 0, tx); + VERIFY(new_obj != 0); VERIFY0(zap_add(os, parent_obj, name, sizeof (uint64_t), 1, &new_obj, tx)); @@ -989,13 +961,12 @@ zap_value_search(objset_t *os, uint64_t zapobj, uint64 char *name) { zap_cursor_t zc; - zap_attribute_t *za; int err; if (mask == 0) mask = -1ULL; - za = kmem_alloc(sizeof (zap_attribute_t), KM_SLEEP); + zap_attribute_t *za = kmem_alloc(sizeof (*za), KM_SLEEP); for (zap_cursor_init(&zc, os, zapobj); (err = zap_cursor_retrieve(&zc, za)) == 0; zap_cursor_advance(&zc)) { @@ -1005,7 +976,7 @@ zap_value_search(objset_t *os, uint64_t zapobj, uint64 } } zap_cursor_fini(&zc); - kmem_free(za, sizeof (zap_attribute_t)); + kmem_free(za, sizeof (*za)); return (err); } @@ -1013,23 +984,23 @@ int zap_join(objset_t *os, uint64_t fromobj, uint64_t intoobj, dmu_tx_t *tx) { zap_cursor_t zc; - zap_attribute_t za; - int err; + int err = 0; - err = 0; + zap_attribute_t *za = kmem_alloc(sizeof (*za), KM_SLEEP); for (zap_cursor_init(&zc, os, fromobj); - zap_cursor_retrieve(&zc, &za) == 0; + zap_cursor_retrieve(&zc, za) == 0; (void) zap_cursor_advance(&zc)) { - if (za.za_integer_length != 8 || za.za_num_integers != 1) { + if (za->za_integer_length != 8 || za->za_num_integers != 1) { err = SET_ERROR(EINVAL); break; } - err = zap_add(os, intoobj, za.za_name, - 8, 1, &za.za_first_integer, tx); - if (err) + err = zap_add(os, intoobj, za->za_name, + 8, 1, &za->za_first_integer, tx); + if (err != 0) break; } zap_cursor_fini(&zc); + kmem_free(za, sizeof (*za)); return (err); } @@ -1038,23 +1009,23 @@ zap_join_key(objset_t *os, uint64_t fromobj, uint64_t uint64_t value, dmu_tx_t *tx) { zap_cursor_t zc; - zap_attribute_t za; - int err; + int err = 0; - err = 0; + zap_attribute_t *za = kmem_alloc(sizeof (*za), KM_SLEEP); for (zap_cursor_init(&zc, os, fromobj); - zap_cursor_retrieve(&zc, &za) == 0; + zap_cursor_retrieve(&zc, za) == 0; (void) zap_cursor_advance(&zc)) { - if (za.za_integer_length != 8 || za.za_num_integers != 1) { + if (za->za_integer_length != 8 || za->za_num_integers != 1) { err = SET_ERROR(EINVAL); break; } - err = zap_add(os, intoobj, za.za_name, + err = zap_add(os, intoobj, za->za_name, 8, 1, &value, tx); if (err != 0) break; } zap_cursor_fini(&zc); + kmem_free(za, sizeof (*za)); return (err); } @@ -1063,29 +1034,29 @@ zap_join_increment(objset_t *os, uint64_t fromobj, uin dmu_tx_t *tx) { zap_cursor_t zc; - zap_attribute_t za; - int err; + int err = 0; - err = 0; + zap_attribute_t *za = kmem_alloc(sizeof (*za), KM_SLEEP); for (zap_cursor_init(&zc, os, fromobj); - zap_cursor_retrieve(&zc, &za) == 0; + zap_cursor_retrieve(&zc, za) == 0; (void) zap_cursor_advance(&zc)) { uint64_t delta = 0; - if (za.za_integer_length != 8 || za.za_num_integers != 1) { + if (za->za_integer_length != 8 || za->za_num_integers != 1) { err = SET_ERROR(EINVAL); break; } - err = zap_lookup(os, intoobj, za.za_name, 8, 1, &delta); + err = zap_lookup(os, intoobj, za->za_name, 8, 1, &delta); if (err != 0 && err != ENOENT) break; - delta += za.za_first_integer; - err = zap_update(os, intoobj, za.za_name, 8, 1, &delta, tx); - if (err) + delta += za->za_first_integer; + err = zap_update(os, intoobj, za->za_name, 8, 1, &delta, tx); + if (err != 0) break; } zap_cursor_fini(&zc); + kmem_free(za, sizeof (*za)); return (err); } @@ -1150,12 +1121,11 @@ zap_increment(objset_t *os, uint64_t obj, const char * dmu_tx_t *tx) { uint64_t value = 0; - int err; if (delta == 0) return (0); - err = zap_lookup(os, obj, name, 8, 1, &value); + int err = zap_lookup(os, obj, name, 8, 1, &value); if (err != 0 && err != ENOENT) return (err); value += delta; @@ -1253,7 +1223,6 @@ again: static void zap_stats_ptrtbl(zap_t *zap, uint64_t *tbl, int len, zap_stats_t *zs) { - int i, err; uint64_t lastblk = 0; /* @@ -1261,14 +1230,14 @@ zap_stats_ptrtbl(zap_t *zap, uint64_t *tbl, int len, z * can hold, then it'll be accounted for more than once, since * we won't have lastblk. */ - for (i = 0; i < len; i++) { + for (int i = 0; i < len; i++) { zap_leaf_t *l; if (tbl[i] == lastblk) continue; lastblk = tbl[i]; - err = zap_get_leaf_byblk(zap, tbl[i], NULL, RW_READER, &l); + int err = zap_get_leaf_byblk(zap, tbl[i], NULL, RW_READER, &l); if (err == 0) { zap_leaf_stats(zap, l, zs); zap_put_leaf(l); @@ -1333,14 +1302,12 @@ fzap_get_stats(zap_t *zap, zap_stats_t *zs) zap_stats_ptrtbl(zap, &ZAP_EMBEDDED_PTRTBL_ENT(zap, 0), 1 << ZAP_EMBEDDED_PTRTBL_SHIFT(zap), zs); } else { - int b; - dmu_prefetch(zap->zap_objset, zap->zap_object, 0, zap_f_phys(zap)->zap_ptrtbl.zt_blk << bs, zap_f_phys(zap)->zap_ptrtbl.zt_numblks << bs, ZIO_PRIORITY_SYNC_READ); - for (b = 0; b < zap_f_phys(zap)->zap_ptrtbl.zt_numblks; + for (int b = 0; b < zap_f_phys(zap)->zap_ptrtbl.zt_numblks; b++) { dmu_buf_t *db; int err; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c Wed Oct 3 02:48:31 2018 (r339114) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c Wed Oct 3 02:49:24 2018 (r339115) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2013, 2015 by Delphix. All rights reserved. + * Copyright (c) 2013, 2016 by Delphix. All rights reserved. * Copyright 2017 Nexenta Systems, Inc. */ @@ -107,7 +107,6 @@ ldv(int len, const void *addr) void zap_leaf_byteswap(zap_leaf_phys_t *buf, int size) { - int i; zap_leaf_t l; dmu_buf_t l_dbuf; @@ -123,10 +122,10 @@ zap_leaf_byteswap(zap_leaf_phys_t *buf, int size) buf->l_hdr.lh_prefix_len = BSWAP_16(buf->l_hdr.lh_prefix_len); buf->l_hdr.lh_freelist = BSWAP_16(buf->l_hdr.lh_freelist); - for (i = 0; i < ZAP_LEAF_HASH_NUMENTRIES(&l); i++) + for (int i = 0; i < ZAP_LEAF_HASH_NUMENTRIES(&l); i++) buf->l_hash[i] = BSWAP_16(buf->l_hash[i]); - for (i = 0; i < ZAP_LEAF_NUMCHUNKS(&l); i++) { + for (int i = 0; i < ZAP_LEAF_NUMCHUNKS(&l); i++) { zap_leaf_chunk_t *lc = &ZAP_LEAF_CHUNK(&l, i); struct zap_leaf_entry *le; @@ -162,14 +161,12 @@ zap_leaf_byteswap(zap_leaf_phys_t *buf, int size) void zap_leaf_init(zap_leaf_t *l, boolean_t sort) { - int i; - l->l_bs = highbit64(l->l_dbuf->db_size) - 1; zap_memset(&zap_leaf_phys(l)->l_hdr, 0, sizeof (struct zap_leaf_header)); zap_memset(zap_leaf_phys(l)->l_hash, CHAIN_END, 2*ZAP_LEAF_HASH_NUMENTRIES(l)); - for (i = 0; i < ZAP_LEAF_NUMCHUNKS(l); i++) { + for (int i = 0; i < ZAP_LEAF_NUMCHUNKS(l); i++) { ZAP_LEAF_CHUNK(l, i).l_free.lf_type = ZAP_CHUNK_FREE; ZAP_LEAF_CHUNK(l, i).l_free.lf_next = i+1; } @@ -188,11 +185,9 @@ zap_leaf_init(zap_leaf_t *l, boolean_t sort) static uint16_t zap_leaf_chunk_alloc(zap_leaf_t *l) { - int chunk; - ASSERT(zap_leaf_phys(l)->l_hdr.lh_nfree > 0); - chunk = zap_leaf_phys(l)->l_hdr.lh_freelist; + int chunk = zap_leaf_phys(l)->l_hdr.lh_freelist; ASSERT3U(chunk, <, ZAP_LEAF_NUMCHUNKS(l)); ASSERT3U(ZAP_LEAF_CHUNK(l, chunk).l_free.lf_type, ==, ZAP_CHUNK_FREE); @@ -232,7 +227,7 @@ zap_leaf_array_create(zap_leaf_t *l, const char *buf, uint16_t *chunkp = &chunk_head; int byten = 0; uint64_t value = 0; - int shift = (integer_size-1)*8; + int shift = (integer_size - 1) * 8; int len = num_integers; ASSERT3U(num_integers * integer_size, <, MAX_ARRAY_BYTES); @@ -240,10 +235,9 @@ zap_leaf_array_create(zap_leaf_t *l, const char *buf, while (len > 0) { uint16_t chunk = zap_leaf_chunk_alloc(l); struct zap_leaf_array *la = &ZAP_LEAF_CHUNK(l, chunk).l_array; - int i; la->la_type = ZAP_CHUNK_ARRAY; - for (i = 0; i < ZAP_LEAF_ARRAY_BYTES; i++) { + for (int i = 0; i < ZAP_LEAF_ARRAY_BYTES; i++) { if (byten == 0) value = ldv(integer_size, buf); la->la_array[i] = value >> shift; @@ -321,10 +315,9 @@ zap_leaf_array_read(zap_leaf_t *l, uint16_t chunk, while (len > 0) { struct zap_leaf_array *la = &ZAP_LEAF_CHUNK(l, chunk).l_array; - int i; ASSERT3U(chunk, <, ZAP_LEAF_NUMCHUNKS(l)); - for (i = 0; i < ZAP_LEAF_ARRAY_BYTES && len > 0; i++) { + for (int i = 0; i < ZAP_LEAF_ARRAY_BYTES && len > 0; i++) { value = (value << 8) | la->la_array[i]; byten++; if (byten == array_int_len) { @@ -347,16 +340,13 @@ zap_leaf_array_match(zap_leaf_t *l, zap_name_t *zn, int bseen = 0; if (zap_getflags(zn->zn_zap) & ZAP_FLAG_UINT64_KEY) { - uint64_t *thiskey; - boolean_t match; - + uint64_t *thiskey = + kmem_alloc(array_numints * sizeof (*thiskey), KM_SLEEP); ASSERT(zn->zn_key_intlen == sizeof (*thiskey)); - thiskey = kmem_alloc(array_numints * sizeof (*thiskey), - KM_SLEEP); zap_leaf_array_read(l, chunk, sizeof (*thiskey), array_numints, sizeof (*thiskey), array_numints, thiskey); - match = bcmp(thiskey, zn->zn_key_orig, + boolean_t match = bcmp(thiskey, zn->zn_key_orig, array_numints * sizeof (*thiskey)) == 0; kmem_free(thiskey, array_numints * sizeof (*thiskey)); return (match); @@ -365,11 +355,10 @@ zap_leaf_array_match(zap_leaf_t *l, zap_name_t *zn, ASSERT(zn->zn_key_intlen == 1); if (zn->zn_matchtype & MT_NORMALIZE) { char *thisname = kmem_alloc(array_numints, KM_SLEEP); - boolean_t match; zap_leaf_array_read(l, chunk, sizeof (char), array_numints, sizeof (char), array_numints, thisname); - match = zap_match(zn, thisname); + boolean_t match = zap_match(zn, thisname); kmem_free(thisname, array_numints); return (match); } @@ -400,12 +389,11 @@ zap_leaf_array_match(zap_leaf_t *l, zap_name_t *zn, int zap_leaf_lookup(zap_leaf_t *l, zap_name_t *zn, zap_entry_handle_t *zeh) { - uint16_t *chunkp; struct zap_leaf_entry *le; ASSERT3U(zap_leaf_phys(l)->l_hdr.lh_magic, ==, ZAP_LEAF_MAGIC); - for (chunkp = LEAF_HASH_ENTPTR(l, zn->zn_hash); + for (uint16_t *chunkp = LEAF_HASH_ENTPTR(l, zn->zn_hash); *chunkp != CHAIN_END; chunkp = &le->le_next) { uint16_t chunk = *chunkp; le = ZAP_LEAF_ENTRY(l, chunk); @@ -446,17 +434,15 @@ int zap_leaf_lookup_closest(zap_leaf_t *l, uint64_t h, uint32_t cd, zap_entry_handle_t *zeh) { - uint16_t chunk; uint64_t besth = -1ULL; uint32_t bestcd = -1U; uint16_t bestlh = ZAP_LEAF_HASH_NUMENTRIES(l)-1; - uint16_t lh; struct zap_leaf_entry *le; ASSERT3U(zap_leaf_phys(l)->l_hdr.lh_magic, ==, ZAP_LEAF_MAGIC); - for (lh = LEAF_HASH(l, h); lh <= bestlh; lh++) { - for (chunk = zap_leaf_phys(l)->l_hash[lh]; + for (uint16_t lh = LEAF_HASH(l, h); lh <= bestlh; lh++) { + for (uint16_t chunk = zap_leaf_phys(l)->l_hash[lh]; chunk != CHAIN_END; chunk = le->le_next) { le = ZAP_LEAF_ENTRY(l, chunk); @@ -529,11 +515,10 @@ int zap_entry_update(zap_entry_handle_t *zeh, uint8_t integer_size, uint64_t num_integers, const void *buf) { - int delta_chunks; zap_leaf_t *l = zeh->zeh_leaf; struct zap_leaf_entry *le = ZAP_LEAF_ENTRY(l, *zeh->zeh_chunkp); - delta_chunks = ZAP_LEAF_ARRAY_NCHUNKS(num_integers * integer_size) - + int delta_chunks = ZAP_LEAF_ARRAY_NCHUNKS(num_integers * integer_size) - ZAP_LEAF_ARRAY_NCHUNKS(le->le_value_numints * le->le_value_intlen); if ((int)zap_leaf_phys(l)->l_hdr.lh_nfree < delta_chunks) @@ -550,14 +535,12 @@ zap_entry_update(zap_entry_handle_t *zeh, void zap_entry_remove(zap_entry_handle_t *zeh) { - uint16_t entry_chunk; - struct zap_leaf_entry *le; zap_leaf_t *l = zeh->zeh_leaf; ASSERT3P(zeh->zeh_chunkp, !=, &zeh->zeh_fakechunk); - entry_chunk = *zeh->zeh_chunkp; - le = ZAP_LEAF_ENTRY(l, entry_chunk); + uint16_t entry_chunk = *zeh->zeh_chunkp; + struct zap_leaf_entry *le = ZAP_LEAF_ENTRY(l, entry_chunk); ASSERT3U(le->le_type, ==, ZAP_CHUNK_ENTRY); zap_leaf_array_free(l, &le->le_name_chunk); @@ -575,15 +558,12 @@ zap_entry_create(zap_leaf_t *l, zap_name_t *zn, uint32 zap_entry_handle_t *zeh) { uint16_t chunk; - uint16_t *chunkp; struct zap_leaf_entry *le; - uint64_t valuelen; - int numchunks; uint64_t h = zn->zn_hash; - valuelen = integer_size * num_integers; + uint64_t valuelen = integer_size * num_integers; - numchunks = 1 + ZAP_LEAF_ARRAY_NCHUNKS(zn->zn_key_orig_numints * + int numchunks = 1 + ZAP_LEAF_ARRAY_NCHUNKS(zn->zn_key_orig_numints * zn->zn_key_intlen) + ZAP_LEAF_ARRAY_NCHUNKS(valuelen); if (numchunks > ZAP_LEAF_NUMCHUNKS(l)) return (E2BIG); @@ -645,7 +625,7 @@ zap_entry_create(zap_leaf_t *l, zap_name_t *zn, uint32 /* link it into the hash chain */ /* XXX if we did the search above, we could just use that */ - chunkp = zap_leaf_rehash_entry(l, chunk); + uint16_t *chunkp = zap_leaf_rehash_entry(l, chunk); zap_leaf_phys(l)->l_hdr.lh_nentries++; @@ -673,14 +653,13 @@ boolean_t zap_entry_normalization_conflict(zap_entry_handle_t *zeh, zap_name_t *zn, const char *name, zap_t *zap) { - uint64_t chunk; struct zap_leaf_entry *le; boolean_t allocdzn = B_FALSE; if (zap->zap_normflags == 0) return (B_FALSE); - for (chunk = *LEAF_HASH_ENTPTR(zeh->zeh_leaf, zeh->zeh_hash); + for (uint16_t chunk = *LEAF_HASH_ENTPTR(zeh->zeh_leaf, zeh->zeh_hash); chunk != CHAIN_END; chunk = le->le_next) { le = ZAP_LEAF_ENTRY(zeh->zeh_leaf, chunk); if (le->le_hash != zeh->zeh_hash) @@ -763,14 +742,11 @@ zap_leaf_transfer_array(zap_leaf_t *l, uint16_t chunk, static void zap_leaf_transfer_entry(zap_leaf_t *l, int entry, zap_leaf_t *nl) { - struct zap_leaf_entry *le, *nle; - uint16_t chunk; - - le = ZAP_LEAF_ENTRY(l, entry); + struct zap_leaf_entry *le = ZAP_LEAF_ENTRY(l, entry); ASSERT3U(le->le_type, ==, ZAP_CHUNK_ENTRY); - chunk = zap_leaf_chunk_alloc(nl); - nle = ZAP_LEAF_ENTRY(nl, chunk); + uint16_t chunk = zap_leaf_chunk_alloc(nl); + struct zap_leaf_entry *nle = ZAP_LEAF_ENTRY(nl, chunk); *nle = *le; /* structure assignment */ (void) zap_leaf_rehash_entry(nl, chunk); @@ -791,7 +767,6 @@ zap_leaf_transfer_entry(zap_leaf_t *l, int entry, zap_ void zap_leaf_split(zap_leaf_t *l, zap_leaf_t *nl, boolean_t sort) { - int i; int bit = 64 - 1 - zap_leaf_phys(l)->l_hdr.lh_prefix_len; /* set new prefix and prefix_len */ @@ -818,7 +793,7 @@ zap_leaf_split(zap_leaf_t *l, zap_leaf_t *nl, boolean_ * but this accesses memory more sequentially, and when we're * called, the block is usually pretty full. */ - for (i = 0; i < ZAP_LEAF_NUMCHUNKS(l); i++) { + for (int i = 0; i < ZAP_LEAF_NUMCHUNKS(l); i++) { struct zap_leaf_entry *le = ZAP_LEAF_ENTRY(l, i); if (le->le_type != ZAP_CHUNK_ENTRY) continue; @@ -833,9 +808,7 @@ zap_leaf_split(zap_leaf_t *l, zap_leaf_t *nl, boolean_ void zap_leaf_stats(zap_t *zap, zap_leaf_t *l, zap_stats_t *zs) { - int i, n; - - n = zap_f_phys(zap)->zap_ptrtbl.zt_shift - + int n = zap_f_phys(zap)->zap_ptrtbl.zt_shift - zap_leaf_phys(l)->l_hdr.lh_prefix_len; n = MIN(n, ZAP_HISTOGRAM_SIZE-1); zs->zs_leafs_with_2n_pointers[n]++; @@ -851,7 +824,7 @@ zap_leaf_stats(zap_t *zap, zap_leaf_t *l, zap_stats_t n = MIN(n, ZAP_HISTOGRAM_SIZE-1); zs->zs_blocks_n_tenths_full[n]++; - for (i = 0; i < ZAP_LEAF_HASH_NUMENTRIES(l); i++) { + for (int i = 0; i < ZAP_LEAF_HASH_NUMENTRIES(l); i++) { int nentries = 0; int chunk = zap_leaf_phys(l)->l_hash[i]; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c Wed Oct 3 02:48:31 2018 (r339114) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c Wed Oct 3 02:49:24 2018 (r339115) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2011, 2016 by Delphix. All rights reserved. + * Copyright (c) 2011, 2017 by Delphix. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. * Copyright (c) 2014 Integros [integros.com] * Copyright 2017 Nexenta Systems, Inc. @@ -89,22 +89,20 @@ zap_hash(zap_name_t *zn) ASSERT(zfs_crc64_table[128] == ZFS_CRC64_POLY); if (zap_getflags(zap) & ZAP_FLAG_UINT64_KEY) { - int i; *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-stable-11@freebsd.org Wed Oct 3 02:50:08 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CCF5110B5861; Wed, 3 Oct 2018 02:50:08 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3BEA7885D0; Wed, 3 Oct 2018 02:50:08 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 36E8210E7F; Wed, 3 Oct 2018 02:50:08 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w932o86R061706; Wed, 3 Oct 2018 02:50:08 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w932o8Dm061705; Wed, 3 Oct 2018 02:50:08 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030250.w932o8Dm061705@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:50:08 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339116 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339116 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:50:09 -0000 Author: mav Date: Wed Oct 3 02:50:07 2018 New Revision: 339116 URL: https://svnweb.freebsd.org/changeset/base/339116 Log: MFC r337030: MFV r337029: 9426 metaslab size can exceed offset addressable by spacemap metaslab size can exceed offset addressable by spacemap. The vdev can address up to 2^63 * SPA_MAXBLOCKSIZE (512). A metaslab can address up to 2^47 * 2^vdev_ashift. Therefore we may need to increase the number of metaslabs so that the maximum metaslab size is capped at the amount that can be addressed by the spacemap. This should happen in vdev_metaslab_set_size(). illumos/illumos-gate@b4bf0cf0458759c67920a031021a9d96cd683cfe Reviewed by: Paul Dagnelie Reviewed by: Matt Ahrens Approved by: Dan McDonald Author: Don Brady Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Oct 3 02:49:24 2018 (r339115) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Oct 3 02:50:07 2018 (r339116) @@ -163,24 +163,30 @@ static vdev_ops_t *vdev_ops_table[] = { }; -/* maximum number of metaslabs per top-level vdev */ +/* target number of metaslabs per top-level vdev */ int vdev_max_ms_count = 200; SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, max_ms_count, CTLFLAG_RDTUN, &vdev_max_ms_count, 0, "Maximum number of metaslabs per top-level vdev"); -/* minimum amount of metaslabs per top-level vdev */ +/* minimum number of metaslabs per top-level vdev */ int vdev_min_ms_count = 16; SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, min_ms_count, CTLFLAG_RDTUN, &vdev_min_ms_count, 0, "Minimum number of metaslabs per top-level vdev"); -/* see comment in vdev_metaslab_set_size() */ +/* practical upper limit of total metaslabs per top-level vdev */ +int vdev_ms_count_limit = 1ULL << 17; + +/* lower limit for metaslab size (512M) */ int vdev_default_ms_shift = 29; SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, default_ms_shift, CTLFLAG_RDTUN, &vdev_default_ms_shift, 0, "Shift between vdev size and number of metaslabs"); +/* upper limit for metaslab size (256G) */ +int vdev_max_ms_shift = 38; + boolean_t vdev_validate_skip = B_FALSE; /* @@ -2167,34 +2173,53 @@ void vdev_metaslab_set_size(vdev_t *vd) { uint64_t asize = vd->vdev_asize; - uint64_t ms_shift = 0; + uint64_t ms_count = asize >> vdev_default_ms_shift; + uint64_t ms_shift; /* - * For vdevs that are bigger than 8G the metaslab size varies in - * a way that the number of metaslabs increases in powers of two, - * linearly in terms of vdev_asize, starting from 16 metaslabs. - * So for vdev_asize of 8G we get 16 metaslabs, for 16G, we get 32, - * and so on, until we hit the maximum metaslab count limit - * [vdev_max_ms_count] from which point the metaslab count stays - * the same. + * There are two dimensions to the metaslab sizing calculation: + * the size of the metaslab and the count of metaslabs per vdev. + * In general, we aim for vdev_max_ms_count (200) metaslabs. The + * range of the dimensions are as follows: + * + * 2^29 <= ms_size <= 2^38 + * 16 <= ms_count <= 131,072 + * + * On the lower end of vdev sizes, we aim for metaslabs sizes of + * at least 512MB (2^29) to minimize fragmentation effects when + * testing with smaller devices. However, the count constraint + * of at least 16 metaslabs will override this minimum size goal. + * + * On the upper end of vdev sizes, we aim for a maximum metaslab + * size of 256GB. However, we will cap the total count to 2^17 + * metaslabs to keep our memory footprint in check. + * + * The net effect of applying above constrains is summarized below. + * + * vdev size metaslab count + * -------------|----------------- + * < 8GB ~16 + * 8GB - 100GB one per 512MB + * 100GB - 50TB ~200 + * 50TB - 32PB one per 256GB + * > 32PB ~131,072 + * ------------------------------- */ - ms_shift = vdev_default_ms_shift; - if ((asize >> ms_shift) < vdev_min_ms_count) { - /* - * For devices that are less than 8G we want to have - * exactly 16 metaslabs. We don't want less as integer - * division rounds down, so less metaslabs mean more - * wasted space. We don't want more as these vdevs are - * small and in the likely event that we are running - * out of space, the SPA will have a hard time finding - * space due to fragmentation. - */ + if (ms_count < vdev_min_ms_count) ms_shift = highbit64(asize / vdev_min_ms_count); - ms_shift = MAX(ms_shift, SPA_MAXBLOCKSHIFT); - - } else if ((asize >> ms_shift) > vdev_max_ms_count) { + else if (ms_count > vdev_max_ms_count) ms_shift = highbit64(asize / vdev_max_ms_count); + else + ms_shift = vdev_default_ms_shift; + + if (ms_shift < SPA_MAXBLOCKSHIFT) { + ms_shift = SPA_MAXBLOCKSHIFT; + } else if (ms_shift > vdev_max_ms_shift) { + ms_shift = vdev_max_ms_shift; + /* cap the total count to constrain memory footprint */ + if ((asize >> ms_shift) > vdev_ms_count_limit) + ms_shift = highbit64(asize / vdev_ms_count_limit); } vd->vdev_ms_shift = ms_shift; From owner-svn-src-stable-11@freebsd.org Wed Oct 3 02:51:15 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 22ADD10B5A78; Wed, 3 Oct 2018 02:51:15 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CCFD5888E3; Wed, 3 Oct 2018 02:51:14 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id C7E4810EA8; Wed, 3 Oct 2018 02:51:14 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w932pEOp062593; Wed, 3 Oct 2018 02:51:14 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w932pDdo062588; Wed, 3 Oct 2018 02:51:13 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030251.w932pDdo062588@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:51:13 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339117 - in stable/11/cddl/contrib/opensolaris: cmd/zfs lib/libzfs/common X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/cddl/contrib/opensolaris: cmd/zfs lib/libzfs/common X-SVN-Commit-Revision: 339117 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:51:15 -0000 Author: mav Date: Wed Oct 3 02:51:13 2018 New Revision: 339117 URL: https://svnweb.freebsd.org/changeset/base/339117 Log: MFC r337063: MFV r316926: 7955 libshare needs to initialize only those datasets being modified by the consumer illumos/illumos-gate@8a981c3356b194b3b5c0ae9276a9cc31cd2f93a3 https://github.com/illumos/illumos-gate/commit/8a981c3356b194b3b5c0ae9276a9cc31cd2f93a3 https://www.illumos.org/issues/7955 Libshare currently initializes all available filesystems when doing any libshare operation. This requires iterating through all the filesystem multiple times, which is a huge performance problem for sharing and unsharing operations. Reviewed by: Steve Gonczi Reviewed by: Sebastien Roy Reviewed by: Matthew Ahrens Reviewed by: George Wilson Reviewed by: Pavel Zakharov Reviewed by: Yuri Pankov Approved by: Gordon Ross Author: Daniel Hoffman For FreeBSD this is practically a NOP, just a diff reduction. Modified: stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_changelist.c stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_impl.h stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_mount.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c Wed Oct 3 02:50:07 2018 (r339116) +++ stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c Wed Oct 3 02:51:13 2018 (r339117) @@ -72,6 +72,7 @@ #include #include #include +#include #endif #include "zfs_iter.h" @@ -6221,6 +6222,17 @@ share_mount(int op, int argc, char **argv) return (0); qsort(dslist, count, sizeof (void *), libzfs_dataset_cmp); +#ifdef illumos + sa_init_selective_arg_t sharearg; + sharearg.zhandle_arr = dslist; + sharearg.zhandle_len = count; + if ((ret = zfs_init_libshare_arg(zfs_get_handle(dslist[0]), + SA_INIT_SHARE_API_SELECTIVE, &sharearg)) != SA_OK) { + (void) fprintf(stderr, + gettext("Could not initialize libshare, %d"), ret); + return (ret); + } +#endif for (i = 0; i < count; i++) { if (verbose) Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h Wed Oct 3 02:50:07 2018 (r339116) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h Wed Oct 3 02:51:13 2018 (r339117) @@ -843,6 +843,17 @@ extern int zmount(const char *, const char *, int, cha #endif extern int zfs_remap_indirects(libzfs_handle_t *hdl, const char *); +/* Allow consumers to initialize libshare externally for optimal performance */ +extern int zfs_init_libshare_arg(libzfs_handle_t *, int, void *); +/* + * For most consumers, zfs_init_libshare_arg is sufficient on its own, and + * zfs_uninit_libshare is unnecessary. zfs_uninit_libshare should only be called + * if the caller has already initialized libshare for one set of zfs handles, + * and wishes to share or unshare filesystems outside of that set. In that case, + * the caller should uninitialize libshare, and then re-initialize it with the + * new handles being shared or unshared. + */ +extern void zfs_uninit_libshare(libzfs_handle_t *); #ifdef __cplusplus } #endif Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_changelist.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_changelist.c Wed Oct 3 02:50:07 2018 (r339116) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_changelist.c Wed Oct 3 02:51:13 2018 (r339117) @@ -26,7 +26,7 @@ * Portions Copyright 2007 Ramprakash Jelari * Copyright (c) 2011 Pawel Jakub Dawidek . * All rights reserved. - * Copyright (c) 2014, 2015 by Delphix. All rights reserved. + * Copyright (c) 2014, 2016 by Delphix. All rights reserved. * Copyright 2016 Igor Kozhukhov */ @@ -166,6 +166,11 @@ changelist_postfix(prop_changelist_t *clp) char shareopts[ZFS_MAXPROPLEN]; int errors = 0; libzfs_handle_t *hdl; +#ifdef illumos + size_t num_datasets = 0, i; + zfs_handle_t **zhandle_arr; + sa_init_selective_arg_t sharearg; +#endif /* * If we're changing the mountpoint, attempt to destroy the underlying @@ -192,8 +197,33 @@ changelist_postfix(prop_changelist_t *clp) hdl = cn->cn_handle->zfs_hdl; assert(hdl != NULL); zfs_uninit_libshare(hdl); - } +#ifdef illumos + /* + * For efficiencies sake, we initialize libshare for only a few + * shares (the ones affected here). Future initializations in + * this process should just use the cached initialization. + */ + for (cn = uu_list_last(clp->cl_list); cn != NULL; + cn = uu_list_prev(clp->cl_list, cn)) { + num_datasets++; + } + + zhandle_arr = zfs_alloc(hdl, + num_datasets * sizeof (zfs_handle_t *)); + for (i = 0, cn = uu_list_last(clp->cl_list); cn != NULL; + cn = uu_list_prev(clp->cl_list, cn)) { + zhandle_arr[i++] = cn->cn_handle; + zfs_refresh_properties(cn->cn_handle); + } + assert(i == num_datasets); + sharearg.zhandle_arr = zhandle_arr; + sharearg.zhandle_len = num_datasets; + errors = zfs_init_libshare_arg(hdl, SA_INIT_SHARE_API_SELECTIVE, + &sharearg); + free(zhandle_arr); +#endif + } /* * We walk the datasets in reverse, because we want to mount any parent * datasets before mounting the children. We walk all datasets even if @@ -218,7 +248,9 @@ changelist_postfix(prop_changelist_t *clp) continue; cn->cn_needpost = B_FALSE; +#ifndef illumos zfs_refresh_properties(cn->cn_handle); +#endif if (ZFS_IS_VOLUME(cn->cn_handle)) continue; Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_impl.h ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_impl.h Wed Oct 3 02:50:07 2018 (r339116) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_impl.h Wed Oct 3 02:51:13 2018 (r339117) @@ -207,7 +207,6 @@ void namespace_clear(libzfs_handle_t *); */ extern int zfs_init_libshare(libzfs_handle_t *, int); -extern void zfs_uninit_libshare(libzfs_handle_t *); extern int zfs_parse_options(char *, zfs_share_proto_t); extern int zfs_unshare_proto(zfs_handle_t *, Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_mount.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_mount.c Wed Oct 3 02:50:07 2018 (r339116) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_mount.c Wed Oct 3 02:51:13 2018 (r339117) @@ -579,6 +579,7 @@ zfs_is_shared_smb(zfs_handle_t *zhp, char **where) #ifdef illumos static sa_handle_t (*_sa_init)(int); +static sa_handle_t (*_sa_init_arg)(int, void *); static void (*_sa_fini)(sa_handle_t); static sa_share_t (*_sa_find_share)(sa_handle_t, char *); static int (*_sa_enable_share)(sa_share_t, char *); @@ -620,6 +621,8 @@ _zfs_init_libshare(void) if ((libshare = dlopen(path, RTLD_LAZY | RTLD_GLOBAL)) != NULL) { _sa_init = (sa_handle_t (*)(int))dlsym(libshare, "sa_init"); + _sa_init_arg = (sa_handle_t (*)(int, void *))dlsym(libshare, + "sa_init_arg"); _sa_fini = (void (*)(sa_handle_t))dlsym(libshare, "sa_fini"); _sa_find_share = (sa_share_t (*)(sa_handle_t, char *)) dlsym(libshare, "sa_find_share"); @@ -639,14 +642,15 @@ _zfs_init_libshare(void) char *, char *))dlsym(libshare, "sa_zfs_process_share"); _sa_update_sharetab_ts = (void (*)(sa_handle_t)) dlsym(libshare, "sa_update_sharetab_ts"); - if (_sa_init == NULL || _sa_fini == NULL || - _sa_find_share == NULL || _sa_enable_share == NULL || - _sa_disable_share == NULL || _sa_errorstr == NULL || - _sa_parse_legacy_options == NULL || + if (_sa_init == NULL || _sa_init_arg == NULL || + _sa_fini == NULL || _sa_find_share == NULL || + _sa_enable_share == NULL || _sa_disable_share == NULL || + _sa_errorstr == NULL || _sa_parse_legacy_options == NULL || _sa_needs_refresh == NULL || _sa_get_zfs_handle == NULL || _sa_zfs_process_share == NULL || _sa_update_sharetab_ts == NULL) { _sa_init = NULL; + _sa_init_arg = NULL; _sa_fini = NULL; _sa_disable_share = NULL; _sa_enable_share = NULL; @@ -670,8 +674,8 @@ _zfs_init_libshare(void) * service value is which part(s) of the API to initialize and is a * direct map to the libshare sa_init(service) interface. */ -int -zfs_init_libshare(libzfs_handle_t *zhandle, int service) +static int +zfs_init_libshare_impl(libzfs_handle_t *zhandle, int service, void *arg) { #ifdef illumos /* @@ -694,11 +698,11 @@ zfs_init_libshare(libzfs_handle_t *zhandle, int servic if (_sa_needs_refresh != NULL && _sa_needs_refresh(zhandle->libzfs_sharehdl)) { zfs_uninit_libshare(zhandle); - zhandle->libzfs_sharehdl = _sa_init(service); + zhandle->libzfs_sharehdl = _sa_init_arg(service, arg); } if (zhandle && zhandle->libzfs_sharehdl == NULL) - zhandle->libzfs_sharehdl = _sa_init(service); + zhandle->libzfs_sharehdl = _sa_init_arg(service, arg); if (zhandle->libzfs_sharehdl == NULL) return (SA_NO_MEMORY); @@ -706,7 +710,19 @@ zfs_init_libshare(libzfs_handle_t *zhandle, int servic return (SA_OK); } +int +zfs_init_libshare(libzfs_handle_t *zhandle, int service) +{ + return (zfs_init_libshare_impl(zhandle, service, NULL)); +} +int +zfs_init_libshare_arg(libzfs_handle_t *zhandle, int service, void *arg) +{ + return (zfs_init_libshare_impl(zhandle, service, arg)); +} + + /* * zfs_uninit_libshare(zhandle) * @@ -817,9 +833,9 @@ zfs_share_proto(zfs_handle_t *zhp, zfs_share_proto_t * ZFS_MAXPROPLEN, B_FALSE) != 0 || strcmp(shareopts, "off") == 0) continue; - #ifdef illumos - ret = zfs_init_libshare(hdl, SA_INIT_SHARE_API); + ret = zfs_init_libshare_arg(hdl, SA_INIT_ONE_SHARE_FROM_HANDLE, + zhp); if (ret != SA_OK) { (void) zfs_error_fmt(hdl, EZFS_SHARENFSFAILED, dgettext(TEXT_DOMAIN, "cannot share '%s': %s"), @@ -930,6 +946,7 @@ unshare_one(libzfs_handle_t *hdl, const char *name, co sa_share_t share; int err; char *mntpt; + /* * Mountpoint could get trashed if libshare calls getmntany * which it does during API initialization, so strdup the @@ -937,8 +954,14 @@ unshare_one(libzfs_handle_t *hdl, const char *name, co */ mntpt = zfs_strdup(hdl, mountpoint); - /* make sure libshare initialized */ - if ((err = zfs_init_libshare(hdl, SA_INIT_SHARE_API)) != SA_OK) { + /* + * make sure libshare initialized, initialize everything because we + * don't know what other unsharing may happen later. Functions up the + * stack are allowed to initialize instead a subset of shares at the + * time the set is known. + */ + if ((err = zfs_init_libshare_arg(hdl, SA_INIT_ONE_SHARE_FROM_NAME, + (void *)name)) != SA_OK) { free(mntpt); /* don't need the copy anymore */ return (zfs_error_fmt(hdl, proto_table[proto].p_unshare_err, dgettext(TEXT_DOMAIN, "cannot unshare '%s': %s"), @@ -1289,6 +1312,9 @@ zpool_disable_datasets(zpool_handle_t *zhp, boolean_t int i; int ret = -1; int flags = (force ? MS_FORCE : 0); +#ifdef illumos + sa_init_selective_arg_t sharearg; +#endif namelen = strlen(zhp->zpool_name); @@ -1363,6 +1389,14 @@ zpool_disable_datasets(zpool_handle_t *zhp, boolean_t * At this point, we have the entire list of filesystems, so sort it by * mountpoint. */ +#ifdef illumos + sharearg.zhandle_arr = datasets; + sharearg.zhandle_len = used; + ret = zfs_init_libshare_arg(hdl, SA_INIT_SHARE_API_SELECTIVE, + &sharearg); + if (ret != 0) + goto out; +#endif qsort(mountpoints, used, sizeof (char *), mountpoint_compare); /* From owner-svn-src-stable-11@freebsd.org Wed Oct 3 02:52:51 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0FD6510B5B62; Wed, 3 Oct 2018 02:52:51 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id ACDA488B4D; Wed, 3 Oct 2018 02:52:50 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id A4FC311012; Wed, 3 Oct 2018 02:52:50 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w932qogT066530; Wed, 3 Oct 2018 02:52:50 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w932qlIf066516; Wed, 3 Oct 2018 02:52:47 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030252.w932qlIf066516@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:52:47 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339118 - in stable/11: cddl/contrib/opensolaris/lib/libzfs/common cddl/lib/libzfs cddl/lib/libzfs_core cddl/lib/libzpool cddl/sbin/zfs cddl/sbin/zpool cddl/usr.bin/zinject cddl/usr.bin... X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11: cddl/contrib/opensolaris/lib/libzfs/common cddl/lib/libzfs cddl/lib/libzfs_core cddl/lib/libzpool cddl/sbin/zfs cddl/sbin/zpool cddl/usr.bin/zinject cddl/usr.bin/zstreamdump cddl/usr.bin... X-SVN-Commit-Revision: 339118 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:52:51 -0000 Author: mav Date: Wed Oct 3 02:52:47 2018 New Revision: 339118 URL: https://svnweb.freebsd.org/changeset/base/339118 Log: MFC r337160: Do not blindly include illumos kernel headers instead of user-space. It is not needed now, and I doubt it much helped at all, creating more confusions then good. Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c stable/11/cddl/lib/libzfs/Makefile stable/11/cddl/lib/libzfs_core/Makefile stable/11/cddl/lib/libzpool/Makefile stable/11/cddl/sbin/zfs/Makefile stable/11/cddl/sbin/zpool/Makefile stable/11/cddl/usr.bin/zinject/Makefile stable/11/cddl/usr.bin/zstreamdump/Makefile stable/11/cddl/usr.bin/ztest/Makefile stable/11/cddl/usr.sbin/zdb/Makefile stable/11/cddl/usr.sbin/zfsd/Makefile.common stable/11/cddl/usr.sbin/zhack/Makefile stable/11/lib/libprocstat/zfs/Makefile stable/11/usr.sbin/fstyp/Makefile Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Oct 3 02:51:13 2018 (r339117) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Oct 3 02:52:47 2018 (r339118) @@ -50,7 +50,9 @@ #include #include #include +#ifdef illumos #include +#endif #include #include Modified: stable/11/cddl/lib/libzfs/Makefile ============================================================================== --- stable/11/cddl/lib/libzfs/Makefile Wed Oct 3 02:51:13 2018 (r339117) +++ stable/11/cddl/lib/libzfs/Makefile Wed Oct 3 02:52:47 2018 (r339118) @@ -49,7 +49,6 @@ CFLAGS+= -I${SRCTOP}/cddl/compat/opensolaris/lib/libum CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libzpool/common CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/common/zfs CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/fs/zfs -CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/sys CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/head CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libnvpair Modified: stable/11/cddl/lib/libzfs_core/Makefile ============================================================================== --- stable/11/cddl/lib/libzfs_core/Makefile Wed Oct 3 02:51:13 2018 (r339117) +++ stable/11/cddl/lib/libzfs_core/Makefile Wed Oct 3 02:52:47 2018 (r339118) @@ -26,7 +26,6 @@ CFLAGS+= -I${SRCTOP}/cddl/compat/opensolaris/lib/libum CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libzpool/common CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/common/zfs CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/fs/zfs -CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/sys CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/head CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libnvpair Modified: stable/11/cddl/lib/libzpool/Makefile ============================================================================== --- stable/11/cddl/lib/libzpool/Makefile Wed Oct 3 02:51:13 2018 (r339117) +++ stable/11/cddl/lib/libzpool/Makefile Wed Oct 3 02:52:47 2018 (r339118) @@ -47,7 +47,6 @@ CFLAGS+= -I${SRCTOP}/sys/cddl/compat/opensolaris CFLAGS+= -I${SRCTOP}/cddl/compat/opensolaris/include CFLAGS+= -I${SRCTOP}/cddl/compat/opensolaris/lib/libumem CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libzpool/common -CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/sys CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/fs/zfs CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lua CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/common/zfs Modified: stable/11/cddl/sbin/zfs/Makefile ============================================================================== --- stable/11/cddl/sbin/zfs/Makefile Wed Oct 3 02:51:13 2018 (r339117) +++ stable/11/cddl/sbin/zfs/Makefile Wed Oct 3 02:52:47 2018 (r339118) @@ -19,7 +19,6 @@ CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libu CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libnvpair CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/fs/zfs -CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/sys CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/common/zfs LIBADD= jail nvpair uutil zfs_core zfs Modified: stable/11/cddl/sbin/zpool/Makefile ============================================================================== --- stable/11/cddl/sbin/zpool/Makefile Wed Oct 3 02:51:13 2018 (r339117) +++ stable/11/cddl/sbin/zpool/Makefile Wed Oct 3 02:52:47 2018 (r339118) @@ -23,7 +23,6 @@ CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libn CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/common/zfs CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/fs/zfs -CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/sys CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libzpool/common CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/cmd/stat/common Modified: stable/11/cddl/usr.bin/zinject/Makefile ============================================================================== --- stable/11/cddl/usr.bin/zinject/Makefile Wed Oct 3 02:51:13 2018 (r339117) +++ stable/11/cddl/usr.bin/zinject/Makefile Wed Oct 3 02:52:47 2018 (r339118) @@ -15,7 +15,6 @@ CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libz CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libzpool/common CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libnvpair CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/fs/zfs -CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/sys CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/common/zfs/ CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/head Modified: stable/11/cddl/usr.bin/zstreamdump/Makefile ============================================================================== --- stable/11/cddl/usr.bin/zstreamdump/Makefile Wed Oct 3 02:51:13 2018 (r339117) +++ stable/11/cddl/usr.bin/zstreamdump/Makefile Wed Oct 3 02:52:47 2018 (r339118) @@ -13,7 +13,6 @@ CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libz CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libnvpair CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/common/zfs CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/fs/zfs -CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/sys CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/head Modified: stable/11/cddl/usr.bin/ztest/Makefile ============================================================================== --- stable/11/cddl/usr.bin/ztest/Makefile Wed Oct 3 02:51:13 2018 (r339117) +++ stable/11/cddl/usr.bin/ztest/Makefile Wed Oct 3 02:52:47 2018 (r339118) @@ -14,7 +14,6 @@ CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libn CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libcmdutils CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/common/zfs CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/fs/zfs -CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/sys CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/head Modified: stable/11/cddl/usr.sbin/zdb/Makefile ============================================================================== --- stable/11/cddl/usr.sbin/zdb/Makefile Wed Oct 3 02:51:13 2018 (r339117) +++ stable/11/cddl/usr.sbin/zdb/Makefile Wed Oct 3 02:52:47 2018 (r339118) @@ -20,7 +20,6 @@ CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libz CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libzpool/common CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/fs/zfs CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common -CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/sys CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/common/zfs CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/head Modified: stable/11/cddl/usr.sbin/zfsd/Makefile.common ============================================================================== --- stable/11/cddl/usr.sbin/zfsd/Makefile.common Wed Oct 3 02:51:13 2018 (r339117) +++ stable/11/cddl/usr.sbin/zfsd/Makefile.common Wed Oct 3 02:52:47 2018 (r339118) @@ -28,7 +28,6 @@ INCFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/li INCFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/common/zfs INCFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common INCFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/fs/zfs -INCFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/sys CFLAGS= -g -DNEED_SOLARIS_BOOLEAN ${INCFLAGS} Modified: stable/11/cddl/usr.sbin/zhack/Makefile ============================================================================== --- stable/11/cddl/usr.sbin/zhack/Makefile Wed Oct 3 02:51:13 2018 (r339117) +++ stable/11/cddl/usr.sbin/zhack/Makefile Wed Oct 3 02:52:47 2018 (r339118) @@ -18,7 +18,6 @@ CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libz CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libzpool/common CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/fs/zfs CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common -CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/sys CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/common/zfs CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/head Modified: stable/11/lib/libprocstat/zfs/Makefile ============================================================================== --- stable/11/lib/libprocstat/zfs/Makefile Wed Oct 3 02:51:13 2018 (r339117) +++ stable/11/lib/libprocstat/zfs/Makefile Wed Oct 3 02:52:47 2018 (r339118) @@ -13,7 +13,6 @@ CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libz CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/common/zfs CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/fs/zfs CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common -CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/sys CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/head CFLAGS+= -I${.CURDIR:H} CFLAGS+= -DNEED_SOLARIS_BOOLEAN Modified: stable/11/usr.sbin/fstyp/Makefile ============================================================================== --- stable/11/usr.sbin/fstyp/Makefile Wed Oct 3 02:51:13 2018 (r339117) +++ stable/11/usr.sbin/fstyp/Makefile Wed Oct 3 02:52:47 2018 (r339118) @@ -29,7 +29,6 @@ CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libn CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/lib/libzpool/common CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/fs/zfs CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common -CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/sys CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/head .endif From owner-svn-src-stable-11@freebsd.org Wed Oct 3 03:13:55 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0F09F10B6160; Wed, 3 Oct 2018 03:13:55 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A16A9896DD; Wed, 3 Oct 2018 03:13:54 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 96C4211423; Wed, 3 Oct 2018 03:13:54 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w933DsNA077590; Wed, 3 Oct 2018 03:13:54 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w933DsPd077588; Wed, 3 Oct 2018 03:13:54 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030313.w933DsPd077588@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 03:13:54 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339119 - in stable/11/cddl/contrib/opensolaris: cmd/zfs lib/libzfs/common X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/cddl/contrib/opensolaris: cmd/zfs lib/libzfs/common X-SVN-Commit-Revision: 339119 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 03:13:55 -0000 Author: mav Date: Wed Oct 3 03:13:53 2018 New Revision: 339119 URL: https://svnweb.freebsd.org/changeset/base/339119 Log: MFC r337163: MFV r337161: 9512 zfs remap poolname@snapname coredumps Only filesystems and volumes are valid "zfs remap" parameters: when passed a snapshot name zfs_remap_indirects() does not handle the EINVAL returned from libzfs_core, which results in failing an assertion and consequently crashing. illumos/illumos-gate@0b2e8253986c5c761129b58cfdac46d204903de1 Reviewed by: Matthew Ahrens Reviewed by: John Wren Kennedy Reviewed by: Sara Hartse Approved by: Matt Ahrens Author: loli10K Modified: stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c Wed Oct 3 02:52:47 2018 (r339118) +++ stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c Wed Oct 3 03:13:53 2018 (r339119) @@ -7038,11 +7038,28 @@ zfs_do_diff(int argc, char **argv) return (err != 0); } +/* + * zfs remap + * + * Remap the indirect blocks in the given fileystem or volume. + */ static int zfs_do_remap(int argc, char **argv) { const char *fsname; int err = 0; + int c; + + /* check options */ + while ((c = getopt(argc, argv, "")) != -1) { + switch (c) { + case '?': + (void) fprintf(stderr, + gettext("invalid option '%c'\n"), optopt); + usage(B_FALSE); + } + } + if (argc != 2) { (void) fprintf(stderr, gettext("wrong number of arguments\n")); usage(B_FALSE); Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Oct 3 02:52:47 2018 (r339118) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Oct 3 03:13:53 2018 (r339119) @@ -3917,12 +3917,24 @@ zfs_remap_indirects(libzfs_handle_t *hdl, const char * char errbuf[1024]; (void) snprintf(errbuf, sizeof (errbuf), dgettext(TEXT_DOMAIN, - "cannot remap filesystem '%s' "), fs); + "cannot remap dataset '%s'"), fs); err = lzc_remap(fs); if (err != 0) { - (void) zfs_standard_error(hdl, err, errbuf); + switch (err) { + case ENOTSUP: + zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, + "pool must be upgraded")); + (void) zfs_error(hdl, EZFS_BADVERSION, errbuf); + break; + case EINVAL: + (void) zfs_error(hdl, EZFS_BADTYPE, errbuf); + break; + default: + (void) zfs_standard_error(hdl, err, errbuf); + break; + } } return (err); From owner-svn-src-stable-11@freebsd.org Wed Oct 3 03:14:42 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CCACC10B61B5; Wed, 3 Oct 2018 03:14:41 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 809C089820; Wed, 3 Oct 2018 03:14:41 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 7B5EF11426; Wed, 3 Oct 2018 03:14:41 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w933Ef6g077679; Wed, 3 Oct 2018 03:14:41 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w933Ee9K077676; Wed, 3 Oct 2018 03:14:40 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030314.w933Ee9K077676@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 03:14:40 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339120 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 339120 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 03:14:42 -0000 Author: mav Date: Wed Oct 3 03:14:40 2018 New Revision: 339120 URL: https://svnweb.freebsd.org/changeset/base/339120 Log: MFC r337169: MFV r337167: 9442 decrease indirect block size of spacemaps Updates to indirect blocks of spacemaps can contribute significantly to write inflation. Therefore we want to reduce the indirect block size of spacemaps from 128K to 16K. illumos/illumos-gate@221813c13b43ef48330b03725e00edee85108cf1 Reviewed by: Serapheim Dimitropoulos Reviewed by: George Wilson Reviewed by: Albert Lee Reviewed by: Igor Kozhukhov Approved by: Dan McDonald Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c Wed Oct 3 03:13:53 2018 (r339119) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c Wed Oct 3 03:14:40 2018 (r339120) @@ -32,7 +32,8 @@ #include uint64_t -dmu_object_alloc(objset_t *os, dmu_object_type_t ot, int blocksize, +dmu_object_alloc_ibs(objset_t *os, dmu_object_type_t ot, int blocksize, + int indirect_blockshift, dmu_object_type_t bonustype, int bonuslen, dmu_tx_t *tx) { uint64_t object; @@ -92,13 +93,22 @@ dmu_object_alloc(objset_t *os, dmu_object_type_t ot, i os->os_obj_next = object - 1; } - dnode_allocate(dn, ot, blocksize, 0, bonustype, bonuslen, tx); + dnode_allocate(dn, ot, blocksize, indirect_blockshift, + bonustype, bonuslen, tx); mutex_exit(&os->os_obj_lock); dmu_tx_add_new_object(tx, dn); dnode_rele(dn, FTAG); return (object); +} + +uint64_t +dmu_object_alloc(objset_t *os, dmu_object_type_t ot, int blocksize, + dmu_object_type_t bonustype, int bonuslen, dmu_tx_t *tx) +{ + return (dmu_object_alloc_ibs(os, ot, blocksize, 0, + bonustype, bonuslen, tx)); } int Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c Wed Oct 3 03:13:53 2018 (r339119) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c Wed Oct 3 03:14:40 2018 (r339120) @@ -54,6 +54,17 @@ SYSCTL_DECL(_vfs_zfs); */ boolean_t zfs_force_some_double_word_sm_entries = B_FALSE; +/* + * Override the default indirect block size of 128K, instead using 16K for + * spacemaps (2^14 bytes). This dramatically reduces write inflation since + * appending to a spacemap typically has to write one data block (4KB) and one + * or two indirect blocks (16K-32K, rather than 128K). + */ +int space_map_ibs = 14; + +SYSCTL_INT(_vfs_zfs, OID_AUTO, space_map_ibs, CTLFLAG_RWTUN, + &space_map_ibs, 0, "Space map indirect block shift"); + boolean_t sm_entry_is_debug(uint64_t e) { @@ -676,8 +687,8 @@ space_map_write_impl(space_map_t *sm, range_tree_t *rt * * [1] The feature is enabled. * [2] The offset or run is too big for a single-word entry, - * or the vdev_id is set (meaning not equal to - * SM_NO_VDEVID). + * or the vdev_id is set (meaning not equal to + * SM_NO_VDEVID). * * Note that for purposes of testing we've added the case that * we write two-word entries occasionally when the feature is @@ -836,7 +847,8 @@ space_map_truncate(space_map_t *sm, int blocksize, dmu */ if ((spa_feature_is_enabled(spa, SPA_FEATURE_SPACEMAP_HISTOGRAM) && doi.doi_bonus_size != sizeof (space_map_phys_t)) || - doi.doi_data_block_size != blocksize) { + doi.doi_data_block_size != blocksize || + doi.doi_metadata_block_size != 1 << space_map_ibs) { zfs_dbgmsg("txg %llu, spa %s, sm %p, reallocating " "object[%llu]: old bonus %u, old blocksz %u", dmu_tx_get_txg(tx), spa_name(spa), sm, sm->sm_object, @@ -892,8 +904,8 @@ space_map_alloc(objset_t *os, int blocksize, dmu_tx_t bonuslen = SPACE_MAP_SIZE_V0; } - object = dmu_object_alloc(os, DMU_OT_SPACE_MAP, blocksize, - DMU_OT_SPACE_MAP_HEADER, bonuslen, tx); + object = dmu_object_alloc_ibs(os, DMU_OT_SPACE_MAP, blocksize, + space_map_ibs, DMU_OT_SPACE_MAP_HEADER, bonuslen, tx); return (object); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Wed Oct 3 03:13:53 2018 (r339119) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Wed Oct 3 03:14:40 2018 (r339120) @@ -356,6 +356,9 @@ typedef struct dmu_buf { */ uint64_t dmu_object_alloc(objset_t *os, dmu_object_type_t ot, int blocksize, dmu_object_type_t bonus_type, int bonus_len, dmu_tx_t *tx); +uint64_t dmu_object_alloc_ibs(objset_t *os, dmu_object_type_t ot, int blocksize, + int indirect_blockshift, + dmu_object_type_t bonustype, int bonuslen, dmu_tx_t *tx); int dmu_object_claim(objset_t *os, uint64_t object, dmu_object_type_t ot, int blocksize, dmu_object_type_t bonus_type, int bonus_len, dmu_tx_t *tx); int dmu_object_reclaim(objset_t *os, uint64_t object, dmu_object_type_t ot, From owner-svn-src-stable-11@freebsd.org Wed Oct 3 11:34:29 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E274E10C03D3; Wed, 3 Oct 2018 11:34:28 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 95C5E76CF0; Wed, 3 Oct 2018 11:34:28 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 8BFEE16610; Wed, 3 Oct 2018 11:34:28 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93BYSVV032663; Wed, 3 Oct 2018 11:34:28 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93BYSLK032662; Wed, 3 Oct 2018 11:34:28 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201810031134.w93BYSLK032662@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Wed, 3 Oct 2018 11:34:28 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339122 - stable/11/libexec/rtld-elf X-SVN-Group: stable-11 X-SVN-Commit-Author: kib X-SVN-Commit-Paths: stable/11/libexec/rtld-elf X-SVN-Commit-Revision: 339122 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 11:34:29 -0000 Author: kib Date: Wed Oct 3 11:34:28 2018 New Revision: 339122 URL: https://svnweb.freebsd.org/changeset/base/339122 Log: MFC r338955: When doing lm_add(), check for duplicates. Modified: stable/11/libexec/rtld-elf/libmap.c Directory Properties: stable/11/ (props changed) Modified: stable/11/libexec/rtld-elf/libmap.c ============================================================================== --- stable/11/libexec/rtld-elf/libmap.c Wed Oct 3 07:35:16 2018 (r339121) +++ stable/11/libexec/rtld-elf/libmap.c Wed Oct 3 11:34:28 2018 (r339122) @@ -350,6 +350,7 @@ lm_add(const char *p, const char *f, const char *t) { struct lm_list *lml; struct lm *lm; + const char *t1; if (p == NULL) p = "$DEFAULT$"; @@ -359,11 +360,14 @@ lm_add(const char *p, const char *f, const char *t) if ((lml = lmp_find(p)) == NULL) lml = lmp_init(xstrdup(p)); - lm = xmalloc(sizeof(struct lm)); - lm->f = xstrdup(f); - lm->t = xstrdup(t); - TAILQ_INSERT_HEAD(lml, lm, lm_link); - lm_count++; + t1 = lml_find(lml, f); + if (t1 == NULL || strcmp(t1, t) != 0) { + lm = xmalloc(sizeof(struct lm)); + lm->f = xstrdup(f); + lm->t = xstrdup(t); + TAILQ_INSERT_HEAD(lml, lm, lm_link); + lm_count++; + } } char * From owner-svn-src-stable-11@freebsd.org Wed Oct 3 12:47:55 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0321E10C2103; Wed, 3 Oct 2018 12:47:55 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A105979289; Wed, 3 Oct 2018 12:47:54 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 984B717179; Wed, 3 Oct 2018 12:47:54 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93ClsFh068256; Wed, 3 Oct 2018 12:47:54 GMT (envelope-from ae@FreeBSD.org) Received: (from ae@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93Cls22068255; Wed, 3 Oct 2018 12:47:54 GMT (envelope-from ae@FreeBSD.org) Message-Id: <201810031247.w93Cls22068255@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: ae set sender to ae@FreeBSD.org using -f From: "Andrey V. Elsukov" Date: Wed, 3 Oct 2018 12:47:54 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339123 - stable/11/sbin/ipfw X-SVN-Group: stable-11 X-SVN-Commit-Author: ae X-SVN-Commit-Paths: stable/11/sbin/ipfw X-SVN-Commit-Revision: 339123 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 12:47:55 -0000 Author: ae Date: Wed Oct 3 12:47:54 2018 New Revision: 339123 URL: https://svnweb.freebsd.org/changeset/base/339123 Log: MFC r338947: Add "src-ip" or "dst-ip" keyword to the output, when we are printing the rest of rule options. Reported by: lev Modified: stable/11/sbin/ipfw/ipfw2.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sbin/ipfw/ipfw2.c ============================================================================== --- stable/11/sbin/ipfw/ipfw2.c Wed Oct 3 11:34:28 2018 (r339122) +++ stable/11/sbin/ipfw/ipfw2.c Wed Oct 3 12:47:54 2018 (r339123) @@ -1466,19 +1466,31 @@ print_instruction(struct buf_pr *bp, const struct form case O_IP_SRC_MASK: case O_IP_SRC_ME: case O_IP_SRC_SET: + if (state->flags & HAVE_SRCIP) + bprintf(bp, " src-ip"); + print_ip(bp, fo, insntod(cmd, ip)); + break; case O_IP_DST: case O_IP_DST_LOOKUP: case O_IP_DST_MASK: case O_IP_DST_ME: case O_IP_DST_SET: + if (state->flags & HAVE_DSTIP) + bprintf(bp, " dst-ip"); print_ip(bp, fo, insntod(cmd, ip)); break; case O_IP6_SRC: case O_IP6_SRC_MASK: case O_IP6_SRC_ME: + if (state->flags & HAVE_SRCIP) + bprintf(bp, " src-ip6"); + print_ip6(bp, insntod(cmd, ip6)); + break; case O_IP6_DST: case O_IP6_DST_MASK: case O_IP6_DST_ME: + if (state->flags & HAVE_DSTIP) + bprintf(bp, " dst-ip6"); print_ip6(bp, insntod(cmd, ip6)); break; case O_FLOW6ID: From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:43:19 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0264110C528A; Wed, 3 Oct 2018 14:43:19 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9D1F27D9F6; Wed, 3 Oct 2018 14:43:18 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 943B618516; Wed, 3 Oct 2018 14:43:18 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93EhIHu029794; Wed, 3 Oct 2018 14:43:18 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93EhIOC029791; Wed, 3 Oct 2018 14:43:18 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031443.w93EhIOC029791@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:43:18 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339125 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 339125 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:43:19 -0000 Author: mav Date: Wed Oct 3 14:43:17 2018 New Revision: 339125 URL: https://svnweb.freebsd.org/changeset/base/339125 Log: MFC r337172, MFV r337171: 9464 txg_kick() fails to see that we are quiescing, forcing transactions to their next stages without leaving them accumulate changes Ideally we would like txg_kick() to get triggered only when we are sure that we are not syncing AND not quiescing any txg. This way we can kick an open TXG to the quiescing state when we are sure that there is nothing going on and we would benefit from the different states running concurrently. illumos/illumos-gate@fa41d87de9ec9000964c605eb01d6dc19e4a1abe Reviewed by: Matt Ahrens Reviewed by: Brad Lewis Reviewed by: Andriy Gapon Approved by: Dan McDonald Author: Serapheim Dimitropoulos Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg_impl.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Oct 3 14:20:43 2018 (r339124) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Oct 3 14:43:17 2018 (r339125) @@ -1091,7 +1091,12 @@ dmu_tx_wait(dmu_tx_t *tx) mutex_exit(&dn->dn_mtx); tx->tx_needassign_txh = NULL; } else { - txg_wait_open(tx->tx_pool, tx->tx_lasttried_txg + 1); + /* + * If we have a lot of dirty data just wait until we sync + * out a TXG at which point we'll hopefully have synced + * a portion of the changes. + */ + txg_wait_synced(dp, spa_last_synced_txg(spa) + 1); } } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg_impl.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg_impl.h Wed Oct 3 14:20:43 2018 (r339124) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg_impl.h Wed Oct 3 14:43:17 2018 (r339125) @@ -25,7 +25,7 @@ */ /* - * Copyright (c) 2013 by Delphix. All rights reserved. + * Copyright (c) 2013, 2017 by Delphix. All rights reserved. */ #ifndef _SYS_TXG_IMPL_H @@ -92,6 +92,7 @@ typedef struct tx_state { kmutex_t tx_sync_lock; /* protects the rest of this struct */ uint64_t tx_open_txg; /* currently open txg id */ + uint64_t tx_quiescing_txg; /* currently quiescing txg id */ uint64_t tx_quiesced_txg; /* quiesced txg waiting for sync */ uint64_t tx_syncing_txg; /* currently syncing txg id */ uint64_t tx_synced_txg; /* last synced txg id */ Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Wed Oct 3 14:20:43 2018 (r339124) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Wed Oct 3 14:43:17 2018 (r339125) @@ -450,6 +450,30 @@ txg_dispatch_callbacks(dsl_pool_t *dp, uint64_t txg) } } +static boolean_t +txg_is_syncing(dsl_pool_t *dp) +{ + tx_state_t *tx = &dp->dp_tx; + ASSERT(MUTEX_HELD(&tx->tx_sync_lock)); + return (tx->tx_syncing_txg != 0); +} + +static boolean_t +txg_is_quiescing(dsl_pool_t *dp) +{ + tx_state_t *tx = &dp->dp_tx; + ASSERT(MUTEX_HELD(&tx->tx_sync_lock)); + return (tx->tx_quiescing_txg != 0); +} + +static boolean_t +txg_has_quiesced_to_sync(dsl_pool_t *dp) +{ + tx_state_t *tx = &dp->dp_tx; + ASSERT(MUTEX_HELD(&tx->tx_sync_lock)); + return (tx->tx_quiesced_txg != 0); +} + static void txg_sync_thread(void *arg) { @@ -476,7 +500,7 @@ txg_sync_thread(void *arg) while (!dsl_scan_active(dp->dp_scan) && !tx->tx_exiting && timer > 0 && tx->tx_synced_txg >= tx->tx_sync_txg_waiting && - tx->tx_quiesced_txg == 0 && + !txg_has_quiesced_to_sync(dp) && dp->dp_dirty_total < zfs_dirty_data_sync) { dprintf("waiting; tx_synced=%llu waiting=%llu dp=%p\n", tx->tx_synced_txg, tx->tx_sync_txg_waiting, dp); @@ -489,7 +513,7 @@ txg_sync_thread(void *arg) * Wait until the quiesce thread hands off a txg to us, * prompting it to do so if necessary. */ - while (!tx->tx_exiting && tx->tx_quiesced_txg == 0) { + while (!tx->tx_exiting && !txg_has_quiesced_to_sync(dp)) { if (tx->tx_quiesce_txg_waiting < tx->tx_open_txg+1) tx->tx_quiesce_txg_waiting = tx->tx_open_txg+1; cv_broadcast(&tx->tx_quiesce_more_cv); @@ -504,6 +528,7 @@ txg_sync_thread(void *arg) * us. This may cause the quiescing thread to now be * able to quiesce another txg, so we must signal it. */ + ASSERT(tx->tx_quiesced_txg != 0); txg = tx->tx_quiesced_txg; tx->tx_quiesced_txg = 0; tx->tx_syncing_txg = txg; @@ -552,7 +577,7 @@ txg_quiesce_thread(void *arg) */ while (!tx->tx_exiting && (tx->tx_open_txg >= tx->tx_quiesce_txg_waiting || - tx->tx_quiesced_txg != 0)) + txg_has_quiesced_to_sync(dp))) txg_thread_wait(tx, &cpr, &tx->tx_quiesce_more_cv, 0); if (tx->tx_exiting) @@ -562,6 +587,8 @@ txg_quiesce_thread(void *arg) dprintf("txg=%llu quiesce_txg=%llu sync_txg=%llu\n", txg, tx->tx_quiesce_txg_waiting, tx->tx_sync_txg_waiting); + tx->tx_quiescing_txg = txg; + mutex_exit(&tx->tx_sync_lock); txg_quiesce(dp, txg); mutex_enter(&tx->tx_sync_lock); @@ -570,6 +597,7 @@ txg_quiesce_thread(void *arg) * Hand this txg off to the sync thread. */ dprintf("quiesce done, handing off txg %llu\n", txg); + tx->tx_quiescing_txg = 0; tx->tx_quiesced_txg = txg; DTRACE_PROBE2(txg__quiesced, dsl_pool_t *, dp, uint64_t, txg); cv_broadcast(&tx->tx_sync_more_cv); @@ -667,7 +695,8 @@ txg_kick(dsl_pool_t *dp) ASSERT(!dsl_pool_config_held(dp)); mutex_enter(&tx->tx_sync_lock); - if (tx->tx_syncing_txg == 0 && + if (!txg_is_syncing(dp) && + !txg_is_quiescing(dp) && tx->tx_quiesce_txg_waiting <= tx->tx_open_txg && tx->tx_sync_txg_waiting <= tx->tx_synced_txg && tx->tx_quiesced_txg <= tx->tx_synced_txg) { From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:44:17 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7D2EC10C52E5; Wed, 3 Oct 2018 14:44:17 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 32AA57DB31; Wed, 3 Oct 2018 14:44:17 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 296CD18518; Wed, 3 Oct 2018 14:44:17 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93EiHXa029909; Wed, 3 Oct 2018 14:44:17 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93EiGW1029907; Wed, 3 Oct 2018 14:44:16 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031444.w93EiGW1029907@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:44:16 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339126 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 339126 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:44:17 -0000 Author: mav Date: Wed Oct 3 14:44:16 2018 New Revision: 339126 URL: https://svnweb.freebsd.org/changeset/base/339126 Log: MFC r337177: MFV r337175: 9487 Free objects when receiving full stream as clone All objects after the last written or freed object are not supposed to exist after receiving the stream. We should free them accordingly, as if a freeobjects record for them had been included in the stream. zfsonlinux/zfs@48fbb9ddbf2281911560dfbc2821aa8b74127315 illumos/illumos-gate@7864b8192b8d30471fa2240466d516292e5765b8 Reviewed by: Matthew Ahrens Approved by: Dan McDonald Author: Paul Dagnelie Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_send.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Wed Oct 3 14:43:17 2018 (r339125) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Wed Oct 3 14:44:16 2018 (r339126) @@ -1795,6 +1795,7 @@ dmu_recv_begin(char *tofs, char *tosnap, dmu_replay_re drc->drc_force = force; drc->drc_resumable = resumable; drc->drc_cred = CRED(); + drc->drc_clone = (origin != NULL); if (drc->drc_drrb->drr_magic == BSWAP_64(DMU_BACKUP_MAGIC)) { drc->drc_byteswap = B_TRUE; @@ -1855,7 +1856,9 @@ struct receive_writer_arg { /* A map from guid to dataset to help handle dedup'd streams. */ avl_tree_t *guid_to_ds_map; boolean_t resumable; - uint64_t last_object, last_offset; + uint64_t last_object; + uint64_t last_offset; + uint64_t max_object; /* highest object ID referenced in stream */ uint64_t bytes_read; /* bytes read when current record created */ }; @@ -2152,6 +2155,9 @@ receive_object(struct receive_writer_arg *rwa, struct return (SET_ERROR(EINVAL)); object = err == 0 ? drro->drr_object : DMU_NEW_OBJECT; + if (drro->drr_object > rwa->max_object) + rwa->max_object = drro->drr_object; + /* * If we are losing blkptrs or changing the block size this must * be a new file instance. We must clear out the previous file @@ -2247,6 +2253,9 @@ receive_freeobjects(struct receive_writer_arg *rwa, err = dmu_free_long_object(rwa->os, obj); if (err != 0) return (err); + + if (obj > rwa->max_object) + rwa->max_object = obj; } if (next_err != ESRCH) return (next_err); @@ -2276,6 +2285,9 @@ receive_write(struct receive_writer_arg *rwa, struct d rwa->last_object = drrw->drr_object; rwa->last_offset = drrw->drr_offset; + if (rwa->last_object > rwa->max_object) + rwa->max_object = rwa->last_object; + if (dmu_object_info(rwa->os, drrw->drr_object, NULL) != 0) return (SET_ERROR(EINVAL)); @@ -2352,6 +2364,9 @@ receive_write_byref(struct receive_writer_arg *rwa, ref_os = rwa->os; } + if (drrwbr->drr_object > rwa->max_object) + rwa->max_object = drrwbr->drr_object; + err = dmu_buf_hold(ref_os, drrwbr->drr_refobject, drrwbr->drr_refoffset, FTAG, &dbp, DMU_READ_PREFETCH); if (err != 0) @@ -2394,6 +2409,9 @@ receive_write_embedded(struct receive_writer_arg *rwa, if (drrwe->drr_compression >= ZIO_COMPRESS_FUNCTIONS) return (EINVAL); + if (drrwe->drr_object > rwa->max_object) + rwa->max_object = drrwe->drr_object; + tx = dmu_tx_create(rwa->os); dmu_tx_hold_write(tx, drrwe->drr_object, @@ -2430,6 +2448,9 @@ receive_spill(struct receive_writer_arg *rwa, struct d if (dmu_object_info(rwa->os, drrs->drr_object, NULL) != 0) return (SET_ERROR(EINVAL)); + if (drrs->drr_object > rwa->max_object) + rwa->max_object = drrs->drr_object; + VERIFY0(dmu_bonus_hold(rwa->os, drrs->drr_object, FTAG, &db)); if ((err = dmu_spill_hold_by_bonus(db, FTAG, &db_spill)) != 0) { dmu_buf_rele(db, FTAG); @@ -2474,6 +2495,9 @@ receive_free(struct receive_writer_arg *rwa, struct dr if (dmu_object_info(rwa->os, drrf->drr_object, NULL) != 0) return (SET_ERROR(EINVAL)); + if (drrf->drr_object > rwa->max_object) + rwa->max_object = drrf->drr_object; + err = dmu_free_long_range(rwa->os, drrf->drr_object, drrf->drr_offset, drrf->drr_length); @@ -3092,6 +3116,41 @@ dmu_recv_stream(dmu_recv_cookie_t *drc, struct file *f cv_wait(&rwa.cv, &rwa.mutex); } mutex_exit(&rwa.mutex); + + /* + * If we are receiving a full stream as a clone, all object IDs which + * are greater than the maximum ID referenced in the stream are + * by definition unused and must be freed. Note that it's possible that + * we've resumed this send and the first record we received was the END + * record. In that case, max_object would be 0, but we shouldn't start + * freeing all objects from there; instead we should start from the + * resumeobj. + */ + if (drc->drc_clone && drc->drc_drrb->drr_fromguid == 0) { + uint64_t obj; + if (nvlist_lookup_uint64(begin_nvl, "resume_object", &obj) != 0) + obj = 0; + if (rwa.max_object > obj) + obj = rwa.max_object; + obj++; + int free_err = 0; + int next_err = 0; + + while (next_err == 0) { + free_err = dmu_free_long_object(rwa.os, obj); + if (free_err != 0 && free_err != ENOENT) + break; + + next_err = dmu_object_next(rwa.os, &obj, FALSE, 0); + } + + if (err == 0) { + if (free_err != 0 && free_err != ENOENT) + err = free_err; + else if (next_err != ESRCH) + err = next_err; + } + } cv_destroy(&rwa.cv); mutex_destroy(&rwa.mutex); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_send.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_send.h Wed Oct 3 14:43:17 2018 (r339125) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_send.h Wed Oct 3 14:44:16 2018 (r339126) @@ -70,6 +70,7 @@ typedef struct dmu_recv_cookie { boolean_t drc_byteswap; boolean_t drc_force; boolean_t drc_resumable; + boolean_t drc_clone; struct avl_tree *drc_guid_to_ds_map; zio_cksum_t drc_cksum; uint64_t drc_newsnapobj; From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:45:49 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 055E810C53C7; Wed, 3 Oct 2018 14:45:49 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B01A57DD13; Wed, 3 Oct 2018 14:45:48 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id AB09818530; Wed, 3 Oct 2018 14:45:48 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93Ejm2u030046; Wed, 3 Oct 2018 14:45:48 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93Ejm5s030045; Wed, 3 Oct 2018 14:45:48 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031445.w93Ejm5s030045@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:45:48 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339127 - stable/11/cddl/contrib/opensolaris/cmd/zdb X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/cddl/contrib/opensolaris/cmd/zdb X-SVN-Commit-Revision: 339127 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:45:49 -0000 Author: mav Date: Wed Oct 3 14:45:48 2018 New Revision: 339127 URL: https://svnweb.freebsd.org/changeset/base/339127 Log: MFC r337179: 9523 Large alloc in zdb can cause trouble 16MB alloc in zdb_embedded_block() can cause cores in certain situations (clang, gcc55). OsX commit: https://github.com/openzfsonosx/zfs/commit/ced236a5da6e72ea7bf6d2919fe14e17cffe10f1 FreeBSD commit: https://svnweb.freebsd.org/base?view=revision&revision=326150 illumos/illumos-gate@03a4c2f4bfaca30115963b76445279b36468a614 Reviewed by: Igor Kozhukhov Reviewed by: Andriy Gapon Reviewed by: Matthew Ahrens Approved by: Dan McDonald Author: Jorgen Lundman This is an update for r326150 (by avg), where this change comes from. Modified: stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Oct 3 14:44:16 2018 (r339126) +++ stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Oct 3 14:45:48 2018 (r339127) @@ -4950,19 +4950,18 @@ zdb_embedded_block(char *thing) words + 8, words + 9, words + 10, words + 11, words + 12, words + 13, words + 14, words + 15); if (err != 16) { - (void) printf("invalid input format\n"); + (void) fprintf(stderr, "invalid input format\n"); exit(1); } ASSERT3U(BPE_GET_LSIZE(&bp), <=, SPA_MAXBLOCKSIZE); buf = malloc(SPA_MAXBLOCKSIZE); if (buf == NULL) { - (void) fprintf(stderr, "%s: failed to allocate %llu bytes\n", - __func__, SPA_MAXBLOCKSIZE); + (void) fprintf(stderr, "out of memory\n"); exit(1); } err = decode_embedded_bp(&bp, buf, BPE_GET_LSIZE(&bp)); if (err != 0) { - (void) printf("decode failed: %u\n", err); + (void) fprintf(stderr, "decode failed: %u\n", err); free(buf); exit(1); } From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:46:26 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3D25510C5443; Wed, 3 Oct 2018 14:46:26 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E67BB7DE8A; Wed, 3 Oct 2018 14:46:25 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id E146318531; Wed, 3 Oct 2018 14:46:25 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93EkPs9030129; Wed, 3 Oct 2018 14:46:25 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93EkP7h030126; Wed, 3 Oct 2018 14:46:25 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031446.w93EkP7h030126@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:46:25 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339128 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 339128 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:46:26 -0000 Author: mav Date: Wed Oct 3 14:46:25 2018 New Revision: 339128 URL: https://svnweb.freebsd.org/changeset/base/339128 Log: MFC r337181: 9539 Make zvol operations use _by_dnode routines Continues what was started in 7801 add more by-dnode routines by fully converting zvols to avoid unnecessary dnode_hold() calls. This saves a small amount of CPU time and slightly improves latencies of operations on zvols. illumos/illumos-gate@8dfe5547fbf0979fc1065a8b6fddc1e940a7cf4f Reviewed by: Matthew Ahrens Reviewed by: Brian Behlendorf Reviewed by: Rick McNeal Approved by: Dan McDonald Author: Richard Yao Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Wed Oct 3 14:45:48 2018 (r339127) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Wed Oct 3 14:46:25 2018 (r339128) @@ -449,7 +449,7 @@ dmu_spill_hold_by_bonus(dmu_buf_t *bonus, void *tag, d * and can induce severe lock contention when writing to several files * whose dnodes are in the same block. */ -static int +int dmu_buf_hold_array_by_dnode(dnode_t *dn, uint64_t offset, uint64_t length, boolean_t read, void *tag, int *numbufsp, dmu_buf_t ***dbpp, uint32_t flags) { @@ -1321,7 +1321,7 @@ xuio_stat_wbuf_nocopy(void) } #ifdef _KERNEL -static int +int dmu_read_uio_dnode(dnode_t *dn, uio_t *uio, uint64_t size) { dmu_buf_t **dbp; @@ -1437,7 +1437,7 @@ dmu_read_uio(objset_t *os, uint64_t object, uio_t *uio return (err); } -static int +int dmu_write_uio_dnode(dnode_t *dn, uio_t *uio, uint64_t size, dmu_tx_t *tx) { dmu_buf_t **dbp; @@ -1873,22 +1873,17 @@ dmu_return_arcbuf(arc_buf_t *buf) * dmu_write(). */ void -dmu_assign_arcbuf(dmu_buf_t *handle, uint64_t offset, arc_buf_t *buf, +dmu_assign_arcbuf_dnode(dnode_t *dn, uint64_t offset, arc_buf_t *buf, dmu_tx_t *tx) { - dmu_buf_impl_t *dbuf = (dmu_buf_impl_t *)handle; - dnode_t *dn; dmu_buf_impl_t *db; uint32_t blksz = (uint32_t)arc_buf_lsize(buf); uint64_t blkid; - DB_DNODE_ENTER(dbuf); - dn = DB_DNODE(dbuf); rw_enter(&dn->dn_struct_rwlock, RW_READER); blkid = dbuf_whichblock(dn, 0, offset); VERIFY((db = dbuf_hold(dn, blkid, FTAG)) != NULL); rw_exit(&dn->dn_struct_rwlock); - DB_DNODE_EXIT(dbuf); /* * We can only assign if the offset is aligned, the arc buf is the @@ -1916,17 +1911,25 @@ dmu_assign_arcbuf(dmu_buf_t *handle, uint64_t offset, ASSERT3U(arc_get_compression(buf), ==, ZIO_COMPRESS_OFF); ASSERT(!(buf->b_flags & ARC_BUF_FLAG_COMPRESSED)); - DB_DNODE_ENTER(dbuf); - dn = DB_DNODE(dbuf); os = dn->dn_objset; object = dn->dn_object; - DB_DNODE_EXIT(dbuf); dbuf_rele(db, FTAG); dmu_write(os, object, offset, blksz, buf->b_data, tx); dmu_return_arcbuf(buf); XUIOSTAT_BUMP(xuiostat_wbuf_copied); } +} + +void +dmu_assign_arcbuf(dmu_buf_t *handle, uint64_t offset, arc_buf_t *buf, + dmu_tx_t *tx) +{ + dmu_buf_impl_t *dbuf = (dmu_buf_impl_t *)handle; + + DB_DNODE_ENTER(dbuf); + dmu_assign_arcbuf_dnode(DB_DNODE(dbuf), offset, buf, tx); + DB_DNODE_EXIT(dbuf); } typedef struct { Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Wed Oct 3 14:45:48 2018 (r339127) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Wed Oct 3 14:46:25 2018 (r339128) @@ -519,6 +519,9 @@ uint64_t dmu_buf_refcount(dmu_buf_t *db); int dmu_buf_hold_array_by_bonus(dmu_buf_t *db, uint64_t offset, uint64_t length, boolean_t read, void *tag, int *numbufsp, dmu_buf_t ***dbpp); +int dmu_buf_hold_array_by_dnode(dnode_t *dn, uint64_t offset, uint64_t length, + boolean_t read, void *tag, int *numbufsp, dmu_buf_t ***dbpp, + uint32_t flags); void dmu_buf_rele_array(dmu_buf_t **, int numbufs, void *tag); typedef void dmu_buf_evict_func_t(void *user_ptr); @@ -757,10 +760,13 @@ void dmu_prealloc(objset_t *os, uint64_t object, uint6 dmu_tx_t *tx); int dmu_read_uio(objset_t *os, uint64_t object, struct uio *uio, uint64_t size); int dmu_read_uio_dbuf(dmu_buf_t *zdb, struct uio *uio, uint64_t size); +int dmu_read_uio_dnode(dnode_t *dn, struct uio *uio, uint64_t size); int dmu_write_uio(objset_t *os, uint64_t object, struct uio *uio, uint64_t size, dmu_tx_t *tx); int dmu_write_uio_dbuf(dmu_buf_t *zdb, struct uio *uio, uint64_t size, dmu_tx_t *tx); +int dmu_write_uio_dnode(dnode_t *dn, struct uio *uio, uint64_t size, + dmu_tx_t *tx); #ifdef _KERNEL #ifdef illumos int dmu_write_pages(objset_t *os, uint64_t object, uint64_t offset, @@ -774,6 +780,8 @@ int dmu_read_pages(objset_t *os, uint64_t object, vm_p #endif struct arc_buf *dmu_request_arcbuf(dmu_buf_t *handle, int size); void dmu_return_arcbuf(struct arc_buf *buf); +void dmu_assign_arcbuf_dnode(dnode_t *handle, uint64_t offset, + struct arc_buf *buf, dmu_tx_t *tx); void dmu_assign_arcbuf(dmu_buf_t *handle, uint64_t offset, struct arc_buf *buf, dmu_tx_t *tx); int dmu_xuio_init(struct xuio *uio, int niov); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c Wed Oct 3 14:45:48 2018 (r339127) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c Wed Oct 3 14:46:25 2018 (r339128) @@ -174,7 +174,7 @@ typedef struct zvol_state { zilog_t *zv_zilog; /* ZIL handle */ list_t zv_extents; /* List of extents for dump */ znode_t zv_znode; /* for range locking */ - dmu_buf_t *zv_dbuf; /* bonus handle */ + dnode_t *zv_dn; /* dnode hold */ #ifndef illumos int zv_state; int zv_volmode; /* Provide GEOM or cdev */ @@ -868,7 +868,7 @@ zvol_first_open(zvol_state_t *zv) } zv->zv_volblocksize = doi.doi_data_block_size; - error = dmu_bonus_hold(os, ZVOL_OBJ, zvol_tag, &zv->zv_dbuf); + error = dnode_hold(os, ZVOL_OBJ, zvol_tag, &zv->zv_dn); if (error) { dmu_objset_disown(os, zvol_tag); return (error); @@ -893,8 +893,8 @@ zvol_last_close(zvol_state_t *zv) zil_close(zv->zv_zilog); zv->zv_zilog = NULL; - dmu_buf_rele(zv->zv_dbuf, zvol_tag); - zv->zv_dbuf = NULL; + dnode_rele(zv->zv_dn, zvol_tag); + zv->zv_dn = NULL; /* * Evict cached data @@ -1342,8 +1342,6 @@ static int zvol_get_data(void *arg, lr_write_t *lr, char *buf, struct lwb *lwb, zio_t *zio) { zvol_state_t *zv = arg; - objset_t *os = zv->zv_objset; - uint64_t object = ZVOL_OBJ; uint64_t offset = lr->lr_offset; uint64_t size = lr->lr_length; /* length of user data */ dmu_buf_t *db; @@ -1367,7 +1365,7 @@ zvol_get_data(void *arg, lr_write_t *lr, char *buf, st if (buf != NULL) { /* immediate write */ zgd->zgd_rl = zfs_range_lock(&zv->zv_znode, offset, size, RL_READER); - error = dmu_read(os, object, offset, size, buf, + error = dmu_read_by_dnode(zv->zv_dn, offset, size, buf, DMU_READ_NO_PREFETCH); } else { /* indirect write */ /* @@ -1380,7 +1378,7 @@ zvol_get_data(void *arg, lr_write_t *lr, char *buf, st offset = P2ALIGN(offset, size); zgd->zgd_rl = zfs_range_lock(&zv->zv_znode, offset, size, RL_READER); - error = dmu_buf_hold(os, object, offset, zgd, &db, + error = dmu_buf_hold_by_dnode(zv->zv_dn, offset, zgd, &db, DMU_READ_NO_PREFETCH); if (error == 0) { blkptr_t *bp = &lr->lr_blkptr; @@ -1451,8 +1449,8 @@ zvol_log_write(zvol_state_t *zv, dmu_tx_t *tx, offset_ itx = zil_itx_create(TX_WRITE, sizeof (*lr) + (wr_state == WR_COPIED ? len : 0)); lr = (lr_write_t *)&itx->itx_lr; - if (wr_state == WR_COPIED && dmu_read(zv->zv_objset, - ZVOL_OBJ, off, len, lr + 1, DMU_READ_NO_PREFETCH) != 0) { + if (wr_state == WR_COPIED && dmu_read_by_dnode(zv->zv_dn, + off, len, lr + 1, DMU_READ_NO_PREFETCH) != 0) { zil_itx_destroy(itx); itx = zil_itx_create(TX_WRITE, sizeof (*lr)); lr = (lr_write_t *)&itx->itx_lr; @@ -1874,7 +1872,7 @@ zvol_read(struct cdev *dev, struct uio *uio, int iofla if (bytes > volsize - uio->uio_loffset) bytes = volsize - uio->uio_loffset; - error = dmu_read_uio_dbuf(zv->zv_dbuf, uio, bytes); + error = dmu_read_uio_dnode(zv->zv_dn, uio, bytes); if (error) { /* convert checksum errors into IO errors */ if (error == ECKSUM) @@ -1946,7 +1944,7 @@ zvol_write(struct cdev *dev, struct uio *uio, int iofl dmu_tx_abort(tx); break; } - error = dmu_write_uio_dbuf(zv->zv_dbuf, uio, bytes, tx); + error = dmu_write_uio_dnode(zv->zv_dn, uio, bytes, tx); if (error == 0) zvol_log_write(zv, tx, off, bytes, sync); dmu_tx_commit(tx); @@ -2028,7 +2026,7 @@ zvol_getefi(void *arg, int flag, uint64_t vs, uint8_t int zvol_get_volume_params(minor_t minor, uint64_t *blksize, uint64_t *max_xfer_len, void **minor_hdl, void **objset_hdl, void **zil_hdl, - void **rl_hdl, void **bonus_hdl) + void **rl_hdl, void **dnode_hdl) { zvol_state_t *zv; @@ -2039,7 +2037,7 @@ zvol_get_volume_params(minor_t minor, uint64_t *blksiz return (SET_ERROR(ENXIO)); ASSERT(blksize && max_xfer_len && minor_hdl && - objset_hdl && zil_hdl && rl_hdl && bonus_hdl); + objset_hdl && zil_hdl && rl_hdl && dnode_hdl); *blksize = zv->zv_volblocksize; *max_xfer_len = (uint64_t)zvol_maxphys; @@ -2047,7 +2045,7 @@ zvol_get_volume_params(minor_t minor, uint64_t *blksiz *objset_hdl = zv->zv_objset; *zil_hdl = zv->zv_zilog; *rl_hdl = &zv->zv_znode; - *bonus_hdl = zv->zv_dbuf; + *dnode_hdl = zv->zv_dn; return (0); } From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:47:31 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 36DC210C5505; Wed, 3 Oct 2018 14:47:31 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id DC69D7E029; Wed, 3 Oct 2018 14:47:30 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id D750218536; Wed, 3 Oct 2018 14:47:30 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93ElUKa030230; Wed, 3 Oct 2018 14:47:30 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93ElTit030224; Wed, 3 Oct 2018 14:47:29 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031447.w93ElTit030224@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:47:29 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339129 - in stable/11: cddl/contrib/opensolaris/cmd/zfs cddl/contrib/opensolaris/lib/libzfs/common sys/cddl/contrib/opensolaris/common/zfs sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11: cddl/contrib/opensolaris/cmd/zfs cddl/contrib/opensolaris/lib/libzfs/common sys/cddl/contrib/opensolaris/common/zfs sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339129 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:47:31 -0000 Author: mav Date: Wed Oct 3 14:47:29 2018 New Revision: 339129 URL: https://svnweb.freebsd.org/changeset/base/339129 Log: MFC r337183: MFV r337182: 9330 stack overflow when creating a deeply nested dataset Datasets that are deeply nested (~100 levels) are impractical. We just put a limit of 50 levels to newly created datasets. Existing datasets should work without a problem. illumos/illumos-gate@5ac95da7d61660aa299c287a39277cb0372be959 Reviewed by: John Kennedy Reviewed by: Matt Ahrens Approved by: Garrett D'Amore Author: Serapheim Dimitropoulos Modified: stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_namecheck.c stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_namecheck.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 Wed Oct 3 14:46:25 2018 (r339128) +++ stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 Wed Oct 3 14:47:29 2018 (r339129) @@ -320,7 +320,8 @@ namespace. For example: .Pp where the maximum length of a dataset name is .Dv MAXNAMELEN -(256 bytes). +(256 bytes) +and the maximum amount of nesting allowed in a path is 50 levels deep. .Pp A dataset can be one of the following: .Bl -hang -width 12n Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Oct 3 14:46:25 2018 (r339128) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Oct 3 14:47:29 2018 (r339129) @@ -3449,8 +3449,22 @@ zfs_create_ancestors(libzfs_handle_t *hdl, const char { int prefix; char *path_copy; + char errbuf[1024]; int rc = 0; + (void) snprintf(errbuf, sizeof (errbuf), dgettext(TEXT_DOMAIN, + "cannot create '%s'"), path); + + /* + * Check that we are not passing the nesting limit + * before we start creating any ancestors. + */ + if (dataset_nestcheck(path) != 0) { + zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, + "maximum name nesting depth exceeded")); + return (zfs_error(hdl, EZFS_INVALIDNAME, errbuf)); + } + if (check_parents(hdl, path, NULL, B_TRUE, &prefix) != 0) return (-1); @@ -3486,6 +3500,12 @@ zfs_create(libzfs_handle_t *hdl, const char *path, zfs if (!zfs_validate_name(hdl, path, type, B_TRUE)) return (zfs_error(hdl, EZFS_INVALIDNAME, errbuf)); + if (dataset_nestcheck(path) != 0) { + zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, + "maximum name nesting depth exceeded")); + return (zfs_error(hdl, EZFS_INVALIDNAME, errbuf)); + } + /* validate parents exist */ if (check_parents(hdl, path, &zoned, B_FALSE, NULL) != 0) return (-1); @@ -4286,6 +4306,7 @@ zfs_rename(zfs_handle_t *zhp, const char *source, cons errbuf)); } } + if (!zfs_validate_name(hdl, target, zhp->zfs_type, B_TRUE)) return (zfs_error(hdl, EZFS_INVALIDNAME, errbuf)); } else { Modified: stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_namecheck.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_namecheck.c Wed Oct 3 14:46:25 2018 (r339128) +++ stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_namecheck.c Wed Oct 3 14:47:29 2018 (r339129) @@ -23,7 +23,7 @@ * Use is subject to license terms. */ /* - * Copyright (c) 2013 by Delphix. All rights reserved. + * Copyright (c) 2013, 2016 by Delphix. All rights reserved. */ /* @@ -34,8 +34,6 @@ * name is invalid. In the kernel, we only care whether it's valid or not. * Each routine therefore takes a 'namecheck_err_t' which describes exactly why * the name failed to validate. - * - * Each function returns 0 on success, -1 on error. */ #if defined(_KERNEL) @@ -50,6 +48,14 @@ #include "zfs_namecheck.h" #include "zfs_deleg.h" +/* + * Deeply nested datasets can overflow the stack, so we put a limit + * in the amount of nesting a path can have. zfs_max_dataset_nesting + * can be tuned temporarily to fix existing datasets that exceed our + * predefined limit. + */ +int zfs_max_dataset_nesting = 50; + static int valid_char(char c) { @@ -60,10 +66,35 @@ valid_char(char c) } /* + * Looks at a path and returns its level of nesting (depth). + */ +int +get_dataset_depth(const char *path) +{ + const char *loc = path; + int nesting = 0; + + /* + * Keep track of nesting until you hit the end of the + * path or found the snapshot/bookmark seperator. + */ + for (int i = 0; loc[i] != '\0' && + loc[i] != '@' && + loc[i] != '#'; i++) { + if (loc[i] == '/') + nesting++; + } + + return (nesting); +} + +/* * Snapshot names must be made up of alphanumeric characters plus the following * characters: * - * [-_.: ] + * [-_.: ] + * + * Returns 0 on success, -1 on error. */ int zfs_component_namecheck(const char *path, namecheck_err_t *why, char *what) @@ -99,6 +130,8 @@ zfs_component_namecheck(const char *path, namecheck_er * Permissions set name must start with the letter '@' followed by the * same character restrictions as snapshot names, except that the name * cannot exceed 64 characters. + * + * Returns 0 on success, -1 on error. */ int permset_namecheck(const char *path, namecheck_err_t *why, char *what) @@ -121,28 +154,40 @@ permset_namecheck(const char *path, namecheck_err_t *w } /* + * Dataset paths should not be deeper than zfs_max_dataset_nesting + * in terms of nesting. + * + * Returns 0 on success, -1 on error. + */ +int +dataset_nestcheck(const char *path) +{ + return ((get_dataset_depth(path) < zfs_max_dataset_nesting) ? 0 : -1); +} + +/* * Entity names must be of the following form: * - * [component/]*[component][(@|#)component]? + * [component/]*[component][(@|#)component]? * * Where each component is made up of alphanumeric characters plus the following * characters: * - * [-_.:%] + * [-_.:%] * * We allow '%' here as we use that character internally to create unique * names for temporary clones (for online recv). + * + * Returns 0 on success, -1 on error. */ int entity_namecheck(const char *path, namecheck_err_t *why, char *what) { - const char *start, *end; - int found_delim; + const char *end; /* * Make sure the name is not too long. */ - if (strlen(path) >= ZFS_MAX_DATASET_NAME_LEN) { if (why) *why = NAME_ERR_TOOLONG; @@ -162,8 +207,8 @@ entity_namecheck(const char *path, namecheck_err_t *wh return (-1); } - start = path; - found_delim = 0; + const char *start = path; + boolean_t found_delim = B_FALSE; for (;;) { /* Find the end of this component */ end = start; @@ -198,7 +243,7 @@ entity_namecheck(const char *path, namecheck_err_t *wh return (-1); } - found_delim = 1; + found_delim = B_TRUE; } /* Zero-length components are not allowed */ @@ -250,6 +295,8 @@ dataset_namecheck(const char *path, namecheck_err_t *w * mountpoint names must be of the following form: * * /[component][/]*[component][/] + * + * Returns 0 on success, -1 on error. */ int mountpoint_namecheck(const char *path, namecheck_err_t *why) @@ -294,6 +341,8 @@ mountpoint_namecheck(const char *path, namecheck_err_t * dataset names, with the additional restriction that the pool name must begin * with a letter. The pool names 'raidz' and 'mirror' are also reserved names * that cannot be used. + * + * Returns 0 on success, -1 on error. */ int pool_namecheck(const char *pool, namecheck_err_t *why, char *what) Modified: stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_namecheck.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_namecheck.h Wed Oct 3 14:46:25 2018 (r339128) +++ stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_namecheck.h Wed Oct 3 14:47:29 2018 (r339129) @@ -23,7 +23,7 @@ * Use is subject to license terms. */ /* - * Copyright (c) 2013 by Delphix. All rights reserved. + * Copyright (c) 2013, 2016 by Delphix. All rights reserved. */ #ifndef _ZFS_NAMECHECK_H @@ -48,9 +48,13 @@ typedef enum { #define ZFS_PERMSET_MAXLEN 64 +extern int zfs_max_dataset_nesting; + +int get_dataset_depth(const char *); int pool_namecheck(const char *, namecheck_err_t *, char *); int entity_namecheck(const char *, namecheck_err_t *, char *); int dataset_namecheck(const char *, namecheck_err_t *, char *); +int dataset_nestcheck(const char *); int mountpoint_namecheck(const char *, namecheck_err_t *); int zfs_component_namecheck(const char *, namecheck_err_t *, char *); int permset_namecheck(const char *, namecheck_err_t *, char *); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Oct 3 14:46:25 2018 (r339128) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Oct 3 14:47:29 2018 (r339129) @@ -54,6 +54,7 @@ #include #include #include +#include "zfs_namecheck.h" /* * Needed to close a window in dnode_move() that allows the objset to be freed @@ -911,6 +912,9 @@ dmu_objset_create_check(void *arg, dmu_tx_t *tx) return (SET_ERROR(EINVAL)); if (strlen(doca->doca_name) >= ZFS_MAX_DATASET_NAME_LEN) + return (SET_ERROR(ENAMETOOLONG)); + + if (dataset_nestcheck(doca->doca_name) != 0) return (SET_ERROR(ENAMETOOLONG)); error = dsl_dir_hold(dp, doca->doca_name, FTAG, &pdd, &tail); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c Wed Oct 3 14:46:25 2018 (r339128) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c Wed Oct 3 14:47:29 2018 (r339129) @@ -1819,17 +1819,29 @@ typedef struct dsl_dir_rename_arg { cred_t *ddra_cred; } dsl_dir_rename_arg_t; +typedef struct dsl_valid_rename_arg { + int char_delta; + int nest_delta; +} dsl_valid_rename_arg_t; + /* ARGSUSED */ static int dsl_valid_rename(dsl_pool_t *dp, dsl_dataset_t *ds, void *arg) { - int *deltap = arg; + dsl_valid_rename_arg_t *dvra = arg; char namebuf[ZFS_MAX_DATASET_NAME_LEN]; dsl_dataset_name(ds, namebuf); - if (strlen(namebuf) + *deltap >= ZFS_MAX_DATASET_NAME_LEN) + ASSERT3U(strnlen(namebuf, ZFS_MAX_DATASET_NAME_LEN), + <, ZFS_MAX_DATASET_NAME_LEN); + int namelen = strlen(namebuf) + dvra->char_delta; + int depth = get_dataset_depth(namebuf) + dvra->nest_delta; + + if (namelen >= ZFS_MAX_DATASET_NAME_LEN) return (SET_ERROR(ENAMETOOLONG)); + if (dvra->nest_delta > 0 && depth >= zfs_max_dataset_nesting) + return (SET_ERROR(ENAMETOOLONG)); return (0); } @@ -1839,9 +1851,9 @@ dsl_dir_rename_check(void *arg, dmu_tx_t *tx) dsl_dir_rename_arg_t *ddra = arg; dsl_pool_t *dp = dmu_tx_pool(tx); dsl_dir_t *dd, *newparent; + dsl_valid_rename_arg_t dvra; const char *mynewname; int error; - int delta = strlen(ddra->ddra_newname) - strlen(ddra->ddra_oldname); /* target dir should exist */ error = dsl_dir_hold(dp, ddra->ddra_oldname, FTAG, &dd, NULL); @@ -1870,10 +1882,19 @@ dsl_dir_rename_check(void *arg, dmu_tx_t *tx) return (SET_ERROR(EEXIST)); } + ASSERT3U(strnlen(ddra->ddra_newname, ZFS_MAX_DATASET_NAME_LEN), + <, ZFS_MAX_DATASET_NAME_LEN); + ASSERT3U(strnlen(ddra->ddra_oldname, ZFS_MAX_DATASET_NAME_LEN), + <, ZFS_MAX_DATASET_NAME_LEN); + dvra.char_delta = strlen(ddra->ddra_newname) + - strlen(ddra->ddra_oldname); + dvra.nest_delta = get_dataset_depth(ddra->ddra_newname) + - get_dataset_depth(ddra->ddra_oldname); + /* if the name length is growing, validate child name lengths */ - if (delta > 0) { + if (dvra.char_delta > 0 || dvra.nest_delta > 0) { error = dmu_objset_find_dp(dp, dd->dd_object, dsl_valid_rename, - &delta, DS_FIND_CHILDREN | DS_FIND_SNAPSHOTS); + &dvra, DS_FIND_CHILDREN | DS_FIND_SNAPSHOTS); if (error != 0) { dsl_dir_rele(newparent, FTAG); dsl_dir_rele(dd, FTAG); From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:48:18 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EE32410C55B2; Wed, 3 Oct 2018 14:48:17 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A3A557E1CB; Wed, 3 Oct 2018 14:48:17 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 9EA4F18557; Wed, 3 Oct 2018 14:48:17 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93EmHVP030317; Wed, 3 Oct 2018 14:48:17 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93EmHld030316; Wed, 3 Oct 2018 14:48:17 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031448.w93EmHld030316@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:48:17 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339130 - stable/11/cddl/contrib/opensolaris/lib/libzfs/common X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/cddl/contrib/opensolaris/lib/libzfs/common X-SVN-Commit-Revision: 339130 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:48:18 -0000 Author: mav Date: Wed Oct 3 14:48:17 2018 New Revision: 339130 URL: https://svnweb.freebsd.org/changeset/base/339130 Log: MFC r337185: MFV r337184: 9457 libzfs_import.c:add_config() has a memory leak A memory leak occurs on lines 209 and 213 because the config is not freed in the error case. The interface to add_config() seems less than ideal - it would be better if it copied any data necessary from the config and the caller freed it. illumos/illumos-gate@ddfe901b12348d31c500fb57f9174e88860a4061 Reviewed by: Matt Ahrens Reviewed by: Serapheim Dimitropoulos Approved by: Robert Mustacchi Author: sara hartse Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Wed Oct 3 14:47:29 2018 (r339129) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Wed Oct 3 14:48:17 2018 (r339130) @@ -33,7 +33,7 @@ * ZFS label of each device. If we successfully read the label, then we * organize the configuration information in the following hierarchy: * - * pool guid -> toplevel vdev guid -> label txg + * pool guid -> toplevel vdev guid -> label txg * * Duplicate entries matching this same tuple will be discarded. Once we have * examined every device, we pick the best label txg config for each toplevel @@ -245,7 +245,6 @@ add_config(libzfs_handle_t *hdl, pool_list_t *pl, cons ne->ne_next = pl->names; pl->names = ne; - nvlist_free(config); return (0); } @@ -265,7 +264,6 @@ add_config(libzfs_handle_t *hdl, pool_list_t *pl, cons &top_guid) != 0 || nvlist_lookup_uint64(config, ZPOOL_CONFIG_POOL_TXG, &txg) != 0 || txg == 0) { - nvlist_free(config); return (0); } @@ -280,7 +278,6 @@ add_config(libzfs_handle_t *hdl, pool_list_t *pl, cons if (pe == NULL) { if ((pe = zfs_alloc(hdl, sizeof (pool_entry_t))) == NULL) { - nvlist_free(config); return (-1); } pe->pe_guid = pool_guid; @@ -299,7 +296,6 @@ add_config(libzfs_handle_t *hdl, pool_list_t *pl, cons if (ve == NULL) { if ((ve = zfs_alloc(hdl, sizeof (vdev_entry_t))) == NULL) { - nvlist_free(config); return (-1); } ve->ve_guid = top_guid; @@ -319,15 +315,12 @@ add_config(libzfs_handle_t *hdl, pool_list_t *pl, cons if (ce == NULL) { if ((ce = zfs_alloc(hdl, sizeof (config_entry_t))) == NULL) { - nvlist_free(config); return (-1); } ce->ce_txg = txg; - ce->ce_config = config; + ce->ce_config = fnvlist_dup(config); ce->ce_next = ve->ve_configs; ve->ve_configs = ce; - } else { - nvlist_free(config); } /* @@ -1396,9 +1389,7 @@ skipdir: &this_guid) == 0 && iarg->guid == this_guid; } - if (!matched) { - nvlist_free(config); - } else { + if (matched) { /* * use the non-raw path for the config */ @@ -1408,6 +1399,7 @@ skipdir: config) != 0) config_failed = B_TRUE; } + nvlist_free(config); } free(slice->rn_name); free(slice); From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:48:57 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ECEFF10C564E; Wed, 3 Oct 2018 14:48:56 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A01C67E30F; Wed, 3 Oct 2018 14:48:56 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 9AB061855A; Wed, 3 Oct 2018 14:48:56 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93EmueW030398; Wed, 3 Oct 2018 14:48:56 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93Emtf2030393; Wed, 3 Oct 2018 14:48:55 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031448.w93Emtf2030393@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:48:55 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339131 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 339131 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:48:57 -0000 Author: mav Date: Wed Oct 3 14:48:55 2018 New Revision: 339131 URL: https://svnweb.freebsd.org/changeset/base/339131 Log: MFC r337191: MFV r337190: 9486 reduce memory used by device removal on fragmented pools In the most fragmented real-world cases, this reduces memory used by the mapping from ~1GB to ~50MB of RAM per 1TB of storage removed. Less fragmented cases will typically also see around 50-100MB of RAM per 1TB of storage. illumos/illumos-gate@cfd63e1b1bcf7ba4bf72f55ddbd87ce008d2986d Reviewed by: George Wilson Reviewed by: Serapheim Dimitropoulos Reviewed by: Brian Behlendorf Reviewed by: Tim Chase Approved by: Robert Mustacchi Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/range_tree.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/range_tree.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_removal.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_removal.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/range_tree.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/range_tree.c Wed Oct 3 14:48:17 2018 (r339130) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/range_tree.c Wed Oct 3 14:48:55 2018 (r339131) @@ -491,7 +491,6 @@ range_tree_resize_segment(range_tree_t *rt, range_seg_ static range_seg_t * range_tree_find_impl(range_tree_t *rt, uint64_t start, uint64_t size) { - avl_index_t where; range_seg_t rsearch; uint64_t end = start + size; @@ -499,7 +498,7 @@ range_tree_find_impl(range_tree_t *rt, uint64_t start, rsearch.rs_start = start; rsearch.rs_end = end; - return (avl_find(&rt->rt_root, &rsearch, &where)); + return (avl_find(&rt->rt_root, &rsearch, NULL)); } range_seg_t * @@ -650,4 +649,24 @@ range_tree_is_empty(range_tree_t *rt) { ASSERT(rt != NULL); return (range_tree_space(rt) == 0); +} + +uint64_t +range_tree_min(range_tree_t *rt) +{ + range_seg_t *rs = avl_first(&rt->rt_root); + return (rs != NULL ? rs->rs_start : 0); +} + +uint64_t +range_tree_max(range_tree_t *rt) +{ + range_seg_t *rs = avl_last(&rt->rt_root); + return (rs != NULL ? rs->rs_end : 0); +} + +uint64_t +range_tree_span(range_tree_t *rt) +{ + return (range_tree_max(rt) - range_tree_min(rt)); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/range_tree.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/range_tree.h Wed Oct 3 14:48:17 2018 (r339130) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/range_tree.h Wed Oct 3 14:48:55 2018 (r339131) @@ -95,6 +95,9 @@ boolean_t range_tree_is_empty(range_tree_t *rt); void range_tree_verify(range_tree_t *rt, uint64_t start, uint64_t size); void range_tree_swap(range_tree_t **rtsrc, range_tree_t **rtdst); void range_tree_stat_verify(range_tree_t *rt); +uint64_t range_tree_min(range_tree_t *rt); +uint64_t range_tree_max(range_tree_t *rt); +uint64_t range_tree_span(range_tree_t *rt); void range_tree_add(void *arg, uint64_t start, uint64_t size); void range_tree_remove(void *arg, uint64_t start, uint64_t size); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_removal.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_removal.h Wed Oct 3 14:48:17 2018 (r339130) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_removal.h Wed Oct 3 14:48:55 2018 (r339131) @@ -86,6 +86,9 @@ extern void spa_vdev_remove_suspend(spa_t *); extern int spa_vdev_remove_cancel(spa_t *); extern void spa_vdev_removal_destroy(spa_vdev_removal_t *svr); +extern int vdev_removal_max_span; +extern int zfs_remove_max_segment; + #ifdef __cplusplus } #endif Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c Wed Oct 3 14:48:17 2018 (r339130) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c Wed Oct 3 14:48:55 2018 (r339131) @@ -33,15 +33,15 @@ * 1. Uniquely identify this device as part of a ZFS pool and confirm its * identity within the pool. * - * 2. Verify that all the devices given in a configuration are present + * 2. Verify that all the devices given in a configuration are present * within the pool. * - * 3. Determine the uberblock for the pool. + * 3. Determine the uberblock for the pool. * - * 4. In case of an import operation, determine the configuration of the + * 4. In case of an import operation, determine the configuration of the * toplevel vdev of which it is a part. * - * 5. If an import operation cannot find all the devices in the pool, + * 5. If an import operation cannot find all the devices in the pool, * provide enough information to the administrator to determine which * devices are missing. * @@ -77,9 +77,9 @@ * In order to identify which labels are valid, the labels are written in the * following manner: * - * 1. For each vdev, update 'L1' to the new label - * 2. Update the uberblock - * 3. For each vdev, update 'L2' to the new label + * 1. For each vdev, update 'L1' to the new label + * 2. Update the uberblock + * 3. For each vdev, update 'L2' to the new label * * Given arbitrary failure, we can determine the correct label to use based on * the transaction group. If we fail after updating L1 but before updating the @@ -117,19 +117,19 @@ * * The nvlist describing the pool and vdev contains the following elements: * - * version ZFS on-disk version - * name Pool name - * state Pool state - * txg Transaction group in which this label was written - * pool_guid Unique identifier for this pool - * vdev_tree An nvlist describing vdev tree. + * version ZFS on-disk version + * name Pool name + * state Pool state + * txg Transaction group in which this label was written + * pool_guid Unique identifier for this pool + * vdev_tree An nvlist describing vdev tree. * features_for_read * An nvlist of the features necessary for reading the MOS. * * Each leaf device label also contains the following: * - * top_guid Unique ID for top-level vdev in which this is contained - * guid Unique ID for the leaf vdev + * top_guid Unique ID for top-level vdev in which this is contained + * guid Unique ID for the leaf vdev * * The 'vs' configuration follows the format described in 'spa_config.c'. */ @@ -396,22 +396,33 @@ vdev_config_generate(spa_t *spa, vdev_t *vd, boolean_t * histograms. */ uint64_t seg_count = 0; + uint64_t to_alloc = vd->vdev_stat.vs_alloc; /* * There are the same number of allocated segments * as free segments, so we will have at least one - * entry per free segment. + * entry per free segment. However, small free + * segments (smaller than vdev_removal_max_span) + * will be combined with adjacent allocated segments + * as a single mapping. */ for (int i = 0; i < RANGE_TREE_HISTOGRAM_SIZE; i++) { - seg_count += vd->vdev_mg->mg_histogram[i]; + if (1ULL << (i + 1) < vdev_removal_max_span) { + to_alloc += + vd->vdev_mg->mg_histogram[i] << + i + 1; + } else { + seg_count += + vd->vdev_mg->mg_histogram[i]; + } } /* - * The maximum length of a mapping is SPA_MAXBLOCKSIZE, - * so we need at least one entry per SPA_MAXBLOCKSIZE - * of allocated data. + * The maximum length of a mapping is + * zfs_remove_max_segment, so we need at least one entry + * per zfs_remove_max_segment of allocated data. */ - seg_count += vd->vdev_stat.vs_alloc / SPA_MAXBLOCKSIZE; + seg_count += to_alloc / zfs_remove_max_segment; fnvlist_add_uint64(nv, ZPOOL_CONFIG_INDIRECT_SIZE, seg_count * Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_removal.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_removal.c Wed Oct 3 14:48:17 2018 (r339130) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_removal.c Wed Oct 3 14:48:55 2018 (r339131) @@ -106,6 +106,24 @@ int zfs_remove_max_copy_bytes = 64 * 1024 * 1024; int zfs_remove_max_segment = 1024 * 1024; /* + * Allow a remap segment to span free chunks of at most this size. The main + * impact of a larger span is that we will read and write larger, more + * contiguous chunks, with more "unnecessary" data -- trading off bandwidth + * for iops. The value here was chosen to align with + * zfs_vdev_read_gap_limit, which is a similar concept when doing regular + * reads (but there's no reason it has to be the same). + * + * Additionally, a higher span will have the following relatively minor + * effects: + * - the mapping will be smaller, since one entry can cover more allocated + * segments + * - more of the fragmentation in the removing device will be preserved + * - we'll do larger allocations, which may fail and fall back on smaller + * allocations + */ +int vdev_removal_max_span = 32 * 1024; + +/* * This is used by the test suite so that it can ensure that certain * actions happen while in the middle of a removal. */ @@ -726,13 +744,52 @@ vdev_mapping_sync(void *arg, dmu_tx_t *tx) spa_sync_removing_state(spa, tx); } +typedef struct vdev_copy_segment_arg { + spa_t *vcsa_spa; + dva_t *vcsa_dest_dva; + uint64_t vcsa_txg; + range_tree_t *vcsa_obsolete_segs; +} vdev_copy_segment_arg_t; + +static void +unalloc_seg(void *arg, uint64_t start, uint64_t size) +{ + vdev_copy_segment_arg_t *vcsa = arg; + spa_t *spa = vcsa->vcsa_spa; + blkptr_t bp = { 0 }; + + BP_SET_BIRTH(&bp, TXG_INITIAL, TXG_INITIAL); + BP_SET_LSIZE(&bp, size); + BP_SET_PSIZE(&bp, size); + BP_SET_COMPRESS(&bp, ZIO_COMPRESS_OFF); + BP_SET_CHECKSUM(&bp, ZIO_CHECKSUM_OFF); + BP_SET_TYPE(&bp, DMU_OT_NONE); + BP_SET_LEVEL(&bp, 0); + BP_SET_DEDUP(&bp, 0); + BP_SET_BYTEORDER(&bp, ZFS_HOST_BYTEORDER); + + DVA_SET_VDEV(&bp.blk_dva[0], DVA_GET_VDEV(vcsa->vcsa_dest_dva)); + DVA_SET_OFFSET(&bp.blk_dva[0], + DVA_GET_OFFSET(vcsa->vcsa_dest_dva) + start); + DVA_SET_ASIZE(&bp.blk_dva[0], size); + + zio_free(spa, vcsa->vcsa_txg, &bp); +} + /* * All reads and writes associated with a call to spa_vdev_copy_segment() * are done. */ static void -spa_vdev_copy_nullzio_done(zio_t *zio) +spa_vdev_copy_segment_done(zio_t *zio) { + vdev_copy_segment_arg_t *vcsa = zio->io_private; + + range_tree_vacate(vcsa->vcsa_obsolete_segs, + unalloc_seg, vcsa); + range_tree_destroy(vcsa->vcsa_obsolete_segs); + kmem_free(vcsa, sizeof (*vcsa)); + spa_config_exit(zio->io_spa, SCL_STATE, zio->io_spa); } @@ -849,7 +906,8 @@ spa_vdev_copy_one_child(vdev_copy_arg_t *vca, zio_t *n * read from the old location and write to the new location. */ static int -spa_vdev_copy_segment(vdev_t *vd, uint64_t start, uint64_t size, uint64_t txg, +spa_vdev_copy_segment(vdev_t *vd, range_tree_t *segs, + uint64_t maxalloc, uint64_t txg, vdev_copy_arg_t *vca, zio_alloc_list_t *zal) { metaslab_group_t *mg = vd->vdev_mg; @@ -857,9 +915,40 @@ spa_vdev_copy_segment(vdev_t *vd, uint64_t start, uint spa_vdev_removal_t *svr = spa->spa_vdev_removal; vdev_indirect_mapping_entry_t *entry; dva_t dst = { 0 }; + uint64_t start = range_tree_min(segs); - ASSERT3U(size, <=, SPA_MAXBLOCKSIZE); + ASSERT3U(maxalloc, <=, SPA_MAXBLOCKSIZE); + uint64_t size = range_tree_span(segs); + if (range_tree_span(segs) > maxalloc) { + /* + * We can't allocate all the segments. Prefer to end + * the allocation at the end of a segment, thus avoiding + * additional split blocks. + */ + range_seg_t search; + avl_index_t where; + search.rs_start = start + maxalloc; + search.rs_end = search.rs_start; + range_seg_t *rs = avl_find(&segs->rt_root, &search, &where); + if (rs == NULL) { + rs = avl_nearest(&segs->rt_root, where, AVL_BEFORE); + } else { + rs = AVL_PREV(&segs->rt_root, rs); + } + if (rs != NULL) { + size = rs->rs_end - start; + } else { + /* + * There are no segments that end before maxalloc. + * I.e. the first segment is larger than maxalloc, + * so we must split it. + */ + size = maxalloc; + } + } + ASSERT3U(size, <=, maxalloc); + /* * We use allocator 0 for this I/O because we don't expect device remap * to be the steady state of the system, so parallelizing is not as @@ -873,6 +962,31 @@ spa_vdev_copy_segment(vdev_t *vd, uint64_t start, uint return (error); /* + * Determine the ranges that are not actually needed. Offsets are + * relative to the start of the range to be copied (i.e. relative to the + * local variable "start"). + */ + range_tree_t *obsolete_segs = range_tree_create(NULL, NULL); + + range_seg_t *rs = avl_first(&segs->rt_root); + ASSERT3U(rs->rs_start, ==, start); + uint64_t prev_seg_end = rs->rs_end; + while ((rs = AVL_NEXT(&segs->rt_root, rs)) != NULL) { + if (rs->rs_start >= start + size) { + break; + } else { + range_tree_add(obsolete_segs, + prev_seg_end - start, + rs->rs_start - prev_seg_end); + } + prev_seg_end = rs->rs_end; + } + /* We don't end in the middle of an obsolete range */ + ASSERT3U(start + size, <=, prev_seg_end); + + range_tree_clear(segs, start, size); + + /* * We can't have any padding of the allocated size, otherwise we will * misunderstand what's allocated, and the size of the mapping. * The caller ensures this will be true by passing in a size that is @@ -883,13 +997,22 @@ spa_vdev_copy_segment(vdev_t *vd, uint64_t start, uint entry = kmem_zalloc(sizeof (vdev_indirect_mapping_entry_t), KM_SLEEP); DVA_MAPPING_SET_SRC_OFFSET(&entry->vime_mapping, start); entry->vime_mapping.vimep_dst = dst; + if (spa_feature_is_enabled(spa, SPA_FEATURE_OBSOLETE_COUNTS)) { + entry->vime_obsolete_count = range_tree_space(obsolete_segs); + } + vdev_copy_segment_arg_t *vcsa = kmem_zalloc(sizeof (*vcsa), KM_SLEEP); + vcsa->vcsa_dest_dva = &entry->vime_mapping.vimep_dst; + vcsa->vcsa_obsolete_segs = obsolete_segs; + vcsa->vcsa_spa = spa; + vcsa->vcsa_txg = txg; + /* * See comment before spa_vdev_copy_one_child(). */ spa_config_enter(spa, SCL_STATE, spa, RW_READER); zio_t *nzio = zio_null(spa->spa_txg_zio[txg & TXG_MASK], spa, NULL, - spa_vdev_copy_nullzio_done, NULL, 0); + spa_vdev_copy_segment_done, vcsa, 0); vdev_t *dest_vd = vdev_lookup_top(spa, DVA_GET_VDEV(&dst)); if (dest_vd->vdev_ops == &vdev_mirror_ops) { for (int i = 0; i < dest_vd->vdev_children; i++) { @@ -1092,39 +1215,78 @@ spa_vdev_copy_impl(vdev_t *vd, spa_vdev_removal_t *svr mutex_enter(&svr->svr_lock); - range_seg_t *rs = avl_first(&svr->svr_allocd_segs->rt_root); - if (rs == NULL) { + /* + * Determine how big of a chunk to copy. We can allocate up + * to max_alloc bytes, and we can span up to vdev_removal_max_span + * bytes of unallocated space at a time. "segs" will track the + * allocated segments that we are copying. We may also be copying + * free segments (of up to vdev_removal_max_span bytes). + */ + range_tree_t *segs = range_tree_create(NULL, NULL); + for (;;) { + range_seg_t *rs = avl_first(&svr->svr_allocd_segs->rt_root); + if (rs == NULL) + break; + + uint64_t seg_length; + + if (range_tree_is_empty(segs)) { + /* need to truncate the first seg based on max_alloc */ + seg_length = + MIN(rs->rs_end - rs->rs_start, *max_alloc); + } else { + if (rs->rs_start - range_tree_max(segs) > + vdev_removal_max_span) { + /* + * Including this segment would cause us to + * copy a larger unneeded chunk than is allowed. + */ + break; + } else if (rs->rs_end - range_tree_min(segs) > + *max_alloc) { + /* + * This additional segment would extend past + * max_alloc. Rather than splitting this + * segment, leave it for the next mapping. + */ + break; + } else { + seg_length = rs->rs_end - rs->rs_start; + } + } + + range_tree_add(segs, rs->rs_start, seg_length); + range_tree_remove(svr->svr_allocd_segs, + rs->rs_start, seg_length); + } + + if (range_tree_is_empty(segs)) { mutex_exit(&svr->svr_lock); + range_tree_destroy(segs); return; } - uint64_t offset = rs->rs_start; - uint64_t length = MIN(rs->rs_end - rs->rs_start, *max_alloc); - range_tree_remove(svr->svr_allocd_segs, offset, length); - if (svr->svr_max_offset_to_sync[txg & TXG_MASK] == 0) { dsl_sync_task_nowait(dmu_tx_pool(tx), vdev_mapping_sync, svr, 0, ZFS_SPACE_CHECK_NONE, tx); } - svr->svr_max_offset_to_sync[txg & TXG_MASK] = offset + length; + svr->svr_max_offset_to_sync[txg & TXG_MASK] = range_tree_max(segs); /* * Note: this is the amount of *allocated* space * that we are taking care of each txg. */ - svr->svr_bytes_done[txg & TXG_MASK] += length; + svr->svr_bytes_done[txg & TXG_MASK] += range_tree_space(segs); mutex_exit(&svr->svr_lock); zio_alloc_list_t zal; metaslab_trace_init(&zal); - uint64_t thismax = *max_alloc; - while (length > 0) { - uint64_t mylen = MIN(length, thismax); - + uint64_t thismax = SPA_MAXBLOCKSIZE; + while (!range_tree_is_empty(segs)) { int error = spa_vdev_copy_segment(vd, - offset, mylen, txg, vca, &zal); + segs, thismax, txg, vca, &zal); if (error == ENOSPC) { /* @@ -1138,18 +1300,17 @@ spa_vdev_copy_impl(vdev_t *vd, spa_vdev_removal_t *svr */ ASSERT3U(spa->spa_max_ashift, >=, SPA_MINBLOCKSHIFT); ASSERT3U(spa->spa_max_ashift, ==, spa->spa_min_ashift); - thismax = P2ROUNDUP(mylen / 2, + uint64_t attempted = + MIN(range_tree_span(segs), thismax); + thismax = P2ROUNDUP(attempted / 2, 1 << spa->spa_max_ashift); - ASSERT3U(thismax, <, mylen); /* * The minimum-size allocation can not fail. */ - ASSERT3U(mylen, >, 1 << spa->spa_max_ashift); - *max_alloc = mylen - (1 << spa->spa_max_ashift); + ASSERT3U(attempted, >, 1 << spa->spa_max_ashift); + *max_alloc = attempted - (1 << spa->spa_max_ashift); } else { ASSERT0(error); - length -= mylen; - offset += mylen; /* * We've performed an allocation, so reset the @@ -1160,6 +1321,7 @@ spa_vdev_copy_impl(vdev_t *vd, spa_vdev_removal_t *svr } } metaslab_trace_fini(&zal); + range_tree_destroy(segs); } /* From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:49:33 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7999310C56BA; Wed, 3 Oct 2018 14:49:33 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5103E7E468; Wed, 3 Oct 2018 14:49:33 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 3305B1855D; Wed, 3 Oct 2018 14:49:33 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93EnX63030482; Wed, 3 Oct 2018 14:49:33 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93EnXRh030481; Wed, 3 Oct 2018 14:49:33 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031449.w93EnXRh030481@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:49:33 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339132 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339132 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:49:33 -0000 Author: mav Date: Wed Oct 3 14:49:32 2018 New Revision: 339132 URL: https://svnweb.freebsd.org/changeset/base/339132 Log: MFC r337194: MFV r337193: 9424 ztest failure: "unprotected error in call to Lua API (Invalid value type 'f unction' for key 'error')" illumos/illumos-gate@fe3ba4d1227d8746116ece7240682b13595c3142 Reviewed by: Sebastien Roy Reviewed by: Paul Dagnelie Reviewed by: Don Brady Approved by: Robert Mustacchi Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zcp.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zcp.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zcp.c Wed Oct 3 14:48:55 2018 (r339131) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zcp.c Wed Oct 3 14:49:32 2018 (r339132) @@ -433,7 +433,7 @@ zcp_lua_to_nvlist_impl(lua_State *state, int index, nv /* * Convert a lua value to an nvpair, adding it to an nvlist with the given key. */ -void +static void zcp_lua_to_nvlist(lua_State *state, int index, nvlist_t *nvl, const char *key) { /* @@ -445,7 +445,7 @@ zcp_lua_to_nvlist(lua_State *state, int index, nvlist_ (void) lua_error(state); } -int +static int zcp_lua_to_nvlist_helper(lua_State *state) { nvlist_t *nv = (nvlist_t *)lua_touserdata(state, 2); @@ -454,11 +454,12 @@ zcp_lua_to_nvlist_helper(lua_State *state) return (0); } -void +static void zcp_convert_return_values(lua_State *state, nvlist_t *nvl, const char *key, zcp_eval_arg_t *evalargs) { int err; + VERIFY3U(1, ==, lua_gettop(state)); lua_pushcfunction(state, zcp_lua_to_nvlist_helper); lua_pushlightuserdata(state, (char *)key); lua_pushlightuserdata(state, nvl); @@ -904,6 +905,7 @@ zcp_eval_impl(dmu_tx_t *tx, boolean_t sync, zcp_eval_a ZCP_RET_RETURN, evalargs); } else if (return_count > 1) { evalargs->ea_result = SET_ERROR(ECHRNG); + lua_settop(state, 0); (void) lua_pushfstring(state, "Multiple return " "values not supported"); zcp_convert_return_values(state, evalargs->ea_outnvl, @@ -965,6 +967,7 @@ static void zcp_pool_error(zcp_eval_arg_t *evalargs, const char *poolname) { evalargs->ea_result = SET_ERROR(ECHRNG); + lua_settop(evalargs->ea_state, 0); (void) lua_pushfstring(evalargs->ea_state, "Could not open pool: %s", poolname); zcp_convert_return_values(evalargs->ea_state, evalargs->ea_outnvl, From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:50:07 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9C15410C5780; Wed, 3 Oct 2018 14:50:07 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5168B7E5EC; Wed, 3 Oct 2018 14:50:07 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4C33A18565; Wed, 3 Oct 2018 14:50:07 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93Eo7AD030588; Wed, 3 Oct 2018 14:50:07 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93Eo7SV030587; Wed, 3 Oct 2018 14:50:07 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031450.w93Eo7SV030587@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:50:07 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339133 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339133 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:50:07 -0000 Author: mav Date: Wed Oct 3 14:50:06 2018 New Revision: 339133 URL: https://svnweb.freebsd.org/changeset/base/339133 Log: MFC r337196: MFV r337195: 9454 ::zfs_blkstats should count embedded blocks illumos/illumos-gate@dec267e7ea9828898b1c64462daa6636c4ef5e29 Reviewed by: Dan Kimmel Reviewed by: George Wilson Approved by: Robert Mustacchi Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Oct 3 14:49:32 2018 (r339132) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Oct 3 14:50:06 2018 (r339133) @@ -3552,14 +3552,14 @@ dsl_scan_scrub_cb(dsl_pool_t *dp, int zio_flags = ZIO_FLAG_SCAN_THREAD | ZIO_FLAG_RAW | ZIO_FLAG_CANFAIL; int d; + count_block(dp->dp_blkstats, bp); + if (phys_birth <= scn->scn_phys.scn_min_txg || phys_birth >= scn->scn_phys.scn_max_txg) return (0); - if (BP_IS_EMBEDDED(bp)) { - count_block(scn, dp->dp_blkstats, bp); - return (0); - } + /* Embedded BP's have phys_birth==0, so we reject them above. */ + ASSERT(!BP_IS_EMBEDDED(bp)); ASSERT(DSL_SCAN_IS_SCRUB_RESILVER(scn)); if (scn->scn_phys.scn_func == POOL_SCAN_SCRUB) { From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:50:41 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 074DB10C57E9; Wed, 3 Oct 2018 14:50:41 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B1A437E76A; Wed, 3 Oct 2018 14:50:40 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id AC8D61857C; Wed, 3 Oct 2018 14:50:40 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93Eoeg7031368; Wed, 3 Oct 2018 14:50:40 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93Eoews031367; Wed, 3 Oct 2018 14:50:40 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031450.w93Eoews031367@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:50:40 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339134 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339134 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:50:41 -0000 Author: mav Date: Wed Oct 3 14:50:40 2018 New Revision: 339134 URL: https://svnweb.freebsd.org/changeset/base/339134 Log: MFC r337198: MFV r337197: 9456 ztest failure in zil_commit_waiter_timeout illumos/illumos-gate@b6031810da58df96413bf76e068638fcab1f228a Reviewed by: Matt Ahrens Reviewed by: Serapheim Dimitropoulos Approved by: Matt Ahrens Author: Prakash Surya Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c Wed Oct 3 14:50:06 2018 (r339133) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c Wed Oct 3 14:50:40 2018 (r339134) @@ -2299,7 +2299,7 @@ zil_commit_waiter_timeout(zilog_t *zilog, zil_commit_w */ lwb_t *nlwb = zil_lwb_write_issue(zilog, lwb); - ASSERT3S(lwb->lwb_state, !=, LWB_STATE_OPENED); + IMPLY(nlwb != NULL, lwb->lwb_state != LWB_STATE_OPENED); /* * Since the lwb's zio hadn't been issued by the time this thread From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:51:17 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1C4D110C59C0; Wed, 3 Oct 2018 14:51:17 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C60707EA44; Wed, 3 Oct 2018 14:51:16 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id C102E185A6; Wed, 3 Oct 2018 14:51:16 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93EpG9t031449; Wed, 3 Oct 2018 14:51:16 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93EpGC5031448; Wed, 3 Oct 2018 14:51:16 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031451.w93EpGC5031448@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:51:16 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339135 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339135 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:51:17 -0000 Author: mav Date: Wed Oct 3 14:51:16 2018 New Revision: 339135 URL: https://svnweb.freebsd.org/changeset/base/339135 Log: MFC r337201: Fix build after r337196 mismerge. Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Oct 3 14:50:40 2018 (r339134) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Oct 3 14:51:16 2018 (r339135) @@ -3552,7 +3552,7 @@ dsl_scan_scrub_cb(dsl_pool_t *dp, int zio_flags = ZIO_FLAG_SCAN_THREAD | ZIO_FLAG_RAW | ZIO_FLAG_CANFAIL; int d; - count_block(dp->dp_blkstats, bp); + count_block(scn, dp->dp_blkstats, bp); if (phys_birth <= scn->scn_phys.scn_min_txg || phys_birth >= scn->scn_phys.scn_max_txg) From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:51:51 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1DB9C10C5A44; Wed, 3 Oct 2018 14:51:51 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C7CE07F117; Wed, 3 Oct 2018 14:51:50 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id C29C3185D2; Wed, 3 Oct 2018 14:51:50 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93EpoxY033010; Wed, 3 Oct 2018 14:51:50 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93Epoq9033006; Wed, 3 Oct 2018 14:51:50 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031451.w93Epoq9033006@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:51:50 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339136 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339136 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:51:51 -0000 Author: mav Date: Wed Oct 3 14:51:49 2018 New Revision: 339136 URL: https://svnweb.freebsd.org/changeset/base/339136 Log: MFC r337202: MFV r337200: 9438 Holes can lose birth time info if a block has a mix of birth times Ultimately, the problem here is that when you truncate and write a file in the same transaction group, the dbuf for the indirect block will be zeroed out to deal with the truncation, and then written for the write. During this process, we will lose hole birth time information for any holes in the range. In the case where a dnode is being freed, we need to determine whether the block should be converted to a higher-level hole in the zio pipeline, and if so do it when the dnode is being synced out. illumos/illumos-gate@738e2a3ce3b2579222d6855e7fe75b5bcfcddf8d Reviewed by: Matt Ahrens Reviewed by: George Wilson Approved by: Robert Mustacchi Author: Paul Dagnelie Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c Wed Oct 3 14:51:16 2018 (r339135) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c Wed Oct 3 14:51:49 2018 (r339136) @@ -167,6 +167,10 @@ dmu_object_free(objset_t *os, uint64_t object, dmu_tx_ return (err); ASSERT(dn->dn_type != DMU_OT_NONE); + /* + * If we don't create this free range, we'll leak indirect blocks when + * we get to freeing the dnode in syncing context. + */ dnode_free_range(dn, 0, DMU_OBJECT_END, tx); dnode_free(dn, tx); dnode_rele(dn, FTAG); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Wed Oct 3 14:51:16 2018 (r339135) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Wed Oct 3 14:51:49 2018 (r339136) @@ -1518,6 +1518,72 @@ dnode_dirty_l1(dnode_t *dn, uint64_t l1blkid, dmu_tx_t } } +/* + * Dirty all the in-core level-1 dbufs in the range specified by start_blkid + * and end_blkid. + */ +static void +dnode_dirty_l1range(dnode_t *dn, uint64_t start_blkid, uint64_t end_blkid, + dmu_tx_t *tx) +{ + dmu_buf_impl_t db_search; + dmu_buf_impl_t *db; + avl_index_t where; + + mutex_enter(&dn->dn_dbufs_mtx); + + db_search.db_level = 1; + db_search.db_blkid = start_blkid + 1; + db_search.db_state = DB_SEARCH; + for (;;) { + + db = avl_find(&dn->dn_dbufs, &db_search, &where); + if (db == NULL) + db = avl_nearest(&dn->dn_dbufs, where, AVL_AFTER); + + if (db == NULL || db->db_level != 1 || + db->db_blkid >= end_blkid) { + break; + } + + /* + * Setup the next blkid we want to search for. + */ + db_search.db_blkid = db->db_blkid + 1; + ASSERT3U(db->db_blkid, >=, start_blkid); + + /* + * If the dbuf transitions to DB_EVICTING while we're trying + * to dirty it, then we will be unable to discover it in + * the dbuf hash table. This will result in a call to + * dbuf_create() which needs to acquire the dn_dbufs_mtx + * lock. To avoid a deadlock, we drop the lock before + * dirtying the level-1 dbuf. + */ + mutex_exit(&dn->dn_dbufs_mtx); + dnode_dirty_l1(dn, db->db_blkid, tx); + mutex_enter(&dn->dn_dbufs_mtx); + } + +#ifdef ZFS_DEBUG + /* + * Walk all the in-core level-1 dbufs and verify they have been dirtied. + */ + db_search.db_level = 1; + db_search.db_blkid = start_blkid + 1; + db_search.db_state = DB_SEARCH; + db = avl_find(&dn->dn_dbufs, &db_search, &where); + if (db == NULL) + db = avl_nearest(&dn->dn_dbufs, where, AVL_AFTER); + for (; db != NULL; db = AVL_NEXT(&dn->dn_dbufs, db)) { + if (db->db_level != 1 || db->db_blkid >= end_blkid) + break; + ASSERT(db->db_dirtycnt > 0); + } +#endif + mutex_exit(&dn->dn_dbufs_mtx); +} + void dnode_free_range(dnode_t *dn, uint64_t off, uint64_t len, dmu_tx_t *tx) { @@ -1668,6 +1734,8 @@ dnode_free_range(dnode_t *dn, uint64_t off, uint64_t l last = (blkid + nblks - 1) >> epbs; if (last != first) dnode_dirty_l1(dn, last, tx); + + dnode_dirty_l1range(dn, first, last, tx); int shift = dn->dn_datablkshift + dn->dn_indblkshift - SPA_BLKPTRSHIFT; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c Wed Oct 3 14:51:16 2018 (r339135) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c Wed Oct 3 14:51:49 2018 (r339136) @@ -229,9 +229,24 @@ free_verify(dmu_buf_impl_t *db, uint64_t start, uint64 } #endif +/* + * We don't usually free the indirect blocks here. If in one txg we have a + * free_range and a write to the same indirect block, it's important that we + * preserve the hole's birth times. Therefore, we don't free any any indirect + * blocks in free_children(). If an indirect block happens to turn into all + * holes, it will be freed by dbuf_write_children_ready, which happens at a + * point in the syncing process where we know for certain the contents of the + * indirect block. + * + * However, if we're freeing a dnode, its space accounting must go to zero + * before we actually try to free the dnode, or we will trip an assertion. In + * addition, we know the case described above cannot occur, because the dnode is + * being freed. Therefore, we free the indirect blocks immediately in that + * case. + */ static void free_children(dmu_buf_impl_t *db, uint64_t blkid, uint64_t nblks, - dmu_tx_t *tx) + boolean_t free_indirects, dmu_tx_t *tx) { dnode_t *dn; blkptr_t *bp; @@ -283,32 +298,16 @@ free_children(dmu_buf_impl_t *db, uint64_t blkid, uint rw_exit(&dn->dn_struct_rwlock); ASSERT3P(bp, ==, subdb->db_blkptr); - free_children(subdb, blkid, nblks, tx); + free_children(subdb, blkid, nblks, free_indirects, tx); dbuf_rele(subdb, FTAG); } } - /* If this whole block is free, free ourself too. */ - for (i = 0, bp = db->db.db_data; i < 1 << epbs; i++, bp++) { - if (!BP_IS_HOLE(bp)) - break; - } - if (i == 1 << epbs) { - /* - * We only found holes. Grab the rwlock to prevent - * anybody from reading the blocks we're about to - * zero out. - */ - rw_enter(&dn->dn_struct_rwlock, RW_WRITER); + if (free_indirects) { + for (i = 0, bp = db->db.db_data; i < 1 << epbs; i++, bp++) + ASSERT(BP_IS_HOLE(bp)); bzero(db->db.db_data, db->db.db_size); - rw_exit(&dn->dn_struct_rwlock); free_blocks(dn, db->db_blkptr, 1, tx); - } else { - /* - * Partial block free; must be marked dirty so that it - * will be written out. - */ - ASSERT(db->db_dirtycnt > 0); } DB_DNODE_EXIT(db); @@ -321,7 +320,7 @@ free_children(dmu_buf_impl_t *db, uint64_t blkid, uint */ static void dnode_sync_free_range_impl(dnode_t *dn, uint64_t blkid, uint64_t nblks, - dmu_tx_t *tx) + boolean_t free_indirects, dmu_tx_t *tx) { blkptr_t *bp = dn->dn_phys->dn_blkptr; int dnlevel = dn->dn_phys->dn_nlevels; @@ -361,7 +360,7 @@ dnode_sync_free_range_impl(dnode_t *dn, uint64_t blkid TRUE, FALSE, FTAG, &db)); rw_exit(&dn->dn_struct_rwlock); - free_children(db, blkid, nblks, tx); + free_children(db, blkid, nblks, free_indirects, tx); dbuf_rele(db, FTAG); } } @@ -380,6 +379,7 @@ dnode_sync_free_range_impl(dnode_t *dn, uint64_t blkid typedef struct dnode_sync_free_range_arg { dnode_t *dsfra_dnode; dmu_tx_t *dsfra_tx; + boolean_t dsfra_free_indirects; } dnode_sync_free_range_arg_t; static void @@ -389,7 +389,8 @@ dnode_sync_free_range(void *arg, uint64_t blkid, uint6 dnode_t *dn = dsfra->dsfra_dnode; mutex_exit(&dn->dn_mtx); - dnode_sync_free_range_impl(dn, blkid, nblks, dsfra->dsfra_tx); + dnode_sync_free_range_impl(dn, blkid, nblks, + dsfra->dsfra_free_indirects, dsfra->dsfra_tx); mutex_enter(&dn->dn_mtx); } @@ -670,6 +671,11 @@ dnode_sync(dnode_t *dn, dmu_tx_t *tx) dnode_sync_free_range_arg_t dsfra; dsfra.dsfra_dnode = dn; dsfra.dsfra_tx = tx; + dsfra.dsfra_free_indirects = freeing_dnode; + if (freeing_dnode) { + ASSERT(range_tree_contains(dn->dn_free_ranges[txgoff], + 0, dn->dn_maxblkid + 1)); + } mutex_enter(&dn->dn_mtx); range_tree_vacate(dn->dn_free_ranges[txgoff], dnode_sync_free_range, &dsfra); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c Wed Oct 3 14:51:16 2018 (r339135) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c Wed Oct 3 14:51:49 2018 (r339136) @@ -1691,7 +1691,8 @@ zfs_trunc(znode_t *zp, uint64_t end) return (0); } - error = dmu_free_long_range(zfsvfs->z_os, zp->z_id, end, -1); + error = dmu_free_long_range(zfsvfs->z_os, zp->z_id, end, + DMU_OBJECT_END); if (error) { zfs_range_unlock(rl); return (error); From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:52:36 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7D7A710C5AE0; Wed, 3 Oct 2018 14:52:36 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 32EF97F35D; Wed, 3 Oct 2018 14:52:36 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 13D47186FF; Wed, 3 Oct 2018 14:52:36 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93EqZH1035307; Wed, 3 Oct 2018 14:52:35 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93EqZLI035306; Wed, 3 Oct 2018 14:52:35 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031452.w93EqZLI035306@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:52:35 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339137 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339137 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:52:36 -0000 Author: mav Date: Wed Oct 3 14:52:35 2018 New Revision: 339137 URL: https://svnweb.freebsd.org/changeset/base/339137 Log: MFC r337205: MFV r337204: 9439 ZFS double-free due to failure to dirty indirect block illumos/illumos-gate@99a19144e82244f3426f055cc73af8a937c0135c Reviewed by: George Wilson Reviewed by: Paul Dagnelie Approved by: Robert Mustacchi Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Wed Oct 3 14:51:49 2018 (r339136) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Wed Oct 3 14:52:35 2018 (r339137) @@ -1616,13 +1616,11 @@ dnode_free_range(dnode_t *dn, uint64_t off, uint64_t l if (off == 0 && len >= blksz) { /* * Freeing the whole block; fast-track this request. - * Note that we won't dirty any indirect blocks, - * which is fine because we will be freeing the entire - * file and thus all indirect blocks will be freed - * by free_children(). */ blkid = 0; nblks = 1; + if (dn->dn_nlevels > 1) + dnode_dirty_l1(dn, 0, tx); goto done; } else if (off >= blksz) { /* Freeing past end-of-data */ Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c Wed Oct 3 14:51:49 2018 (r339136) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c Wed Oct 3 14:52:35 2018 (r339137) @@ -263,6 +263,24 @@ free_children(dmu_buf_impl_t *db, uint64_t blkid, uint if (db->db_state != DB_CACHED) (void) dbuf_read(db, NULL, DB_RF_MUST_SUCCEED); + /* + * If we modify this indirect block, and we are not freeing the + * dnode (!free_indirects), then this indirect block needs to get + * written to disk by dbuf_write(). If it is dirty, we know it will + * be written (otherwise, we would have incorrect on-disk state + * because the space would be freed but still referenced by the BP + * in this indirect block). Therefore we VERIFY that it is + * dirty. + * + * Our VERIFY covers some cases that do not actually have to be + * dirty, but the open-context code happens to dirty. E.g. if the + * blocks we are freeing are all holes, because in that case, we + * are only freeing part of this indirect block, so it is an + * ancestor of the first or last block to be freed. The first and + * last L1 indirect blocks are always dirtied by dnode_free_range(). + */ + VERIFY(BP_GET_FILL(db->db_blkptr) == 0 || db->db_dirtycnt > 0); + dbuf_release_bp(db); bp = db->db.db_data; From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:53:08 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0EF9E10C5B6C; Wed, 3 Oct 2018 14:53:08 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B6B5A7F4D0; Wed, 3 Oct 2018 14:53:07 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id B173C18701; Wed, 3 Oct 2018 14:53:07 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93Er7CG035388; Wed, 3 Oct 2018 14:53:07 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93Er7wW035387; Wed, 3 Oct 2018 14:53:07 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031453.w93Er7wW035387@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:53:07 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339138 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339138 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:53:08 -0000 Author: mav Date: Wed Oct 3 14:53:07 2018 New Revision: 339138 URL: https://svnweb.freebsd.org/changeset/base/339138 Log: MFC r337207: MFV r337206: 9338 moved dnode has incorrect dn_next_type illumos/illumos-gate@c7fbe46df966ea665df63b6e6071808987e839d1 Reviewed by: Prashanth Sreenivasa Reviewed by: Serapheim Dimitropoulos Reviewed by: Dan Kimmel Approved by: Robert Mustacchi Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Wed Oct 3 14:52:35 2018 (r339137) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Wed Oct 3 14:53:07 2018 (r339138) @@ -742,6 +742,8 @@ dnode_move_impl(dnode_t *odn, dnode_t *ndn) ndn->dn_datablkszsec = odn->dn_datablkszsec; ndn->dn_datablksz = odn->dn_datablksz; ndn->dn_maxblkid = odn->dn_maxblkid; + bcopy(&odn->dn_next_type[0], &ndn->dn_next_type[0], + sizeof (odn->dn_next_type)); bcopy(&odn->dn_next_nblkptr[0], &ndn->dn_next_nblkptr[0], sizeof (odn->dn_next_nblkptr)); bcopy(&odn->dn_next_nlevels[0], &ndn->dn_next_nlevels[0], From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:53:53 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CAD7410C5C3D; Wed, 3 Oct 2018 14:53:52 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 809627F701; Wed, 3 Oct 2018 14:53:52 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 61B5918704; Wed, 3 Oct 2018 14:53:52 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93ErqQG035576; Wed, 3 Oct 2018 14:53:52 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93ErqR1035574; Wed, 3 Oct 2018 14:53:52 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031453.w93ErqR1035574@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:53:52 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339139 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339139 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:53:53 -0000 Author: mav Date: Wed Oct 3 14:53:51 2018 New Revision: 339139 URL: https://svnweb.freebsd.org/changeset/base/339139 Log: MFC r337209: MFV r337208: 9591 ms_shift can be incorrectly changed in MOS config for indirect vdevs that have been historically expanded illumos/illumos-gate@11f6a9680e013a7c9c57dc0b64d3e91e2eee1a6b Reviewed by: Matthew Ahrens Reviewed by: George Wilson Reviewed by: John Kennedy Reviewed by: Prashanth Sreenivasa Reviewed by: Tim Chase Approved by: Richard Lowe Author: Serapheim Dimitropoulos Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c Wed Oct 3 14:53:07 2018 (r339138) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c Wed Oct 3 14:53:51 2018 (r339139) @@ -22,7 +22,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. - * Copyright (c) 2011, 2015 by Delphix. All rights reserved. + * Copyright (c) 2011, 2018 by Delphix. All rights reserved. * Copyright 2017 Joyent, Inc. */ @@ -559,6 +559,18 @@ spa_config_update(spa_t *spa, int what) */ for (c = 0; c < rvd->vdev_children; c++) { vdev_t *tvd = rvd->vdev_child[c]; + + /* + * Explicitly skip vdevs that are indirect or + * log vdevs that are being removed. The reason + * is that both of those can have vdev_ms_array + * set to 0 and we wouldn't want to change their + * metaslab size nor call vdev_expand() on them. + */ + if (!vdev_is_concrete(tvd) || + (tvd->vdev_islog && tvd->vdev_removing)) + continue; + if (tvd->vdev_ms_array == 0) { vdev_ashift_optimize(tvd); vdev_metaslab_set_size(tvd); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Oct 3 14:53:07 2018 (r339138) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Oct 3 14:53:51 2018 (r339139) @@ -4284,11 +4284,11 @@ vdev_expand(vdev_t *vd, uint64_t txg) { ASSERT(vd->vdev_top == vd); ASSERT(spa_config_held(vd->vdev_spa, SCL_ALL, RW_WRITER) == SCL_ALL); + ASSERT(vdev_is_concrete(vd)); vdev_set_deflate_ratio(vd); - if ((vd->vdev_asize >> vd->vdev_ms_shift) > vd->vdev_ms_count && - vdev_is_concrete(vd)) { + if ((vd->vdev_asize >> vd->vdev_ms_shift) > vd->vdev_ms_count) { VERIFY(vdev_metaslab_init(vd, txg) == 0); vdev_config_dirty(vd); } From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:54:50 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D589B10C5D0A; Wed, 3 Oct 2018 14:54:49 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8A8AE7F894; Wed, 3 Oct 2018 14:54:49 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 857FC18706; Wed, 3 Oct 2018 14:54:49 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93Esn0q035677; Wed, 3 Oct 2018 14:54:49 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93EsmlA035672; Wed, 3 Oct 2018 14:54:48 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031454.w93EsmlA035672@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:54:48 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339140 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 339140 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:54:50 -0000 Author: mav Date: Wed Oct 3 14:54:48 2018 New Revision: 339140 URL: https://svnweb.freebsd.org/changeset/base/339140 Log: MFC r337211: MFV r337210: 9577 remove zfs_dbuf_evict_key tsd The zfs_dbuf_evict_key TSD (thread-specific data) is not necessary - we can instead pass a flag down in a few places to prevent recursive dbuf eviction. Making this change has 3 benefits: 1. The code semantics are easier to understand. 2. On Linux, performance is improved, because creating/removing TSD values (by setting to NULL vs non-NULL) is expensive, and we do it very often. 3. According to Nexenta, the current semantics can cause a deadlock when concurrently calling dmu_objset_evict_dbufs() (which is rare today, but they are working on a "parallel unmount" change that triggers this more easily) illumos/illumos-gate@c2919acbea007fa95c709b60d073db9a24526e01 Reviewed by: George Wilson Reviewed by: Serapheim Dimitropoulos Reviewed by: Brian Behlendorf Reviewed by: Andy Stormont Approved by: Richard Lowe Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Oct 3 14:53:51 2018 (r339139) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Oct 3 14:54:48 2018 (r339140) @@ -51,8 +51,6 @@ #include #include -uint_t zfs_dbuf_evict_key; - static boolean_t dbuf_undirty(dmu_buf_impl_t *db, dmu_tx_t *tx); static void dbuf_write(dbuf_dirty_record_t *dr, arc_buf_t *data, dmu_tx_t *tx); @@ -534,14 +532,6 @@ dbuf_evict_one(void) ASSERT(!MUTEX_HELD(&dbuf_evict_lock)); - /* - * Set the thread's tsd to indicate that it's processing evictions. - * Once a thread stops evicting from the dbuf cache it will - * reset its tsd to NULL. - */ - ASSERT3P(tsd_get(zfs_dbuf_evict_key), ==, NULL); - (void) tsd_set(zfs_dbuf_evict_key, (void *)B_TRUE); - dmu_buf_impl_t *db = multilist_sublist_tail(mls); while (db != NULL && mutex_tryenter(&db->db_mtx) == 0) { db = multilist_sublist_prev(mls, db); @@ -561,7 +551,6 @@ dbuf_evict_one(void) } else { multilist_sublist_unlock(mls); } - (void) tsd_set(zfs_dbuf_evict_key, NULL); } /* @@ -615,30 +604,7 @@ dbuf_evict_thread(void *unused __unused) static void dbuf_evict_notify(void) { - /* - * We use thread specific data to track when a thread has - * started processing evictions. This allows us to avoid deeply - * nested stacks that would have a call flow similar to this: - * - * dbuf_rele()-->dbuf_rele_and_unlock()-->dbuf_evict_notify() - * ^ | - * | | - * +-----dbuf_destroy()<--dbuf_evict_one()<--------+ - * - * The dbuf_eviction_thread will always have its tsd set until - * that thread exits. All other threads will only set their tsd - * if they are participating in the eviction process. This only - * happens if the eviction thread is unable to process evictions - * fast enough. To keep the dbuf cache size in check, other threads - * can evict from the dbuf cache directly. Those threads will set - * their tsd values so that we ensure that they only evict one dbuf - * from the dbuf cache. - */ - if (tsd_get(zfs_dbuf_evict_key) != NULL) - return; - - /* * We check if we should evict without holding the dbuf_evict_lock, * because it's OK to occasionally make the wrong decision here, * and grabbing the lock results in massive lock contention. @@ -714,7 +680,6 @@ retry: refcount_create(&dbuf_caches[dcs].size); } - tsd_create(&zfs_dbuf_evict_key, NULL); dbuf_evict_thread_exit = B_FALSE; mutex_init(&dbuf_evict_lock, NULL, MUTEX_DEFAULT, NULL); cv_init(&dbuf_evict_cv, NULL, CV_DEFAULT, NULL); @@ -741,7 +706,6 @@ dbuf_fini(void) cv_wait(&dbuf_evict_cv, &dbuf_evict_lock); } mutex_exit(&dbuf_evict_lock); - tsd_destroy(&zfs_dbuf_evict_key); mutex_destroy(&dbuf_evict_lock); cv_destroy(&dbuf_evict_cv); @@ -1026,7 +990,7 @@ dbuf_read_done(zio_t *zio, const zbookmark_phys_t *zb, db->db_state = DB_CACHED; } cv_broadcast(&db->db_changed); - dbuf_rele_and_unlock(db, NULL); + dbuf_rele_and_unlock(db, NULL, B_FALSE); } static void @@ -2186,7 +2150,8 @@ dbuf_destroy(dmu_buf_impl_t *db) * value in dnode_move(), since DB_DNODE_EXIT doesn't actually * release any lock. */ - dnode_rele(dn, db); + mutex_enter(&dn->dn_mtx); + dnode_rele_and_unlock(dn, db, B_TRUE); db->db_dnode_handle = NULL; dbuf_hash_remove(db); @@ -2213,8 +2178,10 @@ dbuf_destroy(dmu_buf_impl_t *db) * If this dbuf is referenced from an indirect dbuf, * decrement the ref count on the indirect dbuf. */ - if (parent && parent != dndb) - dbuf_rele(parent, db); + if (parent && parent != dndb) { + mutex_enter(&parent->db_mtx); + dbuf_rele_and_unlock(parent, db, B_TRUE); + } } /* @@ -2846,7 +2813,7 @@ void dbuf_rele(dmu_buf_impl_t *db, void *tag) { mutex_enter(&db->db_mtx); - dbuf_rele_and_unlock(db, tag); + dbuf_rele_and_unlock(db, tag, B_FALSE); } void @@ -2857,10 +2824,19 @@ dmu_buf_rele(dmu_buf_t *db, void *tag) /* * dbuf_rele() for an already-locked dbuf. This is necessary to allow - * db_dirtycnt and db_holds to be updated atomically. + * db_dirtycnt and db_holds to be updated atomically. The 'evicting' + * argument should be set if we are already in the dbuf-evicting code + * path, in which case we don't want to recursively evict. This allows us to + * avoid deeply nested stacks that would have a call flow similar to this: + * + * dbuf_rele()-->dbuf_rele_and_unlock()-->dbuf_evict_notify() + * ^ | + * | | + * +-----dbuf_destroy()<--dbuf_evict_one()<--------+ + * */ void -dbuf_rele_and_unlock(dmu_buf_impl_t *db, void *tag) +dbuf_rele_and_unlock(dmu_buf_impl_t *db, void *tag, boolean_t evicting) { int64_t holds; @@ -2963,7 +2939,8 @@ dbuf_rele_and_unlock(dmu_buf_impl_t *db, void *tag) db->db.db_size, db); mutex_exit(&db->db_mtx); - if (db->db_caching_status == DB_DBUF_CACHE) { + if (db->db_caching_status == DB_DBUF_CACHE && + !evicting) { dbuf_evict_notify(); } } @@ -3230,7 +3207,7 @@ dbuf_sync_leaf(dbuf_dirty_record_t *dr, dmu_tx_t *tx) kmem_free(dr, sizeof (dbuf_dirty_record_t)); ASSERT(db->db_dirtycnt > 0); db->db_dirtycnt -= 1; - dbuf_rele_and_unlock(db, (void *)(uintptr_t)txg); + dbuf_rele_and_unlock(db, (void *)(uintptr_t)txg, B_FALSE); return; } @@ -3580,7 +3557,7 @@ dbuf_write_done(zio_t *zio, arc_buf_t *buf, void *vdb) ASSERT(db->db_dirtycnt > 0); db->db_dirtycnt -= 1; db->db_data_pending = NULL; - dbuf_rele_and_unlock(db, (void *)(uintptr_t)tx->tx_txg); + dbuf_rele_and_unlock(db, (void *)(uintptr_t)tx->tx_txg, B_FALSE); } static void Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Wed Oct 3 14:53:51 2018 (r339139) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Wed Oct 3 14:54:48 2018 (r339140) @@ -1240,11 +1240,11 @@ void dnode_rele(dnode_t *dn, void *tag) { mutex_enter(&dn->dn_mtx); - dnode_rele_and_unlock(dn, tag); + dnode_rele_and_unlock(dn, tag, B_FALSE); } void -dnode_rele_and_unlock(dnode_t *dn, void *tag) +dnode_rele_and_unlock(dnode_t *dn, void *tag, boolean_t evicting) { uint64_t refs; /* Get while the hold prevents the dnode from moving. */ @@ -1275,7 +1275,8 @@ dnode_rele_and_unlock(dnode_t *dn, void *tag) * that the handle has zero references, but that will be * asserted anyway when the handle gets destroyed. */ - dbuf_rele(db, dnh); + mutex_enter(&db->db_mtx); + dbuf_rele_and_unlock(db, dnh, evicting); } } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c Wed Oct 3 14:53:51 2018 (r339139) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c Wed Oct 3 14:54:48 2018 (r339140) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2012, 2017 by Delphix. All rights reserved. + * Copyright (c) 2012, 2018 by Delphix. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. */ @@ -439,6 +439,19 @@ dnode_evict_dbufs(dnode_t *dn) avl_insert_here(&dn->dn_dbufs, &db_marker, db, AVL_BEFORE); + /* + * We need to use the "marker" dbuf rather than + * simply getting the next dbuf, because + * dbuf_destroy() may actually remove multiple dbufs. + * It can call itself recursively on the parent dbuf, + * which may also be removed from dn_dbufs. The code + * flow would look like: + * + * dbuf_destroy(): + * dnode_rele_and_unlock(parent_dbuf, evicting=TRUE): + * if (!cacheable || pending_evict) + * dbuf_destroy() + */ dbuf_destroy(db); db_next = AVL_NEXT(&dn->dn_dbufs, &db_marker); @@ -497,7 +510,7 @@ dnode_undirty_dbufs(list_t *list) list_destroy(&dr->dt.di.dr_children); } kmem_free(dr, sizeof (dbuf_dirty_record_t)); - dbuf_rele_and_unlock(db, (void *)(uintptr_t)txg); + dbuf_rele_and_unlock(db, (void *)(uintptr_t)txg, B_FALSE); } } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h Wed Oct 3 14:53:51 2018 (r339139) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h Wed Oct 3 14:54:48 2018 (r339140) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2012, 2015 by Delphix. All rights reserved. + * Copyright (c) 2012, 2018 by Delphix. All rights reserved. * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. */ @@ -303,7 +303,7 @@ boolean_t dbuf_try_add_ref(dmu_buf_t *db, objset_t *os uint64_t dbuf_refcount(dmu_buf_impl_t *db); void dbuf_rele(dmu_buf_impl_t *db, void *tag); -void dbuf_rele_and_unlock(dmu_buf_impl_t *db, void *tag); +void dbuf_rele_and_unlock(dmu_buf_impl_t *db, void *tag, boolean_t evicting); dmu_buf_impl_t *dbuf_find(struct objset *os, uint64_t object, uint8_t level, uint64_t blkid); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h Wed Oct 3 14:53:51 2018 (r339139) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h Wed Oct 3 14:54:48 2018 (r339140) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2012, 2017 by Delphix. All rights reserved. + * Copyright (c) 2012, 2018 by Delphix. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. */ @@ -291,7 +291,7 @@ int dnode_hold_impl(struct objset *dd, uint64_t object void *ref, dnode_t **dnp); boolean_t dnode_add_ref(dnode_t *dn, void *ref); void dnode_rele(dnode_t *dn, void *ref); -void dnode_rele_and_unlock(dnode_t *dn, void *tag); +void dnode_rele_and_unlock(dnode_t *dn, void *tag, boolean_t evicting); void dnode_setdirty(dnode_t *dn, dmu_tx_t *tx); void dnode_sync(dnode_t *dn, dmu_tx_t *tx); void dnode_allocate(dnode_t *dn, dmu_object_type_t ot, int blocksize, int ibs, From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:55:38 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9D3F810C5DC5; Wed, 3 Oct 2018 14:55:38 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 524047FBB4; Wed, 3 Oct 2018 14:55:38 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4D31218708; Wed, 3 Oct 2018 14:55:38 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93Etc08035798; Wed, 3 Oct 2018 14:55:38 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93EtbfV035792; Wed, 3 Oct 2018 14:55:37 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031455.w93EtbfV035792@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:55:37 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339141 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 339141 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:55:38 -0000 Author: mav Date: Wed Oct 3 14:55:36 2018 New Revision: 339141 URL: https://svnweb.freebsd.org/changeset/base/339141 Log: MFC r337213: MFV r337212: 9465 ARC check for 'anon_size > arc_c/2' can stall the system illumos/illumos-gate@abe1fd01ce5a83718c5a840daeab4abdaec1c104 Reviewed by: Sebastien Roy Reviewed by: Matt Ahrens Reviewed by: Prashanth Sreenivasa Approved by: Robert Mustacchi Author: Don Brady Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Oct 3 14:54:48 2018 (r339140) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Oct 3 14:55:36 2018 (r339141) @@ -377,6 +377,13 @@ u_int zfs_arc_free_target = 0; /* Absolute min for arc min / max is 16MB. */ static uint64_t arc_abs_min = 16 << 20; +/* + * ARC dirty data constraints for arc_tempreserve_space() throttle + */ +uint_t zfs_arc_dirty_limit_percent = 50; /* total dirty data limit */ +uint_t zfs_arc_anon_limit_percent = 25; /* anon block dirty limit */ +uint_t zfs_arc_pool_dirty_percent = 20; /* each pool's anon allowance */ + boolean_t zfs_compressed_arc_enabled = B_TRUE; static int sysctl_vfs_zfs_arc_free_target(SYSCTL_HANDLER_ARGS); @@ -6337,12 +6344,10 @@ arc_write(zio_t *pio, spa_t *spa, uint64_t txg, blkptr } static int -arc_memory_throttle(uint64_t reserve, uint64_t txg) +arc_memory_throttle(spa_t *spa, uint64_t reserve, uint64_t txg) { #ifdef _KERNEL uint64_t available_memory = ptob(freemem); - static uint64_t page_load = 0; - static uint64_t last_txg = 0; #if defined(__i386) || !defined(UMA_MD_SMALL_ALLOC) available_memory = @@ -6352,9 +6357,9 @@ arc_memory_throttle(uint64_t reserve, uint64_t txg) if (freemem > (uint64_t)physmem * arc_lotsfree_percent / 100) return (0); - if (txg > last_txg) { - last_txg = txg; - page_load = 0; + if (txg > spa->spa_lowmem_last_txg) { + spa->spa_lowmem_last_txg = txg; + spa->spa_lowmem_page_load = 0; } /* * If we are in pageout, we know that memory is already tight, @@ -6362,18 +6367,19 @@ arc_memory_throttle(uint64_t reserve, uint64_t txg) * continue to let page writes occur as quickly as possible. */ if (curproc == pageproc) { - if (page_load > MAX(ptob(minfree), available_memory) / 4) + if (spa->spa_lowmem_page_load > + MAX(ptob(minfree), available_memory) / 4) return (SET_ERROR(ERESTART)); /* Note: reserve is inflated, so we deflate */ - page_load += reserve / 8; + atomic_add_64(&spa->spa_lowmem_page_load, reserve / 8); return (0); - } else if (page_load > 0 && arc_reclaim_needed()) { + } else if (spa->spa_lowmem_page_load > 0 && arc_reclaim_needed()) { /* memory is low, delay before restarting */ ARCSTAT_INCR(arcstat_memory_throttle_count, 1); return (SET_ERROR(EAGAIN)); } - page_load = 0; -#endif + spa->spa_lowmem_page_load = 0; +#endif /* _KERNEL */ return (0); } @@ -6385,7 +6391,7 @@ arc_tempreserve_clear(uint64_t reserve) } int -arc_tempreserve_space(uint64_t reserve, uint64_t txg) +arc_tempreserve_space(spa_t *spa, uint64_t reserve, uint64_t txg) { int error; uint64_t anon_size; @@ -6414,7 +6420,7 @@ arc_tempreserve_space(uint64_t reserve, uint64_t txg) * in order to compress/encrypt/etc the data. We therefore need to * make sure that there is sufficient available memory for this. */ - error = arc_memory_throttle(reserve, txg); + error = arc_memory_throttle(spa, reserve, txg); if (error != 0) return (error); @@ -6422,12 +6428,24 @@ arc_tempreserve_space(uint64_t reserve, uint64_t txg) * Throttle writes when the amount of dirty data in the cache * gets too large. We try to keep the cache less than half full * of dirty blocks so that our sync times don't grow too large. + * + * In the case of one pool being built on another pool, we want + * to make sure we don't end up throttling the lower (backing) + * pool when the upper pool is the majority contributor to dirty + * data. To insure we make forward progress during throttling, we + * also check the current pool's net dirty data and only throttle + * if it exceeds zfs_arc_pool_dirty_percent of the anonymous dirty + * data in the cache. + * * Note: if two requests come in concurrently, we might let them * both succeed, when one of them should fail. Not a huge deal. */ + uint64_t total_dirty = reserve + arc_tempreserve + anon_size; + uint64_t spa_dirty_anon = spa_dirty_data(spa); - if (reserve + arc_tempreserve + anon_size > arc_c / 2 && - anon_size > arc_c / 4) { + if (total_dirty > arc_c * zfs_arc_dirty_limit_percent / 100 && + anon_size > arc_c * zfs_arc_anon_limit_percent / 100 && + spa_dirty_anon > anon_size * zfs_arc_pool_dirty_percent / 100) { uint64_t meta_esize = refcount_count(&arc_anon->arcs_esize[ARC_BUFC_METADATA]); uint64_t data_esize = Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c Wed Oct 3 14:54:48 2018 (r339140) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c Wed Oct 3 14:55:36 2018 (r339141) @@ -1388,7 +1388,7 @@ dsl_dir_tempreserve_space(dsl_dir_t *dd, uint64_t lsiz offsetof(struct tempreserve, tr_node)); ASSERT3S(asize, >, 0); - err = arc_tempreserve_space(lsize, tx->tx_txg); + err = arc_tempreserve_space(dd->dd_pool->dp_spa, lsize, tx->tx_txg); if (err == 0) { struct tempreserve *tr; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Wed Oct 3 14:54:48 2018 (r339140) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Wed Oct 3 14:55:36 2018 (r339141) @@ -2037,6 +2037,12 @@ bp_get_dsize(spa_t *spa, const blkptr_t *bp) return (dsize); } +uint64_t +spa_dirty_data(spa_t *spa) +{ + return (spa->spa_dsl_pool->dp_dirty_total); +} + /* * ========================================================================== * Initialization and Termination Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h Wed Oct 3 14:54:48 2018 (r339140) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h Wed Oct 3 14:55:36 2018 (r339141) @@ -194,7 +194,7 @@ void arc_freed(spa_t *spa, const blkptr_t *bp); void arc_flush(spa_t *spa, boolean_t retry); void arc_tempreserve_clear(uint64_t reserve); -int arc_tempreserve_space(uint64_t reserve, uint64_t txg); +int arc_tempreserve_space(spa_t *spa, uint64_t reserve, uint64_t txg); uint64_t arc_max_bytes(void); void arc_init(void); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h Wed Oct 3 14:54:48 2018 (r339140) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h Wed Oct 3 14:55:36 2018 (r339141) @@ -833,6 +833,7 @@ extern uint64_t spa_bootfs(spa_t *spa); extern uint64_t spa_delegation(spa_t *spa); extern objset_t *spa_meta_objset(spa_t *spa); extern uint64_t spa_deadman_synctime(spa_t *spa); +extern uint64_t spa_dirty_data(spa_t *spa); /* Miscellaneous support routines */ extern void spa_load_failed(spa_t *spa, const char *fmt, ...); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h Wed Oct 3 14:54:48 2018 (r339140) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h Wed Oct 3 14:55:36 2018 (r339141) @@ -388,6 +388,10 @@ struct spa { int spa_queued; } spa_queue_stats[ZIO_PRIORITY_NUM_QUEUEABLE]; #endif + /* arc_memory_throttle() parameters during low memory condition */ + uint64_t spa_lowmem_page_load; /* memory load during txg */ + uint64_t spa_lowmem_last_txg; /* txg window start */ + hrtime_t spa_ccw_fail_time; /* Conf cache write fail time */ /* From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:56:39 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D030F10C5EB6; Wed, 3 Oct 2018 14:56:39 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 802EA7FD6A; Wed, 3 Oct 2018 14:56:39 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 61D4618709; Wed, 3 Oct 2018 14:56:39 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93EudHK035907; Wed, 3 Oct 2018 14:56:39 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93EucPO035902; Wed, 3 Oct 2018 14:56:38 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031456.w93EucPO035902@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:56:38 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339142 - in stable/11: cddl/contrib/opensolaris/cmd/zfs cddl/contrib/opensolaris/lib/libzfs/common sys/cddl/contrib/opensolaris/common/zfs sys/cddl/contrib/opensolaris/uts/common/sys/fs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11: cddl/contrib/opensolaris/cmd/zfs cddl/contrib/opensolaris/lib/libzfs/common sys/cddl/contrib/opensolaris/common/zfs sys/cddl/contrib/opensolaris/uts/common/sys/fs X-SVN-Commit-Revision: 339142 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:56:40 -0000 Author: mav Date: Wed Oct 3 14:56:38 2018 New Revision: 339142 URL: https://svnweb.freebsd.org/changeset/base/339142 Log: MFC r337215: MFV 337214: 9621 Make createtxg and guid properties public illumos/illumos-gate@e8d4a73c868afb740396041be80ed2b141065e76 Reviewed by: Andy Stormont Reviewed by: Paul Dagnelie Reviewed by: Matt Ahrens Reviewed by: Yuri Pankov Approved by: Robert Mustacchi Author: Josh Paetzel Modified: stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_prop.c stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 Wed Oct 3 14:55:36 2018 (r339141) +++ stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 Wed Oct 3 14:56:38 2018 (r339142) @@ -548,6 +548,13 @@ property. Compression can be turned on by running: .Qq Nm Cm set compression=on Ar dataset The default value is .Cm off . +.It Sy createtxg +The transaction group (txg) in which the dataset was created. +Bookmarks have the same +.Sy createtxg +as the snapshot they are initially tied to. +This property is suitable for ordering a list of snapshots, +e.g. for incremental send and receive. .It Sy creation The time this dataset was created. .It Sy clones @@ -575,6 +582,14 @@ This value is only available when a .Sy filesystem_limit has been set somewhere in the tree under which the dataset resides. +.It Sy guid +The 64 bit GUID of this dataset or bookmark which does not change over its +entire lifetime. +When a snapshot is sent to another pool, the received snapshot has the same +GUID. +Thus, the +.Sy guid +is suitable to identify a snapshot across pools. .It Sy logicalreferenced The amount of space that is .Qq logically Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Oct 3 14:55:36 2018 (r339141) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Oct 3 14:56:38 2018 (r339142) @@ -2784,6 +2784,7 @@ zfs_prop_get(zfs_handle_t *zhp, zfs_prop_t prop, char break; case ZFS_PROP_GUID: + case ZFS_PROP_CREATETXG: /* * GUIDs are stored as numbers, but they are identifiers. * We don't want them to be pretty printed, because pretty Modified: stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_prop.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_prop.c Wed Oct 3 14:55:36 2018 (r339141) +++ stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_prop.c Wed Oct 3 14:56:38 2018 (r339142) @@ -427,6 +427,10 @@ zfs_prop_init(void) zprop_register_number(ZFS_PROP_SNAPSHOT_COUNT, "snapshot_count", UINT64_MAX, PROP_DEFAULT, ZFS_TYPE_FILESYSTEM | ZFS_TYPE_VOLUME, "", "SSCOUNT"); + zprop_register_number(ZFS_PROP_GUID, "guid", 0, PROP_READONLY, + ZFS_TYPE_DATASET | ZFS_TYPE_BOOKMARK, "", "GUID"); + zprop_register_number(ZFS_PROP_CREATETXG, "createtxg", 0, PROP_READONLY, + ZFS_TYPE_DATASET | ZFS_TYPE_BOOKMARK, "", "CREATETXG"); /* inherit number properties */ zprop_register_number(ZFS_PROP_RECORDSIZE, "recordsize", @@ -434,8 +438,6 @@ zfs_prop_init(void) ZFS_TYPE_FILESYSTEM, "512 to 1M, power of 2", "RECSIZE"); /* hidden properties */ - zprop_register_hidden(ZFS_PROP_CREATETXG, "createtxg", PROP_TYPE_NUMBER, - PROP_READONLY, ZFS_TYPE_DATASET | ZFS_TYPE_BOOKMARK, "CREATETXG"); zprop_register_hidden(ZFS_PROP_REMAPTXG, "remaptxg", PROP_TYPE_NUMBER, PROP_READONLY, ZFS_TYPE_DATASET, "REMAPTXG"); zprop_register_hidden(ZFS_PROP_NUMCLONES, "numclones", PROP_TYPE_NUMBER, @@ -447,8 +449,6 @@ zfs_prop_init(void) zprop_register_hidden(ZFS_PROP_STMF_SHAREINFO, "stmf_sbd_lu", PROP_TYPE_STRING, PROP_INHERIT, ZFS_TYPE_VOLUME, "STMF_SBD_LU"); - zprop_register_hidden(ZFS_PROP_GUID, "guid", PROP_TYPE_NUMBER, - PROP_READONLY, ZFS_TYPE_DATASET | ZFS_TYPE_BOOKMARK, "GUID"); zprop_register_hidden(ZFS_PROP_USERACCOUNTING, "useraccounting", PROP_TYPE_NUMBER, PROP_READONLY, ZFS_TYPE_DATASET, "USERACCOUNTING"); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h Wed Oct 3 14:55:36 2018 (r339141) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h Wed Oct 3 14:56:38 2018 (r339142) @@ -118,7 +118,7 @@ typedef enum { ZFS_PROP_SNAPDIR, ZFS_PROP_ACLMODE, ZFS_PROP_ACLINHERIT, - ZFS_PROP_CREATETXG, /* not exposed to the user */ + ZFS_PROP_CREATETXG, ZFS_PROP_NAME, /* not exposed to the user */ ZFS_PROP_CANMOUNT, ZFS_PROP_ISCSIOPTIONS, /* not exposed to the user */ From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:57:20 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1D45210C5F41; Wed, 3 Oct 2018 14:57:20 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id BF2437FEE0; Wed, 3 Oct 2018 14:57:19 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id B9A3E18713; Wed, 3 Oct 2018 14:57:19 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93EvJCv035994; Wed, 3 Oct 2018 14:57:19 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93EvJW8035993; Wed, 3 Oct 2018 14:57:19 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031457.w93EvJW8035993@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:57:19 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339143 - stable/11/sys/cddl/contrib/opensolaris/common/nvpair X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/common/nvpair X-SVN-Commit-Revision: 339143 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:57:20 -0000 Author: mav Date: Wed Oct 3 14:57:19 2018 New Revision: 339143 URL: https://svnweb.freebsd.org/changeset/base/339143 Log: MFC r337217: MFV r337216: 7263 deeply nested nvlist can overflow stack illumos/illumos-gate@9ca527c3d3dfa7c8f304b34a9e03b5eddace838f Reviewed by: Adam Leventhal Reviewed by: George Wilson Reviewed by: Robert Mustacchi Approved by: Dan McDonald Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c Wed Oct 3 14:56:38 2018 (r339142) +++ stable/11/sys/cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c Wed Oct 3 14:57:19 2018 (r339143) @@ -21,6 +21,7 @@ /* * Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 2015, 2016 by Delphix. All rights reserved. */ #include @@ -142,6 +143,11 @@ static int nvlist_add_common(nvlist_t *nvl, const char #define NVPAIR2I_NVP(nvp) \ ((i_nvp_t *)((size_t)(nvp) - offsetof(i_nvp_t, nvi_nvp))) +#ifdef _KERNEL +int nvpair_max_recursion = 20; +#else +int nvpair_max_recursion = 100; +#endif int nv_alloc_init(nv_alloc_t *nva, const nv_alloc_ops_t *nvo, /* args */ ...) @@ -2018,6 +2024,7 @@ typedef struct { const nvs_ops_t *nvs_ops; void *nvs_private; nvpriv_t *nvs_priv; + int nvs_recursion; } nvstream_t; /* @@ -2169,9 +2176,16 @@ static int nvs_embedded(nvstream_t *nvs, nvlist_t *embedded) { switch (nvs->nvs_op) { - case NVS_OP_ENCODE: - return (nvs_operation(nvs, embedded, NULL)); + case NVS_OP_ENCODE: { + int err; + if (nvs->nvs_recursion >= nvpair_max_recursion) + return (EINVAL); + nvs->nvs_recursion++; + err = nvs_operation(nvs, embedded, NULL); + nvs->nvs_recursion--; + return (err); + } case NVS_OP_DECODE: { nvpriv_t *priv; int err; @@ -2184,8 +2198,12 @@ nvs_embedded(nvstream_t *nvs, nvlist_t *embedded) nvlist_init(embedded, embedded->nvl_nvflag, priv); + if (nvs->nvs_recursion >= nvpair_max_recursion) + return (EINVAL); + nvs->nvs_recursion++; if ((err = nvs_operation(nvs, embedded, NULL)) != 0) nvlist_free(embedded); + nvs->nvs_recursion--; return (err); } default: @@ -2273,6 +2291,7 @@ nvlist_common(nvlist_t *nvl, char *buf, size_t *buflen return (EINVAL); nvs.nvs_op = nvs_op; + nvs.nvs_recursion = 0; /* * For NVS_OP_ENCODE and NVS_OP_DECODE make sure an nvlist and From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:57:54 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C1ABA10C5FB8; Wed, 3 Oct 2018 14:57:54 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 760BB80019; Wed, 3 Oct 2018 14:57:54 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 7130018715; Wed, 3 Oct 2018 14:57:54 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93EvsjF036073; Wed, 3 Oct 2018 14:57:54 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93EvsoR036072; Wed, 3 Oct 2018 14:57:54 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031457.w93EvsoR036072@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:57:54 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339144 - stable/11/sys/cddl/contrib/opensolaris/common/nvpair X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/common/nvpair X-SVN-Commit-Revision: 339144 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:57:54 -0000 Author: mav Date: Wed Oct 3 14:57:54 2018 New Revision: 339144 URL: https://svnweb.freebsd.org/changeset/base/339144 Log: MFC r337219: MFV r337218: 7261 nvlist code should enforce name length limit illumos/illumos-gate@48dd5e630c9b1773b7b10d08a3b90b6c9062d713 Reviewed by: Sebastien Roy Reviewed by: George Wilson Reviewed by: Robert Mustacchi Approved by: Dan McDonald Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c Wed Oct 3 14:57:19 2018 (r339143) +++ stable/11/sys/cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c Wed Oct 3 14:57:54 2018 (r339144) @@ -914,6 +914,8 @@ nvlist_add_common(nvlist_t *nvl, const char *name, /* calculate sizes of the nvpair elements and the nvpair itself */ name_sz = strlen(name) + 1; + if (name_sz >= 1ULL << (sizeof (nvp->nvp_name_sz) * 8 - 1)) + return (EINVAL); nvp_sz = NVP_SIZE_CALC(name_sz, value_sz); From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:58:29 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 03C7110C6026; Wed, 3 Oct 2018 14:58:29 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id ACD3F8018D; Wed, 3 Oct 2018 14:58:28 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id A7CC218718; Wed, 3 Oct 2018 14:58:28 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93EwS1j036154; Wed, 3 Oct 2018 14:58:28 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93EwSa3036153; Wed, 3 Oct 2018 14:58:28 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031458.w93EwSa3036153@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:58:28 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339145 - stable/11/sys/cddl/contrib/opensolaris/common/nvpair X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/common/nvpair X-SVN-Commit-Revision: 339145 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:58:29 -0000 Author: mav Date: Wed Oct 3 14:58:28 2018 New Revision: 339145 URL: https://svnweb.freebsd.org/changeset/base/339145 Log: MFC r337221: MFV r337220: 8375 Kernel memory leak in nvpair code illumos/illumos-gate@843c2111b160463f014d325560ad4b051711928e Reviewed by: Pavel Zakharov Reviewed by: George Wilson Reviewed by: Prashanth Sreenivasa Reviewed by: Robert Mustacchi Approved by: Dan McDonald Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c Wed Oct 3 14:57:54 2018 (r339144) +++ stable/11/sys/cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c Wed Oct 3 14:58:28 2018 (r339145) @@ -21,7 +21,7 @@ /* * Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2015, 2016 by Delphix. All rights reserved. + * Copyright (c) 2015, 2017 by Delphix. All rights reserved. */ #include @@ -2200,8 +2200,10 @@ nvs_embedded(nvstream_t *nvs, nvlist_t *embedded) nvlist_init(embedded, embedded->nvl_nvflag, priv); - if (nvs->nvs_recursion >= nvpair_max_recursion) + if (nvs->nvs_recursion >= nvpair_max_recursion) { + nvlist_free(embedded); return (EINVAL); + } nvs->nvs_recursion++; if ((err = nvs_operation(nvs, embedded, NULL)) != 0) nvlist_free(embedded); From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:59:04 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 66A5E10C60B3; Wed, 3 Oct 2018 14:59:04 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1BFBA80333; Wed, 3 Oct 2018 14:59:04 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 16E4F18738; Wed, 3 Oct 2018 14:59:04 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93Ex4Ip036238; Wed, 3 Oct 2018 14:59:04 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93Ex3QE036234; Wed, 3 Oct 2018 14:59:03 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031459.w93Ex3QE036234@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:59:03 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339146 - in stable/11: cddl/contrib/opensolaris/lib/libnvpair sys/cddl/contrib/opensolaris/common/nvpair sys/cddl/contrib/opensolaris/uts/common/sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11: cddl/contrib/opensolaris/lib/libnvpair sys/cddl/contrib/opensolaris/common/nvpair sys/cddl/contrib/opensolaris/uts/common/sys X-SVN-Commit-Revision: 339146 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:59:04 -0000 Author: mav Date: Wed Oct 3 14:59:03 2018 New Revision: 339146 URL: https://svnweb.freebsd.org/changeset/base/339146 Log: MFC r337227: MFV r337223: 9580 Add a hash-table on top of nvlist to speed-up operations illumos/illumos-gate@2ec7644aab2a726a64681fa66c6db8731b160de1 Reviewed by: Matt Ahrens Reviewed by: Sebastien Roy Approved by: Robert Mustacchi Author: Serapheim Dimitropoulos Modified: stable/11/cddl/contrib/opensolaris/lib/libnvpair/nvpair_json.c stable/11/sys/cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/nvpair.h stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/nvpair_impl.h Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/lib/libnvpair/nvpair_json.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libnvpair/nvpair_json.c Wed Oct 3 14:58:28 2018 (r339145) +++ stable/11/cddl/contrib/opensolaris/lib/libnvpair/nvpair_json.c Wed Oct 3 14:59:03 2018 (r339146) @@ -10,6 +10,7 @@ */ /* * Copyright (c) 2014, Joyent, Inc. + * Copyright (c) 2017 by Delphix. All rights reserved. */ #include @@ -394,8 +395,10 @@ nvlist_print_json(FILE *fp, nvlist_t *nvl) } case DATA_TYPE_UNKNOWN: + case DATA_TYPE_DONTCARE: return (-1); } + } FPRINTF(fp, "}"); Modified: stable/11/sys/cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c Wed Oct 3 14:58:28 2018 (r339145) +++ stable/11/sys/cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c Wed Oct 3 14:59:03 2018 (r339146) @@ -149,6 +149,8 @@ int nvpair_max_recursion = 20; int nvpair_max_recursion = 100; #endif +uint64_t nvlist_hashtable_init_size = (1 << 4); + int nv_alloc_init(nv_alloc_t *nva, const nv_alloc_ops_t *nvo, /* args */ ...) { @@ -256,7 +258,292 @@ nv_priv_alloc_embedded(nvpriv_t *priv) return (emb_priv); } +static int +nvt_tab_alloc(nvpriv_t *priv, uint64_t buckets) +{ + ASSERT3P(priv->nvp_hashtable, ==, NULL); + ASSERT0(priv->nvp_nbuckets); + ASSERT0(priv->nvp_nentries); + + i_nvp_t **tab = nv_mem_zalloc(priv, buckets * sizeof (i_nvp_t *)); + if (tab == NULL) + return (ENOMEM); + + priv->nvp_hashtable = tab; + priv->nvp_nbuckets = buckets; + return (0); +} + static void +nvt_tab_free(nvpriv_t *priv) +{ + i_nvp_t **tab = priv->nvp_hashtable; + if (tab == NULL) { + ASSERT0(priv->nvp_nbuckets); + ASSERT0(priv->nvp_nentries); + return; + } + + nv_mem_free(priv, tab, priv->nvp_nbuckets * sizeof (i_nvp_t *)); + + priv->nvp_hashtable = NULL; + priv->nvp_nbuckets = 0; + priv->nvp_nentries = 0; +} + +static uint32_t +nvt_hash(const char *p) +{ + uint32_t g, hval = 0; + + while (*p) { + hval = (hval << 4) + *p++; + if ((g = (hval & 0xf0000000)) != 0) + hval ^= g >> 24; + hval &= ~g; + } + return (hval); +} + +static boolean_t +nvt_nvpair_match(nvpair_t *nvp1, nvpair_t *nvp2, uint32_t nvflag) +{ + boolean_t match = B_FALSE; + if (nvflag & NV_UNIQUE_NAME_TYPE) { + if (strcmp(NVP_NAME(nvp1), NVP_NAME(nvp2)) == 0 && + NVP_TYPE(nvp1) == NVP_TYPE(nvp2)) + match = B_TRUE; + } else { + ASSERT(nvflag == 0 || nvflag & NV_UNIQUE_NAME); + if (strcmp(NVP_NAME(nvp1), NVP_NAME(nvp2)) == 0) + match = B_TRUE; + } + return (match); +} + +static nvpair_t * +nvt_lookup_name_type(nvlist_t *nvl, const char *name, data_type_t type) +{ + nvpriv_t *priv = (nvpriv_t *)(uintptr_t)nvl->nvl_priv; + ASSERT(priv != NULL); + + i_nvp_t **tab = priv->nvp_hashtable; + + if (tab == NULL) { + ASSERT3P(priv->nvp_list, ==, NULL); + ASSERT0(priv->nvp_nbuckets); + ASSERT0(priv->nvp_nentries); + return (NULL); + } else { + ASSERT(priv->nvp_nbuckets != 0); + } + + uint64_t hash = nvt_hash(name); + uint64_t index = hash & (priv->nvp_nbuckets - 1); + + ASSERT3U(index, <, priv->nvp_nbuckets); + i_nvp_t *entry = tab[index]; + + for (i_nvp_t *e = entry; e != NULL; e = e->nvi_hashtable_next) { + if (strcmp(NVP_NAME(&e->nvi_nvp), name) == 0 && + (type == DATA_TYPE_DONTCARE || + NVP_TYPE(&e->nvi_nvp) == type)) + return (&e->nvi_nvp); + } + return (NULL); +} + +static nvpair_t * +nvt_lookup_name(nvlist_t *nvl, const char *name) +{ + return (nvt_lookup_name_type(nvl, name, DATA_TYPE_DONTCARE)); +} + +static int +nvt_resize(nvpriv_t *priv, uint32_t new_size) +{ + i_nvp_t **tab = priv->nvp_hashtable; + + /* + * Migrate all the entries from the current table + * to a newly-allocated table with the new size by + * re-adjusting the pointers of their entries. + */ + uint32_t size = priv->nvp_nbuckets; + uint32_t new_mask = new_size - 1; + ASSERT(((new_size) & ((new_size) - 1)) == 0); + + i_nvp_t **new_tab = nv_mem_zalloc(priv, new_size * sizeof (i_nvp_t *)); + if (new_tab == NULL) + return (ENOMEM); + + uint32_t nentries = 0; + for (uint32_t i = 0; i < size; i++) { + i_nvp_t *next, *e = tab[i]; + + while (e != NULL) { + next = e->nvi_hashtable_next; + + uint32_t hash = nvt_hash(NVP_NAME(&e->nvi_nvp)); + uint32_t index = hash & new_mask; + + e->nvi_hashtable_next = new_tab[index]; + new_tab[index] = e; + nentries++; + + e = next; + } + tab[i] = NULL; + } + ASSERT3U(nentries, ==, priv->nvp_nentries); + + nvt_tab_free(priv); + + priv->nvp_hashtable = new_tab; + priv->nvp_nbuckets = new_size; + priv->nvp_nentries = nentries; + + return (0); +} + +static boolean_t +nvt_needs_togrow(nvpriv_t *priv) +{ + /* + * Grow only when we have more elements than buckets + * and the # of buckets doesn't overflow. + */ + return (priv->nvp_nentries > priv->nvp_nbuckets && + (UINT32_MAX >> 1) >= priv->nvp_nbuckets); +} + +/* + * Allocate a new table that's twice the size of the old one, + * and migrate all the entries from the old one to the new + * one by re-adjusting their pointers. + */ +static int +nvt_grow(nvpriv_t *priv) +{ + uint32_t current_size = priv->nvp_nbuckets; + /* ensure we won't overflow */ + ASSERT3U(UINT32_MAX >> 1, >=, current_size); + return (nvt_resize(priv, current_size << 1)); +} + +static boolean_t +nvt_needs_toshrink(nvpriv_t *priv) +{ + /* + * Shrink only when the # of elements is less than or + * equal to 1/4 the # of buckets. Never shrink less than + * nvlist_hashtable_init_size. + */ + ASSERT3U(priv->nvp_nbuckets, >=, nvlist_hashtable_init_size); + if (priv->nvp_nbuckets == nvlist_hashtable_init_size) + return (B_FALSE); + return (priv->nvp_nentries <= (priv->nvp_nbuckets >> 2)); +} + +/* + * Allocate a new table that's half the size of the old one, + * and migrate all the entries from the old one to the new + * one by re-adjusting their pointers. + */ +static int +nvt_shrink(nvpriv_t *priv) +{ + uint32_t current_size = priv->nvp_nbuckets; + /* ensure we won't overflow */ + ASSERT3U(current_size, >=, nvlist_hashtable_init_size); + return (nvt_resize(priv, current_size >> 1)); +} + +static int +nvt_remove_nvpair(nvlist_t *nvl, nvpair_t *nvp) +{ + nvpriv_t *priv = (nvpriv_t *)(uintptr_t)nvl->nvl_priv; + + if (nvt_needs_toshrink(priv)) { + int err = nvt_shrink(priv); + if (err != 0) + return (err); + } + i_nvp_t **tab = priv->nvp_hashtable; + + char *name = NVP_NAME(nvp); + uint64_t hash = nvt_hash(name); + uint64_t index = hash & (priv->nvp_nbuckets - 1); + + ASSERT3U(index, <, priv->nvp_nbuckets); + i_nvp_t *bucket = tab[index]; + + for (i_nvp_t *prev = NULL, *e = bucket; + e != NULL; prev = e, e = e->nvi_hashtable_next) { + if (nvt_nvpair_match(&e->nvi_nvp, nvp, nvl->nvl_flag)) { + if (prev != NULL) { + prev->nvi_hashtable_next = + e->nvi_hashtable_next; + } else { + ASSERT3P(e, ==, bucket); + tab[index] = e->nvi_hashtable_next; + } + e->nvi_hashtable_next = NULL; + priv->nvp_nentries--; + break; + } + } + + return (0); +} + +static int +nvt_add_nvpair(nvlist_t *nvl, nvpair_t *nvp) +{ + nvpriv_t *priv = (nvpriv_t *)(uintptr_t)nvl->nvl_priv; + + /* initialize nvpair table now if it doesn't exist. */ + if (priv->nvp_hashtable == NULL) { + int err = nvt_tab_alloc(priv, nvlist_hashtable_init_size); + if (err != 0) + return (err); + } + + /* + * if we don't allow duplicate entries, make sure to + * unlink any existing entries from the table. + */ + if (nvl->nvl_nvflag != 0) { + int err = nvt_remove_nvpair(nvl, nvp); + if (err != 0) + return (err); + } + + if (nvt_needs_togrow(priv)) { + int err = nvt_grow(priv); + if (err != 0) + return (err); + } + i_nvp_t **tab = priv->nvp_hashtable; + + char *name = NVP_NAME(nvp); + uint64_t hash = nvt_hash(name); + uint64_t index = hash & (priv->nvp_nbuckets - 1); + + ASSERT3U(index, <, priv->nvp_nbuckets); + i_nvp_t *bucket = tab[index]; + + /* insert link at the beginning of the bucket */ + i_nvp_t *new_entry = NVPAIR2I_NVP(nvp); + ASSERT3P(new_entry->nvi_hashtable_next, ==, NULL); + new_entry->nvi_hashtable_next = bucket; + tab[index] = new_entry; + + priv->nvp_nentries++; + return (0); +} + +static void nvlist_init(nvlist_t *nvl, uint32_t nvflag, nvpriv_t *priv) { nvl->nvl_version = NV_VERSION; @@ -588,6 +875,7 @@ nvlist_free(nvlist_t *nvl) else nvl->nvl_priv = 0; + nvt_tab_free(priv); nv_mem_free(priv, priv, sizeof (nvpriv_t)); } @@ -648,26 +936,14 @@ nvlist_xdup(nvlist_t *nvl, nvlist_t **nvlp, nv_alloc_t int nvlist_remove_all(nvlist_t *nvl, const char *name) { - nvpriv_t *priv; - i_nvp_t *curr; int error = ENOENT; - if (nvl == NULL || name == NULL || - (priv = (nvpriv_t *)(uintptr_t)nvl->nvl_priv) == NULL) + if (nvl == NULL || name == NULL || nvl->nvl_priv == 0) return (EINVAL); - curr = priv->nvp_list; - while (curr != NULL) { - nvpair_t *nvp = &curr->nvi_nvp; - - curr = curr->nvi_next; - if (strcmp(name, NVP_NAME(nvp)) != 0) - continue; - - nvp_buf_unlink(nvl, nvp); - nvpair_free(nvp); - nvp_buf_free(nvl, nvp); - + nvpair_t *nvp; + while ((nvp = nvt_lookup_name(nvl, name)) != NULL) { + VERIFY0(nvlist_remove_nvpair(nvl, nvp)); error = 0; } @@ -680,28 +956,14 @@ nvlist_remove_all(nvlist_t *nvl, const char *name) int nvlist_remove(nvlist_t *nvl, const char *name, data_type_t type) { - nvpriv_t *priv; - i_nvp_t *curr; - - if (nvl == NULL || name == NULL || - (priv = (nvpriv_t *)(uintptr_t)nvl->nvl_priv) == NULL) + if (nvl == NULL || name == NULL || nvl->nvl_priv == 0) return (EINVAL); - curr = priv->nvp_list; - while (curr != NULL) { - nvpair_t *nvp = &curr->nvi_nvp; + nvpair_t *nvp = nvt_lookup_name_type(nvl, name, type); + if (nvp == NULL) + return (ENOENT); - if (strcmp(name, NVP_NAME(nvp)) == 0 && NVP_TYPE(nvp) == type) { - nvp_buf_unlink(nvl, nvp); - nvpair_free(nvp); - nvp_buf_free(nvl, nvp); - - return (0); - } - curr = curr->nvi_next; - } - - return (ENOENT); + return (nvlist_remove_nvpair(nvl, nvp)); } int @@ -710,6 +972,10 @@ nvlist_remove_nvpair(nvlist_t *nvl, nvpair_t *nvp) if (nvl == NULL || nvp == NULL) return (EINVAL); + int err = nvt_remove_nvpair(nvl, nvp); + if (err != 0) + return (err); + nvp_buf_unlink(nvl, nvp); nvpair_free(nvp); nvp_buf_free(nvl, nvp); @@ -987,6 +1253,12 @@ nvlist_add_common(nvlist_t *nvl, const char *name, else if (nvl->nvl_nvflag & NV_UNIQUE_NAME_TYPE) (void) nvlist_remove(nvl, name, type); + err = nvt_add_nvpair(nvl, nvp); + if (err != 0) { + nvpair_free(nvp); + nvp_buf_free(nvl, nvp); + return (err); + } nvp_buf_link(nvl, nvp); return (0); @@ -1336,25 +1608,17 @@ static int nvlist_lookup_common(nvlist_t *nvl, const char *name, data_type_t type, uint_t *nelem, void *data) { - nvpriv_t *priv; - nvpair_t *nvp; - i_nvp_t *curr; - - if (name == NULL || nvl == NULL || - (priv = (nvpriv_t *)(uintptr_t)nvl->nvl_priv) == NULL) + if (name == NULL || nvl == NULL || nvl->nvl_priv == 0) return (EINVAL); if (!(nvl->nvl_nvflag & (NV_UNIQUE_NAME | NV_UNIQUE_NAME_TYPE))) return (ENOTSUP); - for (curr = priv->nvp_list; curr != NULL; curr = curr->nvi_next) { - nvp = &curr->nvi_nvp; + nvpair_t *nvp = nvt_lookup_name_type(nvl, name, type); + if (nvp == NULL) + return (ENOENT); - if (strcmp(name, NVP_NAME(nvp)) == 0 && NVP_TYPE(nvp) == type) - return (nvpair_value_common(nvp, type, nelem, data)); - } - - return (ENOENT); + return (nvpair_value_common(nvp, type, nelem, data)); } int @@ -2112,6 +2376,12 @@ nvs_decode_pairs(nvstream_t *nvs, nvlist_t *nvl) return (EFAULT); } + err = nvt_add_nvpair(nvl, nvp); + if (err != 0) { + nvpair_free(nvp); + nvp_buf_free(nvl, nvp); + return (err); + } nvp_buf_link(nvl, nvp); } return (err); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/nvpair.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/nvpair.h Wed Oct 3 14:58:28 2018 (r339145) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/nvpair.h Wed Oct 3 14:59:03 2018 (r339146) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2012, 2017 by Delphix. All rights reserved. */ #ifndef _SYS_NVPAIR_H @@ -39,6 +39,7 @@ extern "C" { #endif typedef enum { + DATA_TYPE_DONTCARE = -1, DATA_TYPE_UNKNOWN = 0, DATA_TYPE_BOOLEAN, DATA_TYPE_BYTE, Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/nvpair_impl.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/nvpair_impl.h Wed Oct 3 14:58:28 2018 (r339145) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/nvpair_impl.h Wed Oct 3 14:59:03 2018 (r339146) @@ -24,11 +24,13 @@ * Use is subject to license terms. */ +/* + * Copyright (c) 2017 by Delphix. All rights reserved. + */ + #ifndef _NVPAIR_IMPL_H #define _NVPAIR_IMPL_H -#pragma ident "%Z%%M% %I% %E% SMI" - #ifdef __cplusplus extern "C" { #endif @@ -47,16 +49,27 @@ typedef struct i_nvp i_nvp_t; struct i_nvp { union { - uint64_t _nvi_align; /* ensure alignment */ + /* ensure alignment */ + uint64_t _nvi_align; + struct { - i_nvp_t *_nvi_next; /* pointer to next nvpair */ - i_nvp_t *_nvi_prev; /* pointer to prev nvpair */ + /* pointer to next nvpair */ + i_nvp_t *_nvi_next; + + /* pointer to prev nvpair */ + i_nvp_t *_nvi_prev; + + /* next pair in table bucket */ + i_nvp_t *_nvi_hashtable_next; } _nvi; } _nvi_un; - nvpair_t nvi_nvp; /* nvpair */ + + /* nvpair */ + nvpair_t nvi_nvp; }; #define nvi_next _nvi_un._nvi._nvi_next #define nvi_prev _nvi_un._nvi._nvi_prev +#define nvi_hashtable_next _nvi_un._nvi._nvi_hashtable_next typedef struct { i_nvp_t *nvp_list; /* linked list of nvpairs */ @@ -64,6 +77,10 @@ typedef struct { i_nvp_t *nvp_curr; /* current walker nvpair */ nv_alloc_t *nvp_nva; /* pluggable allocator */ uint32_t nvp_stat; /* internal state */ + + i_nvp_t **nvp_hashtable; /* table of entries used for lookup */ + uint32_t nvp_nbuckets; /* # of buckets in hash table */ + uint32_t nvp_nentries; /* # of entries in hash table */ } nvpriv_t; #ifdef __cplusplus From owner-svn-src-stable-11@freebsd.org Wed Oct 3 14:59:40 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5B8D410C6124; Wed, 3 Oct 2018 14:59:40 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0933880489; Wed, 3 Oct 2018 14:59:40 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 03DC518739; Wed, 3 Oct 2018 14:59:40 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93ExdJs036316; Wed, 3 Oct 2018 14:59:39 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93ExdlU036315; Wed, 3 Oct 2018 14:59:39 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031459.w93ExdlU036315@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 14:59:39 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339147 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 339147 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 14:59:40 -0000 Author: mav Date: Wed Oct 3 14:59:39 2018 New Revision: 339147 URL: https://svnweb.freebsd.org/changeset/base/339147 Log: MFC r337229: Reduce taskq and context-switch cost of zio pipe When doing a read from disk, ZFS creates 3 ZIO's: a zio_null(), the logical zio_read(), and then a physical zio. Currently, each of these results in a separate taskq_dispatch(zio_execute). On high-read-iops workloads, this causes a significant performance impact. By processing all 3 ZIO's in a single taskq entry, we reduce the overhead on taskq locking and context switching. We accomplish this by allowing zio_done() to return a "next zio to execute" to zio_execute(). This results in a ~12% performance increase for random reads, from 96,000 iops to 108,000 iops (with recordsize=8k, on SSD's). Reviewed by: Pavel Zakharov Reviewed-by: Brian Behlendorf Reviewed by: George Wilson Signed-off-by: Matthew Ahrens External-issue: DLPX-59292 Closes #7736 zfsonlinux/zfs@62840030a7dceaee013ddbcc1eebcfc7922edf7c Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h Wed Oct 3 14:59:03 2018 (r339146) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h Wed Oct 3 14:59:39 2018 (r339147) @@ -217,7 +217,7 @@ enum zio_child { #define ZIO_CHILD_DDT_BIT ZIO_CHILD_BIT(ZIO_CHILD_DDT) #define ZIO_CHILD_LOGICAL_BIT ZIO_CHILD_BIT(ZIO_CHILD_LOGICAL) #define ZIO_CHILD_ALL_BITS \ - (ZIO_CHILD_VDEV_BIT | ZIO_CHILD_GANG_BIT | \ + (ZIO_CHILD_VDEV_BIT | ZIO_CHILD_GANG_BIT | \ ZIO_CHILD_DDT_BIT | ZIO_CHILD_LOGICAL_BIT) enum zio_wait_type { @@ -356,7 +356,7 @@ typedef struct zio_transform { struct zio_transform *zt_next; } zio_transform_t; -typedef int zio_pipe_stage_t(zio_t *zio); +typedef zio_t *zio_pipe_stage_t(zio_t *zio); /* * The io_reexecute flags are distinct from io_flags because the child must Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Wed Oct 3 14:59:03 2018 (r339146) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Wed Oct 3 14:59:39 2018 (r339147) @@ -100,9 +100,6 @@ kmem_cache_t *zio_data_buf_cache[SPA_MAXBLOCKSIZE >> S extern vmem_t *zio_alloc_arena; #endif -#define ZIO_PIPELINE_CONTINUE 0x100 -#define ZIO_PIPELINE_STOP 0x101 - #define BP_SPANB(indblkshift, level) \ (((uint64_t)1) << ((level) * ((indblkshift) - SPA_BLKPTRSHIFT))) #define COMPARE_META_LEVEL 0x80000000ul @@ -539,7 +536,8 @@ zio_wait_for_children(zio_t *zio, uint8_t childbits, e } static void -zio_notify_parent(zio_t *pio, zio_t *zio, enum zio_wait_type wait) +zio_notify_parent(zio_t *pio, zio_t *zio, enum zio_wait_type wait, + zio_t **next_to_executep) { uint64_t *countp = &pio->io_children[zio->io_child_type][wait]; int *errorp = &pio->io_child_error[zio->io_child_type]; @@ -558,13 +556,33 @@ zio_notify_parent(zio_t *pio, zio_t *zio, enum zio_wai ZIO_TASKQ_INTERRUPT; pio->io_stall = NULL; mutex_exit(&pio->io_lock); + /* - * Dispatch the parent zio in its own taskq so that - * the child can continue to make progress. This also - * prevents overflowing the stack when we have deeply nested - * parent-child relationships. + * If we can tell the caller to execute this parent next, do + * so. Otherwise dispatch the parent zio as its own task. + * + * Having the caller execute the parent when possible reduces + * locking on the zio taskq's, reduces context switch + * overhead, and has no recursion penalty. Note that one + * read from disk typically causes at least 3 zio's: a + * zio_null(), the logical zio_read(), and then a physical + * zio. When the physical ZIO completes, we are able to call + * zio_done() on all 3 of these zio's from one invocation of + * zio_execute() by returning the parent back to + * zio_execute(). Since the parent isn't executed until this + * thread returns back to zio_execute(), the caller should do + * so promptly. + * + * In other cases, dispatching the parent prevents + * overflowing the stack when we have deeply nested + * parent-child relationships, as we do with the "mega zio" + * of writes for spa_sync(), and the chain of ZIL blocks. */ - zio_taskq_dispatch(pio, type, B_FALSE); + if (next_to_executep != NULL && *next_to_executep == NULL) { + *next_to_executep = pio; + } else { + zio_taskq_dispatch(pio, type, B_FALSE); + } } else { mutex_exit(&pio->io_lock); } @@ -1275,7 +1293,7 @@ zio_shrink(zio_t *zio, uint64_t size) * ========================================================================== */ -static int +static zio_t * zio_read_bp_init(zio_t *zio) { blkptr_t *bp = zio->io_bp; @@ -1312,14 +1330,14 @@ zio_read_bp_init(zio_t *zio) if (BP_GET_DEDUP(bp) && zio->io_child_type == ZIO_CHILD_LOGICAL) zio->io_pipeline = ZIO_DDT_READ_PIPELINE; - return (ZIO_PIPELINE_CONTINUE); + return (zio); } -static int +static zio_t * zio_write_bp_init(zio_t *zio) { if (!IO_IS_ALLOCATING(zio)) - return (ZIO_PIPELINE_CONTINUE); + return (zio); ASSERT(zio->io_child_type != ZIO_CHILD_DDT); @@ -1334,7 +1352,7 @@ zio_write_bp_init(zio_t *zio) zio->io_pipeline = ZIO_INTERLOCK_PIPELINE; if (BP_IS_EMBEDDED(bp)) - return (ZIO_PIPELINE_CONTINUE); + return (zio); /* * If we've been overridden and nopwrite is set then @@ -1345,13 +1363,13 @@ zio_write_bp_init(zio_t *zio) ASSERT(!zp->zp_dedup); ASSERT3U(BP_GET_CHECKSUM(bp), ==, zp->zp_checksum); zio->io_flags |= ZIO_FLAG_NOPWRITE; - return (ZIO_PIPELINE_CONTINUE); + return (zio); } ASSERT(!zp->zp_nopwrite); if (BP_IS_HOLE(bp) || !zp->zp_dedup) - return (ZIO_PIPELINE_CONTINUE); + return (zio); ASSERT((zio_checksum_table[zp->zp_checksum].ci_flags & ZCHECKSUM_FLAG_DEDUP) || zp->zp_dedup_verify); @@ -1359,7 +1377,7 @@ zio_write_bp_init(zio_t *zio) if (BP_GET_CHECKSUM(bp) == zp->zp_checksum) { BP_SET_DEDUP(bp, 1); zio->io_pipeline |= ZIO_STAGE_DDT_WRITE; - return (ZIO_PIPELINE_CONTINUE); + return (zio); } /* @@ -1371,10 +1389,10 @@ zio_write_bp_init(zio_t *zio) zio->io_pipeline = zio->io_orig_pipeline; } - return (ZIO_PIPELINE_CONTINUE); + return (zio); } -static int +static zio_t * zio_write_compress(zio_t *zio) { spa_t *spa = zio->io_spa; @@ -1393,11 +1411,11 @@ zio_write_compress(zio_t *zio) */ if (zio_wait_for_children(zio, ZIO_CHILD_LOGICAL_BIT | ZIO_CHILD_GANG_BIT, ZIO_WAIT_READY)) { - return (ZIO_PIPELINE_STOP); + return (NULL); } if (!IO_IS_ALLOCATING(zio)) - return (ZIO_PIPELINE_CONTINUE); + return (zio); if (zio->io_children_ready != NULL) { /* @@ -1456,7 +1474,7 @@ zio_write_compress(zio_t *zio) zio->io_pipeline = ZIO_INTERLOCK_PIPELINE; ASSERT(spa_feature_is_active(spa, SPA_FEATURE_EMBEDDED_DATA)); - return (ZIO_PIPELINE_CONTINUE); + return (zio); } else { /* * Round up compressed size up to the ashift @@ -1544,10 +1562,10 @@ zio_write_compress(zio_t *zio) zio->io_pipeline |= ZIO_STAGE_NOP_WRITE; } } - return (ZIO_PIPELINE_CONTINUE); + return (zio); } -static int +static zio_t * zio_free_bp_init(zio_t *zio) { blkptr_t *bp = zio->io_bp; @@ -1559,7 +1577,7 @@ zio_free_bp_init(zio_t *zio) ASSERT3P(zio->io_bp, ==, &zio->io_bp_copy); - return (ZIO_PIPELINE_CONTINUE); + return (zio); } /* @@ -1633,12 +1651,12 @@ zio_taskq_member(zio_t *zio, zio_taskq_type_t q) return (B_FALSE); } -static int +static zio_t * zio_issue_async(zio_t *zio) { zio_taskq_dispatch(zio, ZIO_TASKQ_ISSUE, B_FALSE); - return (ZIO_PIPELINE_STOP); + return (NULL); } void @@ -1720,15 +1738,14 @@ static zio_pipe_stage_t *zio_pipeline[]; void zio_execute(zio_t *zio) { - zio->io_executor = curthread; - ASSERT3U(zio->io_queued_timestamp, >, 0); while (zio->io_stage < ZIO_STAGE_DONE) { enum zio_stage pipeline = zio->io_pipeline; enum zio_stage stage = zio->io_stage; - int rv; + zio->io_executor = curthread; + ASSERT(!MUTEX_HELD(&zio->io_lock)); ASSERT(ISP2(stage)); ASSERT(zio->io_stall == NULL); @@ -1758,12 +1775,16 @@ zio_execute(zio_t *zio) zio->io_stage = stage; zio->io_pipeline_trace |= zio->io_stage; - rv = zio_pipeline[highbit64(stage) - 1](zio); - if (rv == ZIO_PIPELINE_STOP) - return; + /* + * The zio pipeline stage returns the next zio to execute + * (typically the same as this one), or NULL if we should + * stop. + */ + zio = zio_pipeline[highbit64(stage) - 1](zio); - ASSERT(rv == ZIO_PIPELINE_CONTINUE); + if (zio == NULL) + return; } } @@ -2226,7 +2247,7 @@ zio_gang_tree_issue(zio_t *pio, zio_gang_node_t *gn, b zio_nowait(zio); } -static int +static zio_t * zio_gang_assemble(zio_t *zio) { blkptr_t *bp = zio->io_bp; @@ -2238,16 +2259,16 @@ zio_gang_assemble(zio_t *zio) zio_gang_tree_assemble(zio, bp, &zio->io_gang_tree); - return (ZIO_PIPELINE_CONTINUE); + return (zio); } -static int +static zio_t * zio_gang_issue(zio_t *zio) { blkptr_t *bp = zio->io_bp; if (zio_wait_for_children(zio, ZIO_CHILD_GANG_BIT, ZIO_WAIT_DONE)) { - return (ZIO_PIPELINE_STOP); + return (NULL); } ASSERT(BP_IS_GANG(bp) && zio->io_gang_leader == zio); @@ -2261,7 +2282,7 @@ zio_gang_issue(zio_t *zio) zio->io_pipeline = ZIO_INTERLOCK_PIPELINE; - return (ZIO_PIPELINE_CONTINUE); + return (zio); } static void @@ -2300,7 +2321,7 @@ zio_write_gang_done(zio_t *zio) abd_put(zio->io_abd); } -static int +static zio_t * zio_write_gang_block(zio_t *pio) { spa_t *spa = pio->io_spa; @@ -2359,7 +2380,7 @@ zio_write_gang_block(zio_t *pio) gbh_copies - copies, pio->io_allocator, pio); } pio->io_error = error; - return (ZIO_PIPELINE_CONTINUE); + return (pio); } if (pio == gio) { @@ -2426,7 +2447,7 @@ zio_write_gang_block(zio_t *pio) zio_nowait(zio); - return (ZIO_PIPELINE_CONTINUE); + return (pio); } /* @@ -2447,7 +2468,7 @@ zio_write_gang_block(zio_t *pio) * used for nopwrite, assuming that the salt and the checksums * themselves remain secret. */ -static int +static zio_t * zio_nop_write(zio_t *zio) { blkptr_t *bp = zio->io_bp; @@ -2474,7 +2495,7 @@ zio_nop_write(zio_t *zio) BP_GET_COMPRESS(bp) != BP_GET_COMPRESS(bp_orig) || BP_GET_DEDUP(bp) != BP_GET_DEDUP(bp_orig) || zp->zp_copies != BP_GET_NDVAS(bp_orig)) - return (ZIO_PIPELINE_CONTINUE); + return (zio); /* * If the checksums match then reset the pipeline so that we @@ -2494,7 +2515,7 @@ zio_nop_write(zio_t *zio) zio->io_flags |= ZIO_FLAG_NOPWRITE; } - return (ZIO_PIPELINE_CONTINUE); + return (zio); } /* @@ -2522,7 +2543,7 @@ zio_ddt_child_read_done(zio_t *zio) mutex_exit(&pio->io_lock); } -static int +static zio_t * zio_ddt_read_start(zio_t *zio) { blkptr_t *bp = zio->io_bp; @@ -2542,7 +2563,7 @@ zio_ddt_read_start(zio_t *zio) zio->io_vsd = dde; if (ddp_self == NULL) - return (ZIO_PIPELINE_CONTINUE); + return (zio); for (int p = 0; p < DDT_PHYS_TYPES; p++, ddp++) { if (ddp->ddp_phys_birth == 0 || ddp == ddp_self) @@ -2555,23 +2576,23 @@ zio_ddt_read_start(zio_t *zio) zio->io_priority, ZIO_DDT_CHILD_FLAGS(zio) | ZIO_FLAG_DONT_PROPAGATE, &zio->io_bookmark)); } - return (ZIO_PIPELINE_CONTINUE); + return (zio); } zio_nowait(zio_read(zio, zio->io_spa, bp, zio->io_abd, zio->io_size, NULL, NULL, zio->io_priority, ZIO_DDT_CHILD_FLAGS(zio), &zio->io_bookmark)); - return (ZIO_PIPELINE_CONTINUE); + return (zio); } -static int +static zio_t * zio_ddt_read_done(zio_t *zio) { blkptr_t *bp = zio->io_bp; if (zio_wait_for_children(zio, ZIO_CHILD_DDT_BIT, ZIO_WAIT_DONE)) { - return (ZIO_PIPELINE_STOP); + return (NULL); } ASSERT(BP_GET_DEDUP(bp)); @@ -2583,12 +2604,12 @@ zio_ddt_read_done(zio_t *zio) ddt_entry_t *dde = zio->io_vsd; if (ddt == NULL) { ASSERT(spa_load_state(zio->io_spa) != SPA_LOAD_NONE); - return (ZIO_PIPELINE_CONTINUE); + return (zio); } if (dde == NULL) { zio->io_stage = ZIO_STAGE_DDT_READ_START >> 1; zio_taskq_dispatch(zio, ZIO_TASKQ_ISSUE, B_FALSE); - return (ZIO_PIPELINE_STOP); + return (NULL); } if (dde->dde_repair_abd != NULL) { abd_copy(zio->io_abd, dde->dde_repair_abd, @@ -2601,7 +2622,7 @@ zio_ddt_read_done(zio_t *zio) ASSERT(zio->io_vsd == NULL); - return (ZIO_PIPELINE_CONTINUE); + return (zio); } static boolean_t @@ -2759,7 +2780,7 @@ zio_ddt_ditto_write_done(zio_t *zio) ddt_exit(ddt); } -static int +static zio_t * zio_ddt_write(zio_t *zio) { spa_t *spa = zio->io_spa; @@ -2803,7 +2824,7 @@ zio_ddt_write(zio_t *zio) ASSERT(!BP_GET_DEDUP(bp)); zio->io_pipeline = ZIO_WRITE_PIPELINE; ddt_exit(ddt); - return (ZIO_PIPELINE_CONTINUE); + return (zio); } ditto_copies = ddt_ditto_copies_needed(ddt, dde, ddp); @@ -2829,7 +2850,7 @@ zio_ddt_write(zio_t *zio) zio->io_bp_override = NULL; BP_ZERO(bp); ddt_exit(ddt); - return (ZIO_PIPELINE_CONTINUE); + return (zio); } dio = zio_write(zio, spa, txg, bp, zio->io_orig_abd, @@ -2871,12 +2892,12 @@ zio_ddt_write(zio_t *zio) if (dio) zio_nowait(dio); - return (ZIO_PIPELINE_CONTINUE); + return (zio); } ddt_entry_t *freedde; /* for debugging */ -static int +static zio_t * zio_ddt_free(zio_t *zio) { spa_t *spa = zio->io_spa; @@ -2894,7 +2915,7 @@ zio_ddt_free(zio_t *zio) ddt_phys_decref(ddp); ddt_exit(ddt); - return (ZIO_PIPELINE_CONTINUE); + return (zio); } /* @@ -2932,7 +2953,7 @@ zio_io_to_allocate(spa_t *spa, int allocator) return (zio); } -static int +static zio_t * zio_dva_throttle(zio_t *zio) { spa_t *spa = zio->io_spa; @@ -2942,7 +2963,7 @@ zio_dva_throttle(zio_t *zio) !spa_normal_class(zio->io_spa)->mc_alloc_throttle_enabled || zio->io_child_type == ZIO_CHILD_GANG || zio->io_flags & ZIO_FLAG_NODATA) { - return (ZIO_PIPELINE_CONTINUE); + return (zio); } ASSERT(zio->io_child_type > ZIO_CHILD_GANG); @@ -2968,22 +2989,7 @@ zio_dva_throttle(zio_t *zio) nio = zio_io_to_allocate(zio->io_spa, zio->io_allocator); mutex_exit(&spa->spa_alloc_locks[zio->io_allocator]); - if (nio == zio) - return (ZIO_PIPELINE_CONTINUE); - - if (nio != NULL) { - ASSERT(nio->io_stage == ZIO_STAGE_DVA_THROTTLE); - /* - * We are passing control to a new zio so make sure that - * it is processed by a different thread. We do this to - * avoid stack overflows that can occur when parents are - * throttled and children are making progress. We allow - * it to go to the head of the taskq since it's already - * been waiting. - */ - zio_taskq_dispatch(nio, ZIO_TASKQ_ISSUE, B_TRUE); - } - return (ZIO_PIPELINE_STOP); + return (nio); } void @@ -3002,7 +3008,7 @@ zio_allocate_dispatch(spa_t *spa, int allocator) zio_taskq_dispatch(zio, ZIO_TASKQ_ISSUE, B_TRUE); } -static int +static zio_t * zio_dva_allocate(zio_t *zio) { spa_t *spa = zio->io_spa; @@ -3045,18 +3051,18 @@ zio_dva_allocate(zio_t *zio) zio->io_error = error; } - return (ZIO_PIPELINE_CONTINUE); + return (zio); } -static int +static zio_t * zio_dva_free(zio_t *zio) { metaslab_free(zio->io_spa, zio->io_bp, zio->io_txg, B_FALSE); - return (ZIO_PIPELINE_CONTINUE); + return (zio); } -static int +static zio_t * zio_dva_claim(zio_t *zio) { int error; @@ -3065,7 +3071,7 @@ zio_dva_claim(zio_t *zio) if (error) zio->io_error = error; - return (ZIO_PIPELINE_CONTINUE); + return (zio); } /* @@ -3160,7 +3166,7 @@ zio_alloc_zil(spa_t *spa, uint64_t objset, uint64_t tx * force the underlying vdev layers to call either zio_execute() or * zio_interrupt() to ensure that the pipeline continues with the correct I/O. */ -static int +static zio_t * zio_vdev_io_start(zio_t *zio) { vdev_t *vd = zio->io_vd; @@ -3179,13 +3185,13 @@ zio_vdev_io_start(zio_t *zio) * The mirror_ops handle multiple DVAs in a single BP. */ vdev_mirror_ops.vdev_op_io_start(zio); - return (ZIO_PIPELINE_STOP); + return (NULL); } if (vd->vdev_ops->vdev_op_leaf && zio->io_type == ZIO_TYPE_FREE && zio->io_priority == ZIO_PRIORITY_NOW) { trim_map_free(vd, zio->io_offset, zio->io_size, zio->io_txg); - return (ZIO_PIPELINE_CONTINUE); + return (zio); } ASSERT3P(zio->io_logical, !=, zio); @@ -3299,24 +3305,24 @@ zio_vdev_io_start(zio_t *zio) !vdev_dtl_contains(vd, DTL_PARTIAL, zio->io_txg, 1)) { ASSERT(zio->io_type == ZIO_TYPE_WRITE); zio_vdev_io_bypass(zio); - return (ZIO_PIPELINE_CONTINUE); + return (zio); } if (vd->vdev_ops->vdev_op_leaf) { switch (zio->io_type) { case ZIO_TYPE_READ: if (vdev_cache_read(zio)) - return (ZIO_PIPELINE_CONTINUE); + return (zio); /* FALLTHROUGH */ case ZIO_TYPE_WRITE: case ZIO_TYPE_FREE: if ((zio = vdev_queue_io(zio)) == NULL) - return (ZIO_PIPELINE_STOP); + return (NULL); if (!vdev_accessible(vd, zio)) { zio->io_error = SET_ERROR(ENXIO); zio_interrupt(zio); - return (ZIO_PIPELINE_STOP); + return (NULL); } break; } @@ -3328,14 +3334,14 @@ zio_vdev_io_start(zio_t *zio) if (zio->io_type == ZIO_TYPE_WRITE && !(zio->io_flags & ZIO_FLAG_IO_REPAIR) && !trim_map_write_start(zio)) - return (ZIO_PIPELINE_STOP); + return (NULL); } vd->vdev_ops->vdev_op_io_start(zio); - return (ZIO_PIPELINE_STOP); + return (NULL); } -static int +static zio_t * zio_vdev_io_done(zio_t *zio) { vdev_t *vd = zio->io_vd; @@ -3343,7 +3349,7 @@ zio_vdev_io_done(zio_t *zio) boolean_t unexpected_error = B_FALSE; if (zio_wait_for_children(zio, ZIO_CHILD_VDEV_BIT, ZIO_WAIT_DONE)) { - return (ZIO_PIPELINE_STOP); + return (NULL); } ASSERT(zio->io_type == ZIO_TYPE_READ || @@ -3386,7 +3392,7 @@ zio_vdev_io_done(zio_t *zio) if (unexpected_error) VERIFY(vdev_probe(vd, zio) == NULL); - return (ZIO_PIPELINE_CONTINUE); + return (zio); } /* @@ -3444,13 +3450,13 @@ zio_vsd_default_cksum_report(zio_t *zio, zio_cksum_rep zcr->zcr_free = zio_buf_free; } -static int +static zio_t * zio_vdev_io_assess(zio_t *zio) { vdev_t *vd = zio->io_vd; if (zio_wait_for_children(zio, ZIO_CHILD_VDEV_BIT, ZIO_WAIT_DONE)) { - return (ZIO_PIPELINE_STOP); + return (NULL); } if (vd == NULL && !(zio->io_flags & ZIO_FLAG_CONFIG_WRITER)) @@ -3496,7 +3502,7 @@ zio_vdev_io_assess(zio_t *zio) zio->io_stage = ZIO_STAGE_VDEV_IO_START >> 1; zio_taskq_dispatch(zio, ZIO_TASKQ_ISSUE, zio_requeue_io_start_cut_in_line); - return (ZIO_PIPELINE_STOP); + return (NULL); } /* @@ -3536,7 +3542,7 @@ zio_vdev_io_assess(zio_t *zio) zio->io_physdone(zio->io_logical); } - return (ZIO_PIPELINE_CONTINUE); + return (zio); } void @@ -3571,7 +3577,7 @@ zio_vdev_io_bypass(zio_t *zio) * Generate and verify checksums * ========================================================================== */ -static int +static zio_t * zio_checksum_generate(zio_t *zio) { blkptr_t *bp = zio->io_bp; @@ -3585,7 +3591,7 @@ zio_checksum_generate(zio_t *zio) checksum = zio->io_prop.zp_checksum; if (checksum == ZIO_CHECKSUM_OFF) - return (ZIO_PIPELINE_CONTINUE); + return (zio); ASSERT(checksum == ZIO_CHECKSUM_LABEL); } else { @@ -3599,10 +3605,10 @@ zio_checksum_generate(zio_t *zio) zio_checksum_compute(zio, checksum, zio->io_abd, zio->io_size); - return (ZIO_PIPELINE_CONTINUE); + return (zio); } -static int +static zio_t * zio_checksum_verify(zio_t *zio) { zio_bad_cksum_t info; @@ -3617,7 +3623,7 @@ zio_checksum_verify(zio_t *zio) * We're either verifying a label checksum, or nothing at all. */ if (zio->io_prop.zp_checksum == ZIO_CHECKSUM_OFF) - return (ZIO_PIPELINE_CONTINUE); + return (zio); ASSERT(zio->io_prop.zp_checksum == ZIO_CHECKSUM_LABEL); } @@ -3632,7 +3638,7 @@ zio_checksum_verify(zio_t *zio) } } - return (ZIO_PIPELINE_CONTINUE); + return (zio); } /* @@ -3675,7 +3681,7 @@ zio_worst_error(int e1, int e2) * I/O completion * ========================================================================== */ -static int +static zio_t * zio_ready(zio_t *zio) { blkptr_t *bp = zio->io_bp; @@ -3684,7 +3690,7 @@ zio_ready(zio_t *zio) if (zio_wait_for_children(zio, ZIO_CHILD_GANG_BIT | ZIO_CHILD_DDT_BIT, ZIO_WAIT_READY)) { - return (ZIO_PIPELINE_STOP); + return (NULL); } if (zio->io_ready) { @@ -3730,7 +3736,7 @@ zio_ready(zio_t *zio) */ for (; pio != NULL; pio = pio_next) { pio_next = zio_walk_parents(zio, &zl); - zio_notify_parent(pio, zio, ZIO_WAIT_READY); + zio_notify_parent(pio, zio, ZIO_WAIT_READY, NULL); } if (zio->io_flags & ZIO_FLAG_NODATA) { @@ -3746,7 +3752,7 @@ zio_ready(zio_t *zio) zio->io_spa->spa_syncing_txg == zio->io_txg) zio_handle_ignored_writes(zio); - return (ZIO_PIPELINE_CONTINUE); + return (zio); } /* @@ -3810,7 +3816,7 @@ zio_dva_throttle_done(zio_t *zio) zio_allocate_dispatch(zio->io_spa, pio->io_allocator); } -static int +static zio_t * zio_done(zio_t *zio) { spa_t *spa = zio->io_spa; @@ -3827,7 +3833,7 @@ zio_done(zio_t *zio) * wait for them and then repeat this pipeline stage. */ if (zio_wait_for_children(zio, ZIO_CHILD_ALL_BITS, ZIO_WAIT_DONE)) { - return (ZIO_PIPELINE_STOP); + return (NULL); } /* @@ -4041,7 +4047,12 @@ zio_done(zio_t *zio) if ((pio->io_flags & ZIO_FLAG_GODFATHER) && (zio->io_reexecute & ZIO_REEXECUTE_SUSPEND)) { zio_remove_child(pio, zio, remove_zl); - zio_notify_parent(pio, zio, ZIO_WAIT_DONE); + /* + * This is a rare code path, so we don't + * bother with "next_to_execute". + */ + zio_notify_parent(pio, zio, ZIO_WAIT_DONE, + NULL); } } @@ -4053,7 +4064,11 @@ zio_done(zio_t *zio) */ ASSERT(!(zio->io_flags & ZIO_FLAG_GODFATHER)); zio->io_flags |= ZIO_FLAG_DONT_PROPAGATE; - zio_notify_parent(pio, zio, ZIO_WAIT_DONE); + /* + * This is a rare code path, so we don't bother with + * "next_to_execute". + */ + zio_notify_parent(pio, zio, ZIO_WAIT_DONE, NULL); } else if (zio->io_reexecute & ZIO_REEXECUTE_SUSPEND) { /* * We'd fail again if we reexecuted now, so suspend @@ -4074,7 +4089,7 @@ zio_done(zio_t *zio) ZIO_TASKQ_ISSUE, (task_func_t *)zio_reexecute, zio, 0, &zio->io_tqent); } - return (ZIO_PIPELINE_STOP); + return (NULL); } ASSERT(zio->io_child_count == 0); @@ -4104,12 +4119,17 @@ zio_done(zio_t *zio) zio->io_state[ZIO_WAIT_DONE] = 1; mutex_exit(&zio->io_lock); + /* + * We are done executing this zio. We may want to execute a parent + * next. See the comment in zio_notify_parent(). + */ + zio_t *next_to_execute = NULL; zl = NULL; for (pio = zio_walk_parents(zio, &zl); pio != NULL; pio = pio_next) { zio_link_t *remove_zl = zl; pio_next = zio_walk_parents(zio, &zl); zio_remove_child(pio, zio, remove_zl); - zio_notify_parent(pio, zio, ZIO_WAIT_DONE); + zio_notify_parent(pio, zio, ZIO_WAIT_DONE, &next_to_execute); } if (zio->io_waiter != NULL) { @@ -4121,7 +4141,7 @@ zio_done(zio_t *zio) zio_destroy(zio); } - return (ZIO_PIPELINE_STOP); + return (next_to_execute); } /* From owner-svn-src-stable-11@freebsd.org Wed Oct 3 15:31:44 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CB36F10A3000; Wed, 3 Oct 2018 15:31:44 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7E4AB818D2; Wed, 3 Oct 2018 15:31:44 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 7905318D7B; Wed, 3 Oct 2018 15:31:44 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93FVimu055551; Wed, 3 Oct 2018 15:31:44 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93FVi0c055550; Wed, 3 Oct 2018 15:31:44 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031531.w93FVi0c055550@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 15:31:44 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339148 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339148 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 15:31:45 -0000 Author: mav Date: Wed Oct 3 15:31:44 2018 New Revision: 339148 URL: https://svnweb.freebsd.org/changeset/base/339148 Log: MFC r337870: Fix mismerge in r337196. ZoL did the same mistake, and fixed it with separate commit 863522b1f9: dsl_scan_scrub_cb: don't double-account non-embedded blocks We were doing count_block() twice inside this function, once unconditionally at the beginning (intended to catch the embedded block case) and once near the end after processing the block. The double-accounting caused the "zpool scrub" progress statistics in "zpool status" to climb from 0% to 200% instead of 0% to 100%, and showed double the I/O rate it was actually seeing. This was apparently a regression introduced in commit 00c405b4b5e8, which was an incorrect port of this OpenZFS commit: https://github.com/openzfs/openzfs/commit/d8a447a7 Reviewed by: Thomas Caputi Reviewed by: Matt Ahrens Reviewed-by: Brian Behlendorf Reviewed-by: George Melikov Signed-off-by: Steven Noonan Closes #7720 Closes #7738 Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Oct 3 14:59:39 2018 (r339147) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Oct 3 15:31:44 2018 (r339148) @@ -3551,12 +3551,12 @@ dsl_scan_scrub_cb(dsl_pool_t *dp, boolean_t needs_io; int zio_flags = ZIO_FLAG_SCAN_THREAD | ZIO_FLAG_RAW | ZIO_FLAG_CANFAIL; int d; - - count_block(scn, dp->dp_blkstats, bp); if (phys_birth <= scn->scn_phys.scn_min_txg || - phys_birth >= scn->scn_phys.scn_max_txg) + phys_birth >= scn->scn_phys.scn_max_txg) { + count_block(scn, dp->dp_blkstats, bp); return (0); + } /* Embedded BP's have phys_birth==0, so we reject them above. */ ASSERT(!BP_IS_EMBEDDED(bp)); From owner-svn-src-stable-11@freebsd.org Wed Oct 3 15:32:43 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8D89110A30CF; Wed, 3 Oct 2018 15:32:43 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4126981ADF; Wed, 3 Oct 2018 15:32:43 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 3C39A18DB6; Wed, 3 Oct 2018 15:32:43 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93FWh8n056340; Wed, 3 Oct 2018 15:32:43 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93FWgAT056338; Wed, 3 Oct 2018 15:32:42 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031532.w93FWgAT056338@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 15:32:42 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339149 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339149 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 15:32:43 -0000 Author: mav Date: Wed Oct 3 15:32:42 2018 New Revision: 339149 URL: https://svnweb.freebsd.org/changeset/base/339149 Log: MFC r337883: Add couple tunables/sysctl, missed in r336949. Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Wed Oct 3 15:31:44 2018 (r339148) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Wed Oct 3 15:32:42 2018 (r339149) @@ -441,6 +441,10 @@ SYSCTL_UQUAD(_vfs_zfs, OID_AUTO, spa_min_slop, CTLFLAG int spa_allocators = 4; +SYSCTL_INT(_vfs_zfs, OID_AUTO, spa_allocators, CTLFLAG_RWTUN, + &spa_allocators, 0, + "Number of allocators per metaslab group"); + /*PRINTFLIKE2*/ void spa_load_failed(spa_t *spa, const char *fmt, ...) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c Wed Oct 3 15:31:44 2018 (r339148) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c Wed Oct 3 15:32:42 2018 (r339149) @@ -267,6 +267,9 @@ SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, write_gap_limit, C SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, queue_depth_pct, CTLFLAG_RWTUN, &zfs_vdev_queue_depth_pct, 0, "Queue depth percentage for each top-level"); +SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, def_queue_depth, CTLFLAG_RWTUN, + &zfs_vdev_def_queue_depth, 0, + "Default queue depth for each allocator"); static int sysctl_zfs_async_write_active_min_dirty_percent(SYSCTL_HANDLER_ARGS) From owner-svn-src-stable-11@freebsd.org Wed Oct 3 15:33:21 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B649910A31B3; Wed, 3 Oct 2018 15:33:21 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6A09B81C11; Wed, 3 Oct 2018 15:33:21 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 64F4118DB8; Wed, 3 Oct 2018 15:33:21 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93FXLHg056442; Wed, 3 Oct 2018 15:33:21 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93FXLUZ056441; Wed, 3 Oct 2018 15:33:21 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031533.w93FXLUZ056441@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 15:33:21 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339150 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339150 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 15:33:21 -0000 Author: mav Date: Wed Oct 3 15:33:20 2018 New Revision: 339150 URL: https://svnweb.freebsd.org/changeset/base/339150 Log: MFC r337923: Make vfs.zfs.zio.dva_throttle_enabled sysctl writable. Not sure what I thought originally, but as I see now runtime changes are working fine, and the code seems like even designed for this. Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Wed Oct 3 15:32:42 2018 (r339149) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Wed Oct 3 15:33:20 2018 (r339150) @@ -83,8 +83,8 @@ const char *zio_type_name[ZIO_TYPES] = { }; boolean_t zio_dva_throttle_enabled = B_TRUE; -SYSCTL_INT(_vfs_zfs_zio, OID_AUTO, dva_throttle_enabled, CTLFLAG_RDTUN, - &zio_dva_throttle_enabled, 0, ""); +SYSCTL_INT(_vfs_zfs_zio, OID_AUTO, dva_throttle_enabled, CTLFLAG_RWTUN, + &zio_dva_throttle_enabled, 0, "Enable allocation throttling"); /* * ========================================================================== From owner-svn-src-stable-11@freebsd.org Wed Oct 3 15:34:51 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C57C310A32BE; Wed, 3 Oct 2018 15:34:50 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 68EED81DD5; Wed, 3 Oct 2018 15:34:50 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 44B8918DB9; Wed, 3 Oct 2018 15:34:50 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93FYoh6056554; Wed, 3 Oct 2018 15:34:50 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93FYo8N056553; Wed, 3 Oct 2018 15:34:50 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031534.w93FYo8N056553@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 15:34:50 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339151 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339151 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 15:34:51 -0000 Author: mav Date: Wed Oct 3 15:34:49 2018 New Revision: 339151 URL: https://svnweb.freebsd.org/changeset/base/339151 Log: MFC r337970: 9738 Fix third block copy allocations, broken at 9112. Use METASLAB_WEIGHT_CLAIM weight to allocate tertiary blocks. Previous use of METASLAB_WEIGHT_SECONDARY for that caused errors later on metaslab_activate_allocator() call, leading to massive load of unneeded metaslabs and write freezes. Reviewed by: Paul Dagnelie Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Oct 3 15:33:20 2018 (r339150) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Oct 3 15:34:49 2018 (r339151) @@ -3094,7 +3094,6 @@ metaslab_group_alloc_normal(metaslab_group_t *mg, zio_ metaslab_t *msp = NULL; uint64_t offset = -1ULL; uint64_t activation_weight; - boolean_t tertiary = B_FALSE; activation_weight = METASLAB_WEIGHT_PRIMARY; for (int i = 0; i < d; i++) { @@ -3103,7 +3102,7 @@ metaslab_group_alloc_normal(metaslab_group_t *mg, zio_ activation_weight = METASLAB_WEIGHT_SECONDARY; } else if (activation_weight == METASLAB_WEIGHT_SECONDARY && DVA_GET_VDEV(&dva[i]) == mg->mg_vd->vdev_id) { - tertiary = B_TRUE; + activation_weight = METASLAB_WEIGHT_CLAIM; break; } } @@ -3112,10 +3111,8 @@ metaslab_group_alloc_normal(metaslab_group_t *mg, zio_ * If we don't have enough metaslabs active to fill the entire array, we * just use the 0th slot. */ - if (mg->mg_ms_ready < mg->mg_allocators * 2) { - tertiary = B_FALSE; + if (mg->mg_ms_ready < mg->mg_allocators * 3) allocator = 0; - } ASSERT3U(mg->mg_vd->vdev_ms_count, >=, 2); @@ -3141,7 +3138,7 @@ metaslab_group_alloc_normal(metaslab_group_t *mg, zio_ msp = mg->mg_primaries[allocator]; was_active = B_TRUE; } else if (activation_weight == METASLAB_WEIGHT_SECONDARY && - mg->mg_secondaries[allocator] != NULL && !tertiary) { + mg->mg_secondaries[allocator] != NULL) { msp = mg->mg_secondaries[allocator]; was_active = B_TRUE; } else { @@ -3184,7 +3181,8 @@ metaslab_group_alloc_normal(metaslab_group_t *mg, zio_ continue; } - if (msp->ms_weight & METASLAB_WEIGHT_CLAIM) { + if (msp->ms_weight & METASLAB_WEIGHT_CLAIM && + activation_weight != METASLAB_WEIGHT_CLAIM) { metaslab_passivate(msp, msp->ms_weight & ~METASLAB_WEIGHT_CLAIM); mutex_exit(&msp->ms_lock); From owner-svn-src-stable-11@freebsd.org Wed Oct 3 15:35:28 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8BA3710A3341; Wed, 3 Oct 2018 15:35:28 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3C4D981F21; Wed, 3 Oct 2018 15:35:28 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 1D29618DBA; Wed, 3 Oct 2018 15:35:28 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93FZRTQ056648; Wed, 3 Oct 2018 15:35:27 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93FZR7e056647; Wed, 3 Oct 2018 15:35:27 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031535.w93FZR7e056647@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 15:35:27 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339152 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339152 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 15:35:28 -0000 Author: mav Date: Wed Oct 3 15:35:27 2018 New Revision: 339152 URL: https://svnweb.freebsd.org/changeset/base/339152 Log: MFC r337972: 9751 Allocation throttling misplacing ditto blocks Relax allocation throttling for ditto blocks. Due to random imbalances in allocation it tends to push block copies to one vdev, that looks slightly better at the moment. Slightly less strict policy allows both improve data security and surprisingly write performance, since we don't need to touch extra metaslabs on each vdev to respect the min distance. Sponsored by: iXsystems, Inc. Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Oct 3 15:34:49 2018 (r339151) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Oct 3 15:35:27 2018 (r339152) @@ -1090,7 +1090,7 @@ metaslab_group_fragmentation(metaslab_group_t *mg) */ static boolean_t metaslab_group_allocatable(metaslab_group_t *mg, metaslab_group_t *rotor, - uint64_t psize, int allocator) + uint64_t psize, int allocator, int d) { spa_t *spa = mg->mg_vd->vdev_spa; metaslab_class_t *mc = mg->mg_class; @@ -1131,6 +1131,13 @@ metaslab_group_allocatable(metaslab_group_t *mg, metas if (mg->mg_no_free_space) return (B_FALSE); + /* + * Relax allocation throttling for ditto blocks. Due to + * random imbalances in allocation it tends to push copies + * to one vdev, that looks a bit better at the moment. + */ + qmax = qmax * (4 + d) / 4; + qdepth = refcount_count(&mg->mg_alloc_queue_depth[allocator]); /* @@ -1151,7 +1158,7 @@ metaslab_group_allocatable(metaslab_group_t *mg, metas */ for (mgp = mg->mg_next; mgp != rotor; mgp = mgp->mg_next) { qmax = mgp->mg_cur_max_alloc_queue_depth[allocator]; - + qmax = qmax * (4 + d) / 4; qdepth = refcount_count( &mgp->mg_alloc_queue_depth[allocator]); @@ -3437,7 +3444,7 @@ top: */ if (allocatable && !GANG_ALLOCATION(flags) && !try_hard) { allocatable = metaslab_group_allocatable(mg, rotor, - psize, allocator); + psize, allocator, d); } if (!allocatable) { From owner-svn-src-stable-11@freebsd.org Wed Oct 3 15:36:37 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5F8F410A3453; Wed, 3 Oct 2018 15:36:37 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 135FA820AA; Wed, 3 Oct 2018 15:36:37 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 0DCF218DBB; Wed, 3 Oct 2018 15:36:37 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93FaaYx056752; Wed, 3 Oct 2018 15:36:36 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93FaaMH056751; Wed, 3 Oct 2018 15:36:36 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031536.w93FaaMH056751@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 15:36:36 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339153 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339153 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 15:36:37 -0000 Author: mav Date: Wed Oct 3 15:36:36 2018 New Revision: 339153 URL: https://svnweb.freebsd.org/changeset/base/339153 Log: MFC r338869: MFV r338866: 9700 ZFS resilvered mirror does not balance reads illumos/illumos-gate@82f63c3c2bf5e4378706e8dcfccf717d67371be9 Reviewed by: Toomas Soome Reviewed by: Sanjay Nadkarni Reviewed by: George Wilson Approved by: Matthew Ahrens Author: Jerry Jelinek Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Oct 3 15:35:27 2018 (r339152) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Oct 3 15:36:36 2018 (r339153) @@ -28,7 +28,7 @@ * Copyright 2013 Saso Kiselkov. All rights reserved. * Copyright (c) 2014 Integros [integros.com] * Copyright 2016 Toomas Soome - * Copyright 2017 Joyent, Inc. + * Copyright 2018 Joyent, Inc. * Copyright (c) 2017 Datto Inc. * Copyright 2018 OmniOS Community Edition (OmniOSce) Association. */ @@ -6871,6 +6871,7 @@ spa_vdev_resilver_done_hunt(vdev_t *vd) /* * Check for a completed resilver with the 'unspare' flag set. + * Also potentially update faulted state. */ if (vd->vdev_ops == &vdev_spare_ops) { vdev_t *first = vd->vdev_child[0]; @@ -6891,6 +6892,8 @@ spa_vdev_resilver_done_hunt(vdev_t *vd) vdev_dtl_empty(newvd, DTL_OUTAGE) && !vdev_dtl_required(oldvd)) return (oldvd); + + vdev_propagate_state(vd); /* * If there are more than two spares attached to a disk, From owner-svn-src-stable-11@freebsd.org Wed Oct 3 17:10:37 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 41F9B10A6F04; Wed, 3 Oct 2018 17:10:37 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E71F285974; Wed, 3 Oct 2018 17:10:36 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id E1D0219C99; Wed, 3 Oct 2018 17:10:36 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93HAajv005044; Wed, 3 Oct 2018 17:10:36 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93HAWHb005019; Wed, 3 Oct 2018 17:10:32 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810031710.w93HAWHb005019@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 17:10:32 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339158 - in stable/11: cddl/contrib/opensolaris/lib/libzfs/common cddl/contrib/opensolaris/lib/libzpool/common/sys sys/cddl/contrib/opensolaris/uts/common/fs/zfs sys/cddl/contrib/opens... X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11: cddl/contrib/opensolaris/lib/libzfs/common cddl/contrib/opensolaris/lib/libzpool/common/sys sys/cddl/contrib/opensolaris/uts/common/fs/zfs sys/cddl/contrib/opensolaris/uts/common/fs/zfs/... X-SVN-Commit-Revision: 339158 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 17:10:37 -0000 Author: mav Date: Wed Oct 3 17:10:32 2018 New Revision: 339158 URL: https://svnweb.freebsd.org/changeset/base/339158 Log: MFC r337567 (by mmacy): Performance optimization of AVL tree comparator functions MFV: commit ee36c709c3d5f7040e1bd11f5c75318aa03e789f Author: Gvozden Neskovic Date: Sat Aug 27 20:12:53 2016 +0200 perf: 2.75x faster ddt_entry_compare() First 256bits of ddt_key_t is a block checksum, which are expected to be close to random data. Hence, on average, comparison only needs to look at first few bytes of the keys. To reduce number of conditional jump instructions, the result is computed as: sign(memcmp(k1, k2)). Sign of an integer 'a' can be obtained as: `(0 < a) - (a < 0)` := {-1, 0, 1} , which is computed efficiently. Synthetic performance evaluation of original and new algorithm over 1G random keys on 2.6GHz Intel(R) Xeon(R) CPU E5-2660 v3: old 6.85789 s new 2.49089 s perf: 2.8x faster vdev_queue_offset_compare() and vdev_queue_timestamp_compare() Compute the result directly instead of using conditionals perf: zfs_range_compare() Speedup between 1.1x - 2.5x, depending on compiler version and optimization level. perf: spa_error_entry_compare() `bcmp()` is not suitable for comparator use. Use `memcmp()` instead. perf: 2.8x faster metaslab_compare() and metaslab_rangesize_compare() perf: 2.8x faster zil_bp_compare() perf: 2.8x faster mze_compare() perf: faster dbuf_compare() perf: faster compares in spa_misc perf: 2.8x faster layout_hash_compare() perf: 2.8x faster space_reftree_compare() perf: libzfs: faster avl tree comparators perf: guid_compare() perf: dsl_deadlist_compare() perf: perm_set_compare() perf: 2x faster range_tree_seg_compare() perf: faster unique_compare() perf: faster vdev_cache _compare() perf: faster vdev_uberblock_compare() perf: faster fuid _compare() perf: faster zfs_znode_hold_compare() Signed-off-by: Gvozden Neskovic Signed-off-by: Richard Elling Signed-off-by: Brian Behlendorf Closes #5033 Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_iter.c stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c stable/11/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deleg.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_reftree.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_context.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/unique.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_fuid.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/avl.h Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Oct 3 17:10:32 2018 (r339158) @@ -787,15 +787,13 @@ typedef struct mnttab_node { static int libzfs_mnttab_cache_compare(const void *arg1, const void *arg2) { - const mnttab_node_t *mtn1 = arg1; - const mnttab_node_t *mtn2 = arg2; + const mnttab_node_t *mtn1 = (const mnttab_node_t *)arg1; + const mnttab_node_t *mtn2 = (const mnttab_node_t *)arg2; int rv; rv = strcmp(mtn1->mtn_mt.mnt_special, mtn2->mtn_mt.mnt_special); - if (rv == 0) - return (0); - return (rv > 0 ? 1 : -1); + return (AVL_ISIGN(rv)); } void Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_iter.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_iter.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_iter.c Wed Oct 3 17:10:32 2018 (r339158) @@ -272,12 +272,7 @@ zfs_snapshot_compare(const void *larg, const void *rar lcreate = zfs_prop_get_int(l, ZFS_PROP_CREATETXG); rcreate = zfs_prop_get_int(r, ZFS_PROP_CREATETXG); - if (lcreate < rcreate) - return (-1); - else if (lcreate > rcreate) - return (+1); - else - return (0); + return (AVL_CMP(lcreate, rcreate)); } int Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c Wed Oct 3 17:10:32 2018 (r339158) @@ -489,15 +489,10 @@ typedef struct fsavl_node { static int fsavl_compare(const void *arg1, const void *arg2) { - const fsavl_node_t *fn1 = arg1; - const fsavl_node_t *fn2 = arg2; + const fsavl_node_t *fn1 = (const fsavl_node_t *)arg1; + const fsavl_node_t *fn2 = (const fsavl_node_t *)arg2; - if (fn1->fn_guid > fn2->fn_guid) - return (+1); - else if (fn1->fn_guid < fn2->fn_guid) - return (-1); - else - return (0); + return (AVL_CMP(fn1->fn_guid, fn2->fn_guid)); } /* Modified: stable/11/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h Wed Oct 3 17:10:32 2018 (r339158) @@ -670,6 +670,9 @@ extern zoneid_t getzoneid(void); #define root_mount_wait() do { } while (0) #define root_mounted() (1) +#define noinline __attribute__((noinline)) +#define likely(x) __builtin_expect((x), 1) + struct file { void *dummy; }; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Wed Oct 3 17:10:32 2018 (r339158) @@ -1906,14 +1906,10 @@ typedef struct guid_map_entry { static int guid_compare(const void *arg1, const void *arg2) { - const guid_map_entry_t *gmep1 = arg1; - const guid_map_entry_t *gmep2 = arg2; + const guid_map_entry_t *gmep1 = (const guid_map_entry_t *)arg1; + const guid_map_entry_t *gmep2 = (const guid_map_entry_t *)arg2; - if (gmep1->guid < gmep2->guid) - return (-1); - else if (gmep1->guid > gmep2->guid) - return (1); - return (0); + return (AVL_CMP(gmep1->guid, gmep2->guid)); } static void Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Wed Oct 3 17:10:32 2018 (r339158) @@ -78,19 +78,13 @@ dbuf_compare(const void *x1, const void *x2) const dmu_buf_impl_t *d1 = x1; const dmu_buf_impl_t *d2 = x2; - if (d1->db_level < d2->db_level) { - return (-1); - } - if (d1->db_level > d2->db_level) { - return (1); - } + int cmp = AVL_CMP(d1->db_level, d2->db_level); + if (likely(cmp)) + return (cmp); - if (d1->db_blkid < d2->db_blkid) { - return (-1); - } - if (d1->db_blkid > d2->db_blkid) { - return (1); - } + cmp = AVL_CMP(d1->db_blkid, d2->db_blkid); + if (likely(cmp)) + return (cmp); if (d1->db_state == DB_SEARCH) { ASSERT3S(d2->db_state, !=, DB_SEARCH); @@ -100,13 +94,7 @@ dbuf_compare(const void *x1, const void *x2) return (1); } - if ((uintptr_t)d1 < (uintptr_t)d2) { - return (-1); - } - if ((uintptr_t)d1 > (uintptr_t)d2) { - return (1); - } - return (0); + return (AVL_PCMP(d1, d2)); } /* ARGSUSED */ Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c Wed Oct 3 17:10:32 2018 (r339158) @@ -55,15 +55,10 @@ static int dsl_deadlist_compare(const void *arg1, const void *arg2) { - const dsl_deadlist_entry_t *dle1 = arg1; - const dsl_deadlist_entry_t *dle2 = arg2; + const dsl_deadlist_entry_t *dle1 = (const dsl_deadlist_entry_t *)arg1; + const dsl_deadlist_entry_t *dle2 = (const dsl_deadlist_entry_t *)arg2; - if (dle1->dle_mintxg < dle2->dle_mintxg) - return (-1); - else if (dle1->dle_mintxg > dle2->dle_mintxg) - return (+1); - else - return (0); + return (AVL_CMP(dle1->dle_mintxg, dle2->dle_mintxg)); } static void Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deleg.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deleg.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deleg.c Wed Oct 3 17:10:32 2018 (r339158) @@ -384,14 +384,13 @@ typedef struct perm_set { static int perm_set_compare(const void *arg1, const void *arg2) { - const perm_set_t *node1 = arg1; - const perm_set_t *node2 = arg2; + const perm_set_t *node1 = (const perm_set_t *)arg1; + const perm_set_t *node2 = (const perm_set_t *)arg2; int val; val = strcmp(node1->p_setname, node2->p_setname); - if (val == 0) - return (0); - return (val > 0 ? 1 : -1); + + return (AVL_ISIGN(val)); } /* Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Oct 3 17:10:32 2018 (r339158) @@ -541,8 +541,8 @@ metaslab_class_expandable_space(metaslab_class_t *mc) static int metaslab_compare(const void *x1, const void *x2) { - const metaslab_t *m1 = x1; - const metaslab_t *m2 = x2; + const metaslab_t *m1 = (const metaslab_t *)x1; + const metaslab_t *m2 = (const metaslab_t *)x2; int sort1 = 0; int sort2 = 0; @@ -568,22 +568,13 @@ metaslab_compare(const void *x1, const void *x2) if (sort1 > sort2) return (1); - if (m1->ms_weight < m2->ms_weight) - return (1); - if (m1->ms_weight > m2->ms_weight) - return (-1); + int cmp = AVL_CMP(m2->ms_weight, m1->ms_weight); + if (likely(cmp)) + return (cmp); - /* - * If the weights are identical, use the offset to force uniqueness. - */ - if (m1->ms_start < m2->ms_start) - return (-1); - if (m1->ms_start > m2->ms_start) - return (1); + IMPLY(AVL_CMP(m1->ms_start, m2->ms_start) == 0, m1 == m2); - ASSERT3P(m1, ==, m2); - - return (0); + return (AVL_CMP(m1->ms_start, m2->ms_start)); } /* @@ -1202,18 +1193,14 @@ metaslab_rangesize_compare(const void *x1, const void uint64_t rs_size1 = r1->rs_end - r1->rs_start; uint64_t rs_size2 = r2->rs_end - r2->rs_start; - if (rs_size1 < rs_size2) - return (-1); - if (rs_size1 > rs_size2) - return (1); + int cmp = AVL_CMP(rs_size1, rs_size2); + if (likely(cmp)) + return (cmp); if (r1->rs_start < r2->rs_start) return (-1); - if (r1->rs_start > r2->rs_start) - return (1); - - return (0); + return (AVL_CMP(r1->rs_start, r2->rs_start)); } /* Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c Wed Oct 3 17:10:32 2018 (r339158) @@ -242,31 +242,23 @@ sa_cache_fini(void) static int layout_num_compare(const void *arg1, const void *arg2) { - const sa_lot_t *node1 = arg1; - const sa_lot_t *node2 = arg2; + const sa_lot_t *node1 = (const sa_lot_t *)arg1; + const sa_lot_t *node2 = (const sa_lot_t *)arg2; - if (node1->lot_num > node2->lot_num) - return (1); - else if (node1->lot_num < node2->lot_num) - return (-1); - return (0); + return (AVL_CMP(node1->lot_num, node2->lot_num)); } static int layout_hash_compare(const void *arg1, const void *arg2) { - const sa_lot_t *node1 = arg1; - const sa_lot_t *node2 = arg2; + const sa_lot_t *node1 = (const sa_lot_t *)arg1; + const sa_lot_t *node2 = (const sa_lot_t *)arg2; - if (node1->lot_hash > node2->lot_hash) - return (1); - if (node1->lot_hash < node2->lot_hash) - return (-1); - if (node1->lot_instance > node2->lot_instance) - return (1); - if (node1->lot_instance < node2->lot_instance) - return (-1); - return (0); + int cmp = AVL_CMP(node1->lot_hash, node2->lot_hash); + if (likely(cmp)) + return (cmp); + + return (AVL_CMP(node1->lot_instance, node2->lot_instance)); } boolean_t Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Oct 3 17:10:32 2018 (r339158) @@ -905,19 +905,14 @@ spa_change_guid(spa_t *spa) static int spa_error_entry_compare(const void *a, const void *b) { - spa_error_entry_t *sa = (spa_error_entry_t *)a; - spa_error_entry_t *sb = (spa_error_entry_t *)b; + const spa_error_entry_t *sa = (const spa_error_entry_t *)a; + const spa_error_entry_t *sb = (const spa_error_entry_t *)b; int ret; - ret = bcmp(&sa->se_bookmark, &sb->se_bookmark, + ret = memcmp(&sa->se_bookmark, &sb->se_bookmark, sizeof (zbookmark_phys_t)); - if (ret < 0) - return (-1); - else if (ret > 0) - return (1); - else - return (0); + return (AVL_ISIGN(ret)); } /* Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Wed Oct 3 17:10:32 2018 (r339158) @@ -1024,18 +1024,13 @@ typedef struct spa_aux { int aux_count; } spa_aux_t; -static int +static inline int spa_aux_compare(const void *a, const void *b) { - const spa_aux_t *sa = a; - const spa_aux_t *sb = b; + const spa_aux_t *sa = (const spa_aux_t *)a; + const spa_aux_t *sb = (const spa_aux_t *)b; - if (sa->aux_guid < sb->aux_guid) - return (-1); - else if (sa->aux_guid > sb->aux_guid) - return (1); - else - return (0); + return (AVL_CMP(sa->aux_guid, sb->aux_guid)); } void @@ -2061,11 +2056,8 @@ spa_name_compare(const void *a1, const void *a2) int s; s = strcmp(s1->spa_name, s2->spa_name); - if (s > 0) - return (1); - if (s < 0) - return (-1); - return (0); + + return (AVL_ISIGN(s)); } int Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_reftree.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_reftree.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_reftree.c Wed Oct 3 17:10:32 2018 (r339158) @@ -54,20 +54,14 @@ static int space_reftree_compare(const void *x1, const void *x2) { - const space_ref_t *sr1 = x1; - const space_ref_t *sr2 = x2; + const space_ref_t *sr1 = (const space_ref_t *)x1; + const space_ref_t *sr2 = (const space_ref_t *)x2; - if (sr1->sr_offset < sr2->sr_offset) - return (-1); - if (sr1->sr_offset > sr2->sr_offset) - return (1); + int cmp = AVL_CMP(sr1->sr_offset, sr2->sr_offset); + if (likely(cmp)) + return (cmp); - if (sr1 < sr2) - return (-1); - if (sr1 > sr2) - return (1); - - return (0); + return (AVL_PCMP(sr1, sr2)); } void Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_context.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_context.h Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_context.h Wed Oct 3 17:10:32 2018 (r339158) @@ -139,4 +139,7 @@ extern struct mtx zfs_debug_mtx; #define sys_shutdown rebooting +#define noinline __attribute__((noinline)) +#define likely(x) __builtin_expect((x), 1) + #endif /* _SYS_ZFS_CONTEXT_H */ Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/unique.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/unique.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/unique.c Wed Oct 3 17:10:32 2018 (r339158) @@ -42,14 +42,10 @@ typedef struct unique { static int unique_compare(const void *a, const void *b) { - const unique_t *una = a; - const unique_t *unb = b; + const unique_t *una = (const unique_t *)a; + const unique_t *unb = (const unique_t *)b; - if (una->un_value < unb->un_value) - return (-1); - if (una->un_value > unb->un_value) - return (+1); - return (0); + return (AVL_CMP(una->un_value, unb->un_value)); } void Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c Wed Oct 3 17:10:32 2018 (r339158) @@ -1047,19 +1047,13 @@ retry: * among uberblocks with equal txg, choose the one with the latest timestamp. */ static int -vdev_uberblock_compare(uberblock_t *ub1, uberblock_t *ub2) +vdev_uberblock_compare(const uberblock_t *ub1, const uberblock_t *ub2) { - if (ub1->ub_txg < ub2->ub_txg) - return (-1); - if (ub1->ub_txg > ub2->ub_txg) - return (1); + int cmp = AVL_CMP(ub1->ub_txg, ub2->ub_txg); + if (likely(cmp)) + return (cmp); - if (ub1->ub_timestamp < ub2->ub_timestamp) - return (-1); - if (ub1->ub_timestamp > ub2->ub_timestamp) - return (1); - - return (0); + return (AVL_CMP(ub1->ub_timestamp, ub2->ub_timestamp)); } struct ubl_cbdata { Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c Wed Oct 3 17:10:32 2018 (r339158) @@ -314,20 +314,15 @@ sysctl_zfs_async_write_active_max_dirty_percent(SYSCTL int vdev_queue_offset_compare(const void *x1, const void *x2) { - const zio_t *z1 = x1; - const zio_t *z2 = x2; + const zio_t *z1 = (const zio_t *)x1; + const zio_t *z2 = (const zio_t *)x2; - if (z1->io_offset < z2->io_offset) - return (-1); - if (z1->io_offset > z2->io_offset) - return (1); + int cmp = AVL_CMP(z1->io_offset, z2->io_offset); - if (z1 < z2) - return (-1); - if (z1 > z2) - return (1); + if (likely(cmp)) + return (cmp); - return (0); + return (AVL_PCMP(z1, z2)); } static inline avl_tree_t * Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c Wed Oct 3 17:10:32 2018 (r339158) @@ -281,15 +281,11 @@ mze_compare(const void *arg1, const void *arg2) const mzap_ent_t *mze1 = arg1; const mzap_ent_t *mze2 = arg2; - if (mze1->mze_hash > mze2->mze_hash) - return (+1); - if (mze1->mze_hash < mze2->mze_hash) - return (-1); - if (mze1->mze_cd > mze2->mze_cd) - return (+1); - if (mze1->mze_cd < mze2->mze_cd) - return (-1); - return (0); + int cmp = AVL_CMP(mze1->mze_hash, mze2->mze_hash); + if (likely(cmp)) + return (cmp); + + return (AVL_CMP(mze1->mze_cd, mze2->mze_cd)); } static int Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_fuid.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_fuid.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_fuid.c Wed Oct 3 17:10:32 2018 (r339158) @@ -71,14 +71,10 @@ static char *nulldomain = ""; static int idx_compare(const void *arg1, const void *arg2) { - const fuid_domain_t *node1 = arg1; - const fuid_domain_t *node2 = arg2; + const fuid_domain_t *node1 = (const fuid_domain_t *)arg1; + const fuid_domain_t *node2 = (const fuid_domain_t *)arg2; - if (node1->f_idx < node2->f_idx) - return (-1); - else if (node1->f_idx > node2->f_idx) - return (1); - return (0); + return (AVL_CMP(node1->f_idx, node2->f_idx)); } /* @@ -87,14 +83,13 @@ idx_compare(const void *arg1, const void *arg2) static int domain_compare(const void *arg1, const void *arg2) { - const fuid_domain_t *node1 = arg1; - const fuid_domain_t *node2 = arg2; + const fuid_domain_t *node1 = (const fuid_domain_t *)arg1; + const fuid_domain_t *node2 = (const fuid_domain_t *)arg2; int val; val = strcmp(node1->f_ksid->kd_name, node2->f_ksid->kd_name); - if (val == 0) - return (0); - return (val > 0 ? 1 : -1); + + return (AVL_ISIGN(val)); } void Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c Wed Oct 3 17:10:32 2018 (r339158) @@ -594,12 +594,8 @@ zfs_range_reduce(rl_t *rl, uint64_t off, uint64_t len) int zfs_range_compare(const void *arg1, const void *arg2) { - const rl_t *rl1 = arg1; - const rl_t *rl2 = arg2; + const rl_t *rl1 = (const rl_t *)arg1; + const rl_t *rl2 = (const rl_t *)arg2; - if (rl1->r_off > rl2->r_off) - return (1); - if (rl1->r_off < rl2->r_off) - return (-1); - return (0); + return (AVL_CMP(rl1->r_off, rl2->r_off)); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c Wed Oct 3 17:10:32 2018 (r339158) @@ -131,17 +131,11 @@ zil_bp_compare(const void *x1, const void *x2) const dva_t *dva1 = &((zil_bp_node_t *)x1)->zn_dva; const dva_t *dva2 = &((zil_bp_node_t *)x2)->zn_dva; - if (DVA_GET_VDEV(dva1) < DVA_GET_VDEV(dva2)) - return (-1); - if (DVA_GET_VDEV(dva1) > DVA_GET_VDEV(dva2)) - return (1); + int cmp = AVL_CMP(DVA_GET_VDEV(dva1), DVA_GET_VDEV(dva2)); + if (likely(cmp)) + return (cmp); - if (DVA_GET_OFFSET(dva1) < DVA_GET_OFFSET(dva2)) - return (-1); - if (DVA_GET_OFFSET(dva1) > DVA_GET_OFFSET(dva2)) - return (1); - - return (0); + return (AVL_CMP(DVA_GET_OFFSET(dva1), DVA_GET_OFFSET(dva2))); } static void @@ -503,12 +497,7 @@ zil_lwb_vdev_compare(const void *x1, const void *x2) const uint64_t v1 = ((zil_vdev_node_t *)x1)->zv_vdev; const uint64_t v2 = ((zil_vdev_node_t *)x2)->zv_vdev; - if (v1 < v2) - return (-1); - if (v1 > v2) - return (1); - - return (0); + return (AVL_CMP(v1, v2)); } static lwb_t * @@ -1626,12 +1615,7 @@ zil_aitx_compare(const void *x1, const void *x2) const uint64_t o1 = ((itx_async_node_t *)x1)->ia_foid; const uint64_t o2 = ((itx_async_node_t *)x2)->ia_foid; - if (o1 < o2) - return (-1); - if (o1 > o2) - return (1); - - return (0); + return (AVL_CMP(o1, o2)); } /* Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/avl.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/avl.h Wed Oct 3 16:38:36 2018 (r339157) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/sys/avl.h Wed Oct 3 17:10:32 2018 (r339158) @@ -107,6 +107,14 @@ extern "C" { /* + * AVL comparator helpers + */ +#define AVL_ISIGN(a) (((a) > 0) - ((a) < 0)) +#define AVL_CMP(a, b) (((a) > (b)) - ((a) < (b))) +#define AVL_PCMP(a, b) \ + (((uintptr_t)(a) > (uintptr_t)(b)) - ((uintptr_t)(a) < (uintptr_t)(b))) + +/* * Type used for the root of the AVL tree. */ typedef struct avl_tree avl_tree_t; From owner-svn-src-stable-11@freebsd.org Wed Oct 3 17:14:42 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3900910A71F1; Wed, 3 Oct 2018 17:14:42 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D061385D7A; Wed, 3 Oct 2018 17:14:41 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id C1D3D19E24; Wed, 3 Oct 2018 17:14:41 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93HEfVQ009718; Wed, 3 Oct 2018 17:14:41 GMT (envelope-from kevans@FreeBSD.org) Received: (from kevans@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93HEeTn009710; Wed, 3 Oct 2018 17:14:40 GMT (envelope-from kevans@FreeBSD.org) Message-Id: <201810031714.w93HEeTn009710@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kevans set sender to kevans@FreeBSD.org using -f From: Kyle Evans Date: Wed, 3 Oct 2018 17:14:40 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339159 - stable/11/usr.bin/dtc X-SVN-Group: stable-11 X-SVN-Commit-Author: kevans X-SVN-Commit-Paths: stable/11/usr.bin/dtc X-SVN-Commit-Revision: 339159 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 17:14:42 -0000 Author: kevans Date: Wed Oct 3 17:14:40 2018 New Revision: 339159 URL: https://svnweb.freebsd.org/changeset/base/339159 Log: MFC r337964, r338232: dtc(1) updates r337964: dtc(1): Update to 97d2d5715eeb45108cc60367fdf6bd5b2046b050 Notable fixes: - Overlays may now be generated properly without -@ - /__local_fixups__ were not including unit address in their structure - The error reporting a magic token was misleading, reporting "Bad magic token in header. Got d00dfeed expected 0xd00dfeed" if the token was missing. This has been split out into a separate message. r338232: dtc(1): Update to 0892ec7; HACKING and implicit header fixes Fixes courtesy of arichardson and jmg: - HACKING was pointing to the wrong place - Added headers were being relied on implicitly, but libstdc++ did not comply with the unspoken wishes of dtc. Modified: stable/11/usr.bin/dtc/HACKING stable/11/usr.bin/dtc/dtb.cc stable/11/usr.bin/dtc/fdt.cc stable/11/usr.bin/dtc/string.cc stable/11/usr.bin/dtc/util.hh Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.bin/dtc/HACKING ============================================================================== --- stable/11/usr.bin/dtc/HACKING Wed Oct 3 17:10:32 2018 (r339158) +++ stable/11/usr.bin/dtc/HACKING Wed Oct 3 17:14:40 2018 (r339159) @@ -8,9 +8,9 @@ This file contains some notes for people wishing to ha Upstreaming ----------- -This code is developed in the FreeBSD svn repository: +This code is developed in the git repository: -https://svn.freebsd.org/base/head/usr.bin/dtc +https://github.com/davidchisnall/dtc If you got the source from anywhere else and wish to make changes, please ensure that you are working against the latest version, or you may end up Modified: stable/11/usr.bin/dtc/dtb.cc ============================================================================== --- stable/11/usr.bin/dtc/dtb.cc Wed Oct 3 17:10:32 2018 (r339158) +++ stable/11/usr.bin/dtc/dtb.cc Wed Oct 3 17:14:40 2018 (r339159) @@ -262,9 +262,14 @@ header::write(output_writer &out) bool header::read_dtb(input_buffer &input) { - if (!(input.consume_binary(magic) && magic == 0xd00dfeed)) + if (!input.consume_binary(magic)) { - fprintf(stderr, "Missing magic token in header. Got %" PRIx32 + fprintf(stderr, "Missing magic token in header."); + return false; + } + if (magic != 0xd00dfeed) + { + fprintf(stderr, "Bad magic token in header. Got %" PRIx32 " expected 0xd00dfeed\n", magic); return false; } Modified: stable/11/usr.bin/dtc/fdt.cc ============================================================================== --- stable/11/usr.bin/dtc/fdt.cc Wed Oct 3 17:10:32 2018 (r339158) +++ stable/11/usr.bin/dtc/fdt.cc Wed Oct 3 17:14:40 2018 (r339159) @@ -1337,7 +1337,7 @@ device_tree::resolve_cross_references(uint32_t &phandl phandle_set.insert({&i.val, i}); } std::vector> sorted_phandles; - root->visit([&](node &n, node *parent) { + root->visit([&](node &n, node *) { for (auto &p : n.properties()) { for (auto &v : *p) @@ -1917,116 +1917,121 @@ device_tree::parse_dts(const string &fn, FILE *depfile symbols.push_back(prop); } root->add_child(node::create_special_node("__symbols__", symbols)); - // If this is a plugin, then we also need to create two extra nodes. - // Internal phandles will need to be renumbered to avoid conflicts with - // already-loaded nodes and external references will need to be - // resolved. - if (is_plugin) + } + // If this is a plugin, then we also need to create two extra nodes. + // Internal phandles will need to be renumbered to avoid conflicts with + // already-loaded nodes and external references will need to be + // resolved. + if (is_plugin) + { + std::vector symbols; + // Create the fixups entry. This is of the form: + // {target} = {path}:{property name}:{offset} + auto create_fixup_entry = [&](fixup &i, string target) + { + string value = i.path.to_string(); + value += ':'; + value += i.prop->get_key(); + value += ':'; + value += std::to_string(i.prop->offset_of_value(i.val)); + property_value v; + v.string_data = value; + v.type = property_value::STRING; + auto prop = std::make_shared(std::move(target)); + prop->add_value(v); + return prop; + }; + // If we have any unresolved phandle references in this plugin, + // then we must update them to 0xdeadbeef and leave a property in + // the /__fixups__ node whose key is the label and whose value is + // as described above. + if (!unresolved_fixups.empty()) { - // Create the fixups entry. This is of the form: - // {target} = {path}:{property name}:{offset} - auto create_fixup_entry = [&](fixup &i, string target) - { - string value = i.path.to_string(); - value += ':'; - value += i.prop->get_key(); - value += ':'; - value += std::to_string(i.prop->offset_of_value(i.val)); - property_value v; - v.string_data = value; - v.type = property_value::STRING; - auto prop = std::make_shared(std::move(target)); - prop->add_value(v); - return prop; - }; - // If we have any unresolved phandle references in this plugin, - // then we must update them to 0xdeadbeef and leave a property in - // the /__fixups__ node whose key is the label and whose value is - // as described above. - if (!unresolved_fixups.empty()) + for (auto &i : unresolved_fixups) { - symbols.clear(); - for (auto &i : unresolved_fixups) - { - auto &val = i.get().val; - symbols.push_back(create_fixup_entry(i, val.string_data)); - val.byte_data.push_back(0xde); - val.byte_data.push_back(0xad); - val.byte_data.push_back(0xbe); - val.byte_data.push_back(0xef); - val.type = property_value::BINARY; - } - root->add_child(node::create_special_node("__fixups__", symbols)); + auto &val = i.get().val; + symbols.push_back(create_fixup_entry(i, val.string_data)); + val.byte_data.push_back(0xde); + val.byte_data.push_back(0xad); + val.byte_data.push_back(0xbe); + val.byte_data.push_back(0xef); + val.type = property_value::BINARY; } - symbols.clear(); - // If we have any resolved phandle references in this plugin, then - // we must create a child in the __local_fixups__ node whose path - // matches the node path from the root and whose value contains the - // location of the reference within a property. - - // Create a local_fixups node that is initially empty. - node_ptr local_fixups = node::create_special_node("__local_fixups__", symbols); - for (auto &i : fixups) + root->add_child(node::create_special_node("__fixups__", symbols)); + } + symbols.clear(); + // If we have any resolved phandle references in this plugin, then + // we must create a child in the __local_fixups__ node whose path + // matches the node path from the root and whose value contains the + // location of the reference within a property. + + // Create a local_fixups node that is initially empty. + node_ptr local_fixups = node::create_special_node("__local_fixups__", symbols); + for (auto &i : fixups) + { + if (!i.val.is_phandle()) { - if (!i.val.is_phandle()) + continue; + } + node *n = local_fixups.get(); + for (auto &p : i.path) + { + // Skip the implicit root + if (p.first.empty()) { continue; } - node *n = local_fixups.get(); - for (auto &p : i.path) + bool found = false; + for (auto &c : n->child_nodes()) { - // Skip the implicit root - if (p.first.empty()) + if (c->name == p.first) { - continue; - } - bool found = false; - for (auto &c : n->child_nodes()) - { - if (c->name == p.first) + string path = p.first; + if (!(p.second.empty())) { - n = c.get(); - found = true; - break; + path += '@'; + path += p.second; } - } - if (!found) - { - n->add_child(node::create_special_node(p.first, symbols)); + n->add_child(node::create_special_node(path, symbols)); n = (--n->child_end())->get(); } } - assert(n); - property_value pv; - push_big_endian(pv.byte_data, static_cast(i.prop->offset_of_value(i.val))); - pv.type = property_value::BINARY; - auto key = i.prop->get_key(); - property_ptr prop = n->get_property(key); - // If we don't have an existing property then create one and - // use this property value - if (!prop) + if (!found) { - prop = std::make_shared(std::move(key)); - n->add_property(prop); - prop->add_value(pv); + n->add_child(node::create_special_node(p.first, symbols)); + n = (--n->child_end())->get(); } - else - { - // If we do have an existing property value, try to append - // this value. - property_value &old_val = *(--prop->end()); - if (!old_val.try_to_merge(pv)) - { - prop->add_value(pv); - } - } } - // We've iterated over all fixups, but only emit the - // __local_fixups__ if we found some that were resolved internally. - if (local_fixups->child_begin() != local_fixups->child_end()) + assert(n); + property_value pv; + push_big_endian(pv.byte_data, static_cast(i.prop->offset_of_value(i.val))); + pv.type = property_value::BINARY; + auto key = i.prop->get_key(); + property_ptr prop = n->get_property(key); + // If we don't have an existing property then create one and + // use this property value + if (!prop) { - root->add_child(std::move(local_fixups)); + prop = std::make_shared(std::move(key)); + n->add_property(prop); + prop->add_value(pv); } + else + { + // If we do have an existing property value, try to append + // this value. + property_value &old_val = *(--prop->end()); + if (!old_val.try_to_merge(pv)) + { + prop->add_value(pv); + } + } + } + // We've iterated over all fixups, but only emit the + // __local_fixups__ if we found some that were resolved internally. + if (local_fixups->child_begin() != local_fixups->child_end()) + { + root->add_child(std::move(local_fixups)); } } } Modified: stable/11/usr.bin/dtc/string.cc ============================================================================== --- stable/11/usr.bin/dtc/string.cc Wed Oct 3 17:10:32 2018 (r339158) +++ stable/11/usr.bin/dtc/string.cc Wed Oct 3 17:14:40 2018 (r339159) @@ -36,6 +36,7 @@ #include #include #include +#include #include #include Modified: stable/11/usr.bin/dtc/util.hh ============================================================================== --- stable/11/usr.bin/dtc/util.hh Wed Oct 3 17:10:32 2018 (r339158) +++ stable/11/usr.bin/dtc/util.hh Wed Oct 3 17:14:40 2018 (r339159) @@ -35,6 +35,9 @@ #ifndef _UTIL_HH_ #define _UTIL_HH_ +#include +#include +#include #include // If we aren't using C++11, then just ignore static asserts. From owner-svn-src-stable-11@freebsd.org Wed Oct 3 17:16:20 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A9A3810A729E; Wed, 3 Oct 2018 17:16:20 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5D97D85EF3; Wed, 3 Oct 2018 17:16:20 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 5868A19E25; Wed, 3 Oct 2018 17:16:20 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93HGK2n009878; Wed, 3 Oct 2018 17:16:20 GMT (envelope-from kevans@FreeBSD.org) Received: (from kevans@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93HGItj009868; Wed, 3 Oct 2018 17:16:18 GMT (envelope-from kevans@FreeBSD.org) Message-Id: <201810031716.w93HGItj009868@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kevans set sender to kevans@FreeBSD.org using -f From: Kyle Evans Date: Wed, 3 Oct 2018 17:16:18 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339160 - in stable/11/usr.bin/diff: . tests X-SVN-Group: stable-11 X-SVN-Commit-Author: kevans X-SVN-Commit-Paths: in stable/11/usr.bin/diff: . tests X-SVN-Commit-Revision: 339160 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 17:16:20 -0000 Author: kevans Date: Wed Oct 3 17:16:18 2018 New Revision: 339160 URL: https://svnweb.freebsd.org/changeset/base/339160 Log: MFC r338039: diff(1): Implement -B/--ignore-blank-lines As noted by cem in r338035, coccinelle invokes diff(1) with the -B flag. This was not previously implemented here, so one was forced to create a link for GNU diff to /usr/local/bin/diff Implement the -B flag and add some primitive tests for it. It is implemented in the same fashion that -I is implemented; each chunk's lines are scanned, and if a non-blank line is encountered then the chunk will be output. Otherwise, it's skipped. Added: stable/11/usr.bin/diff/tests/Bflag_C.out - copied unchanged from r338039, head/usr.bin/diff/tests/Bflag_C.out stable/11/usr.bin/diff/tests/Bflag_D.out - copied unchanged from r338039, head/usr.bin/diff/tests/Bflag_D.out stable/11/usr.bin/diff/tests/Bflag_F.out - copied unchanged from r338039, head/usr.bin/diff/tests/Bflag_F.out Modified: stable/11/usr.bin/diff/TODO stable/11/usr.bin/diff/diff.1 stable/11/usr.bin/diff/diff.c stable/11/usr.bin/diff/diff.h stable/11/usr.bin/diff/diffreg.c stable/11/usr.bin/diff/tests/Makefile stable/11/usr.bin/diff/tests/diff_test.sh Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.bin/diff/TODO ============================================================================== --- stable/11/usr.bin/diff/TODO Wed Oct 3 17:14:40 2018 (r339159) +++ stable/11/usr.bin/diff/TODO Wed Oct 3 17:16:18 2018 (r339160) @@ -5,7 +5,6 @@ * make a libsdiff and use that directly to avoid duplicating the code to be implemented: ---ignore-blank-lines --horizon-lines --ignore-tab-expansion --line-format Modified: stable/11/usr.bin/diff/diff.1 ============================================================================== --- stable/11/usr.bin/diff/diff.1 Wed Oct 3 17:14:40 2018 (r339159) +++ stable/11/usr.bin/diff/diff.1 Wed Oct 3 17:16:18 2018 (r339160) @@ -30,7 +30,7 @@ .\" @(#)diff.1 8.1 (Berkeley) 6/30/93 .\" $FreeBSD$ .\" -.Dd April 20, 2017 +.Dd August 18, 2018 .Dt DIFF 1 .Os .Sh NAME @@ -38,7 +38,7 @@ .Nd differential file and directory comparator .Sh SYNOPSIS .Nm diff -.Op Fl abdipTtw +.Op Fl aBbdipTtw .Oo .Fl c | e | f | .Fl n | q | u @@ -67,7 +67,7 @@ .Op Fl L Ar label | Fl -label Ar label .Ar file1 file2 .Nm diff -.Op Fl abdilpTtw +.Op Fl aBbdilpTtw .Op Fl I Ar pattern | Fl -ignore-matching-lines Ar pattern .Op Fl L Ar label | Fl -label Ar label .Op Fl -brief @@ -93,7 +93,7 @@ .Fl C Ar number | -context Ar number .Ar file1 file2 .Nm diff -.Op Fl abdiltw +.Op Fl aBbdiltw .Op Fl I Ar pattern | Fl -ignore-matching-lines Ar pattern .Op Fl -brief .Op Fl -changed-group-format Ar GFMT @@ -118,7 +118,7 @@ .Fl D Ar string | Fl -ifdef Ar string .Ar file1 file2 .Nm diff -.Op Fl abdilpTtw +.Op Fl aBbdilpTtw .Op Fl I Ar pattern | Fl -ignore-matching-lines Ar pattern .Op Fl L Ar label | Fl -label Ar label .Op Fl -brief @@ -144,7 +144,7 @@ .Fl U Ar number | Fl -unified Ar number .Ar file1 file2 .Nm diff -.Op Fl abdilNPprsTtw +.Op Fl aBbdilNPprsTtw .Oo .Fl c | e | f | .Fl n | q | u @@ -300,6 +300,8 @@ if files contain binary characters. Use of this option forces .Nm to produce a diff. +.It Fl B Fl -ignore-blank-lines +Causes chunks that include only blank lines to be ignored. .It Fl b Causes trailing blanks (spaces and tabs) to be ignored, and other strings of blanks to compare equal. Modified: stable/11/usr.bin/diff/diff.c ============================================================================== --- stable/11/usr.bin/diff/diff.c Wed Oct 3 17:14:40 2018 (r339159) +++ stable/11/usr.bin/diff/diff.c Wed Oct 3 17:16:18 2018 (r339160) @@ -66,6 +66,7 @@ static struct option longopts[] = { { "ed", no_argument, 0, 'e' }, { "forward-ed", no_argument, 0, 'f' }, { "speed-large-files", no_argument, NULL, 'H' }, + { "ignore-blank-lines", no_argument, 0, 'B' }, { "ignore-matching-lines", required_argument, 0, 'I' }, { "ignore-case", no_argument, 0, 'i' }, { "paginate", no_argument, NULL, 'l' }, @@ -164,6 +165,9 @@ main(int argc, char **argv) case 'h': /* silently ignore for backwards compatibility */ break; + case 'B': + dflags |= D_SKIPBLANKLINES; + break; case 'I': push_ignore_pats(optarg); break; @@ -447,18 +451,18 @@ void usage(void) { (void)fprintf(stderr, - "usage: diff [-abdilpTtw] [-c | -e | -f | -n | -q | -u] [--ignore-case]\n" + "usage: diff [-aBbdilpTtw] [-c | -e | -f | -n | -q | -u] [--ignore-case]\n" " [--no-ignore-case] [--normal] [--strip-trailing-cr] [--tabsize]\n" " [-I pattern] [-L label] file1 file2\n" - " diff [-abdilpTtw] [-I pattern] [-L label] [--ignore-case]\n" + " diff [-aBbdilpTtw] [-I pattern] [-L label] [--ignore-case]\n" " [--no-ignore-case] [--normal] [--strip-trailing-cr] [--tabsize]\n" " -C number file1 file2\n" - " diff [-abdiltw] [-I pattern] [--ignore-case] [--no-ignore-case]\n" + " diff [-aBbdiltw] [-I pattern] [--ignore-case] [--no-ignore-case]\n" " [--normal] [--strip-trailing-cr] [--tabsize] -D string file1 file2\n" - " diff [-abdilpTtw] [-I pattern] [-L label] [--ignore-case]\n" + " diff [-aBbdilpTtw] [-I pattern] [-L label] [--ignore-case]\n" " [--no-ignore-case] [--normal] [--tabsize] [--strip-trailing-cr]\n" " -U number file1 file2\n" - " diff [-abdilNPprsTtw] [-c | -e | -f | -n | -q | -u] [--ignore-case]\n" + " diff [-aBbdilNPprsTtw] [-c | -e | -f | -n | -q | -u] [--ignore-case]\n" " [--no-ignore-case] [--normal] [--tabsize] [-I pattern] [-L label]\n" " [-S name] [-X file] [-x pattern] dir1 dir2\n"); Modified: stable/11/usr.bin/diff/diff.h ============================================================================== --- stable/11/usr.bin/diff/diff.h Wed Oct 3 17:14:40 2018 (r339159) +++ stable/11/usr.bin/diff/diff.h Wed Oct 3 17:16:18 2018 (r339160) @@ -67,6 +67,7 @@ #define D_EXPANDTABS 0x100 /* Expand tabs to spaces */ #define D_IGNOREBLANKS 0x200 /* Ignore white space changes */ #define D_STRIPCR 0x400 /* Strip trailing cr */ +#define D_SKIPBLANKLINES 0x800 /* Skip blank lines */ /* * Status values for print_status() and diffreg() return values Modified: stable/11/usr.bin/diff/diffreg.c ============================================================================== --- stable/11/usr.bin/diff/diffreg.c Wed Oct 3 17:14:40 2018 (r339159) +++ stable/11/usr.bin/diff/diffreg.c Wed Oct 3 17:16:18 2018 (r339160) @@ -81,6 +81,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include @@ -1071,6 +1072,31 @@ restart: } } return; + } + if (*pflags & D_SKIPBLANKLINES) { + char *line; + /* + * All lines in the change, insert, or delete must not be + * empty for the change to be ignored. + */ + if (a <= b) { /* Changes and deletes. */ + for (i = a; i <= b; i++) { + line = preadline(fileno(f1), + ixold[i] - ixold[i - 1], ixold[i - 1]); + if (*line != '\0') + goto proceed; + } + } + if (a > b || c <= d) { /* Changes and inserts. */ + for (i = c; i <= d; i++) { + line = preadline(fileno(f2), + ixnew[i] - ixnew[i - 1], ixnew[i - 1]); + if (*line != '\0') + goto proceed; + } + } + return; + } proceed: if (*pflags & D_HEADER && diff_format != D_BRIEF) { Copied: stable/11/usr.bin/diff/tests/Bflag_C.out (from r338039, head/usr.bin/diff/tests/Bflag_C.out) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/usr.bin/diff/tests/Bflag_C.out Wed Oct 3 17:16:18 2018 (r339160, copy of r338039, head/usr.bin/diff/tests/Bflag_C.out) @@ -0,0 +1,2 @@ +1a2 +> Copied: stable/11/usr.bin/diff/tests/Bflag_D.out (from r338039, head/usr.bin/diff/tests/Bflag_D.out) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/usr.bin/diff/tests/Bflag_D.out Wed Oct 3 17:16:18 2018 (r339160, copy of r338039, head/usr.bin/diff/tests/Bflag_D.out) @@ -0,0 +1,2 @@ +1a2 +> C Copied: stable/11/usr.bin/diff/tests/Bflag_F.out (from r338039, head/usr.bin/diff/tests/Bflag_F.out) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/usr.bin/diff/tests/Bflag_F.out Wed Oct 3 17:16:18 2018 (r339160, copy of r338039, head/usr.bin/diff/tests/Bflag_F.out) @@ -0,0 +1,4 @@ +7c8 +< G +--- +> X Modified: stable/11/usr.bin/diff/tests/Makefile ============================================================================== --- stable/11/usr.bin/diff/tests/Makefile Wed Oct 3 17:14:40 2018 (r339159) +++ stable/11/usr.bin/diff/tests/Makefile Wed Oct 3 17:16:18 2018 (r339160) @@ -5,6 +5,9 @@ PACKAGE= tests ATF_TESTS_SH= diff_test ${PACKAGE}FILES+= \ + Bflag_C.out \ + Bflag_D.out \ + Bflag_F.out \ input1.in \ input2.in \ input_c1.in \ Modified: stable/11/usr.bin/diff/tests/diff_test.sh ============================================================================== --- stable/11/usr.bin/diff/tests/diff_test.sh Wed Oct 3 17:14:40 2018 (r339159) +++ stable/11/usr.bin/diff/tests/diff_test.sh Wed Oct 3 17:16:18 2018 (r339160) @@ -9,6 +9,7 @@ atf_test_case group_format atf_test_case side_by_side atf_test_case brief_format atf_test_case b230049 +atf_test_case Bflag simple_body() { @@ -150,6 +151,21 @@ brief_format_body() diff -Nrq A D } +Bflag_body() +{ + atf_check -x 'printf "A\nB\n" > A' + atf_check -x 'printf "A\n\nB\n" > B' + atf_check -x 'printf "A\n \nB\n" > C' + atf_check -x 'printf "A\nC\nB\n" > D' + atf_check -x 'printf "A\nB\nC\nD\nE\nF\nG\nH" > E' + atf_check -x 'printf "A\n\nB\nC\nD\nE\nF\nX\nH" > F' + + atf_check -s exit:0 -o inline:"" diff -B A B + atf_check -s exit:1 -o file:"$(atf_get_srcdir)/Bflag_C.out" diff -B A C + atf_check -s exit:1 -o file:"$(atf_get_srcdir)/Bflag_D.out" diff -B A D + atf_check -s exit:1 -o file:"$(atf_get_srcdir)/Bflag_F.out" diff -B E F +} + atf_init_test_cases() { atf_add_test_case simple @@ -161,4 +177,5 @@ atf_init_test_cases() atf_add_test_case side_by_side atf_add_test_case brief_format atf_add_test_case b230049 + atf_add_test_case Bflag } From owner-svn-src-stable-11@freebsd.org Wed Oct 3 17:17:39 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BED4210A733E; Wed, 3 Oct 2018 17:17:39 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6E0A886052; Wed, 3 Oct 2018 17:17:39 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 676E219E26; Wed, 3 Oct 2018 17:17:39 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93HHdlv009994; Wed, 3 Oct 2018 17:17:39 GMT (envelope-from kevans@FreeBSD.org) Received: (from kevans@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93HHdrd009992; Wed, 3 Oct 2018 17:17:39 GMT (envelope-from kevans@FreeBSD.org) Message-Id: <201810031717.w93HHdrd009992@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kevans set sender to kevans@FreeBSD.org using -f From: Kyle Evans Date: Wed, 3 Oct 2018 17:17:39 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339161 - in stable/11/stand: efi/loader fdt X-SVN-Group: stable-11 X-SVN-Commit-Author: kevans X-SVN-Commit-Paths: in stable/11/stand: efi/loader fdt X-SVN-Commit-Revision: 339161 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 17:17:40 -0000 Author: kevans Date: Wed Oct 3 17:17:38 2018 New Revision: 339161 URL: https://svnweb.freebsd.org/changeset/base/339161 Log: MFC r338219, r338250: FDT in Loader fixes r338219: fdt_fixups: relocate the /chosen node after applying fixups As indicated by the comment, any fixups applied (which might include overlays) can invalidate the previously located node by adding nodes or setting/adding properties. The later fdt_setprop of fixup-applied property would then fail because of the bad/wrong node offset. This would have generally been harmless, but potentially caused multiple applications of fixups and caused a little bit of bloat. r338250: efiloader: Setup FDT in autoload to fix overlays clobbering kenv manu found in the noted PR that overlays seemed to be clobbering the kenv and killing the boot. Further inspection revealed that one can `fdt ls` at the loader prompt for a successful boot, but autoboot breaks it. In the autoboot case, first setup of FDT is happening in the middle of bi_load, which triggers loading of the DTBO from /boot. This is bad, bad, bad. Files in the loader are loaded somewhere in the middle of the address space one after another. bi_load starts building the needed kernel bootinfo immediately after the highest-addr loaded file. File loads in the middle of bi_load suddenly clobber bootinfo and everything goes off the rails. The solution to this is to use take advantage of arch_autoload to setup FDT in efiloader compiled with LOADER_FDT_SUPPORT. This matches how it works in ubldr land, and is how it should have worked when overlay support was added to efiloader since fdt_setup_fdtp now has the potential to load files (courtesy of fdt_platform_load_dtb). Modified: stable/11/stand/efi/loader/autoload.c stable/11/stand/fdt/fdt_loader_cmd.c Directory Properties: stable/11/ (props changed) Modified: stable/11/stand/efi/loader/autoload.c ============================================================================== --- stable/11/stand/efi/loader/autoload.c Wed Oct 3 17:16:18 2018 (r339160) +++ stable/11/stand/efi/loader/autoload.c Wed Oct 3 17:17:38 2018 (r339161) @@ -27,11 +27,30 @@ #include __FBSDID("$FreeBSD$"); +#if defined(LOADER_FDT_SUPPORT) +#include +#include +#endif + #include "loader_efi.h" int efi_autoload(void) { +#if defined(LOADER_FDT_SUPPORT) + /* + * Setup the FDT early so that we're not loading files during bi_load. + * Any such loading is inherently broken since bi_load uses the space + * just after all currently loaded files for the data that will be + * passed to the kernel and newly loaded files will be positioned in + * that same space. + * + * We're glossing over errors here because LOADER_FDT_SUPPORT does not + * imply that we're on a platform where FDT is a requirement. If we + * fix this, then the error handling here should be fixed accordingly. + */ + fdt_setup_fdtp(); +#endif return (0); } Modified: stable/11/stand/fdt/fdt_loader_cmd.c ============================================================================== --- stable/11/stand/fdt/fdt_loader_cmd.c Wed Oct 3 17:16:18 2018 (r339160) +++ stable/11/stand/fdt/fdt_loader_cmd.c Wed Oct 3 17:17:38 2018 (r339161) @@ -933,6 +933,12 @@ fdt_fixup(void) fdt_platform_fixups(); + /* + * Re-fetch the /chosen subnode; our fixups may apply overlays or add + * nodes/properties that invalidate the offset we grabbed or created + * above, so we can no longer trust it. + */ + chosen = fdt_subnode_offset(fdtp, 0, "chosen"); fdt_setprop(fdtp, chosen, "fixup-applied", NULL, 0); return (1); } From owner-svn-src-stable-11@freebsd.org Wed Oct 3 17:19:05 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9E8C510A73EE; Wed, 3 Oct 2018 17:19:05 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 550EE861DB; Wed, 3 Oct 2018 17:19:05 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 5011919E28; Wed, 3 Oct 2018 17:19:05 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93HJ5kC010120; Wed, 3 Oct 2018 17:19:05 GMT (envelope-from kevans@FreeBSD.org) Received: (from kevans@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93HJ5CS010119; Wed, 3 Oct 2018 17:19:05 GMT (envelope-from kevans@FreeBSD.org) Message-Id: <201810031719.w93HJ5CS010119@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kevans set sender to kevans@FreeBSD.org using -f From: Kyle Evans Date: Wed, 3 Oct 2018 17:19:05 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339162 - stable/11/tools/build/mk X-SVN-Group: stable-11 X-SVN-Commit-Author: kevans X-SVN-Commit-Paths: stable/11/tools/build/mk X-SVN-Commit-Revision: 339162 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 17:19:05 -0000 Author: kevans Date: Wed Oct 3 17:19:04 2018 New Revision: 339162 URL: https://svnweb.freebsd.org/changeset/base/339162 Log: MFC r338223, r338263: Missing bits from OptionalObsoleteFiles r338223: Remove ZFS leftovers when WITHOUT_ZFS is set Submitted by: Oliver Pinter Differential Revision: https://reviews.freebsd.org/D16810 r338263: Remove hyper-v leftovers when WITHOUT_HYPERV is set hv_vss_daemon was missed. Modified: stable/11/tools/build/mk/OptionalObsoleteFiles.inc Directory Properties: stable/11/ (props changed) Modified: stable/11/tools/build/mk/OptionalObsoleteFiles.inc ============================================================================== --- stable/11/tools/build/mk/OptionalObsoleteFiles.inc Wed Oct 3 17:17:38 2018 (r339161) +++ stable/11/tools/build/mk/OptionalObsoleteFiles.inc Wed Oct 3 17:19:04 2018 (r339162) @@ -1307,33 +1307,44 @@ OLD_FILES+=boot/gptzfsboot OLD_FILES+=boot/zfsboot OLD_FILES+=boot/zfsloader OLD_FILES+=etc/rc.d/zfs +OLD_FILES+=etc/rc.d/zfsd OLD_FILES+=etc/rc.d/zfsbe OLD_FILES+=etc/rc.d/zvol OLD_FILES+=etc/devd/zfs.conf OLD_FILES+=etc/periodic/daily/404.status-zfs OLD_FILES+=etc/periodic/daily/800.scrub-zfs OLD_LIBS+=lib/libzfs.so.2 +OLD_LIBS+=lib/libzfs.so.3 OLD_LIBS+=lib/libzfs_core.so.2 OLD_LIBS+=lib/libzpool.so.2 OLD_FILES+=rescue/zdb OLD_FILES+=rescue/zfs OLD_FILES+=rescue/zpool +OLD_FILES+=sbin/bectl OLD_FILES+=sbin/zfs OLD_FILES+=sbin/zpool +OLD_FILES+=sbin/zfsbootcfg OLD_FILES+=usr/bin/zinject +OLD_FILES+=usr/bin/zstreamdump OLD_FILES+=usr/bin/ztest +OLD_FILES+=usr/lib/libbe.a +OLD_FILES+=usr/lib/libbe_p.a +OLD_FILES+=usr/lib/libbe.so +OLD_LIBS+=usr/lib/libbe.so.1 OLD_FILES+=usr/lib/libzfs.a -OLD_FILES+=usr/lib/libzfs.so +OLD_LIBS+=usr/lib/libzfs.so OLD_FILES+=usr/lib/libzfs_core.a -OLD_FILES+=usr/lib/libzfs_core.so -OLD_FILES+=usr/lib/libzfs_core_p.a +OLD_LIBS+=usr/lib/libzfs_core.so +OLD_LIBS+=usr/lib/libzfs_core_p.a OLD_FILES+=usr/lib/libzfs_p.a OLD_FILES+=usr/lib/libzpool.a OLD_FILES+=usr/lib/libzpool.so +OLD_LIBS+=usr/lib/libzpool.so.2 .if ${TARGET_ARCH} == "amd64" || ${TARGET_ARCH} == "powerpc64" OLD_FILES+=usr/lib32/libzfs.a OLD_FILES+=usr/lib32/libzfs.so OLD_LIBS+=usr/lib32/libzfs.so.2 +OLD_LIBS+=usr/lib32/libzfs.so.3 OLD_FILES+=usr/lib32/libzfs_core.a OLD_FILES+=usr/lib32/libzfs_core.so OLD_LIBS+=usr/lib32/libzfs_core.so.2 @@ -1343,9 +1354,20 @@ OLD_FILES+=usr/lib32/libzpool.a OLD_FILES+=usr/lib32/libzpool.so OLD_LIBS+=usr/lib32/libzpool.so.2 .endif +OLD_FILES+=usr/sbin/zfsd +OLD_FILES+=usr/sbin/zhack OLD_FILES+=usr/sbin/zdb +OLD_FILES+=usr/share/man/man3/libbe.3.gz +OLD_FILES+=usr/share/man/man7/zpool-features.7.gz +OLD_FILES+=usr/share/man/man8/bectl.8.gz +OLD_FILES+=usr/share/man/man8/gptzfsboot.8.gz OLD_FILES+=usr/share/man/man8/zdb.8.gz +OLD_FILES+=usr/share/man/man8/zfs-program.8.gz OLD_FILES+=usr/share/man/man8/zfs.8.gz +OLD_FILES+=usr/share/man/man8/zfsboot.8.gz +OLD_FILES+=usr/share/man/man8/zfsbootcfg.8.gz +OLD_FILES+=usr/share/man/man8/zfsd.8.gz +OLD_FILES+=usr/share/man/man8/zfsloader.8.gz OLD_FILES+=usr/share/man/man8/zpool.8.gz .endif @@ -9313,6 +9335,7 @@ OLD_FILES+=usr/libexec/hyperv/hv_set_ifconfig OLD_FILES+=usr/libexec/hyperv/hv_get_dns_info OLD_FILES+=usr/libexec/hyperv/hv_get_dhcp_info OLD_FILES+=usr/sbin/hv_kvp_daemon +OLD_FILES+=usr/sbin/hv_vss_daemon OLD_FILES+=usr/share/man/man8/hv_kvp_daemon.8.gz .endif From owner-svn-src-stable-11@freebsd.org Wed Oct 3 17:20:31 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6188910A74E6; Wed, 3 Oct 2018 17:20:31 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1643886380; Wed, 3 Oct 2018 17:20:31 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 114B919E43; Wed, 3 Oct 2018 17:20:31 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93HKUpj010288; Wed, 3 Oct 2018 17:20:30 GMT (envelope-from kevans@FreeBSD.org) Received: (from kevans@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93HKUS7010287; Wed, 3 Oct 2018 17:20:30 GMT (envelope-from kevans@FreeBSD.org) Message-Id: <201810031720.w93HKUS7010287@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kevans set sender to kevans@FreeBSD.org using -f From: Kyle Evans Date: Wed, 3 Oct 2018 17:20:30 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339163 - stable/11/bin/dd X-SVN-Group: stable-11 X-SVN-Commit-Author: kevans X-SVN-Commit-Paths: stable/11/bin/dd X-SVN-Commit-Revision: 339163 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 17:20:31 -0000 Author: kevans Date: Wed Oct 3 17:20:30 2018 New Revision: 339163 URL: https://svnweb.freebsd.org/changeset/base/339163 Log: MFC r338646: dd(1): Correct padding in status=progress Output padding is specified via outlen, which is set using the return value of fprintf. Because it's printing that padding plus a trailing byte, it grows by one each iteration rather than reflecting actual length. Additionally, iec was sized improperly for scaling up similarly to si. Fixing this revealed that the humanize_number(3) call to populate persec was using the wrong width. Modified: stable/11/bin/dd/misc.c Directory Properties: stable/11/ (props changed) Modified: stable/11/bin/dd/misc.c ============================================================================== --- stable/11/bin/dd/misc.c Wed Oct 3 17:19:04 2018 (r339162) +++ stable/11/bin/dd/misc.c Wed Oct 3 17:20:30 2018 (r339163) @@ -109,7 +109,7 @@ progress(void) { static int outlen; char si[4 + 1 + 2 + 1]; /* 123 NUL */ - char iec[4 + 1 + 2 + 1]; /* 123 NUL */ + char iec[4 + 1 + 3 + 1]; /* 123 NUL */ char persec[4 + 1 + 2 + 1]; /* 123 NUL */ char *buf; double secs; @@ -119,11 +119,11 @@ progress(void) HN_DECIMAL | HN_DIVISOR_1000); humanize_number(iec, sizeof(iec), (int64_t)st.bytes, "B", HN_AUTOSCALE, HN_DECIMAL | HN_IEC_PREFIXES); - humanize_number(persec, sizeof(iec), (int64_t)(st.bytes / secs), "B", + humanize_number(persec, sizeof(persec), (int64_t)(st.bytes / secs), "B", HN_AUTOSCALE, HN_DECIMAL | HN_DIVISOR_1000); asprintf(&buf, " %'ju bytes (%s, %s) transferred %.3fs, %s/s", (uintmax_t)st.bytes, si, iec, secs, persec); - outlen = fprintf(stderr, "%-*s\r", outlen, buf); + outlen = fprintf(stderr, "%-*s\r", outlen, buf) - 1; fflush(stderr); free(buf); need_progress = 0; From owner-svn-src-stable-11@freebsd.org Wed Oct 3 17:21:46 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BD47310A7759; Wed, 3 Oct 2018 17:21:46 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 702A886823; Wed, 3 Oct 2018 17:21:46 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 67DEF19FC3; Wed, 3 Oct 2018 17:21:46 +0000 (UTC) (envelope-from kevans@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93HLkC5015251; Wed, 3 Oct 2018 17:21:46 GMT (envelope-from kevans@FreeBSD.org) Received: (from kevans@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93HLkcG015250; Wed, 3 Oct 2018 17:21:46 GMT (envelope-from kevans@FreeBSD.org) Message-Id: <201810031721.w93HLkcG015250@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kevans set sender to kevans@FreeBSD.org using -f From: Kyle Evans Date: Wed, 3 Oct 2018 17:21:46 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339164 - stable/11/usr.bin/diff X-SVN-Group: stable-11 X-SVN-Commit-Author: kevans X-SVN-Commit-Paths: stable/11/usr.bin/diff X-SVN-Commit-Revision: 339164 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 17:21:47 -0000 Author: kevans Date: Wed Oct 3 17:21:45 2018 New Revision: 339164 URL: https://svnweb.freebsd.org/changeset/base/339164 Log: MFC r338040: diff(1): Refactor -B a little bit Instead of doing a second pass to skip empty lines if we've specified -I, go ahead and check both at once. Ignore critera has been split out into its own function to try and keep the logic cleaner. Modified: stable/11/usr.bin/diff/diffreg.c Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.bin/diff/diffreg.c ============================================================================== --- stable/11/usr.bin/diff/diffreg.c Wed Oct 3 17:20:30 2018 (r339163) +++ stable/11/usr.bin/diff/diffreg.c Wed Oct 3 17:21:45 2018 (r339164) @@ -202,7 +202,8 @@ static void unsort(struct line *, int, int *); static void change(char *, FILE *, char *, FILE *, int, int, int, int, int *); static void sort(struct line *, int); static void print_header(const char *, const char *); -static int ignoreline(char *); +static bool ignoreline_pattern(char *); +static bool ignoreline(char *, bool); static int asciifile(FILE *); static int fetch(long *, int, int, FILE *, int, int, int); static int newcand(int, int, int); @@ -1018,8 +1019,8 @@ preadline(int fd, size_t rlen, off_t off) return (line); } -static int -ignoreline(char *line) +static bool +ignoreline_pattern(char *line) { int ret; @@ -1028,6 +1029,20 @@ ignoreline(char *line) return (ret == 0); /* if it matched, it should be ignored. */ } +static bool +ignoreline(char *line, bool skip_blanks) +{ + + if (ignore_pats != NULL && skip_blanks) + return (ignoreline_pattern(line) || *line == '\0'); + if (ignore_pats != NULL) + return (ignoreline_pattern(line)); + if (skip_blanks) + return (*line == '\0'); + /* No ignore criteria specified */ + return (false); +} + /* * Indicate that there is a difference between lines a and b of the from file * to get to lines c to d of the to file. If a is greater then b then there @@ -1043,12 +1058,14 @@ change(char *file1, FILE *f1, char *file2, FILE *f2, i long curpos; int i, nc, f; const char *walk; + bool skip_blanks; + skip_blanks = (*pflags & D_SKIPBLANKLINES); restart: if ((diff_format != D_IFDEF || diff_format == D_GFORMAT) && a > b && c > d) return; - if (ignore_pats != NULL) { + if (ignore_pats != NULL || skip_blanks) { char *line; /* * All lines in the change, insert, or delete must @@ -1059,7 +1076,7 @@ restart: for (i = a; i <= b; i++) { line = preadline(fileno(f1), ixold[i] - ixold[i - 1], ixold[i - 1]); - if (!ignoreline(line)) + if (!ignoreline(line, skip_blanks)) goto proceed; } } @@ -1067,36 +1084,11 @@ restart: for (i = c; i <= d; i++) { line = preadline(fileno(f2), ixnew[i] - ixnew[i - 1], ixnew[i - 1]); - if (!ignoreline(line)) + if (!ignoreline(line, skip_blanks)) goto proceed; } } return; - } - if (*pflags & D_SKIPBLANKLINES) { - char *line; - /* - * All lines in the change, insert, or delete must not be - * empty for the change to be ignored. - */ - if (a <= b) { /* Changes and deletes. */ - for (i = a; i <= b; i++) { - line = preadline(fileno(f1), - ixold[i] - ixold[i - 1], ixold[i - 1]); - if (*line != '\0') - goto proceed; - } - } - if (a > b || c <= d) { /* Changes and inserts. */ - for (i = c; i <= d; i++) { - line = preadline(fileno(f2), - ixnew[i] - ixnew[i - 1], ixnew[i - 1]); - if (*line != '\0') - goto proceed; - } - } - return; - } proceed: if (*pflags & D_HEADER && diff_format != D_BRIEF) { From owner-svn-src-stable-11@freebsd.org Wed Oct 3 17:27:21 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8383310A799C; Wed, 3 Oct 2018 17:27:21 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2FA8786B61; Wed, 3 Oct 2018 17:27:21 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 270201A054; Wed, 3 Oct 2018 17:27:21 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93HRLcr015594; Wed, 3 Oct 2018 17:27:21 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93HRLin015593; Wed, 3 Oct 2018 17:27:21 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201810031727.w93HRLin015593@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Wed, 3 Oct 2018 17:27:21 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339165 - stable/11/libexec/rtld-elf X-SVN-Group: stable-11 X-SVN-Commit-Author: kib X-SVN-Commit-Paths: stable/11/libexec/rtld-elf X-SVN-Commit-Revision: 339165 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 17:27:21 -0000 Author: kib Date: Wed Oct 3 17:27:20 2018 New Revision: 339165 URL: https://svnweb.freebsd.org/changeset/base/339165 Log: MFC r324950 (by trasz): Reword the conditional. Modified: stable/11/libexec/rtld-elf/rtld.c Directory Properties: stable/11/ (props changed) Modified: stable/11/libexec/rtld-elf/rtld.c ============================================================================== --- stable/11/libexec/rtld-elf/rtld.c Wed Oct 3 17:21:45 2018 (r339164) +++ stable/11/libexec/rtld-elf/rtld.c Wed Oct 3 17:27:20 2018 (r339165) @@ -1619,27 +1619,54 @@ find_library(const char *xname, const Obj_Entry *refob * nodeflib. */ if (objgiven && refobj->rpath != NULL && ld_library_path_rpath) { - if ((pathname = search_library_path(name, ld_library_path)) != NULL || - (refobj != NULL && - (pathname = search_library_path(name, refobj->rpath)) != NULL) || - (pathname = search_library_pathfds(name, ld_library_dirs, fdp)) != NULL || - (pathname = search_library_path(name, gethints(false))) != NULL || - (pathname = search_library_path(name, ld_standard_library_path)) != NULL) + pathname = search_library_path(name, ld_library_path); + if (pathname != NULL) return (pathname); + if (refobj != NULL) { + pathname = search_library_path(name, refobj->rpath); + if (pathname != NULL) + return (pathname); + } + pathname = search_library_pathfds(name, ld_library_dirs, fdp); + if (pathname != NULL) + return (pathname); + pathname = search_library_path(name, gethints(false)); + if (pathname != NULL) + return (pathname); + pathname = search_library_path(name, ld_standard_library_path); + if (pathname != NULL) + return (pathname); } else { nodeflib = objgiven ? refobj->z_nodeflib : false; - if ((objgiven && - (pathname = search_library_path(name, refobj->rpath)) != NULL) || - (objgiven && refobj->runpath == NULL && refobj != obj_main && - (pathname = search_library_path(name, obj_main->rpath)) != NULL) || - (pathname = search_library_path(name, ld_library_path)) != NULL || - (objgiven && - (pathname = search_library_path(name, refobj->runpath)) != NULL) || - (pathname = search_library_pathfds(name, ld_library_dirs, fdp)) != NULL || - (pathname = search_library_path(name, gethints(nodeflib))) != NULL || - (objgiven && !nodeflib && - (pathname = search_library_path(name, ld_standard_library_path)) != NULL)) + if (objgiven) { + pathname = search_library_path(name, refobj->rpath); + if (pathname != NULL) + return (pathname); + } + if (objgiven && refobj->runpath == NULL && refobj != obj_main) { + pathname = search_library_path(name, obj_main->rpath); + if (pathname != NULL) + return (pathname); + } + pathname = search_library_path(name, ld_library_path); + if (pathname != NULL) return (pathname); + if (objgiven) { + pathname = search_library_path(name, refobj->runpath); + if (pathname != NULL) + return (pathname); + } + pathname = search_library_pathfds(name, ld_library_dirs, fdp); + if (pathname != NULL) + return (pathname); + pathname = search_library_path(name, gethints(nodeflib)); + if (pathname != NULL) + return (pathname); + if (objgiven && !nodeflib) { + pathname = search_library_path(name, ld_standard_library_path); + if (pathname != NULL) + return (pathname); + } } if (objgiven && refobj->path != NULL) { From owner-svn-src-stable-11@freebsd.org Wed Oct 3 17:28:28 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6B02A10A7A94; Wed, 3 Oct 2018 17:28:28 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 20F7C86CB4; Wed, 3 Oct 2018 17:28:28 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 1BDF21A055; Wed, 3 Oct 2018 17:28:28 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93HSRG6015693; Wed, 3 Oct 2018 17:28:27 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93HSR77015692; Wed, 3 Oct 2018 17:28:27 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201810031728.w93HSR77015692@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Wed, 3 Oct 2018 17:28:27 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339166 - stable/11/libexec/rtld-elf X-SVN-Group: stable-11 X-SVN-Commit-Author: kib X-SVN-Commit-Paths: stable/11/libexec/rtld-elf X-SVN-Commit-Revision: 339166 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 17:28:28 -0000 Author: kib Date: Wed Oct 3 17:28:27 2018 New Revision: 339166 URL: https://svnweb.freebsd.org/changeset/base/339166 Log: MFC r324951 (by trasz): Make find_library() conform to style(9). Modified: stable/11/libexec/rtld-elf/rtld.c Directory Properties: stable/11/ (props changed) Modified: stable/11/libexec/rtld-elf/rtld.c ============================================================================== --- stable/11/libexec/rtld-elf/rtld.c Wed Oct 3 17:27:20 2018 (r339165) +++ stable/11/libexec/rtld-elf/rtld.c Wed Oct 3 17:28:27 2018 (r339166) @@ -1590,92 +1590,93 @@ gnu_hash(const char *s) static char * find_library(const char *xname, const Obj_Entry *refobj, int *fdp) { - char *pathname; - char *name; - bool nodeflib, objgiven; + char *pathname; + char *name; + bool nodeflib, objgiven; - objgiven = refobj != NULL; + objgiven = refobj != NULL; - if (libmap_disable || !objgiven || - (name = lm_find(refobj->path, xname)) == NULL) - name = (char *)xname; + if (libmap_disable || !objgiven || + (name = lm_find(refobj->path, xname)) == NULL) + name = (char *)xname; - if (strchr(name, '/') != NULL) { /* Hard coded pathname */ - if (name[0] != '/' && !trust) { - _rtld_error("Absolute pathname required for shared object \"%s\"", - name); - return (NULL); + if (strchr(name, '/') != NULL) { /* Hard coded pathname */ + if (name[0] != '/' && !trust) { + _rtld_error("Absolute pathname required " + "for shared object \"%s\"", name); + return (NULL); + } + return (origin_subst(__DECONST(Obj_Entry *, refobj), + __DECONST(char *, name))); } - return (origin_subst(__DECONST(Obj_Entry *, refobj), - __DECONST(char *, name))); - } - dbg(" Searching for \"%s\"", name); + dbg(" Searching for \"%s\"", name); - /* - * If refobj->rpath != NULL, then refobj->runpath is NULL. Fall - * back to pre-conforming behaviour if user requested so with - * LD_LIBRARY_PATH_RPATH environment variable and ignore -z - * nodeflib. - */ - if (objgiven && refobj->rpath != NULL && ld_library_path_rpath) { - pathname = search_library_path(name, ld_library_path); - if (pathname != NULL) - return (pathname); - if (refobj != NULL) { - pathname = search_library_path(name, refobj->rpath); - if (pathname != NULL) - return (pathname); + /* + * If refobj->rpath != NULL, then refobj->runpath is NULL. Fall + * back to pre-conforming behaviour if user requested so with + * LD_LIBRARY_PATH_RPATH environment variable and ignore -z + * nodeflib. + */ + if (objgiven && refobj->rpath != NULL && ld_library_path_rpath) { + pathname = search_library_path(name, ld_library_path); + if (pathname != NULL) + return (pathname); + if (refobj != NULL) { + pathname = search_library_path(name, refobj->rpath); + if (pathname != NULL) + return (pathname); + } + pathname = search_library_pathfds(name, ld_library_dirs, fdp); + if (pathname != NULL) + return (pathname); + pathname = search_library_path(name, gethints(false)); + if (pathname != NULL) + return (pathname); + pathname = search_library_path(name, ld_standard_library_path); + if (pathname != NULL) + return (pathname); + } else { + nodeflib = objgiven ? refobj->z_nodeflib : false; + if (objgiven) { + pathname = search_library_path(name, refobj->rpath); + if (pathname != NULL) + return (pathname); + } + if (objgiven && refobj->runpath == NULL && refobj != obj_main) { + pathname = search_library_path(name, obj_main->rpath); + if (pathname != NULL) + return (pathname); + } + pathname = search_library_path(name, ld_library_path); + if (pathname != NULL) + return (pathname); + if (objgiven) { + pathname = search_library_path(name, refobj->runpath); + if (pathname != NULL) + return (pathname); + } + pathname = search_library_pathfds(name, ld_library_dirs, fdp); + if (pathname != NULL) + return (pathname); + pathname = search_library_path(name, gethints(nodeflib)); + if (pathname != NULL) + return (pathname); + if (objgiven && !nodeflib) { + pathname = search_library_path(name, + ld_standard_library_path); + if (pathname != NULL) + return (pathname); + } } - pathname = search_library_pathfds(name, ld_library_dirs, fdp); - if (pathname != NULL) - return (pathname); - pathname = search_library_path(name, gethints(false)); - if (pathname != NULL) - return (pathname); - pathname = search_library_path(name, ld_standard_library_path); - if (pathname != NULL) - return (pathname); - } else { - nodeflib = objgiven ? refobj->z_nodeflib : false; - if (objgiven) { - pathname = search_library_path(name, refobj->rpath); - if (pathname != NULL) - return (pathname); - } - if (objgiven && refobj->runpath == NULL && refobj != obj_main) { - pathname = search_library_path(name, obj_main->rpath); - if (pathname != NULL) - return (pathname); - } - pathname = search_library_path(name, ld_library_path); - if (pathname != NULL) - return (pathname); - if (objgiven) { - pathname = search_library_path(name, refobj->runpath); - if (pathname != NULL) - return (pathname); - } - pathname = search_library_pathfds(name, ld_library_dirs, fdp); - if (pathname != NULL) - return (pathname); - pathname = search_library_path(name, gethints(nodeflib)); - if (pathname != NULL) - return (pathname); - if (objgiven && !nodeflib) { - pathname = search_library_path(name, ld_standard_library_path); - if (pathname != NULL) - return (pathname); - } - } - if (objgiven && refobj->path != NULL) { - _rtld_error("Shared object \"%s\" not found, required by \"%s\"", - name, basename(refobj->path)); - } else { - _rtld_error("Shared object \"%s\" not found", name); - } - return NULL; + if (objgiven && refobj->path != NULL) { + _rtld_error("Shared object \"%s\" not found, " + "required by \"%s\"", name, basename(refobj->path)); + } else { + _rtld_error("Shared object \"%s\" not found", name); + } + return (NULL); } /* From owner-svn-src-stable-11@freebsd.org Wed Oct 3 17:29:31 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7F75010A7C0F; Wed, 3 Oct 2018 17:29:31 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 31D8886E91; Wed, 3 Oct 2018 17:29:31 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 2B5081A056; Wed, 3 Oct 2018 17:29:31 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93HTVXQ015794; Wed, 3 Oct 2018 17:29:31 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93HTVJB015793; Wed, 3 Oct 2018 17:29:31 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201810031729.w93HTVJB015793@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Wed, 3 Oct 2018 17:29:31 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339167 - stable/11/libexec/rtld-elf X-SVN-Group: stable-11 X-SVN-Commit-Author: kib X-SVN-Commit-Paths: stable/11/libexec/rtld-elf X-SVN-Commit-Revision: 339167 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 17:29:31 -0000 Author: kib Date: Wed Oct 3 17:29:30 2018 New Revision: 339167 URL: https://svnweb.freebsd.org/changeset/base/339167 Log: MFC r324952 (by trasz): Replace lseek(2)/read(2) pair with pread(2). Modified: stable/11/libexec/rtld-elf/rtld.c Directory Properties: stable/11/ (props changed) Modified: stable/11/libexec/rtld-elf/rtld.c ============================================================================== --- stable/11/libexec/rtld-elf/rtld.c Wed Oct 3 17:28:27 2018 (r339166) +++ stable/11/libexec/rtld-elf/rtld.c Wed Oct 3 17:29:30 2018 (r339167) @@ -1823,10 +1823,9 @@ cleanup1: if (dl > hint_stat.st_size) goto cleanup1; p = xmalloc(hdr.dirlistlen + 1); - - if (lseek(fd, hdr.strtab + hdr.dirlist, SEEK_SET) == -1 || - read(fd, p, hdr.dirlistlen + 1) != - (ssize_t)hdr.dirlistlen + 1 || p[hdr.dirlistlen] != '\0') { + if (pread(fd, p, hdr.dirlistlen + 1, + hdr.strtab + hdr.dirlist) != (ssize_t)hdr.dirlistlen + 1 || + p[hdr.dirlistlen] != '\0') { free(p); goto cleanup1; } From owner-svn-src-stable-11@freebsd.org Wed Oct 3 17:31:00 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9603610A7DC5; Wed, 3 Oct 2018 17:31:00 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4BDE887111; Wed, 3 Oct 2018 17:31:00 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 46CED1A18C; Wed, 3 Oct 2018 17:31:00 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93HV09v017347; Wed, 3 Oct 2018 17:31:00 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93HV0dI017346; Wed, 3 Oct 2018 17:31:00 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201810031731.w93HV0dI017346@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Wed, 3 Oct 2018 17:31:00 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339168 - stable/11/libexec/rtld-elf X-SVN-Group: stable-11 X-SVN-Commit-Author: kib X-SVN-Commit-Paths: stable/11/libexec/rtld-elf X-SVN-Commit-Revision: 339168 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 17:31:00 -0000 Author: kib Date: Wed Oct 3 17:30:59 2018 New Revision: 339168 URL: https://svnweb.freebsd.org/changeset/base/339168 Log: MFC r324953 (by traz): Remove unneeded calls to access(2) from rtld(1); just call open(2) instead. Modified: stable/11/libexec/rtld-elf/rtld.c Directory Properties: stable/11/ (props changed) Modified: stable/11/libexec/rtld-elf/rtld.c ============================================================================== --- stable/11/libexec/rtld-elf/rtld.c Wed Oct 3 17:29:30 2018 (r339167) +++ stable/11/libexec/rtld-elf/rtld.c Wed Oct 3 17:30:59 2018 (r339168) @@ -138,7 +138,7 @@ static int rtld_dirname(const char *, char *); static int rtld_dirname_abs(const char *, char *); static void *rtld_dlopen(const char *name, int fd, int mode); static void rtld_exit(void); -static char *search_library_path(const char *, const char *); +static char *search_library_path(const char *, const char *, int *); static char *search_library_pathfds(const char *, const char *, int *); static const void **get_program_var_addr(const char *, RtldLockState *); static void set_program_var(const char *, const void *); @@ -1619,52 +1619,52 @@ find_library(const char *xname, const Obj_Entry *refob * nodeflib. */ if (objgiven && refobj->rpath != NULL && ld_library_path_rpath) { - pathname = search_library_path(name, ld_library_path); + pathname = search_library_path(name, ld_library_path, fdp); if (pathname != NULL) return (pathname); if (refobj != NULL) { - pathname = search_library_path(name, refobj->rpath); + pathname = search_library_path(name, refobj->rpath, fdp); if (pathname != NULL) return (pathname); } pathname = search_library_pathfds(name, ld_library_dirs, fdp); if (pathname != NULL) return (pathname); - pathname = search_library_path(name, gethints(false)); + pathname = search_library_path(name, gethints(false), fdp); if (pathname != NULL) return (pathname); - pathname = search_library_path(name, ld_standard_library_path); + pathname = search_library_path(name, ld_standard_library_path, fdp); if (pathname != NULL) return (pathname); } else { nodeflib = objgiven ? refobj->z_nodeflib : false; if (objgiven) { - pathname = search_library_path(name, refobj->rpath); + pathname = search_library_path(name, refobj->rpath, fdp); if (pathname != NULL) return (pathname); } if (objgiven && refobj->runpath == NULL && refobj != obj_main) { - pathname = search_library_path(name, obj_main->rpath); + pathname = search_library_path(name, obj_main->rpath, fdp); if (pathname != NULL) return (pathname); } - pathname = search_library_path(name, ld_library_path); + pathname = search_library_path(name, ld_library_path, fdp); if (pathname != NULL) return (pathname); if (objgiven) { - pathname = search_library_path(name, refobj->runpath); + pathname = search_library_path(name, refobj->runpath, fdp); if (pathname != NULL) return (pathname); } pathname = search_library_pathfds(name, ld_library_dirs, fdp); if (pathname != NULL) return (pathname); - pathname = search_library_path(name, gethints(nodeflib)); + pathname = search_library_path(name, gethints(nodeflib), fdp); if (pathname != NULL) return (pathname); if (objgiven && !nodeflib) { pathname = search_library_path(name, - ld_standard_library_path); + ld_standard_library_path, fdp); if (pathname != NULL) return (pathname); } @@ -3021,12 +3021,14 @@ struct try_library_args { size_t namelen; char *buffer; size_t buflen; + int fd; }; static void * try_library_path(const char *dir, size_t dirlen, void *param) { struct try_library_args *arg; + int fd; arg = param; if (*dir == '/' || trust) { @@ -3041,17 +3043,23 @@ try_library_path(const char *dir, size_t dirlen, void strcpy(pathname + dirlen + 1, arg->name); dbg(" Trying \"%s\"", pathname); - if (access(pathname, F_OK) == 0) { /* We found it */ + fd = open(pathname, O_RDONLY | O_CLOEXEC | O_VERIFY); + if (fd >= 0) { + dbg(" Opened \"%s\", fd %d", pathname, fd); pathname = xmalloc(dirlen + 1 + arg->namelen + 1); strcpy(pathname, arg->buffer); + arg->fd = fd; return (pathname); + } else { + dbg(" Failed to open \"%s\": %s", + pathname, rtld_strerror(errno)); } } return (NULL); } static char * -search_library_path(const char *name, const char *path) +search_library_path(const char *name, const char *path, int *fdp) { char *p; struct try_library_args arg; @@ -3063,8 +3071,10 @@ search_library_path(const char *name, const char *path arg.namelen = strlen(name); arg.buffer = xmalloc(PATH_MAX); arg.buflen = PATH_MAX; + arg.fd = -1; p = path_enumerate(path, try_library_path, &arg); + *fdp = arg.fd; free(arg.buffer); From owner-svn-src-stable-11@freebsd.org Wed Oct 3 17:32:03 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 200D010A7E85; Wed, 3 Oct 2018 17:32:03 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C7DBE87447; Wed, 3 Oct 2018 17:32:02 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id C2BC01A1DB; Wed, 3 Oct 2018 17:32:02 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w93HW2ra018230; Wed, 3 Oct 2018 17:32:02 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w93HW27w018229; Wed, 3 Oct 2018 17:32:02 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201810031732.w93HW27w018229@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Wed, 3 Oct 2018 17:32:02 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339169 - stable/11/libexec/rtld-elf X-SVN-Group: stable-11 X-SVN-Commit-Author: kib X-SVN-Commit-Paths: stable/11/libexec/rtld-elf X-SVN-Commit-Revision: 339169 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 17:32:03 -0000 Author: kib Date: Wed Oct 3 17:32:02 2018 New Revision: 339169 URL: https://svnweb.freebsd.org/changeset/base/339169 Log: MFC r338956: Provide refobj context when doing libmap substitution inside search_library_path(). Modified: stable/11/libexec/rtld-elf/rtld.c Directory Properties: stable/11/ (props changed) Modified: stable/11/libexec/rtld-elf/rtld.c ============================================================================== --- stable/11/libexec/rtld-elf/rtld.c Wed Oct 3 17:30:59 2018 (r339168) +++ stable/11/libexec/rtld-elf/rtld.c Wed Oct 3 17:32:02 2018 (r339169) @@ -123,7 +123,7 @@ static void objlist_remove(Objlist *, Obj_Entry *); static int open_binary_fd(const char *argv0, bool search_in_path); static int parse_args(char* argv[], int argc, bool *use_pathp, int *fdp); static int parse_integer(const char *); -static void *path_enumerate(const char *, path_enum_proc, void *); +static void *path_enumerate(const char *, path_enum_proc, const char *, void *); static void print_usage(const char *argv0); static void release_object(Obj_Entry *); static int relocate_object_dag(Obj_Entry *root, bool bind_now, @@ -138,7 +138,8 @@ static int rtld_dirname(const char *, char *); static int rtld_dirname_abs(const char *, char *); static void *rtld_dlopen(const char *name, int fd, int mode); static void rtld_exit(void); -static char *search_library_path(const char *, const char *, int *); +static char *search_library_path(const char *, const char *, const char *, + int *); static char *search_library_pathfds(const char *, const char *, int *); static const void **get_program_var_addr(const char *, RtldLockState *); static void set_program_var(const char *, const void *); @@ -1590,8 +1591,7 @@ gnu_hash(const char *s) static char * find_library(const char *xname, const Obj_Entry *refobj, int *fdp) { - char *pathname; - char *name; + char *name, *pathname, *refobj_path; bool nodeflib, objgiven; objgiven = refobj != NULL; @@ -1611,6 +1611,7 @@ find_library(const char *xname, const Obj_Entry *refob } dbg(" Searching for \"%s\"", name); + refobj_path = objgiven ? refobj->path : NULL; /* * If refobj->rpath != NULL, then refobj->runpath is NULL. Fall @@ -1619,52 +1620,61 @@ find_library(const char *xname, const Obj_Entry *refob * nodeflib. */ if (objgiven && refobj->rpath != NULL && ld_library_path_rpath) { - pathname = search_library_path(name, ld_library_path, fdp); + pathname = search_library_path(name, ld_library_path, + refobj_path, fdp); if (pathname != NULL) return (pathname); if (refobj != NULL) { - pathname = search_library_path(name, refobj->rpath, fdp); + pathname = search_library_path(name, refobj->rpath, + refobj_path, fdp); if (pathname != NULL) return (pathname); } pathname = search_library_pathfds(name, ld_library_dirs, fdp); if (pathname != NULL) return (pathname); - pathname = search_library_path(name, gethints(false), fdp); + pathname = search_library_path(name, gethints(false), + refobj_path, fdp); if (pathname != NULL) return (pathname); - pathname = search_library_path(name, ld_standard_library_path, fdp); + pathname = search_library_path(name, ld_standard_library_path, + refobj_path, fdp); if (pathname != NULL) return (pathname); } else { nodeflib = objgiven ? refobj->z_nodeflib : false; if (objgiven) { - pathname = search_library_path(name, refobj->rpath, fdp); + pathname = search_library_path(name, refobj->rpath, + refobj->path, fdp); if (pathname != NULL) return (pathname); } if (objgiven && refobj->runpath == NULL && refobj != obj_main) { - pathname = search_library_path(name, obj_main->rpath, fdp); + pathname = search_library_path(name, obj_main->rpath, + refobj_path, fdp); if (pathname != NULL) return (pathname); } - pathname = search_library_path(name, ld_library_path, fdp); + pathname = search_library_path(name, ld_library_path, + refobj_path, fdp); if (pathname != NULL) return (pathname); if (objgiven) { - pathname = search_library_path(name, refobj->runpath, fdp); + pathname = search_library_path(name, refobj->runpath, + refobj_path, fdp); if (pathname != NULL) return (pathname); } pathname = search_library_pathfds(name, ld_library_dirs, fdp); if (pathname != NULL) return (pathname); - pathname = search_library_path(name, gethints(nodeflib), fdp); + pathname = search_library_path(name, gethints(nodeflib), + refobj_path, fdp); if (pathname != NULL) return (pathname); if (objgiven && !nodeflib) { pathname = search_library_path(name, - ld_standard_library_path, fdp); + ld_standard_library_path, refobj_path, fdp); if (pathname != NULL) return (pathname); } @@ -1859,8 +1869,9 @@ cleanup1: hargs.request = RTLD_DI_SERINFOSIZE; hargs.serinfo = &hmeta; - path_enumerate(ld_standard_library_path, fill_search_info, &sargs); - path_enumerate(hints, fill_search_info, &hargs); + path_enumerate(ld_standard_library_path, fill_search_info, NULL, + &sargs); + path_enumerate(hints, fill_search_info, NULL, &hargs); SLPinfo = xmalloc(smeta.dls_size); hintinfo = xmalloc(hmeta.dls_size); @@ -1878,8 +1889,9 @@ cleanup1: hargs.serpath = &hintinfo->dls_serpath[0]; hargs.strspace = (char *)&hintinfo->dls_serpath[hmeta.dls_cnt]; - path_enumerate(ld_standard_library_path, fill_search_info, &sargs); - path_enumerate(hints, fill_search_info, &hargs); + path_enumerate(ld_standard_library_path, fill_search_info, NULL, + &sargs); + path_enumerate(hints, fill_search_info, NULL, &hargs); /* * Now calculate the difference between two sets, by excluding @@ -2988,7 +3000,8 @@ rtld_exit(void) * callback on the result. */ static void * -path_enumerate(const char *path, path_enum_proc callback, void *arg) +path_enumerate(const char *path, path_enum_proc callback, + const char *refobj_path, void *arg) { const char *trans; if (path == NULL) @@ -3000,7 +3013,7 @@ path_enumerate(const char *path, path_enum_proc callba char *res; len = strcspn(path, ":;"); - trans = lm_findn(NULL, path, len); + trans = lm_findn(refobj_path, path, len); if (trans) res = callback(trans, strlen(trans), arg); else @@ -3059,7 +3072,8 @@ try_library_path(const char *dir, size_t dirlen, void } static char * -search_library_path(const char *name, const char *path, int *fdp) +search_library_path(const char *name, const char *path, + const char *refobj_path, int *fdp) { char *p; struct try_library_args arg; @@ -3073,7 +3087,7 @@ search_library_path(const char *name, const char *path arg.buflen = PATH_MAX; arg.fd = -1; - p = path_enumerate(path, try_library_path, &arg); + p = path_enumerate(path, try_library_path, refobj_path, &arg); *fdp = arg.fd; free(arg.buffer); @@ -3790,12 +3804,12 @@ do_search_info(const Obj_Entry *obj, int request, stru _info.dls_size = __offsetof(struct dl_serinfo, dls_serpath); _info.dls_cnt = 0; - path_enumerate(obj->rpath, fill_search_info, &args); - path_enumerate(ld_library_path, fill_search_info, &args); - path_enumerate(obj->runpath, fill_search_info, &args); - path_enumerate(gethints(obj->z_nodeflib), fill_search_info, &args); + path_enumerate(obj->rpath, fill_search_info, NULL, &args); + path_enumerate(ld_library_path, fill_search_info, NULL, &args); + path_enumerate(obj->runpath, fill_search_info, NULL, &args); + path_enumerate(gethints(obj->z_nodeflib), fill_search_info, NULL, &args); if (!obj->z_nodeflib) - path_enumerate(ld_standard_library_path, fill_search_info, &args); + path_enumerate(ld_standard_library_path, fill_search_info, NULL, &args); if (request == RTLD_DI_SERINFOSIZE) { @@ -3815,25 +3829,25 @@ do_search_info(const Obj_Entry *obj, int request, stru args.strspace = (char *)&info->dls_serpath[_info.dls_cnt]; args.flags = LA_SER_RUNPATH; - if (path_enumerate(obj->rpath, fill_search_info, &args) != NULL) + if (path_enumerate(obj->rpath, fill_search_info, NULL, &args) != NULL) return (-1); args.flags = LA_SER_LIBPATH; - if (path_enumerate(ld_library_path, fill_search_info, &args) != NULL) + if (path_enumerate(ld_library_path, fill_search_info, NULL, &args) != NULL) return (-1); args.flags = LA_SER_RUNPATH; - if (path_enumerate(obj->runpath, fill_search_info, &args) != NULL) + if (path_enumerate(obj->runpath, fill_search_info, NULL, &args) != NULL) return (-1); args.flags = LA_SER_CONFIG; - if (path_enumerate(gethints(obj->z_nodeflib), fill_search_info, &args) + if (path_enumerate(gethints(obj->z_nodeflib), fill_search_info, NULL, &args) != NULL) return (-1); args.flags = LA_SER_DEFAULT; - if (!obj->z_nodeflib && - path_enumerate(ld_standard_library_path, fill_search_info, &args) != NULL) + if (!obj->z_nodeflib && path_enumerate(ld_standard_library_path, + fill_search_info, NULL, &args) != NULL) return (-1); return (0); } From owner-svn-src-stable-11@freebsd.org Thu Oct 4 11:47:54 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 73B3E10A495B; Thu, 4 Oct 2018 11:47:54 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2636691CC8; Thu, 4 Oct 2018 11:47:54 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 1CEAC25B66; Thu, 4 Oct 2018 11:47:54 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w94BlruV081765; Thu, 4 Oct 2018 11:47:53 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w94BlrG2081764; Thu, 4 Oct 2018 11:47:53 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201810041147.w94BlrG2081764@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Thu, 4 Oct 2018 11:47:53 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339180 - stable/11/etc/rc.d X-SVN-Group: stable-11 X-SVN-Commit-Author: kib X-SVN-Commit-Paths: stable/11/etc/rc.d X-SVN-Commit-Revision: 339180 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Oct 2018 11:47:54 -0000 Author: kib Date: Thu Oct 4 11:47:53 2018 New Revision: 339180 URL: https://svnweb.freebsd.org/changeset/base/339180 Log: MFC r338964: Remove -m (update) from ldconfig -32 & -soft invocation on startup. Modified: stable/11/etc/rc.d/ldconfig Directory Properties: stable/11/ (props changed) Modified: stable/11/etc/rc.d/ldconfig ============================================================================== --- stable/11/etc/rc.d/ldconfig Thu Oct 4 09:28:40 2018 (r339179) +++ stable/11/etc/rc.d/ldconfig Thu Oct 4 11:47:53 2018 (r339180) @@ -58,7 +58,7 @@ ldconfig_start() done check_startmsgs && echo '32-bit compatibility ldconfig path:' ${_LDC} - ${ldconfig} -32 -m ${_ins} ${_LDC} + ${ldconfig} -32 ${_ins} ${_LDC} ;; esac @@ -80,7 +80,7 @@ ldconfig_start() done check_startmsgs && echo 'Soft Float compatibility ldconfig path:' ${_LDC} - ${ldconfig} -soft -m ${_ins} ${_LDC} + ${ldconfig} -soft ${_ins} ${_LDC} ;; esac From owner-svn-src-stable-11@freebsd.org Fri Oct 5 07:49:01 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DE99410C54CE; Fri, 5 Oct 2018 07:49:01 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8D3D2758B7; Fri, 5 Oct 2018 07:49:01 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 7F1091259B; Fri, 5 Oct 2018 07:49:01 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w957n1uo097093; Fri, 5 Oct 2018 07:49:01 GMT (envelope-from hselasky@FreeBSD.org) Received: (from hselasky@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w957n1Z1097092; Fri, 5 Oct 2018 07:49:01 GMT (envelope-from hselasky@FreeBSD.org) Message-Id: <201810050749.w957n1Z1097092@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: hselasky set sender to hselasky@FreeBSD.org using -f From: Hans Petter Selasky Date: Fri, 5 Oct 2018 07:49:01 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339189 - stable/11/lib/libusb X-SVN-Group: stable-11 X-SVN-Commit-Author: hselasky X-SVN-Commit-Paths: stable/11/lib/libusb X-SVN-Commit-Revision: 339189 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Oct 2018 07:49:02 -0000 Author: hselasky Date: Fri Oct 5 07:49:01 2018 New Revision: 339189 URL: https://svnweb.freebsd.org/changeset/base/339189 Log: MFC r338993: When multiple threads are involved receiving completion events in LibUSB make sure there is always a master polling thread, by setting the "ctx_handler" field in the context. Else the reception of completion events can stop. This happens if event threads are created and destroyed during runtime. Found by: Ludovic Rousseau PR: 231742 Sponsored by: Mellanox Technologies Modified: stable/11/lib/libusb/libusb10_io.c Directory Properties: stable/11/ (props changed) Modified: stable/11/lib/libusb/libusb10_io.c ============================================================================== --- stable/11/lib/libusb/libusb10_io.c Fri Oct 5 05:55:56 2018 (r339188) +++ stable/11/lib/libusb/libusb10_io.c Fri Oct 5 07:49:01 2018 (r339189) @@ -310,6 +310,9 @@ libusb_wait_for_event(libusb_context *ctx, struct time if (tv == NULL) { pthread_cond_wait(&ctx->ctx_cond, &ctx->ctx_lock); + /* try to grab polling of actual events, if any */ + if (ctx->ctx_handler == NO_THREAD) + ctx->ctx_handler = pthread_self(); return (0); } err = clock_gettime(CLOCK_MONOTONIC, &ts); @@ -328,6 +331,9 @@ libusb_wait_for_event(libusb_context *ctx, struct time } err = pthread_cond_timedwait(&ctx->ctx_cond, &ctx->ctx_lock, &ts); + /* try to grab polling of actual events, if any */ + if (ctx->ctx_handler == NO_THREAD) + ctx->ctx_handler = pthread_self(); if (err == ETIMEDOUT) return (1); From owner-svn-src-stable-11@freebsd.org Fri Oct 5 18:12:50 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 528DA10B3011; Fri, 5 Oct 2018 18:12:50 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 089118B490; Fri, 5 Oct 2018 18:12:50 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 0365618CF2; Fri, 5 Oct 2018 18:12:50 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w95ICnj0019660; Fri, 5 Oct 2018 18:12:49 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w95ICnQa019659; Fri, 5 Oct 2018 18:12:49 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201810051812.w95ICnQa019659@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Fri, 5 Oct 2018 18:12:49 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339202 - stable/11/sys/vm X-SVN-Group: stable-11 X-SVN-Commit-Author: kib X-SVN-Commit-Paths: stable/11/sys/vm X-SVN-Commit-Revision: 339202 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Oct 2018 18:12:50 -0000 Author: kib Date: Fri Oct 5 18:12:49 2018 New Revision: 339202 URL: https://svnweb.freebsd.org/changeset/base/339202 Log: MFC r338997: In vm_fault_copy_entry(), collect the code to initialize a newly allocated dst_object in a single place. Modified: stable/11/sys/vm/vm_fault.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/vm/vm_fault.c ============================================================================== --- stable/11/sys/vm/vm_fault.c Fri Oct 5 17:53:47 2018 (r339201) +++ stable/11/sys/vm/vm_fault.c Fri Oct 5 18:12:49 2018 (r339202) @@ -1606,6 +1606,7 @@ vm_fault_copy_entry(vm_map_t dst_map, vm_map_t src_map dst_object->flags |= OBJ_COLORED; dst_object->pg_color = atop(dst_entry->start); #endif + dst_object->charge = dst_entry->end - dst_entry->start; } VM_OBJECT_WLOCK(dst_object); @@ -1614,7 +1615,6 @@ vm_fault_copy_entry(vm_map_t dst_map, vm_map_t src_map if (src_object != dst_object) { dst_entry->object.vm_object = dst_object; dst_entry->offset = 0; - dst_object->charge = dst_entry->end - dst_entry->start; } if (fork_charge != NULL) { KASSERT(dst_entry->cred == NULL, From owner-svn-src-stable-11@freebsd.org Fri Oct 5 18:14:19 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AA5FF10B3138; Fri, 5 Oct 2018 18:14:19 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 60A148B72A; Fri, 5 Oct 2018 18:14:19 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 5B1D718CF7; Fri, 5 Oct 2018 18:14:19 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w95IEJB4019886; Fri, 5 Oct 2018 18:14:19 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w95IEJnu019885; Fri, 5 Oct 2018 18:14:19 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201810051814.w95IEJnu019885@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Fri, 5 Oct 2018 18:14:19 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339203 - stable/11/sys/vm X-SVN-Group: stable-11 X-SVN-Commit-Author: kib X-SVN-Commit-Paths: stable/11/sys/vm X-SVN-Commit-Revision: 339203 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Oct 2018 18:14:19 -0000 Author: kib Date: Fri Oct 5 18:14:18 2018 New Revision: 339203 URL: https://svnweb.freebsd.org/changeset/base/339203 Log: MFC r338998: In vm_fault_copy_entry(), we should not assert that entry is charged if the dst_object is not of swap type. Modified: stable/11/sys/vm/vm_fault.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/vm/vm_fault.c ============================================================================== --- stable/11/sys/vm/vm_fault.c Fri Oct 5 18:12:49 2018 (r339202) +++ stable/11/sys/vm/vm_fault.c Fri Oct 5 18:14:18 2018 (r339203) @@ -1622,7 +1622,9 @@ vm_fault_copy_entry(vm_map_t dst_map, vm_map_t src_map dst_object->cred = curthread->td_ucred; crhold(dst_object->cred); *fork_charge += dst_object->charge; - } else if (dst_object->cred == NULL) { + } else if ((dst_object->type == OBJT_DEFAULT || + dst_object->type == OBJT_SWAP) && + dst_object->cred == NULL) { KASSERT(dst_entry->cred != NULL, ("no cred for entry %p", dst_entry)); dst_object->cred = dst_entry->cred; From owner-svn-src-stable-11@freebsd.org Fri Oct 5 18:15:45 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0FE7A10B31E8; Fri, 5 Oct 2018 18:15:45 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B9CAD8B8BA; Fri, 5 Oct 2018 18:15:44 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id B4B3618CFA; Fri, 5 Oct 2018 18:15:44 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w95IFilO020003; Fri, 5 Oct 2018 18:15:44 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w95IFim7020002; Fri, 5 Oct 2018 18:15:44 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201810051815.w95IFim7020002@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Fri, 5 Oct 2018 18:15:44 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339204 - stable/11/sys/vm X-SVN-Group: stable-11 X-SVN-Commit-Author: kib X-SVN-Commit-Paths: stable/11/sys/vm X-SVN-Commit-Revision: 339204 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Oct 2018 18:15:45 -0000 Author: kib Date: Fri Oct 5 18:15:44 2018 New Revision: 339204 URL: https://svnweb.freebsd.org/changeset/base/339204 Log: MFC r338999: Correct vm_fault_copy_entry() handling of backing file truncation after the file mapping was wired. Modified: stable/11/sys/vm/vm_fault.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/vm/vm_fault.c ============================================================================== --- stable/11/sys/vm/vm_fault.c Fri Oct 5 18:14:18 2018 (r339203) +++ stable/11/sys/vm/vm_fault.c Fri Oct 5 18:15:44 2018 (r339204) @@ -1711,6 +1711,13 @@ again: dst_m = src_m; if (vm_page_sleep_if_busy(dst_m, "fltupg")) goto again; + if (dst_m->pindex >= dst_object->size) + /* + * We are upgrading. Index can occur + * out of bounds if the object type is + * vnode and the file was truncated. + */ + break; vm_page_xbusy(dst_m); KASSERT(dst_m->valid == VM_PAGE_BITS_ALL, ("invalid dst page %p", dst_m)); From owner-svn-src-stable-11@freebsd.org Fri Oct 5 21:10:04 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 34F4010B712A; Fri, 5 Oct 2018 21:10:04 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id DC33771C6D; Fri, 5 Oct 2018 21:10:03 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id D6F7B1AA11; Fri, 5 Oct 2018 21:10:03 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w95LA3j1009116; Fri, 5 Oct 2018 21:10:03 GMT (envelope-from jhb@FreeBSD.org) Received: (from jhb@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w95LA3cR009114; Fri, 5 Oct 2018 21:10:03 GMT (envelope-from jhb@FreeBSD.org) Message-Id: <201810052110.w95LA3cR009114@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: jhb set sender to jhb@FreeBSD.org using -f From: John Baldwin Date: Fri, 5 Oct 2018 21:10:03 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339210 - in stable/11/stand/efi/loader: . arch/i386 X-SVN-Group: stable-11 X-SVN-Commit-Author: jhb X-SVN-Commit-Paths: in stable/11/stand/efi/loader: . arch/i386 X-SVN-Commit-Revision: 339210 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Oct 2018 21:10:04 -0000 Author: jhb Date: Fri Oct 5 21:10:03 2018 New Revision: 339210 URL: https://svnweb.freebsd.org/changeset/base/339210 Log: MFC 338022: Fix casts between 64-bit physical addresses and pointers in EFI. Compiling FreeBSD/i386 with modern GCC triggers warnings for various places that convert 64-bit EFI_ADDRs to pointers and vice versa. - Cast pointers to uintptr_t rather than to uint64_t when assigning to a 64-bit integer. - Cast 64-bit integers to uintptr_t before a cast to a pointer. Modified: stable/11/stand/efi/loader/arch/i386/efimd.c stable/11/stand/efi/loader/bootinfo.c stable/11/stand/efi/loader/copy.c Directory Properties: stable/11/ (props changed) Modified: stable/11/stand/efi/loader/arch/i386/efimd.c ============================================================================== --- stable/11/stand/efi/loader/arch/i386/efimd.c Fri Oct 5 20:49:54 2018 (r339209) +++ stable/11/stand/efi/loader/arch/i386/efimd.c Fri Oct 5 21:10:03 2018 (r339210) @@ -70,14 +70,14 @@ ldr_bootinfo(struct bootinfo *bi, uint64_t *bi_addr) UINTN mmsz, pages, sz; UINT32 mmver; - bi->bi_systab = (uint64_t)ST; - bi->bi_hcdp = (uint64_t)efi_get_table(&hcdp_guid); + bi->bi_systab = (uintptr_t)ST; + bi->bi_hcdp = (uintptr_t)efi_get_table(&hcdp_guid); sz = sizeof(EFI_HANDLE); status = BS->LocateHandle(ByProtocol, &fpswa_guid, 0, &sz, &handle); if (status == 0) status = BS->HandleProtocol(handle, &fpswa_guid, &fpswa); - bi->bi_fpswa = (status == 0) ? (uint64_t)fpswa : 0; + bi->bi_fpswa = (status == 0) ? (uintptr_t)fpswa : 0; bisz = (sizeof(struct bootinfo) + 0x0f) & ~0x0f; @@ -109,7 +109,7 @@ ldr_bootinfo(struct bootinfo *bi, uint64_t *bi_addr) * aligned). */ *bi_addr = addr; - mm = (void *)(addr + bisz); + mm = (void *)(uintptr_t)(addr + bisz); sz = (EFI_PAGE_SIZE * pages) - bisz; status = BS->GetMemoryMap(&sz, mm, &mapkey, &mmsz, &mmver); if (EFI_ERROR(status)) { @@ -117,12 +117,12 @@ ldr_bootinfo(struct bootinfo *bi, uint64_t *bi_addr) (long)status); return (EINVAL); } - bi->bi_memmap = (uint64_t)mm; + bi->bi_memmap = (uintptr_t)mm; bi->bi_memmap_size = sz; bi->bi_memdesc_size = mmsz; bi->bi_memdesc_version = mmver; - bcopy(bi, (void *)(*bi_addr), sizeof(*bi)); + bcopy(bi, (void *)(uintptr_t)(*bi_addr), sizeof(*bi)); return (0); } Modified: stable/11/stand/efi/loader/bootinfo.c ============================================================================== --- stable/11/stand/efi/loader/bootinfo.c Fri Oct 5 20:49:54 2018 (r339209) +++ stable/11/stand/efi/loader/bootinfo.c Fri Oct 5 21:10:03 2018 (r339210) @@ -338,7 +338,7 @@ bi_load_efi_data(struct preloaded_file *kfp) * memory map on a 16-byte boundary (the bootinfo block is page * aligned). */ - efihdr = (struct efi_map_header *)addr; + efihdr = (struct efi_map_header *)(uintptr_t)addr; mm = (void *)((uint8_t *)efihdr + efisz); sz = (EFI_PAGE_SIZE * pages) - efisz; Modified: stable/11/stand/efi/loader/copy.c ============================================================================== --- stable/11/stand/efi/loader/copy.c Fri Oct 5 20:49:54 2018 (r339209) +++ stable/11/stand/efi/loader/copy.c Fri Oct 5 21:10:03 2018 (r339210) @@ -278,9 +278,9 @@ efi_copy_finish(void) { uint64_t *src, *dst, *last; - src = (uint64_t *)staging; - dst = (uint64_t *)(staging - stage_offset); - last = (uint64_t *)staging_end; + src = (uint64_t *)(uintptr_t)staging; + dst = (uint64_t *)(uintptr_t)(staging - stage_offset); + last = (uint64_t *)(uintptr_t)staging_end; while (src < last) *dst++ = *src++;