From owner-freebsd-fs@freebsd.org Sun Jul 2 21:01:12 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 93F7EDAD195 for ; Sun, 2 Jul 2017 21:01:12 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7B41A76A93 for ; Sun, 2 Jul 2017 21:01:12 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v62L01sb057667 for ; Sun, 2 Jul 2017 21:01:12 GMT (envelope-from bugzilla-noreply@FreeBSD.org) Message-Id: <201707022101.v62L01sb057667@kenobi.freebsd.org> From: bugzilla-noreply@FreeBSD.org To: freebsd-fs@FreeBSD.org Subject: Problem reports for freebsd-fs@FreeBSD.org that need special attention Date: Sun, 02 Jul 2017 21:01:12 +0000 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Jul 2017 21:01:12 -0000 To view an individual PR, use: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=(Bug Id). The following is a listing of current problems submitted by FreeBSD users, which need special attention. These represent problem reports covering all versions including experimental development code and obsolete releases. Status | Bug Id | Description ------------+-----------+--------------------------------------------------- New | 203492 | mount_unionfs -o below causes panic New | 217062 | for file systems mounted with -o noexec, exec=off Open | 136470 | [nfs] Cannot mount / in read-only, over NFS Open | 139651 | [nfs] mount(8): read-only remount of NFS volume d Open | 140068 | [smbfs] [patch] smbfs does not allow semicolon in Open | 144447 | [zfs] sharenfs fsunshare() & fsshare_main() non f Open | 211491 | System hangs after "Uptime" on reboot with ZFS 7 problems total for which you should take action. From owner-freebsd-fs@freebsd.org Mon Jul 3 07:42:20 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C9FA49E0351; Mon, 3 Jul 2017 07:42:20 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wm0-x22d.google.com (mail-wm0-x22d.google.com [IPv6:2a00:1450:400c:c09::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 67F3113A1; Mon, 3 Jul 2017 07:42:20 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wm0-x22d.google.com with SMTP id 62so159907784wmw.1; Mon, 03 Jul 2017 00:42:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=fXwnBcvuGOil6HHb8sjqt8q8Sd0CRsYc8QKn66qIsqc=; b=bycwTX8I25IcAuh86TgBS4ofe/XvtneU1PppCxT4V7HiIyd46hORaHWG2EVsByqemu ApTh5cFi1f8n1KDG2tyzKE4m+AUcZtiz3XnnPVFN2oYH28gfgNdjSZG3ey/YlJoTvpjf 2H+GNrMMBE2WMWIUah92k7toD9tX/DvRGIegae8GmeQGdtJ8NeAuFK8BsIy6kCN5UOQJ EO+Q6vqyH26LbHixpB68pulgxrNdE5izJxe7K3DPCfOMDvTW8cBrDB9Rcy/OtNCxUUBc GlZhl04txdaGrK69mPOCubSyPTuEtZEZ7NIWOLaAuBo/rAFx6AaMdZyIxj+cPr1fRzZQ oi4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=fXwnBcvuGOil6HHb8sjqt8q8Sd0CRsYc8QKn66qIsqc=; b=i8dfsMltn5asQZE3jBnYYzaJ3fBNkoKNgT8GTOghVNk03ngntxP3fZu5YdeSM6wQqd Tp36RSbdm4iddcKQbNIYbPo5CN8En0OctV5gGhEWl+5O5ziKQHYrlzM6LjDoVUzBnAgZ JMEXeYNDOkAuTDHHdNc3xxwnKPJj5zZz4Mrv1ph05tpAQArP1rYatYzJ0MWy9ogrysR/ 0QR6YLN/jdN9KSKdV2VtPCL+GgSdkcawyRatyOWNPOyKgln0GA4xXpdyJi/FUUfKbz1z sXYDj6wVuA5WzvotvZNsFqrIHm8X3ZhLOGo+0LcWEF3Gp+BZe42Vrlf8h0l3sg5Dz8T7 xRHA== X-Gm-Message-State: AKS2vOz8OHFZBdU8QFLS7AzM4TZ/UIJZpO/h5hpxUCS4+k8EVexK6F1H DJmKzTFbHEBOE8u0cPk= X-Received: by 10.28.236.19 with SMTP id k19mr22260562wmh.30.1499067737636; Mon, 03 Jul 2017 00:42:17 -0700 (PDT) Received: from ben.home (LFbn-1-11339-180.w2-15.abo.wanadoo.fr. [2.15.165.180]) by smtp.gmail.com with ESMTPSA id q17sm1248103wmd.4.2017.07.03.00.42.16 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 03 Jul 2017 00:42:16 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: I/O to pool appears to be hung, panic ! From: Ben RUBSON X-Priority: 3 In-Reply-To: Date: Mon, 3 Jul 2017 09:42:15 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: To: freebsd-scsi@freebsd.org, Freebsd fs X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Jul 2017 07:42:20 -0000 > On 29 Jun 2017, at 20:40, Karli Sj=C3=B6berg = wrote: >=20 >> Den 29 juni 2017 3:37 em skrev Ben RUBSON : >>=20 >> I also tried to look for some LSI SAS2008 error counters (on target = side), >> but did not found anything interesting. >> (sysctl -a | grep -i mps) >=20 > Here's a well kept secret I got from LSI once: >=20 > /boot/loader.conf: > dev.mps.0.debug_level=3D"0x1F" Thank you Karli for this tip ! I just noticed this is explained in MPS(4), DEBUGGING section. 0x1F is then : 0x0001 Enable informational prints (set by default). 0x0002 Enable prints for driver faults (set by default). 0x0004 Enable prints for controller events. 0x0008 Enable prints for controller logging. 0x0010 Enable prints for tracing recovery operations. Controller events are however extremely verbose. Perhaps 0x1B is then slightly better. Here is a useful link to "decode" logged messages : http://blog.disksurvey.org/blog/2014/08/10/decoding-lsi-loginfo-codes/ Ben= From owner-freebsd-fs@freebsd.org Mon Jul 3 09:07:11 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B98F29E1B47; Mon, 3 Jul 2017 09:07:11 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wm0-x22f.google.com (mail-wm0-x22f.google.com [IPv6:2a00:1450:400c:c09::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4C23A642A0; Mon, 3 Jul 2017 09:07:11 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wm0-x22f.google.com with SMTP id 62so162105234wmw.1; Mon, 03 Jul 2017 02:07:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=pHIxfnfO1kGjFbqmb23JJjYccYBWsUl9Y7g0WX6v6/s=; b=kvUMP3L93038P6ZUGTVt63XhSmAQZ111RDl2JkveNZTRIvifpW1ldTIKok9N+PRzmf 6UtW5IeI+x8ezGlJCiNcsuhimJJnm5YyI0JluisZ/gx7OleRSPKY2PIUtpdi/Ib3t3Vp TYVxRNkP+hucTxedrLDup32TGAYg74chnRW0FB1yxk3VCqQEbs8upM2PZAvFK2L3sju/ exJwR20L/pgVD+yUl1ZZs8OHAT+pjnZEp+I/Z+LGsaBBnnzNd7eM2drhP/rTIWohp2bX pqITY/jjs3rRpXDzOdmlOFHqwIw7aEjNc4Gn1HiXfFlD6cmIuefsHQPys4lWu93a7rL7 5bYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=pHIxfnfO1kGjFbqmb23JJjYccYBWsUl9Y7g0WX6v6/s=; b=TSiNIVb5ukyyuBsYLPZu0+0E/Z8lFUkhpaGU9NakFP4Au0RnHPErUln7stlAzZsLMY AlHbON4Vfd+mRWXA1BOaPjYeYc7NcWnV1qVfZ5NPXZ5oAZYvCY82VHuio9VdXiqzV/65 Rw5A9rAetRQ2Q0XbfVTiGjyZfFh0c4vm+p2GhP1nNlKA0W9G1pLQWLEC4sXObTc9SVl5 79CJV+OiiRJqby5EU2ketEzbHv2v/u8KMp7H56UcPBlFBk7G5l+Re4MpAhsxOwjT4lFj m8bfggTsAHYzEBQuWftzO0MmSu6IqdXtUHro5vkVsZBx9AkjhzqNfhj5JsiIlkPt07oz ropg== X-Gm-Message-State: AKS2vOyZi6hFyEOMMlNxZ5AoSXQalWsyBsavVddnZwrAg0A95BTUU5rJ qhvH+hop9uWmn1Snljk= X-Received: by 10.28.54.217 with SMTP id y86mr15164248wmh.81.1499072829298; Mon, 03 Jul 2017 02:07:09 -0700 (PDT) Received: from ben.home (LFbn-1-11339-180.w2-15.abo.wanadoo.fr. [2.15.165.180]) by smtp.gmail.com with ESMTPSA id c27sm13078689wrb.44.2017.07.03.02.07.08 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 03 Jul 2017 02:07:08 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: I/O to pool appears to be hung, panic ! From: Ben RUBSON In-Reply-To: Date: Mon, 3 Jul 2017 11:07:06 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20170629144334.1e283570@fabiankeil.de> To: Freebsd fs , freebsd-scsi@freebsd.org X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Jul 2017 09:07:11 -0000 > On 29 Jun 2017, at 15:36, Ben RUBSON wrote: >=20 >> On 29 Jun 2017, at 14:43, Fabian Keil = wrote: >=20 > Thank you for your feedback Fabian. >=20 >> Ben RUBSON wrote: >>=20 >>> One of my servers did a kernel panic last night, giving the = following message : >>> panic: I/O to pool 'home' appears to be hung on vdev guid 122... at = '/dev/label/G23iscsi'. >> [...]=20 >>> Here are some numbers regarding this disk, taken from the server = hosting the pool : >>> (unfortunately not from the iscsi target server) >>> https://s23.postimg.org/zd8jy9xaj/busydisk.png >>>=20 >>> We clearly see that suddendly, disk became 100% busy, meanwhile CPU = was almost idle. We also clearly see that 5 minutes later (02:09) disk seems to be back = but became 100% busy again, and that 16 minutes later (default vfs.zfs.deadman_synctime_ms), panic = occurred. >>> No error message at all on both servers. >> [...] >>> The only log I have is the following stacktrace taken from the = server console : >>> panic: I/O to pool 'home' appears to be hung on vdev guid 122... at = '/dev/label/G23iscsi'. >>> cpuid =3D 0 >>> KDB: stack backtrace: >>> #0 0xffffffff80b240f7 at kdb_backtrace+0x67 >>> #1 0xffffffff80ad9462 at vpanic+0x182 >>> #2 0xffffffff80ad92d3 at panic+0x43 >>> #3 0xffffffff82238fa7 at vdev_deadman+0x127 >>> #4 0xffffffff82238ec0 at vdev_deadman+0x40 >>> #5 0xffffffff82238ec0 at vdev_deadman+0x40 >>> #6 0xffffffff8222d0a6 at spa_deadman+0x86 >>> #7 0xffffffff80af32da at softclock_call_cc+0x18a >>> #8 0xffffffff80af3854 at softclock+0x94 >>> #9 0xffffffff80a9348f at intr_event_execute_handlers+0x20f >>> #10 0xffffffff80a936f6 at ithread_loop+0xc6 >>> #11 0xffffffff80a900d5 at fork_exit+0x85 >>> #12 0xffffffff80f846fe at fork_trampoline+0xe >>> Uptime: 92d2h47m6s >>>=20 >>> I would have been pleased to make a dump available. >>> However, despite my (correct ?) configuration, server did not dump : >>> (nevertheless, "sysctl debug.kdb.panic=3D1" make it to dump) >>> # grep ^dump /boot/loader.conf /etc/rc.conf >>> /boot/loader.conf:dumpdev=3D"/dev/mirror/swap" >>> /etc/rc.conf:dumpdev=3D"AUTO" >>=20 >> You may want to look at the NOTES section in gmirror(8). >=20 > Yes, I should already be OK (prefer algorithm set). >=20 >>> I use default kernel, with a rebuilt zfs module : >>> # uname -v >>> FreeBSD 11.0-RELEASE-p8 #0: Wed Feb 22 06:12:04 UTC 2017 = root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC=20 >>>=20 >>> I use the following iSCSI configuration, which disconnects the disks = "as soon as" they are unavailable : >>> kern.iscsi.ping_timeout=3D5 >>> kern.iscsi.fail_on_disconnection=3D1 >>> kern.iscsi.iscsid_timeout=3D5 >>>=20 >>> I then think disk was at least correctly reachable during these 20 = busy minutes. >>>=20 >>> So, any idea why I could have faced this issue ? >>=20 >> Is it possible that the system was under memory pressure? >=20 > No I don't think it was : > https://s1.postimg.org/uvsebpyyn/busydisk2.png > More than 2GB of available memory. > Swap not used (624kB). > ARC behaviour seems correct (anon increases because ZFS can't actually = write I think). > Regarding the pool itself, it was receiving data at 6MB/s, sending = around 30kB blocks to disks. > When disk went busy, throughput fell to some kB, with 128kB blocks. >=20 >> geli's use of malloc() is known to cause deadlocks under memory = pressure: >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209759 >>=20 >> Given that gmirror uses malloc() as well it probably has the same = issue. >=20 > I don't use geli so I should not face this issue. >=20 >>> I would have thought ZFS would have taken the busy device offline, = instead of raising a panic. >>> Perhaps it is already possible to make ZFS behave like this ? >>=20 >> There's a tunable for this: vfs.zfs.deadman_enabled. >> If the panic is just a symptom of the deadlock it's unlikely >> to help though. >=20 > I think this tunable should have prevented the server from having = raised a panic : > # sysctl -d vfs.zfs.deadman_enabled > vfs.zfs.deadman_enabled: Kernel panic on stalled ZFS I/O > # sysctl vfs.zfs.deadman_enabled > vfs.zfs.deadman_enabled: 1 >=20 > But not sure how it would have behaved then... > (busy disk miraculously back to normal status, memory pressure due to = anon increasing...) I then think it would be nice, once vfs.zfs.deadman_synctime_ms has = expired, to be able to take the busy device offline instead of raising a panic. Currently, disabling deadman will avoid the panic but will let the = device slowing down the pool. I still did not found the root cause of this issue, not sure I will, quite difficult actually with a stacktrace and some performance graphs = only :/ Thank you again, Ben From owner-freebsd-fs@freebsd.org Mon Jul 3 11:10:43 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E182D9E4304; Mon, 3 Jul 2017 11:10:43 +0000 (UTC) (envelope-from etnapierala@gmail.com) Received: from mail-ua0-x22a.google.com (mail-ua0-x22a.google.com [IPv6:2607:f8b0:400c:c08::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9132D67DB4; Mon, 3 Jul 2017 11:10:43 +0000 (UTC) (envelope-from etnapierala@gmail.com) Received: by mail-ua0-x22a.google.com with SMTP id j53so107316280uaa.2; Mon, 03 Jul 2017 04:10:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=B+0z7ykioZF3GZx8Q1X7MWPjoBaI3/p6BB9cNfE0cmc=; b=pUF/CqD/YNI8eHHwsGlZ173/YQpXrQx2ajROT+1By4vJT969ggqBjetOdRJ/h4sqML AqmUcXwVJ6t/SkhXXtQb1wBI+k/orPbzOdJ00/xh8394f+h9bW4xHSTHj/h0JpfWyO7+ M5oxGnQ8w0eYKupTdk9mORfOt93pDzwYXzOx24+DNfjza3wn8/WtAdCKGcktYKYqZdlc 3mHnxaJIfMV1TcyGgup6rgwV0K7OkHFm02PS2XQKwQ90Smx3VM+QspBHSROnvnuXHEt1 6k9Mcqjn2kzZgrQWQbnXaT2WCgEBaP5wEiLrSJ6zTQ6EJ+B2zVg+KL/EF0Th3P0KIUK9 LCFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=B+0z7ykioZF3GZx8Q1X7MWPjoBaI3/p6BB9cNfE0cmc=; b=fat9Hew1n5LpCUdHK2U1LFiQ1l/49M/r+/NXaC6lawbcgMiJ8h23Zylda6GsmBH7L7 mjFxdDudxRZUfFr2c8xYPzvtvaS50bRfC03TiN9Nu5seoSko7WjX2OQ7FUels5/rYSHZ MstGe/pwiyyrouMh8wHW7PImkc5ZBuMXNV5ramPcHbJ/nQlxaZwzgAK/silS/t+SsYu0 kGuf1oKxnu4clNh+B9ieufhpDBqqHglf1rqPmekTkrqKgkLuwFTcSf21SYHs81FMRIUV tRNiQAPGIyL/uDsH3Qfy2bSA39UdZ2r3eXeLUgP2bPD12bEeim+FTb/qKxKusZAi4g3u m4mA== X-Gm-Message-State: AKS2vOzIAqm6urmM07y6F8Yhz6MDwuKwikpOyfkCv8yMjjC4/mPCN3lV 7Sm1K8OUd4G16zsO2H55W3KpgLY+lQ== X-Received: by 10.159.32.133 with SMTP id 5mr18355140uaa.123.1499080242238; Mon, 03 Jul 2017 04:10:42 -0700 (PDT) MIME-Version: 1.0 Sender: etnapierala@gmail.com Received: by 10.176.83.198 with HTTP; Mon, 3 Jul 2017 04:10:41 -0700 (PDT) In-Reply-To: References: <20170629144334.1e283570@fabiankeil.de> From: Edward Napierala Date: Mon, 3 Jul 2017 12:10:41 +0100 X-Google-Sender-Auth: D6LVuMcjFaYveirDGoKPI2y9elM Message-ID: Subject: Re: I/O to pool appears to be hung, panic ! To: Ben RUBSON Cc: Freebsd fs , freebsd-scsi Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Jul 2017 11:10:44 -0000 2017-07-03 10:07 GMT+01:00 Ben RUBSON : > > On 29 Jun 2017, at 15:36, Ben RUBSON wrote: > > > >> On 29 Jun 2017, at 14:43, Fabian Keil > wrote: > > > > Thank you for your feedback Fabian. > > > >> Ben RUBSON wrote: > >> > >>> One of my servers did a kernel panic last night, giving the following > message : > >>> panic: I/O to pool 'home' appears to be hung on vdev guid 122... at > '/dev/label/G23iscsi'. > >> [...] > >>> Here are some numbers regarding this disk, taken from the server > hosting the pool : > >>> (unfortunately not from the iscsi target server) > >>> https://s23.postimg.org/zd8jy9xaj/busydisk.png > >>> > >>> We clearly see that suddendly, disk became 100% busy, meanwhile CPU > was almost idle. > > We also clearly see that 5 minutes later (02:09) disk seems to be back but > became 100% busy again, > and that 16 minutes later (default vfs.zfs.deadman_synctime_ms), panic > occurred. > > >>> No error message at all on both servers. > >> [...] > >>> The only log I have is the following stacktrace taken from the server > console : > >>> panic: I/O to pool 'home' appears to be hung on vdev guid 122... at > '/dev/label/G23iscsi'. > >>> cpuid = 0 > >>> KDB: stack backtrace: > >>> #0 0xffffffff80b240f7 at kdb_backtrace+0x67 > >>> #1 0xffffffff80ad9462 at vpanic+0x182 > >>> #2 0xffffffff80ad92d3 at panic+0x43 > >>> #3 0xffffffff82238fa7 at vdev_deadman+0x127 > >>> #4 0xffffffff82238ec0 at vdev_deadman+0x40 > >>> #5 0xffffffff82238ec0 at vdev_deadman+0x40 > >>> #6 0xffffffff8222d0a6 at spa_deadman+0x86 > >>> #7 0xffffffff80af32da at softclock_call_cc+0x18a > >>> #8 0xffffffff80af3854 at softclock+0x94 > >>> #9 0xffffffff80a9348f at intr_event_execute_handlers+0x20f > >>> #10 0xffffffff80a936f6 at ithread_loop+0xc6 > >>> #11 0xffffffff80a900d5 at fork_exit+0x85 > >>> #12 0xffffffff80f846fe at fork_trampoline+0xe > >>> Uptime: 92d2h47m6s > >>> > >>> I would have been pleased to make a dump available. > >>> However, despite my (correct ?) configuration, server did not dump : > >>> (nevertheless, "sysctl debug.kdb.panic=1" make it to dump) > >>> # grep ^dump /boot/loader.conf /etc/rc.conf > >>> /boot/loader.conf:dumpdev="/dev/mirror/swap" > >>> /etc/rc.conf:dumpdev="AUTO" > >> > >> You may want to look at the NOTES section in gmirror(8). > > > > Yes, I should already be OK (prefer algorithm set). > > > >>> I use default kernel, with a rebuilt zfs module : > >>> # uname -v > >>> FreeBSD 11.0-RELEASE-p8 #0: Wed Feb 22 06:12:04 UTC 2017 > root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC > >>> > >>> I use the following iSCSI configuration, which disconnects the disks > "as soon as" they are unavailable : > >>> kern.iscsi.ping_timeout=5 > >>> kern.iscsi.fail_on_disconnection=1 > >>> kern.iscsi.iscsid_timeout=5 > >>> > >>> I then think disk was at least correctly reachable during these 20 > busy minutes. > >>> > >>> So, any idea why I could have faced this issue ? > >> > >> Is it possible that the system was under memory pressure? > > > > No I don't think it was : > > https://s1.postimg.org/uvsebpyyn/busydisk2.png > > More than 2GB of available memory. > > Swap not used (624kB). > > ARC behaviour seems correct (anon increases because ZFS can't actually > write I think). > > Regarding the pool itself, it was receiving data at 6MB/s, sending > around 30kB blocks to disks. > > When disk went busy, throughput fell to some kB, with 128kB blocks. > > > >> geli's use of malloc() is known to cause deadlocks under memory > pressure: > >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209759 > >> > >> Given that gmirror uses malloc() as well it probably has the same issue. > > > > I don't use geli so I should not face this issue. > > > >>> I would have thought ZFS would have taken the busy device offline, > instead of raising a panic. > >>> Perhaps it is already possible to make ZFS behave like this ? > >> > >> There's a tunable for this: vfs.zfs.deadman_enabled. > >> If the panic is just a symptom of the deadlock it's unlikely > >> to help though. > > > > I think this tunable should have prevented the server from having raised > a panic : > > # sysctl -d vfs.zfs.deadman_enabled > > vfs.zfs.deadman_enabled: Kernel panic on stalled ZFS I/O > > # sysctl vfs.zfs.deadman_enabled > > vfs.zfs.deadman_enabled: 1 > > > > But not sure how it would have behaved then... > > (busy disk miraculously back to normal status, memory pressure due to > anon increasing...) > > I then think it would be nice, once vfs.zfs.deadman_synctime_ms has > expired, > to be able to take the busy device offline instead of raising a panic. > Currently, disabling deadman will avoid the panic but will let the device > slowing down the pool. > > I still did not found the root cause of this issue, not sure I will, > quite difficult actually with a stacktrace and some performance graphs > only :/ > What exactly is the disk doing when that happens? What does "gstat" say? If the iSCSI target is also FreeBSD, what does ctlstat say? If everything else fails, you might want to do a packet trace using "tcpdump -w" and take a look at it using Wireshark; it decodes iSCSI quite nicely. From owner-freebsd-fs@freebsd.org Mon Jul 3 13:36:45 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 12D7F9E6FBB; Mon, 3 Jul 2017 13:36:45 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wm0-x232.google.com (mail-wm0-x232.google.com [IPv6:2a00:1450:400c:c09::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8E44D7093A; Mon, 3 Jul 2017 13:36:44 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wm0-x232.google.com with SMTP id z75so38180794wmc.0; Mon, 03 Jul 2017 06:36:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:message-id:mime-version:subject:date:references:to:in-reply-to; bh=2PaOtJ9R5xG6KsuRbyJNwSqCDonjEu/CttKecN2VrjU=; b=kfPk5l5yRHVOXpocQafMYOxs9tnJckT6c6SstP7pVYjYjaE5Ud6MS64FoYwT/OQO/c ttoRfbhn6yLEWR0Ihnwp6aJSmRBtZ6Q+Ii6v859rzpChc+Gp8rp1Ur3OYygh5HURBMtn ekBFMKO5lAtzL6eE61FONMev2MS4dr8D202MM3C2KIYhbASzY+hZdLlcyS5aZnS6mL4v bA62mX9jHOPSnXka126FXrDMwTsSpYfesuIZ/CvALBF3TSVjvSuNxOR9mLQnA5jtFA2f OnThdTwcr39zWBNlKLTflguD87jDGaRLy+Wnlxphd0fcAvaywU7zxCnU+sZlqgRlL5lB W3Jg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :references:to:in-reply-to; bh=2PaOtJ9R5xG6KsuRbyJNwSqCDonjEu/CttKecN2VrjU=; b=oC7n1+tTI3rhHvKMaV5l3LJJ/2YjRvquU1YBfru/4mOiSglMRi05ysQgu3uZwNi2f8 Fe5OyOjokOMDALZzTt9kc0l99YnvEJOOR6kLlJWRBIDuO+pZ21py8JKJjQS3kHUJ7byz 4gRB6Ii5nu0iXopFj8cITRZ/o7+ciP4rxzsltVr2dmQcm595vj48IOU8t30IFMutGAH3 +d9xLwp0Z5EM5Xb3kYdMzjY7XGNsuuFHR/uI2fT1GZjM2NSBZ47G/F7ltM5eL/zry2A2 26s0x6fZvd4AbBYau3m7D0gYt1SgNuI5poQV+7VFtSEQh2HbouZlOXhRXS2RVTPB636Z /nmA== X-Gm-Message-State: AKS2vOx04UjB2hU8KkN8G3d4WHoAed8WfXeg0P0IrVDuffyaoc68wa97 P3Q/qvZbEv3y9D94vNc= X-Received: by 10.28.10.194 with SMTP id 185mr25393286wmk.119.1499089002503; Mon, 03 Jul 2017 06:36:42 -0700 (PDT) Received: from ben.home (LFbn-1-11339-180.w2-15.abo.wanadoo.fr. [2.15.165.180]) by smtp.gmail.com with ESMTPSA id 13sm13619182wrl.57.2017.07.03.06.36.41 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 03 Jul 2017 06:36:41 -0700 (PDT) From: Ben RUBSON Message-Id: <1F414ECE-1856-4EA3-A141-88B64703D4D6@gmail.com> Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: I/O to pool appears to be hung, panic ! Date: Mon, 3 Jul 2017 15:36:40 +0200 References: <20170629144334.1e283570@fabiankeil.de> To: Freebsd fs , freebsd-scsi In-Reply-To: X-Mailer: Apple Mail (2.3124) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Jul 2017 13:36:45 -0000 > On 03 Jul 2017, at 13:10, Edward Napierala wrote: >=20 > 2017-07-03 10:07 GMT+01:00 Ben RUBSON >: >=20 > > On 29 Jun 2017, at 15:36, Ben RUBSON > wrote: > > > >> On 29 Jun 2017, at 14:43, Fabian Keil > wrote: > > > > Thank you for your feedback Fabian. > > > >> Ben RUBSON > = wrote: > >> > >>> One of my servers did a kernel panic last night, giving the = following message : > >>> panic: I/O to pool 'home' appears to be hung on vdev guid 122... = at '/dev/label/G23iscsi'. > >> [...] > >>> Here are some numbers regarding this disk, taken from the server = hosting the pool : > >>> (unfortunately not from the iscsi target server) > >>> https://s23.postimg.org/zd8jy9xaj/busydisk.png = > >>> > >>> We clearly see that suddendly, disk became 100% busy, meanwhile = CPU was almost idle. >=20 > We also clearly see that 5 minutes later (02:09) disk seems to be back = but became 100% busy again, > and that 16 minutes later (default vfs.zfs.deadman_synctime_ms), panic = occurred. >=20 > >>> No error message at all on both servers. > >> [...] > >>> The only log I have is the following stacktrace taken from the = server console : > >>> panic: I/O to pool 'home' appears to be hung on vdev guid 122... = at '/dev/label/G23iscsi'. > >>> cpuid =3D 0 > >>> KDB: stack backtrace: > >>> #0 0xffffffff80b240f7 at kdb_backtrace+0x67 > >>> #1 0xffffffff80ad9462 at vpanic+0x182 > >>> #2 0xffffffff80ad92d3 at panic+0x43 > >>> #3 0xffffffff82238fa7 at vdev_deadman+0x127 > >>> #4 0xffffffff82238ec0 at vdev_deadman+0x40 > >>> #5 0xffffffff82238ec0 at vdev_deadman+0x40 > >>> #6 0xffffffff8222d0a6 at spa_deadman+0x86 > >>> #7 0xffffffff80af32da at softclock_call_cc+0x18a > >>> #8 0xffffffff80af3854 at softclock+0x94 > >>> #9 0xffffffff80a9348f at intr_event_execute_handlers+0x20f > >>> #10 0xffffffff80a936f6 at ithread_loop+0xc6 > >>> #11 0xffffffff80a900d5 at fork_exit+0x85 > >>> #12 0xffffffff80f846fe at fork_trampoline+0xe > >>> Uptime: 92d2h47m6s > >>> > >>> I would have been pleased to make a dump available. > >>> However, despite my (correct ?) configuration, server did not dump = : > >>> (nevertheless, "sysctl debug.kdb.panic=3D1" make it to dump) > >>> # grep ^dump /boot/loader.conf /etc/rc.conf > >>> /boot/loader.conf:dumpdev=3D"/dev/mirror/swap" > >>> /etc/rc.conf:dumpdev=3D"AUTO" > >> > >> You may want to look at the NOTES section in gmirror(8). > > > > Yes, I should already be OK (prefer algorithm set). > > > >>> I use default kernel, with a rebuilt zfs module : > >>> # uname -v > >>> FreeBSD 11.0-RELEASE-p8 #0: Wed Feb 22 06:12:04 UTC 2017 = root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC > >>> > >>> I use the following iSCSI configuration, which disconnects the = disks "as soon as" they are unavailable : > >>> kern.iscsi.ping_timeout=3D5 > >>> kern.iscsi.fail_on_disconnection=3D1 > >>> kern.iscsi.iscsid_timeout=3D5 > >>> > >>> I then think disk was at least correctly reachable during these 20 = busy minutes. > >>> > >>> So, any idea why I could have faced this issue ? > >> > >> Is it possible that the system was under memory pressure? > > > > No I don't think it was : > > https://s1.postimg.org/uvsebpyyn/busydisk2.png = > > More than 2GB of available memory. > > Swap not used (624kB). > > ARC behaviour seems correct (anon increases because ZFS can't = actually write I think). > > Regarding the pool itself, it was receiving data at 6MB/s, sending = around 30kB blocks to disks. > > When disk went busy, throughput fell to some kB, with 128kB blocks. > > > >> geli's use of malloc() is known to cause deadlocks under memory = pressure: > >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209759 = > >> > >> Given that gmirror uses malloc() as well it probably has the same = issue. > > > > I don't use geli so I should not face this issue. > > > >>> I would have thought ZFS would have taken the busy device offline, = instead of raising a panic. > >>> Perhaps it is already possible to make ZFS behave like this ? > >> > >> There's a tunable for this: vfs.zfs.deadman_enabled. > >> If the panic is just a symptom of the deadlock it's unlikely > >> to help though. > > > > I think this tunable should have prevented the server from having = raised a panic : > > # sysctl -d vfs.zfs.deadman_enabled > > vfs.zfs.deadman_enabled: Kernel panic on stalled ZFS I/O > > # sysctl vfs.zfs.deadman_enabled > > vfs.zfs.deadman_enabled: 1 > > > > But not sure how it would have behaved then... > > (busy disk miraculously back to normal status, memory pressure due = to anon increasing...) >=20 > I then think it would be nice, once vfs.zfs.deadman_synctime_ms has = expired, > to be able to take the busy device offline instead of raising a panic. > Currently, disabling deadman will avoid the panic but will let the = device slowing down the pool. >=20 > I still did not found the root cause of this issue, not sure I will, > quite difficult actually with a stacktrace and some performance graphs = only :/ >=20 > What exactly is the disk doing when that happens? What does "gstat" = say? If the iSCSI > target is also FreeBSD, what does ctlstat say? As shown on this graph made with gstat numbers from initiator : https://s23.postimg.org/zd8jy9xaj/busydisk.png The disk is continuously writing 3 MBps before the issue happens. When it occurs, response time increases to around 30 seconds (100% = busy), and consequently disk throughput drops down to some kBps. CPU stays at an almost fully idle level. As shown here, no memory pressure : https://s1.postimg.org/uvsebpyyn/busydisk2.png = At the end of graphs' lines, panic is raised. iSCSI target is also FreeBSD, unfortunately ctlstat was not running = during the issue occurred. So numbers will be average since system startup (102 days ago). I also do not have gstat numbers from this disk on target side (to help finding if it's a hardware issue, a iSCSI issue or something = else). I will think about collecting these numbers if ever issue occurs again. > If everything else fails, you might want to do a packet trace using = "tcpdump -w" and take > a look at it using Wireshark; it decodes iSCSI quite nicely. From owner-freebsd-fs@freebsd.org Mon Jul 3 15:27:25 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A54FC9E8DA6; Mon, 3 Jul 2017 15:27:25 +0000 (UTC) (envelope-from etnapierala@gmail.com) Received: from mail-vk0-x232.google.com (mail-vk0-x232.google.com [IPv6:2607:f8b0:400c:c05::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 58EEA7435D; Mon, 3 Jul 2017 15:27:25 +0000 (UTC) (envelope-from etnapierala@gmail.com) Received: by mail-vk0-x232.google.com with SMTP id y70so97090644vky.3; Mon, 03 Jul 2017 08:27:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=W/0DKT8sHhZ3yNVWjKkTJ772GQVXevDvpoKXY+oigYk=; b=mgGq/hvQwILMsIomtgAL/VM6ON3YadRQNxKhk39W9dPGIGWhI6a+Re93+c5UXRy1Y9 KN0x8xh4fLOGV2avdc+TEGsNhZKwJ40Atd5mZ82PVvroDKMcPPtb1Z/1eE6nENeIemsY b7OVRYCcgiAtGlEdOeWNJ9sRoAGCdR8jMAGi6EVQAMQPtguf8y/48gKBFG3KUPMWcVB6 UbxhSbGeMlcTHuggsm+bN5Q5fDQ5B8t3H0G0J0lx+aHSAAlbliIfZu0Sb9KIXQxeuboJ 1Cmmc0YOFsimoAXajhlaclgufcieRQHKqSIpUh8BE7wza1zR7DbmRzEvke2vGzUqBgoi BhvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=W/0DKT8sHhZ3yNVWjKkTJ772GQVXevDvpoKXY+oigYk=; b=FrWqWEKqSnjHQNyE/rYvKyGLvy+P66XwWlo3RQoLOBAdaqWvBnr3HNFTcpZz6fv16k U80gmKjUZHigB0xEfKs45Bdgw4HR8n50+8oYyt10wcgliIzots8/FfdgiuUq2vS01ygC ca8hQDqyUmQ9pa0JqQQlOscV5HlXYS2ssX4ErF8v8qyU/An0jEgi2+R1rsTvaTAgm1zr 2RlM3cmR9tpYYvg+Fc04J+3CQ0uruSYNLONuaCN0CJMfjOTy/JEdwdDpAoStC4664xsv QXHaNCWnLY2GisjTjMyvpp0FqVn9/hC6QEwhmrKOmk3eaWNroN+Vf6m2Ew6U9dO4iW+Y y4fg== X-Gm-Message-State: AKS2vOzmePGCnJhJd7tbQZiZ7IcbfpTYed4g968rVnZ+J87ND6pH2aeq qvsGykdz3oh6SNNgOh+CMciHar5TFw== X-Received: by 10.31.172.88 with SMTP id v85mr19659845vke.57.1499095644405; Mon, 03 Jul 2017 08:27:24 -0700 (PDT) MIME-Version: 1.0 Sender: etnapierala@gmail.com Received: by 10.176.83.198 with HTTP; Mon, 3 Jul 2017 08:27:23 -0700 (PDT) In-Reply-To: <1F414ECE-1856-4EA3-A141-88B64703D4D6@gmail.com> References: <20170629144334.1e283570@fabiankeil.de> <1F414ECE-1856-4EA3-A141-88B64703D4D6@gmail.com> From: Edward Napierala Date: Mon, 3 Jul 2017 16:27:23 +0100 X-Google-Sender-Auth: MfH634NSHWAqw0dMIrXRjnSuuYA Message-ID: Subject: Re: I/O to pool appears to be hung, panic ! To: Ben RUBSON Cc: Freebsd fs , freebsd-scsi Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Jul 2017 15:27:25 -0000 2017-07-03 14:36 GMT+01:00 Ben RUBSON : > > On 03 Jul 2017, at 13:10, Edward Napierala wrote: > > > > 2017-07-03 10:07 GMT+01:00 Ben RUBSON ben.rubson@gmail.com>>: > > > > > On 29 Jun 2017, at 15:36, Ben RUBSON ben.rubson@gmail.com>> wrote: > > > > > >> On 29 Jun 2017, at 14:43, Fabian Keil > wrote: > > > > > > Thank you for your feedback Fabian. > > > > > >> Ben RUBSON > > wrote: > > >> > > >>> One of my servers did a kernel panic last night, giving the > following message : > > >>> panic: I/O to pool 'home' appears to be hung on vdev guid 122... at > '/dev/label/G23iscsi'. > > >> [...] > > >>> Here are some numbers regarding this disk, taken from the server > hosting the pool : > > >>> (unfortunately not from the iscsi target server) > > >>> https://s23.postimg.org/zd8jy9xaj/busydisk.png < > https://s23.postimg.org/zd8jy9xaj/busydisk.png> > > >>> > > >>> We clearly see that suddendly, disk became 100% busy, meanwhile CPU > was almost idle. > > > > We also clearly see that 5 minutes later (02:09) disk seems to be back > but became 100% busy again, > > and that 16 minutes later (default vfs.zfs.deadman_synctime_ms), panic > occurred. > > > > >>> No error message at all on both servers. > > >> [...] > > >>> The only log I have is the following stacktrace taken from the > server console : > > >>> panic: I/O to pool 'home' appears to be hung on vdev guid 122... at > '/dev/label/G23iscsi'. > > >>> cpuid = 0 > > >>> KDB: stack backtrace: > > >>> #0 0xffffffff80b240f7 at kdb_backtrace+0x67 > > >>> #1 0xffffffff80ad9462 at vpanic+0x182 > > >>> #2 0xffffffff80ad92d3 at panic+0x43 > > >>> #3 0xffffffff82238fa7 at vdev_deadman+0x127 > > >>> #4 0xffffffff82238ec0 at vdev_deadman+0x40 > > >>> #5 0xffffffff82238ec0 at vdev_deadman+0x40 > > >>> #6 0xffffffff8222d0a6 at spa_deadman+0x86 > > >>> #7 0xffffffff80af32da at softclock_call_cc+0x18a > > >>> #8 0xffffffff80af3854 at softclock+0x94 > > >>> #9 0xffffffff80a9348f at intr_event_execute_handlers+0x20f > > >>> #10 0xffffffff80a936f6 at ithread_loop+0xc6 > > >>> #11 0xffffffff80a900d5 at fork_exit+0x85 > > >>> #12 0xffffffff80f846fe at fork_trampoline+0xe > > >>> Uptime: 92d2h47m6s > > >>> > > >>> I would have been pleased to make a dump available. > > >>> However, despite my (correct ?) configuration, server did not dump : > > >>> (nevertheless, "sysctl debug.kdb.panic=1" make it to dump) > > >>> # grep ^dump /boot/loader.conf /etc/rc.conf > > >>> /boot/loader.conf:dumpdev="/dev/mirror/swap" > > >>> /etc/rc.conf:dumpdev="AUTO" > > >> > > >> You may want to look at the NOTES section in gmirror(8). > > > > > > Yes, I should already be OK (prefer algorithm set). > > > > > >>> I use default kernel, with a rebuilt zfs module : > > >>> # uname -v > > >>> FreeBSD 11.0-RELEASE-p8 #0: Wed Feb 22 06:12:04 UTC 2017 > root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC > > >>> > > >>> I use the following iSCSI configuration, which disconnects the disks > "as soon as" they are unavailable : > > >>> kern.iscsi.ping_timeout=5 > > >>> kern.iscsi.fail_on_disconnection=1 > > >>> kern.iscsi.iscsid_timeout=5 > > >>> > > >>> I then think disk was at least correctly reachable during these 20 > busy minutes. > > >>> > > >>> So, any idea why I could have faced this issue ? > > >> > > >> Is it possible that the system was under memory pressure? > > > > > > No I don't think it was : > > > https://s1.postimg.org/uvsebpyyn/busydisk2.png < > https://s1.postimg.org/uvsebpyyn/busydisk2.png> > > > More than 2GB of available memory. > > > Swap not used (624kB). > > > ARC behaviour seems correct (anon increases because ZFS can't actually > write I think). > > > Regarding the pool itself, it was receiving data at 6MB/s, sending > around 30kB blocks to disks. > > > When disk went busy, throughput fell to some kB, with 128kB blocks. > > > > > >> geli's use of malloc() is known to cause deadlocks under memory > pressure: > > >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209759 < > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209759> > > >> > > >> Given that gmirror uses malloc() as well it probably has the same > issue. > > > > > > I don't use geli so I should not face this issue. > > > > > >>> I would have thought ZFS would have taken the busy device offline, > instead of raising a panic. > > >>> Perhaps it is already possible to make ZFS behave like this ? > > >> > > >> There's a tunable for this: vfs.zfs.deadman_enabled. > > >> If the panic is just a symptom of the deadlock it's unlikely > > >> to help though. > > > > > > I think this tunable should have prevented the server from having > raised a panic : > > > # sysctl -d vfs.zfs.deadman_enabled > > > vfs.zfs.deadman_enabled: Kernel panic on stalled ZFS I/O > > > # sysctl vfs.zfs.deadman_enabled > > > vfs.zfs.deadman_enabled: 1 > > > > > > But not sure how it would have behaved then... > > > (busy disk miraculously back to normal status, memory pressure due to > anon increasing...) > > > > I then think it would be nice, once vfs.zfs.deadman_synctime_ms has > expired, > > to be able to take the busy device offline instead of raising a panic. > > Currently, disabling deadman will avoid the panic but will let the > device slowing down the pool. > > > > I still did not found the root cause of this issue, not sure I will, > > quite difficult actually with a stacktrace and some performance graphs > only :/ > > > > What exactly is the disk doing when that happens? What does "gstat" > say? If the iSCSI > > target is also FreeBSD, what does ctlstat say? > > As shown on this graph made with gstat numbers from initiator : > https://s23.postimg.org/zd8jy9xaj/busydisk.png > The disk is continuously writing 3 MBps before the issue happens. > When it occurs, response time increases to around 30 seconds (100% busy), > and consequently disk throughput drops down to some kBps. > CPU stays at an almost fully idle level. > > As shown here, no memory pressure : > https://s1.postimg.org/uvsebpyyn/busydisk2.png uvsebpyyn/busydisk2.png> > > At the end of graphs' lines, panic is raised. > > iSCSI target is also FreeBSD, unfortunately ctlstat was not running during > the issue occurred. > So numbers will be average since system startup (102 days ago). > I also do not have gstat numbers from this disk on target side > (to help finding if it's a hardware issue, a iSCSI issue or something > else). > I will think about collecting these numbers if ever issue occurs again. > It's kind of hard to say something definitive at this point, but I suspect it's a problem at the target side. I got a report about something quite similar some two years ago, and it turned out to be a problem with a disk controller on the target. From owner-freebsd-fs@freebsd.org Mon Jul 3 15:40:50 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DAD5C9E92B0; Mon, 3 Jul 2017 15:40:50 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wm0-x229.google.com (mail-wm0-x229.google.com [IPv6:2a00:1450:400c:c09::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5CB4D74D26; Mon, 3 Jul 2017 15:40:50 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wm0-x229.google.com with SMTP id f67so59056430wmh.1; Mon, 03 Jul 2017 08:40:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:message-id:mime-version:subject:date:references:to:in-reply-to; bh=ovyVCvyhloMQ/jD0OLGq195wBHMNIKxKL28wsOoe6w0=; b=APPCfGF4zioKvH5abBgg8/dOUCpooS6J4C/jbBbnU9wk0xTzZriBf985lfPrdLH5GB R1eBbxa61k5No5/jaAA3b8Tt5KZ0NbXssmitLSikDCgqJ8KmQAuE81YYs2bp/4lLfJxq 2Tb/M3FbHPBZm5+BpTmofd6Hw5qD6F4pDqWMGTRriTUQ2+1yPFxTjRzv9u9P9slYivjz MPIMhqkNwRiSTBqUKfxTnIcqPQ4VF7rlLoYFlVKO2RhTppiXvR/rRxP6QjYoYBjapq4C wXKw7+lAbWRZbPnzRa/E8Ahp55j5BTQE3QUCCj9JPeELMmoPCwaKG8KK1HrRV7WQ5RXs Sy3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :references:to:in-reply-to; bh=ovyVCvyhloMQ/jD0OLGq195wBHMNIKxKL28wsOoe6w0=; b=mchFoACaKc7oSenWNngsEVH7rMPzDg77ZLQ7w2v00IpOyQwcQs4iAEdszbmrs5MdHs aSk+kwFX4fzLmSxWJDvLxPb+LuTkUPwrMCFgH3WbNd2LRtifhZvalZTDyKHzoA8UhFPk gvzcAa0qv8SrqlIPFBlIZp05vwoC/2/dqFC8f5rd6l8s5zvIyatnzocQoOi1HWJ6SFEF 3y6pn1Vx3ZZ2kz88p8bi8nldzJ0VV7bayAiX5JIqBwD/ZOz4gTRhB/hAyQXJK5nJjZbf dyrPTO/RYiRxPkqkxvt5Eqor6YX1hSGvObjwMMeBhkMajc/D0e26A1VwVLyZ5q4w2Y5s V6Sw== X-Gm-Message-State: AIVw113by322m85Njz0ur3EFHHtlRyX/EHzJBB2SsXerP5m0Me68PaCt t+htYiU76gfj9oiimxs= X-Received: by 10.28.184.87 with SMTP id i84mr7233612wmf.22.1499096448410; Mon, 03 Jul 2017 08:40:48 -0700 (PDT) Received: from ben.home (LFbn-1-11339-180.w2-15.abo.wanadoo.fr. [2.15.165.180]) by smtp.gmail.com with ESMTPSA id n71sm18817841wrb.62.2017.07.03.08.40.47 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 03 Jul 2017 08:40:47 -0700 (PDT) From: Ben RUBSON Message-Id: Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: I/O to pool appears to be hung, panic ! Date: Mon, 3 Jul 2017 17:40:46 +0200 References: <20170629144334.1e283570@fabiankeil.de> <1F414ECE-1856-4EA3-A141-88B64703D4D6@gmail.com> To: Freebsd fs , freebsd-scsi In-Reply-To: X-Mailer: Apple Mail (2.3124) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Jul 2017 15:40:51 -0000 > On 03 Jul 2017, at 17:27, Edward Napierala wrote: >=20 > 2017-07-03 14:36 GMT+01:00 Ben RUBSON >: > > On 03 Jul 2017, at 13:10, Edward Napierala > wrote: > > > > 2017-07-03 10:07 GMT+01:00 Ben RUBSON >>: > > > > > On 29 Jun 2017, at 15:36, Ben RUBSON >> wrote: > > > > > >> On 29 Jun 2017, at 14:43, Fabian Keil = = >> wrote: > > > > > > Thank you for your feedback Fabian. > > > > > >> Ben RUBSON = >> wrote: > > >> > > >>> One of my servers did a kernel panic last night, giving the = following message : > > >>> panic: I/O to pool 'home' appears to be hung on vdev guid 122... = at '/dev/label/G23iscsi'. > > >> [...] > > >>> Here are some numbers regarding this disk, taken from the server = hosting the pool : > > >>> (unfortunately not from the iscsi target server) > > >>> https://s23.postimg.org/zd8jy9xaj/busydisk.png = = > > > >>> > > >>> We clearly see that suddendly, disk became 100% busy, meanwhile = CPU was almost idle. > > > > We also clearly see that 5 minutes later (02:09) disk seems to be = back but became 100% busy again, > > and that 16 minutes later (default vfs.zfs.deadman_synctime_ms), = panic occurred. > > > > >>> No error message at all on both servers. > > >> [...] > > >>> The only log I have is the following stacktrace taken from the = server console : > > >>> panic: I/O to pool 'home' appears to be hung on vdev guid 122... = at '/dev/label/G23iscsi'. > > >>> cpuid =3D 0 > > >>> KDB: stack backtrace: > > >>> #0 0xffffffff80b240f7 at kdb_backtrace+0x67 > > >>> #1 0xffffffff80ad9462 at vpanic+0x182 > > >>> #2 0xffffffff80ad92d3 at panic+0x43 > > >>> #3 0xffffffff82238fa7 at vdev_deadman+0x127 > > >>> #4 0xffffffff82238ec0 at vdev_deadman+0x40 > > >>> #5 0xffffffff82238ec0 at vdev_deadman+0x40 > > >>> #6 0xffffffff8222d0a6 at spa_deadman+0x86 > > >>> #7 0xffffffff80af32da at softclock_call_cc+0x18a > > >>> #8 0xffffffff80af3854 at softclock+0x94 > > >>> #9 0xffffffff80a9348f at intr_event_execute_handlers+0x20f > > >>> #10 0xffffffff80a936f6 at ithread_loop+0xc6 > > >>> #11 0xffffffff80a900d5 at fork_exit+0x85 > > >>> #12 0xffffffff80f846fe at fork_trampoline+0xe > > >>> Uptime: 92d2h47m6s > > >>> > > >>> I would have been pleased to make a dump available. > > >>> However, despite my (correct ?) configuration, server did not = dump : > > >>> (nevertheless, "sysctl debug.kdb.panic=3D1" make it to dump) > > >>> # grep ^dump /boot/loader.conf /etc/rc.conf > > >>> /boot/loader.conf:dumpdev=3D"/dev/mirror/swap" > > >>> /etc/rc.conf:dumpdev=3D"AUTO" > > >> > > >> You may want to look at the NOTES section in gmirror(8). > > > > > > Yes, I should already be OK (prefer algorithm set). > > > > > >>> I use default kernel, with a rebuilt zfs module : > > >>> # uname -v > > >>> FreeBSD 11.0-RELEASE-p8 #0: Wed Feb 22 06:12:04 UTC 2017 = root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC > > >>> > > >>> I use the following iSCSI configuration, which disconnects the = disks "as soon as" they are unavailable : > > >>> kern.iscsi.ping_timeout=3D5 > > >>> kern.iscsi.fail_on_disconnection=3D1 > > >>> kern.iscsi.iscsid_timeout=3D5 > > >>> > > >>> I then think disk was at least correctly reachable during these = 20 busy minutes. > > >>> > > >>> So, any idea why I could have faced this issue ? > > >> > > >> Is it possible that the system was under memory pressure? > > > > > > No I don't think it was : > > > https://s1.postimg.org/uvsebpyyn/busydisk2.png = = > > > > More than 2GB of available memory. > > > Swap not used (624kB). > > > ARC behaviour seems correct (anon increases because ZFS can't = actually write I think). > > > Regarding the pool itself, it was receiving data at 6MB/s, sending = around 30kB blocks to disks. > > > When disk went busy, throughput fell to some kB, with 128kB = blocks. > > > > > >> geli's use of malloc() is known to cause deadlocks under memory = pressure: > > >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209759 = = > > > >> > > >> Given that gmirror uses malloc() as well it probably has the same = issue. > > > > > > I don't use geli so I should not face this issue. > > > > > >>> I would have thought ZFS would have taken the busy device = offline, instead of raising a panic. > > >>> Perhaps it is already possible to make ZFS behave like this ? > > >> > > >> There's a tunable for this: vfs.zfs.deadman_enabled. > > >> If the panic is just a symptom of the deadlock it's unlikely > > >> to help though. > > > > > > I think this tunable should have prevented the server from having = raised a panic : > > > # sysctl -d vfs.zfs.deadman_enabled > > > vfs.zfs.deadman_enabled: Kernel panic on stalled ZFS I/O > > > # sysctl vfs.zfs.deadman_enabled > > > vfs.zfs.deadman_enabled: 1 > > > > > > But not sure how it would have behaved then... > > > (busy disk miraculously back to normal status, memory pressure due = to anon increasing...) > > > > I then think it would be nice, once vfs.zfs.deadman_synctime_ms has = expired, > > to be able to take the busy device offline instead of raising a = panic. > > Currently, disabling deadman will avoid the panic but will let the = device slowing down the pool. > > > > I still did not found the root cause of this issue, not sure I will, > > quite difficult actually with a stacktrace and some performance = graphs only :/ > > > > What exactly is the disk doing when that happens? What does "gstat" = say? If the iSCSI > > target is also FreeBSD, what does ctlstat say? >=20 > As shown on this graph made with gstat numbers from initiator : > https://s23.postimg.org/zd8jy9xaj/busydisk.png = > The disk is continuously writing 3 MBps before the issue happens. > When it occurs, response time increases to around 30 seconds (100% = busy), > and consequently disk throughput drops down to some kBps. > CPU stays at an almost fully idle level. >=20 > As shown here, no memory pressure : > https://s1.postimg.org/uvsebpyyn/busydisk2.png = = > >=20 > At the end of graphs' lines, panic is raised. >=20 > iSCSI target is also FreeBSD, unfortunately ctlstat was not running = during the issue occurred. > So numbers will be average since system startup (102 days ago). > I also do not have gstat numbers from this disk on target side > (to help finding if it's a hardware issue, a iSCSI issue or something = else). > I will think about collecting these numbers if ever issue occurs = again. >=20 > It's kind of hard to say something definitive at this point, but I = suspect it's a problem > at the target side. I got a report about something quite similar some = two years ago, > and it turned out to be a problem with a disk controller on the = target. Thank you for your feedback. I then : - enabled gstat collection on target, to also have numbers on target, = not only on initiator ; - enabled controller logging (dev.mps.0.debug_level=3D0x1B) ; - disabled deadman. We should be able to investigate further in case issue occurs again. Of course feel free to notify me in case you have other ideas ! Thank you again, Ben From owner-freebsd-fs@freebsd.org Tue Jul 4 19:37:17 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0A81ED99484 for ; Tue, 4 Jul 2017 19:37:17 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id ED06D1744 for ; Tue, 4 Jul 2017 19:37:16 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v64JbG2X050217 for ; Tue, 4 Jul 2017 19:37:16 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 220440] mount_smbfs fails to mount samba 4.6.5 share Date: Tue, 04 Jul 2017 19:37:17 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.3-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: linimon@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Jul 2017 19:37:17 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D220440 Mark Linimon changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-bugs@FreeBSD.org |freebsd-fs@FreeBSD.org --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Tue Jul 4 21:15:08 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 53833D9B133 for ; Tue, 4 Jul 2017 21:15:08 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 41DCA3E64 for ; Tue, 4 Jul 2017 21:15:08 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v64LF7Bg047027 for ; Tue, 4 Jul 2017 21:15:07 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 220472] 11.1-RC1 kernel panic (zfs recv) Date: Tue, 04 Jul 2017 21:15:08 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-STABLE X-Bugzilla-Keywords: regression X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: linimon@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to keywords cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Jul 2017 21:15:08 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D220472 Mark Linimon changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-bugs@FreeBSD.org |freebsd-fs@FreeBSD.org Keywords| |regression CC| |re@FreeBSD.org --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Jul 5 00:00:54 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0A5D2D9D9F9 for ; Wed, 5 Jul 2017 00:00:54 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from smtp.digiware.nl (smtp.digiware.nl [176.74.240.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C6B30675F2 for ; Wed, 5 Jul 2017 00:00:52 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from router.digiware.nl (localhost.digiware.nl [127.0.0.1]) by smtp.digiware.nl (Postfix) with ESMTP id 289C94486A; Wed, 5 Jul 2017 02:00:44 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.com Received: from smtp.digiware.nl ([127.0.0.1]) by router.digiware.nl (router.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Q81mEf0cy8VE; Wed, 5 Jul 2017 02:00:43 +0200 (CEST) Received: from [192.168.10.67] (opteron [192.168.10.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.digiware.nl (Postfix) with ESMTPSA id 8EB2544869 for ; Wed, 5 Jul 2017 02:00:43 +0200 (CEST) To: FreeBSD Filesystems From: Willem Jan Withagen Subject: newfs returns cg 0: bad magic number Message-ID: Date: Wed, 5 Jul 2017 02:00:43 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jul 2017 00:00:54 -0000 Hi, I'm able to create a Ceph RBD backed ggate disk, in /dev/ggate0. It looks like I can: run dd on it gpart the disk create a zpool on it But when I try to create a UFS file system on it, newfs complains straight from the bat. # sudo newfs -E /dev/ggate0p1 /dev/ggate0p1: 1022.0MB (2093056 sectors) block size 32768, fragment size 4096 using 4 cylinder groups of 255.53MB, 8177 blks, 32768 inodes. Erasing sectors [128...2093055] super-block backups (for fsck_ffs -b #) at: 192, 523520, 1046848, 1570176 cg 0: bad magic number Googling returns that this is on and off a problem with new devices, but there is no generic suggestion on how to debug this.... Any/all suggestions are welcome, --WjW From owner-freebsd-fs@freebsd.org Wed Jul 5 01:08:33 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0630FD9EAAF for ; Wed, 5 Jul 2017 01:08:33 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DE9D268C79 for ; Wed, 5 Jul 2017 01:08:32 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v6518Wk2091853 for ; Wed, 5 Jul 2017 01:08:32 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 220472] 11.1-RC1 kernel panic (zfs recv) Date: Wed, 05 Jul 2017 01:08:33 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-STABLE X-Bugzilla-Keywords: regression X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: warlock@phouka.net X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jul 2017 01:08:33 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D220472 --- Comment #1 from John Kennedy --- So I've done a bunch of I/O trying to narrow this down. Between 20170526 a= nd 20170528 on a system where I thought some SSD drives were failing (recovera= ble checksum errors), I zfs sent | zfs recv the root disk (pool zroot) to a different disk, which I then moved to a different (current) box. Now I'm trying to move it again, and running into problems. It looks like I probably installed from installation media (11.0-RELEASE-p1= #0 r306420) and upgraded to -p10 after using beadm (possibly with some CMOS cl= ock offset looking at the timestamps, but that should all just be relative time= ).=20 I was running FreeBSD 11.0-RELEASE-p10 #0 r317487+8c96ad701987(releng/11.0) during the time that the bad snapshot I've done similar things before (and similar things after) with no problems. I can send|recv data from that filesystem up to a point (@20170526) without issues, but kernel panic after that. Right now, it seems to be isolated to the "zroot" since I can receive incrementals below that (ROOT, var, usr, etc). I'm trying to narrow it down some more. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Jul 5 05:15:05 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2182DDA1F86 for ; Wed, 5 Jul 2017 05:15:05 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A12DD725F2 for ; Wed, 5 Jul 2017 05:15:04 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v655Ewf7012928 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 5 Jul 2017 08:14:58 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v655Ewf7012928 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id v655EwVQ012927; Wed, 5 Jul 2017 08:14:58 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 5 Jul 2017 08:14:58 +0300 From: Konstantin Belousov To: Willem Jan Withagen Cc: FreeBSD Filesystems Subject: Re: newfs returns cg 0: bad magic number Message-ID: <20170705051458.GU1935@kib.kiev.ua> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.8.3 (2017-05-23) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jul 2017 05:15:05 -0000 On Wed, Jul 05, 2017 at 02:00:43AM +0200, Willem Jan Withagen wrote: > Hi, > > I'm able to create a Ceph RBD backed ggate disk, in /dev/ggate0. > It looks like I can: > run dd on it > gpart the disk > create a zpool on it > > But when I try to create a UFS file system on it, newfs complains > straight from the bat. > > # sudo newfs -E /dev/ggate0p1 > /dev/ggate0p1: 1022.0MB (2093056 sectors) block size 32768, fragment > size 4096 > using 4 cylinder groups of 255.53MB, 8177 blks, 32768 inodes. > Erasing sectors [128...2093055] > super-block backups (for fsck_ffs -b #) at: > 192, 523520, 1046848, 1570176 > cg 0: bad magic number > > Googling returns that this is on and off a problem with new devices, but > there is no generic suggestion on how to debug this.... > > Any/all suggestions are welcome, Typically this error means that the drive returns wrong data, not the bytes that were written to it and expected to be read. From owner-freebsd-fs@freebsd.org Wed Jul 5 07:12:35 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7036DDA40B6 for ; Wed, 5 Jul 2017 07:12:35 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail107.syd.optusnet.com.au (mail107.syd.optusnet.com.au [211.29.132.53]) by mx1.freebsd.org (Postfix) with ESMTP id 4675375BD1 for ; Wed, 5 Jul 2017 07:12:32 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail107.syd.optusnet.com.au (Postfix) with ESMTPS id AB409D422E8; Wed, 5 Jul 2017 16:55:08 +1000 (AEST) Date: Wed, 5 Jul 2017 16:55:07 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Konstantin Belousov cc: Willem Jan Withagen , FreeBSD Filesystems Subject: Re: newfs returns cg 0: bad magic number In-Reply-To: <20170705051458.GU1935@kib.kiev.ua> Message-ID: <20170705154533.M1171@besplex.bde.org> References: <20170705051458.GU1935@kib.kiev.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=VbSHBBh9 c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=VPPGVHZAt7igBUv1AsUA:9 a=CjuIK1q_8ugA:10 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jul 2017 07:12:35 -0000 On Wed, 5 Jul 2017, Konstantin Belousov wrote: > On Wed, Jul 05, 2017 at 02:00:43AM +0200, Willem Jan Withagen wrote: >> Hi, >> >> I'm able to create a Ceph RBD backed ggate disk, in /dev/ggate0. >> It looks like I can: >> run dd on it >> gpart the disk >> create a zpool on it >> >> But when I try to create a UFS file system on it, newfs complains >> straight from the bat. >> >> # sudo newfs -E /dev/ggate0p1 >> /dev/ggate0p1: 1022.0MB (2093056 sectors) block size 32768, fragment >> size 4096 >> using 4 cylinder groups of 255.53MB, 8177 blks, 32768 inodes. >> Erasing sectors [128...2093055] >> super-block backups (for fsck_ffs -b #) at: >> 192, 523520, 1046848, 1570176 >> cg 0: bad magic number >> >> Googling returns that this is on and off a problem with new devices, but >> there is no generic suggestion on how to debug this.... >> >> Any/all suggestions are welcome, > Typically this error means that the drive returns wrong data, not the > bytes that were written to it and expected to be read. This might be for writing to a nonexistent sector. Checking for write errors was broken by libufs, so some write errors are only sometimes detected as a side effect of reading back garbage. I use the following quick fix (the patch also fixes some style bugs). X Index: mkfs.c X =================================================================== X RCS file: /home/ncvs/src/sbin/newfs/mkfs.c,v X retrieving revision 1.85 X diff -u -1 -r1.85 mkfs.c X --- mkfs.c 9 Apr 2004 19:58:33 -0000 1.85 X +++ mkfs.c 7 Apr 2005 23:51:56 -0000 X @@ -437,16 +441,19 @@ X if (!Nflag && Oflag != 1) { X - i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, SBLOCKSIZE); X + i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, X + SBLOCKSIZE); X if (i == -1) X - err(1, "can't read old UFS1 superblock: %s", disk.d_error); X - X + err(1, "can't read old UFS1 superblock: %s", X + disk.d_error); X if (fsdummy.fs_magic == FS_UFS1_MAGIC) { X fsdummy.fs_magic = 0; X - bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, SBLOCKSIZE); X + bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, X + SBLOCKSIZE); X for (i = 0; i < fsdummy.fs_ncg; i++) X - bwrite(&disk, fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), X - chdummy, SBLOCKSIZE); X + bwrite(&disk, X + fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), X + chdummy, SBLOCKSIZE); X } X } X - if (!Nflag) X - sbwrite(&disk, 0); X + if (!Nflag && sbwrite(&disk, 0) != 0) X + err(1, "sbwrite: %s", disk.d_error); X if (Eflag == 1) { X @@ -518,4 +525,4 @@ X } X - if (!Nflag) X - sbwrite(&disk, 0); X + if (!Nflag && sbwrite(&disk, 0) != 0) X + err(1, "sbwrite: %s", disk.d_error); X for (i = 0; i < sblock.fs_cssize; i += sblock.fs_bsize) libufs broke the error handling for the most important writes -- to the superblock. Error handling is still done almost correctly in wtfs(), and most writes are still done using wtfs() which is now just a wrapper which adds error handling to libufs's bwrite(3), but writes to superblock are (were) now done internally by libufs's sbwrite(3) which (like most of libufs) is too hard to use. Note that -current needs a slightly different fix. Part of libufs being too hard to use is that it is a library so it can't just exit for errors. It returns errors in the string disk.d_error and the fix uses that for newfs, unlike for most other calls to sbwrite(3). However, newfs no longer uses sbwrite(3). It uses a wrapper do_sbwrite() which reduces to pwrite(2). The wrapper doesn't set d_error, so it is incompatible with sbwrite(3). This is an example that libufs is even harder to use than might first appear. The version with the do_sbwrite() wrapper fixes a previous version which replaced bwrite(3) instead of wrapping it. bwrite() in the application conflicted with bwrite(3) in libufs, since libufs is not designed to have its internals replaced by inconsistent parts like that. Apparently, a special case is only needed for superblock writes, and do_sbwrite() does that, and since libufs doesn't call any sbwrite() function internally there is no need to replace sbwrite(3); sbwrite(3) is just useless for its main application. All that the bwrite(3) and sbwrite(3) library functions do is handle the block size implicitly in a way that makes them harder to use than just multiplying by the block size like wtfs() used to do and do_sbwrite() now does. Bruce From owner-freebsd-fs@freebsd.org Wed Jul 5 08:01:05 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 07B40DA4E2A for ; Wed, 5 Jul 2017 08:01:05 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id EA14E76C3F for ; Wed, 5 Jul 2017 08:01:04 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v65814b7099637 for ; Wed, 5 Jul 2017 08:01:04 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 220440] mount_smbfs fails to mount samba 4.6.5 share Date: Wed, 05 Jul 2017 08:01:05 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.3-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: baillie.joshua@gmail.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jul 2017 08:01:05 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D220440 --- Comment #1 from Joshua Baillie --- if you enable NTLMv1 Auth on the debian samba server: "ntlm auth =3D yes" Then the share mounts with mount_smbfs as expected --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Jul 5 09:13:55 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6A33ADA6005 for ; Wed, 5 Jul 2017 09:13:55 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from smtp.digiware.nl (smtp.digiware.nl [176.74.240.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0F28578B08 for ; Wed, 5 Jul 2017 09:13:54 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from router.digiware.nl (localhost.digiware.nl [127.0.0.1]) by smtp.digiware.nl (Postfix) with ESMTP id 7F9EE44567; Wed, 5 Jul 2017 11:13:51 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.com Received: from smtp.digiware.nl ([127.0.0.1]) by router.digiware.nl (router.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lOM7WKFzloqu; Wed, 5 Jul 2017 11:13:50 +0200 (CEST) Received: from [192.168.10.67] (opteron [192.168.10.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.digiware.nl (Postfix) with ESMTPSA id 0799744566; Wed, 5 Jul 2017 11:13:50 +0200 (CEST) Subject: Re: newfs returns cg 0: bad magic number To: Bruce Evans , Konstantin Belousov Cc: FreeBSD Filesystems References: <20170705051458.GU1935@kib.kiev.ua> <20170705154533.M1171@besplex.bde.org> From: Willem Jan Withagen Message-ID: <42b0b86f-5c6b-a0f0-dea4-29f5118c0070@digiware.nl> Date: Wed, 5 Jul 2017 11:13:48 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170705154533.M1171@besplex.bde.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jul 2017 09:13:55 -0000 On 5-7-2017 08:55, Bruce Evans wrote: > On Wed, 5 Jul 2017, Konstantin Belousov wrote: > >> On Wed, Jul 05, 2017 at 02:00:43AM +0200, Willem Jan Withagen wrote: >>> Hi, >>> >>> I'm able to create a Ceph RBD backed ggate disk, in /dev/ggate0. >>> It looks like I can: >>> run dd on it >>> gpart the disk >>> create a zpool on it >>> >>> But when I try to create a UFS file system on it, newfs complains >>> straight from the bat. >>> >>> # sudo newfs -E /dev/ggate0p1 >>> /dev/ggate0p1: 1022.0MB (2093056 sectors) block size 32768, fragment >>> size 4096 >>> using 4 cylinder groups of 255.53MB, 8177 blks, 32768 inodes. >>> Erasing sectors [128...2093055] >>> super-block backups (for fsck_ffs -b #) at: >>> 192, 523520, 1046848, 1570176 >>> cg 0: bad magic number >>> >>> Googling returns that this is on and off a problem with new devices, but >>> there is no generic suggestion on how to debug this.... >>> >>> Any/all suggestions are welcome, >> Typically this error means that the drive returns wrong data, not the >> bytes that were written to it and expected to be read. > > This might be for writing to a nonexistent sector. Checking for write > errors was broken by libufs, so some write errors are only sometimes > detected as a side effect of reading back garbage. I have not tested yet, but this sounds like it could be on the track, since just dd-ing a file with random data to the image, and then running (shorted): # write test + dd 'if=/dev/urandom' 'of=/tmp/tmp.LIM3lwne/data' 'bs=1M' 'count=64' + dd 'if=/tmp/tmp.LIM3lwne/data' 'of=/dev/ggate3' 'bs=1M' [ "`dd if=${DATA} bs=1M | md5`" = "`rbd -p ${POOL} --no-progress export + [ d19e721c75d36bf2787c273768e3b5a2 '=' d19e721c75d36bf2787c273768e3b5a2 ] So unless resulats are all in a cache between userspace and the rbd-ggate, it seems that the data is correctly hitting the platers. I'll go and test if your fix is also my fix. Thanx, --WjW > I use the following quick fix (the patch also fixes some style bugs). > > X Index: mkfs.c > X =================================================================== > X RCS file: /home/ncvs/src/sbin/newfs/mkfs.c,v > X retrieving revision 1.85 > X diff -u -1 -r1.85 mkfs.c > X --- mkfs.c 9 Apr 2004 19:58:33 -0000 1.85 > X +++ mkfs.c 7 Apr 2005 23:51:56 -0000 > X @@ -437,16 +441,19 @@ > X if (!Nflag && Oflag != 1) { > X - i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > SBLOCKSIZE); > X + i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > X + SBLOCKSIZE); > X if (i == -1) > X - err(1, "can't read old UFS1 superblock: %s", disk.d_error); > X - > X + err(1, "can't read old UFS1 superblock: %s", > X + disk.d_error); > X if (fsdummy.fs_magic == FS_UFS1_MAGIC) { > X fsdummy.fs_magic = 0; > X - bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > SBLOCKSIZE); > X + bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > X + SBLOCKSIZE); > X for (i = 0; i < fsdummy.fs_ncg; i++) > X - bwrite(&disk, fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), > X - chdummy, SBLOCKSIZE); > X + bwrite(&disk, > X + fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), > X + chdummy, SBLOCKSIZE); > X } > X } > X - if (!Nflag) > X - sbwrite(&disk, 0); > X + if (!Nflag && sbwrite(&disk, 0) != 0) > X + err(1, "sbwrite: %s", disk.d_error); > X if (Eflag == 1) { > X @@ -518,4 +525,4 @@ > X } > X - if (!Nflag) > X - sbwrite(&disk, 0); > X + if (!Nflag && sbwrite(&disk, 0) != 0) > X + err(1, "sbwrite: %s", disk.d_error); > X for (i = 0; i < sblock.fs_cssize; i += sblock.fs_bsize) > > libufs broke the error handling for the most important writes -- to > the superblock. Error handling is still done almost correctly in > wtfs(), and most writes are still done using wtfs() which is now > just a wrapper which adds error handling to libufs's bwrite(3), but > writes to superblock are (were) now done internally by libufs's > sbwrite(3) which (like most of libufs) is too hard to use. > > Note that -current needs a slightly different fix. Part of libufs > being too hard to use is that it is a library so it can't just exit > for errors. It returns errors in the string disk.d_error and the > fix uses that for newfs, unlike for most other calls to sbwrite(3). > However, newfs no longer uses sbwrite(3). It uses a wrapper > do_sbwrite() which reduces to pwrite(2). The wrapper doesn't set > d_error, so it is incompatible with sbwrite(3). > > This is an example that libufs is even harder to use than might first > appear. The version with the do_sbwrite() wrapper fixes a previous > version which replaced bwrite(3) instead of wrapping it. bwrite() > in the application conflicted with bwrite(3) in libufs, since libufs > is not designed to have its internals replaced by inconsistent parts > like that. Apparently, a special case is only needed for superblock > writes, and do_sbwrite() does that, and since libufs doesn't call any > sbwrite() function internally there is no need to replace sbwrite(3); > sbwrite(3) is just useless for its main application. All that the > bwrite(3) and sbwrite(3) library functions do is handle the block > size implicitly in a way that makes them harder to use than just > multiplying by the block size like wtfs() used to do and do_sbwrite() > now does. From owner-freebsd-fs@freebsd.org Wed Jul 5 13:51:03 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C2419DAB9DE for ; Wed, 5 Jul 2017 13:51:03 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AFE1181561 for ; Wed, 5 Jul 2017 13:51:03 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v65Dp3j1095496 for ; Wed, 5 Jul 2017 13:51:03 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 162944] [coda] Coda file system module looks broken in 9.0 Date: Wed, 05 Jul 2017 13:51:03 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 9.0-BETA2 X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: trasz@FreeBSD.org X-Bugzilla-Status: In Progress X-Bugzilla-Resolution: X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jul 2017 13:51:03 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D162944 Edward Tomasz Napierala changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |trasz@FreeBSD.org --- Comment #3 from Edward Tomasz Napierala --- Could you test with the "coda" branch at https://github.com/trasz/coda and = see if the problem persists? --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Jul 5 13:51:18 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A5DC0DABA02 for ; Wed, 5 Jul 2017 13:51:18 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9441E815F2 for ; Wed, 5 Jul 2017 13:51:18 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v65DpIVK096202 for ; Wed, 5 Jul 2017 13:51:18 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 162944] [coda] Coda file system module looks broken in 9.0 Date: Wed, 05 Jul 2017 13:51:18 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 9.0-BETA2 X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: trasz@FreeBSD.org X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: trasz@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_status assigned_to Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jul 2017 13:51:18 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D162944 Edward Tomasz Napierala changed: What |Removed |Added ---------------------------------------------------------------------------- Status|In Progress |Open Assignee|freebsd-fs@FreeBSD.org |trasz@FreeBSD.org --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Jul 5 14:03:12 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 91D2ADABE94 for ; Wed, 5 Jul 2017 14:03:12 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 803B281D62 for ; Wed, 5 Jul 2017 14:03:12 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v65E3CwV049361 for ; Wed, 5 Jul 2017 14:03:12 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 220203] [zfs] [panic] in dmu_objset_do_userquota_updates on mount Date: Wed, 05 Jul 2017 14:03:12 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: neovortex@gmail.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jul 2017 14:03:12 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D220203 --- Comment #6 from neovortex@gmail.com --- So since moving the SSDs from the SAS2008 controller to the mainboard SATA controller this issue hasn't reoccurred again. Considering the frequency it= was reoccurring previously, I'd say that's done the trick. I guess this issue really has two parts, one is what's causing corruption w= ith SSDs on the SAS2008 controller (my guess being TRIM related), bug in FreeBS= D, bug in SAS2008 firmware, or bug in SSD firmware that only gets triggered wh= en on a SAS2008 controller, but not on other controllers. For completeness, the SSDs are the following: =3D=3D=3D START OF INFORMATION SECTION =3D=3D=3D Model Family: Marvell based SanDisk SSDs Device Model: SanDisk SDSSDHII480G Serial Number: xxx LU WWN Device Id: xxx Firmware Version: X31200RL User Capacity: 480,103,981,056 bytes [480 GB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device Form Factor: 2.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2 T13/2015-D revision 3 SATA Version is: SATA 3.2, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Wed Jul 5 23:56:36 2017 EST SMART support is: Available - device has SMART capability. SMART support is: Enabled =3D=3D=3D START OF READ SMART DATA SECTION =3D=3D=3D The second issue I guess is when a filesystem has corrupt metadata if it co= uld be handled more gracefully, eg zfs mount returns an error rather than causi= ng a panic. I'm not sure how practical this is, but it was an unusual experience having on-disk corruption causing a panic compared to the behaviour of other filesystems. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Jul 5 15:48:28 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6EF71DADE2C for ; Wed, 5 Jul 2017 15:48:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5D3FEB60 for ; Wed, 5 Jul 2017 15:48:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v65FmSXU039804 for ; Wed, 5 Jul 2017 15:48:28 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 220203] [zfs] [panic] in dmu_objset_do_userquota_updates on mount Date: Wed, 05 Jul 2017 15:48:28 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: ben.rubson@gmail.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jul 2017 15:48:28 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D220203 Ben RUBSON changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ben.rubson@gmail.com --- Comment #7 from Ben RUBSON --- 2 hints perhaps : Try upgrading your SAS2008 firmware to the same version as the driver ? (20.00.07.00 IT) Yours is pretty old compared to the driver. The following sysctl should also give you more info from the controller if = ever the bug occurs again : dev.mps.0.debug_level=3D0x1B --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Jul 5 15:56:01 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CCD07D872F8 for ; Wed, 5 Jul 2017 15:56:01 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BAB9B1237 for ; Wed, 5 Jul 2017 15:56:01 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v65Fu1Mr060399 for ; Wed, 5 Jul 2017 15:56:01 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 220203] [zfs] [panic] in dmu_objset_do_userquota_updates on mount Date: Wed, 05 Jul 2017 15:56:01 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: ben.rubson@gmail.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jul 2017 15:56:01 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D220203 --- Comment #8 from Ben RUBSON --- A last one, I read that it's better to avoid mixing SATA and SAS disks, I however don't remember if it's in the same pool, or on the same controller.= .. Your SSD are SATA, not sure however about your other disks plugged to this SAS2008. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Jul 5 16:12:24 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4AD65D879C6 for ; Wed, 5 Jul 2017 16:12:24 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from smtp.digiware.nl (smtp.digiware.nl [176.74.240.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D401C1EE0 for ; Wed, 5 Jul 2017 16:12:23 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from router.digiware.nl (localhost.digiware.nl [127.0.0.1]) by smtp.digiware.nl (Postfix) with ESMTP id 301EF4478D; Wed, 5 Jul 2017 18:12:20 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.com Received: from smtp.digiware.nl ([127.0.0.1]) by router.digiware.nl (router.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ksUrxuQnivhp; Wed, 5 Jul 2017 18:12:18 +0200 (CEST) Received: from [192.168.10.67] (opteron [192.168.10.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.digiware.nl (Postfix) with ESMTPSA id C450544789; Wed, 5 Jul 2017 18:12:18 +0200 (CEST) Subject: Re: newfs returns cg 0: bad magic number To: Bruce Evans , Konstantin Belousov Cc: FreeBSD Filesystems References: <20170705051458.GU1935@kib.kiev.ua> <20170705154533.M1171@besplex.bde.org> From: Willem Jan Withagen Message-ID: <9aa4dd5c-7ec9-6fdf-420b-e31bbd5217c3@digiware.nl> Date: Wed, 5 Jul 2017 18:12:16 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170705154533.M1171@besplex.bde.org> Content-Type: text/plain; charset=utf-8 Content-Language: nl Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jul 2017 16:12:24 -0000 On 5-7-2017 08:55, Bruce Evans wrote: > On Wed, 5 Jul 2017, Konstantin Belousov wrote: > >> On Wed, Jul 05, 2017 at 02:00:43AM +0200, Willem Jan Withagen wrote: >>> Hi, >>> >>> I'm able to create a Ceph RBD backed ggate disk, in /dev/ggate0. >>> It looks like I can: >>> run dd on it >>> gpart the disk >>> create a zpool on it >>> >>> But when I try to create a UFS file system on it, newfs complains >>> straight from the bat. >>> >>> # sudo newfs -E /dev/ggate0p1 >>> /dev/ggate0p1: 1022.0MB (2093056 sectors) block size 32768, fragment >>> size 4096 >>> using 4 cylinder groups of 255.53MB, 8177 blks, 32768 inodes. >>> Erasing sectors [128...2093055] >>> super-block backups (for fsck_ffs -b #) at: >>> 192, 523520, 1046848, 1570176 >>> cg 0: bad magic number >>> >>> Googling returns that this is on and off a problem with new devices, but >>> there is no generic suggestion on how to debug this.... >>> >>> Any/all suggestions are welcome, >> Typically this error means that the drive returns wrong data, not the >> bytes that were written to it and expected to be read. > > This might be for writing to a nonexistent sector. Checking for write > errors was broken by libufs, so some write errors are only sometimes > detected as a side effect of reading back garbage. > > I use the following quick fix (the patch also fixes some style bugs). > > X Index: mkfs.c > X =================================================================== > X RCS file: /home/ncvs/src/sbin/newfs/mkfs.c,v > X retrieving revision 1.85 > X diff -u -1 -r1.85 mkfs.c > X --- mkfs.c 9 Apr 2004 19:58:33 -0000 1.85 > X +++ mkfs.c 7 Apr 2005 23:51:56 -0000 > X @@ -437,16 +441,19 @@ > X if (!Nflag && Oflag != 1) { > X - i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > SBLOCKSIZE); > X + i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > X + SBLOCKSIZE); > X if (i == -1) > X - err(1, "can't read old UFS1 superblock: %s", disk.d_error); > X - > X + err(1, "can't read old UFS1 superblock: %s", > X + disk.d_error); > X if (fsdummy.fs_magic == FS_UFS1_MAGIC) { > X fsdummy.fs_magic = 0; > X - bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > SBLOCKSIZE); > X + bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > X + SBLOCKSIZE); > X for (i = 0; i < fsdummy.fs_ncg; i++) > X - bwrite(&disk, fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), > X - chdummy, SBLOCKSIZE); > X + bwrite(&disk, > X + fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), > X + chdummy, SBLOCKSIZE); > X } > X } > X - if (!Nflag) > X - sbwrite(&disk, 0); > X + if (!Nflag && sbwrite(&disk, 0) != 0) > X + err(1, "sbwrite: %s", disk.d_error); > X if (Eflag == 1) { > X @@ -518,4 +525,4 @@ > X } > X - if (!Nflag) > X - sbwrite(&disk, 0); > X + if (!Nflag && sbwrite(&disk, 0) != 0) > X + err(1, "sbwrite: %s", disk.d_error); > X for (i = 0; i < sblock.fs_cssize; i += sblock.fs_bsize) > > libufs broke the error handling for the most important writes -- to > the superblock. Error handling is still done almost correctly in > wtfs(), and most writes are still done using wtfs() which is now > just a wrapper which adds error handling to libufs's bwrite(3), but > writes to superblock are (were) now done internally by libufs's > sbwrite(3) which (like most of libufs) is too hard to use. > > Note that -current needs a slightly different fix. Part of libufs > being too hard to use is that it is a library so it can't just exit > for errors. It returns errors in the string disk.d_error and the > fix uses that for newfs, unlike for most other calls to sbwrite(3). > However, newfs no longer uses sbwrite(3). It uses a wrapper > do_sbwrite() which reduces to pwrite(2). The wrapper doesn't set > d_error, so it is incompatible with sbwrite(3). > > This is an example that libufs is even harder to use than might first > appear. The version with the do_sbwrite() wrapper fixes a previous > version which replaced bwrite(3) instead of wrapping it. bwrite() > in the application conflicted with bwrite(3) in libufs, since libufs > is not designed to have its internals replaced by inconsistent parts > like that. Apparently, a special case is only needed for superblock > writes, and do_sbwrite() does that, and since libufs doesn't call any > sbwrite() function internally there is no need to replace sbwrite(3); > sbwrite(3) is just useless for its main application. All that the > bwrite(3) and sbwrite(3) library functions do is handle the block > size implicitly in a way that makes them harder to use than just > multiplying by the block size like wtfs() used to do and do_sbwrite() > now does. the 11.1RC code reads: if (!Nflag) do_sbwrite(&disk); So i'm not certain from the above if I now replace that with: if (!Nflag && do_sbwrite(&disk) != 0) err(1, "sbwrite: %s", disk.d_error); Or to stick the sbwrite() back in again? I guess the later, because of the error return value in sbwrite(). --WjW From owner-freebsd-fs@freebsd.org Wed Jul 5 16:55:07 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 53EC0D8A890 for ; Wed, 5 Jul 2017 16:55:07 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from smtp.digiware.nl (smtp.digiware.nl [176.74.240.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DF2173782 for ; Wed, 5 Jul 2017 16:55:06 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from router.digiware.nl (localhost.digiware.nl [127.0.0.1]) by smtp.digiware.nl (Postfix) with ESMTP id 8DB0B44EF6; Wed, 5 Jul 2017 18:55:04 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.com Received: from smtp.digiware.nl ([127.0.0.1]) by router.digiware.nl (router.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FNhOn1XIbEYH; Wed, 5 Jul 2017 18:55:02 +0200 (CEST) Received: from [192.168.10.67] (opteron [192.168.10.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.digiware.nl (Postfix) with ESMTPSA id BE20144EF5; Wed, 5 Jul 2017 18:55:02 +0200 (CEST) Subject: Re: newfs returns cg 0: bad magic number To: Bruce Evans , Konstantin Belousov Cc: FreeBSD Filesystems References: <20170705051458.GU1935@kib.kiev.ua> <20170705154533.M1171@besplex.bde.org> From: Willem Jan Withagen Message-ID: <718a0437-06ce-8159-b353-6245f176fcd5@digiware.nl> Date: Wed, 5 Jul 2017 18:55:00 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170705154533.M1171@besplex.bde.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jul 2017 16:55:07 -0000 On 5-7-2017 08:55, Bruce Evans wrote: > On Wed, 5 Jul 2017, Konstantin Belousov wrote: > >> On Wed, Jul 05, 2017 at 02:00:43AM +0200, Willem Jan Withagen wrote: >>> Hi, >>> >>> I'm able to create a Ceph RBD backed ggate disk, in /dev/ggate0. >>> It looks like I can: >>> run dd on it >>> gpart the disk >>> create a zpool on it >>> >>> But when I try to create a UFS file system on it, newfs complains >>> straight from the bat. >>> >>> # sudo newfs -E /dev/ggate0p1 >>> /dev/ggate0p1: 1022.0MB (2093056 sectors) block size 32768, fragment >>> size 4096 >>> using 4 cylinder groups of 255.53MB, 8177 blks, 32768 inodes. >>> Erasing sectors [128...2093055] >>> super-block backups (for fsck_ffs -b #) at: >>> 192, 523520, 1046848, 1570176 >>> cg 0: bad magic number >>> >>> Googling returns that this is on and off a problem with new devices, but >>> there is no generic suggestion on how to debug this.... >>> >>> Any/all suggestions are welcome, >> Typically this error means that the drive returns wrong data, not the >> bytes that were written to it and expected to be read. > > This might be for writing to a nonexistent sector. Checking for write > errors was broken by libufs, so some write errors are only sometimes > detected as a side effect of reading back garbage. > > I use the following quick fix (the patch also fixes some style bugs). > > X Index: mkfs.c > X =================================================================== > X RCS file: /home/ncvs/src/sbin/newfs/mkfs.c,v > X retrieving revision 1.85 > X diff -u -1 -r1.85 mkfs.c > X --- mkfs.c 9 Apr 2004 19:58:33 -0000 1.85 > X +++ mkfs.c 7 Apr 2005 23:51:56 -0000 > X @@ -437,16 +441,19 @@ > X if (!Nflag && Oflag != 1) { > X - i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > SBLOCKSIZE); > X + i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > X + SBLOCKSIZE); > X if (i == -1) > X - err(1, "can't read old UFS1 superblock: %s", disk.d_error); > X - > X + err(1, "can't read old UFS1 superblock: %s", > X + disk.d_error); > X if (fsdummy.fs_magic == FS_UFS1_MAGIC) { > X fsdummy.fs_magic = 0; > X - bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > SBLOCKSIZE); > X + bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > X + SBLOCKSIZE); > X for (i = 0; i < fsdummy.fs_ncg; i++) > X - bwrite(&disk, fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), > X - chdummy, SBLOCKSIZE); > X + bwrite(&disk, > X + fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), > X + chdummy, SBLOCKSIZE); > X } > X } > X - if (!Nflag) > X - sbwrite(&disk, 0); > X + if (!Nflag && sbwrite(&disk, 0) != 0) > X + err(1, "sbwrite: %s", disk.d_error); > X if (Eflag == 1) { > X @@ -518,4 +525,4 @@ > X } > X - if (!Nflag) > X - sbwrite(&disk, 0); > X + if (!Nflag && sbwrite(&disk, 0) != 0) > X + err(1, "sbwrite: %s", disk.d_error); > X for (i = 0; i < sblock.fs_cssize; i += sblock.fs_bsize) > > libufs broke the error handling for the most important writes -- to > the superblock. Error handling is still done almost correctly in > wtfs(), and most writes are still done using wtfs() which is now > just a wrapper which adds error handling to libufs's bwrite(3), but > writes to superblock are (were) now done internally by libufs's > sbwrite(3) which (like most of libufs) is too hard to use. > > Note that -current needs a slightly different fix. Part of libufs > being too hard to use is that it is a library so it can't just exit > for errors. It returns errors in the string disk.d_error and the > fix uses that for newfs, unlike for most other calls to sbwrite(3). > However, newfs no longer uses sbwrite(3). It uses a wrapper > do_sbwrite() which reduces to pwrite(2). The wrapper doesn't set > d_error, so it is incompatible with sbwrite(3). > > This is an example that libufs is even harder to use than might first > appear. The version with the do_sbwrite() wrapper fixes a previous > version which replaced bwrite(3) instead of wrapping it. bwrite() > in the application conflicted with bwrite(3) in libufs, since libufs > is not designed to have its internals replaced by inconsistent parts > like that. Apparently, a special case is only needed for superblock > writes, and do_sbwrite() does that, and since libufs doesn't call any > sbwrite() function internally there is no need to replace sbwrite(3); > sbwrite(3) is just useless for its main application. All that the > bwrite(3) and sbwrite(3) library functions do is handle the block > size implicitly in a way that makes them harder to use than just > multiplying by the block size like wtfs() used to do and do_sbwrite() > now does. Looking at geom deug info, I see reading block 0, but nowhere is block 0 written. so the chance of there being a CG_MAGIC in there is rather slim (Running newfs /dev/ggate0) --WjW This is what geom.debug.gate gives: Request received. ggate0[WRITE(offset=51199488, length=512)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[WRITE(offset=51199488, length=512)] Request received. ggate0[READ(offset=8192, length=8192)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=8192, length=8192)] Request received. ggate0[WRITE(offset=65536, length=8192)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[WRITE(offset=65536, length=8192)] Request received. ggate0[GETATTR(attr=GEOM::hba_vendor)] Ignoring request. ggate0[GETATTR(attr=GEOM::hba_vendor)] Request received. ggate0[READ(offset=134185472, length=2048)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=134185472, length=2048)] Request received. ggate0[GETATTR(attr=GEOM::hba_vendor)] Ignoring request. ggate0[GETATTR(attr=GEOM::hba_vendor)] Request received. ggate0[READ(offset=134216704, length=512)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=134216704, length=512)] Request received. ggate0[READ(offset=134217216, length=512)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=134217216, length=512)] Request received. ggate0[GETATTR(attr=GEOM::hba_vendor)] Ignoring request. ggate0[GETATTR(attr=GEOM::hba_vendor)] Request received. ggate0[READ(offset=134217216, length=512)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=134217216, length=512)] Request received. ggate0[GETATTR(attr=GEOM::hba_vendor)] Ignoring request. ggate0[GETATTR(attr=GEOM::hba_vendor)] Request received. ggate0[READ(offset=134217216, length=512)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=134217216, length=512)] Request received. ggate0[GETATTR(attr=GEOM::ident)] Ignoring request. ggate0[GETATTR(attr=GEOM::ident)] Request received. ggate0[GETATTR(attr=PART::isleaf)] Ignoring request. ggate0[GETATTR(attr=PART::isleaf)] Request received. ggate0[GETATTR(attr=PART::depth)] Ignoring request. ggate0[GETATTR(attr=PART::depth)] Request received. ggate0[READ(offset=512, length=512)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=512, length=512)] Request received. ggate0[READ(offset=0, length=512)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=0, length=512)] Request received. ggate0[READ(offset=134217216, length=512)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=134217216, length=512)] Request received. ggate0[GETATTR(attr=PART::scheme)] Ignoring request. ggate0[GETATTR(attr=PART::scheme)] Request received. ggate0[READ(offset=0, length=512)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=0, length=512)] Request received. ggate0[READ(offset=134217216, length=512)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=134217216, length=512)] Request received. ggate0[READ(offset=65536, length=8192)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=65536, length=8192)] Request received. ggate0[READ(offset=8192, length=8192)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=8192, length=8192)] Request received. ggate0[READ(offset=0, length=8192)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=0, length=8192)] Request received. ggate0[READ(offset=262144, length=8192)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=262144, length=8192)] Request received. ggate0[READ(offset=65536, length=8192)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=65536, length=8192)] Request received. ggate0[READ(offset=8192, length=8192)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=8192, length=8192)] Request received. ggate0[READ(offset=0, length=8192)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=0, length=8192)] Request received. ggate0[READ(offset=262144, length=8192)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=262144, length=8192)] Request received. ggate0[READ(offset=32768, length=512)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=32768, length=512)] Request received. ggate0[READ(offset=0, length=512)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=0, length=512)] Request received. ggate0[READ(offset=1024, length=512)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=1024, length=512)] Request received. ggate0[READ(offset=8192, length=512)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=8192, length=512)] Request received. ggate0[READ(offset=65536, length=512)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=65536, length=512)] Request received. ggate0[READ(offset=0, length=512)] ioctl(ggctl, c0386d04, 0xfffffe023c4cc9b0, 3, 0xfffff80037150560) ioctl(ggctl, c0386d05, 0xfffffe023c6229b0, 3, 0xfffff80037150000) Request done. ggate0[READ(offset=0, length=512)] From owner-freebsd-fs@freebsd.org Wed Jul 5 17:47:15 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 959E1D8C019 for ; Wed, 5 Jul 2017 17:47:15 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 83A3765770 for ; Wed, 5 Jul 2017 17:47:15 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v65HlFNH058939 for ; Wed, 5 Jul 2017 17:47:15 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 215519] [fusefs] strange issue when glusterfs is fuse mounted, files not handled as expected. Date: Wed, 05 Jul 2017 17:47:15 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: ben.rubson@gmail.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jul 2017 17:47:15 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D215519 --- Comment #3 from Ben RUBSON --- r320451 will be in 11.1 ! https://svnweb.freebsd.org/changeset/base/320689 --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Wed Jul 5 17:49:35 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DC8F5D8C157 for ; Wed, 5 Jul 2017 17:49:35 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CA481658E2 for ; Wed, 5 Jul 2017 17:49:35 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v65HnZRA062106 for ; Wed, 5 Jul 2017 17:49:35 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 215519] [fusefs] strange issue when glusterfs is fuse mounted, files not handled as expected. Date: Wed, 05 Jul 2017 17:49:36 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: markj@FreeBSD.org X-Bugzilla-Status: Closed X-Bugzilla-Resolution: FIXED X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: cem@freebsd.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to bug_status cc resolution Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jul 2017 17:49:36 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D215519 Mark Johnston changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-fs@FreeBSD.org |cem@freebsd.org Status|New |Closed CC| |markj@FreeBSD.org Resolution|--- |FIXED --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Thu Jul 6 09:56:19 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2BDB7DA9404 for ; Thu, 6 Jul 2017 09:56:19 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 18AF064A25 for ; Thu, 6 Jul 2017 09:56:19 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v669uIl6094872 for ; Thu, 6 Jul 2017 09:56:18 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 220203] [zfs] [panic] in dmu_objset_do_userquota_updates on mount Date: Thu, 06 Jul 2017 09:56:19 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: neovortex@gmail.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Jul 2017 09:56:19 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D220203 --- Comment #9 from neovortex@gmail.com --- (In reply to Ben RUBSON from comment #8) They are separate pools. All disks are SATA. The SATA HDD pool is still on = the SAS2008 pool, and has never had a checksum error throughout everything. I did notice that regarding the controller firmware, although finding newer versions of the IT mode firmware is proving to be a tad difficult. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Thu Jul 6 22:12:58 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 97BDBD93518 for ; Thu, 6 Jul 2017 22:12:58 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from smtp.digiware.nl (smtp.digiware.nl [IPv6:2001:4cb8:90:ffff::3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 46AAE81C9B for ; Thu, 6 Jul 2017 22:12:57 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from router.digiware.nl (localhost.digiware.nl [127.0.0.1]) by smtp.digiware.nl (Postfix) with ESMTP id 91A3344CAE; Fri, 7 Jul 2017 00:12:53 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.com Received: from smtp.digiware.nl ([127.0.0.1]) by router.digiware.nl (router.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kcZzME5zwK8F; Fri, 7 Jul 2017 00:12:52 +0200 (CEST) Received: from [192.168.10.67] (opteron [192.168.10.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.digiware.nl (Postfix) with ESMTPSA id 2F3BF44CAC; Fri, 7 Jul 2017 00:12:52 +0200 (CEST) Subject: Re: newfs returns cg 0: bad magic number To: Bruce Evans , Konstantin Belousov Cc: FreeBSD Filesystems References: <20170705051458.GU1935@kib.kiev.ua> <20170705154533.M1171@besplex.bde.org> From: Willem Jan Withagen Message-ID: <9fe3ec97-60ea-e9e6-fb65-9b163d64ac45@digiware.nl> Date: Fri, 7 Jul 2017 00:12:49 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170705154533.M1171@besplex.bde.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Jul 2017 22:12:58 -0000 On 5-7-2017 08:55, Bruce Evans wrote: > On Wed, 5 Jul 2017, Konstantin Belousov wrote: > >> On Wed, Jul 05, 2017 at 02:00:43AM +0200, Willem Jan Withagen wrote: >>> Hi, >>> >>> I'm able to create a Ceph RBD backed ggate disk, in /dev/ggate0. >>> It looks like I can: >>> run dd on it >>> gpart the disk >>> create a zpool on it >>> >>> But when I try to create a UFS file system on it, newfs complains >>> straight from the bat. >>> >>> # sudo newfs -E /dev/ggate0p1 >>> /dev/ggate0p1: 1022.0MB (2093056 sectors) block size 32768, fragment >>> size 4096 >>> using 4 cylinder groups of 255.53MB, 8177 blks, 32768 inodes. >>> Erasing sectors [128...2093055] >>> super-block backups (for fsck_ffs -b #) at: >>> 192, 523520, 1046848, 1570176 >>> cg 0: bad magic number >>> >>> Googling returns that this is on and off a problem with new devices, but >>> there is no generic suggestion on how to debug this.... >>> >>> Any/all suggestions are welcome, >> Typically this error means that the drive returns wrong data, not the >> bytes that were written to it and expected to be read. > > This might be for writing to a nonexistent sector. Checking for write > errors was broken by libufs, so some write errors are only sometimes > detected as a side effect of reading back garbage. > > I use the following quick fix (the patch also fixes some style bugs). > > X Index: mkfs.c > X =================================================================== > X RCS file: /home/ncvs/src/sbin/newfs/mkfs.c,v > X retrieving revision 1.85 > X diff -u -1 -r1.85 mkfs.c > X --- mkfs.c 9 Apr 2004 19:58:33 -0000 1.85 > X +++ mkfs.c 7 Apr 2005 23:51:56 -0000 > X @@ -437,16 +441,19 @@ > X if (!Nflag && Oflag != 1) { > X - i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > SBLOCKSIZE); > X + i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > X + SBLOCKSIZE); > X if (i == -1) > X - err(1, "can't read old UFS1 superblock: %s", disk.d_error); > X - > X + err(1, "can't read old UFS1 superblock: %s", > X + disk.d_error); > X if (fsdummy.fs_magic == FS_UFS1_MAGIC) { > X fsdummy.fs_magic = 0; > X - bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > SBLOCKSIZE); > X + bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > X + SBLOCKSIZE); > X for (i = 0; i < fsdummy.fs_ncg; i++) > X - bwrite(&disk, fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), > X - chdummy, SBLOCKSIZE); > X + bwrite(&disk, > X + fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), > X + chdummy, SBLOCKSIZE); > X } > X } > X - if (!Nflag) > X - sbwrite(&disk, 0); > X + if (!Nflag && sbwrite(&disk, 0) != 0) > X + err(1, "sbwrite: %s", disk.d_error); > X if (Eflag == 1) { > X @@ -518,4 +525,4 @@ > X } > X - if (!Nflag) > X - sbwrite(&disk, 0); > X + if (!Nflag && sbwrite(&disk, 0) != 0) > X + err(1, "sbwrite: %s", disk.d_error); > X for (i = 0; i < sblock.fs_cssize; i += sblock.fs_bsize) > > libufs broke the error handling for the most important writes -- to > the superblock. Error handling is still done almost correctly in > wtfs(), and most writes are still done using wtfs() which is now > just a wrapper which adds error handling to libufs's bwrite(3), but > writes to superblock are (were) now done internally by libufs's > sbwrite(3) which (like most of libufs) is too hard to use. > > Note that -current needs a slightly different fix. Part of libufs > being too hard to use is that it is a library so it can't just exit > for errors. It returns errors in the string disk.d_error and the > fix uses that for newfs, unlike for most other calls to sbwrite(3). > However, newfs no longer uses sbwrite(3). It uses a wrapper > do_sbwrite() which reduces to pwrite(2). The wrapper doesn't set > d_error, so it is incompatible with sbwrite(3). > > This is an example that libufs is even harder to use than might first > appear. The version with the do_sbwrite() wrapper fixes a previous > version which replaced bwrite(3) instead of wrapping it. bwrite() > in the application conflicted with bwrite(3) in libufs, since libufs > is not designed to have its internals replaced by inconsistent parts > like that. Apparently, a special case is only needed for superblock > writes, and do_sbwrite() does that, and since libufs doesn't call any > sbwrite() function internally there is no need to replace sbwrite(3); > sbwrite(3) is just useless for its main application. All that the > bwrite(3) and sbwrite(3) library functions do is handle the block > size implicitly in a way that makes them harder to use than just > multiplying by the block size like wtfs() used to do and do_sbwrite() > now does. This is where the trouble originates: /usr/srcs/11/src/lib/libufs/sblock.c:148 /* * Write superblock summary information. */ blks = howmany(fs->fs_cssize, fs->fs_fsize); space = (uint8_t *)disk->d_sbcsum; for (i = 0; i < blks; i += fs->fs_frag) { But: (gdb) p disk->d_sbcsum $19 = (struct csum *) 0x0 and this pointer is later on used to write: for (i = 0; i < blks; i += fs->fs_frag) { size = fs->fs_bsize; if (i + fs->fs_frag > blks) size = (blks - i) * fs->fs_fsize; if (bwrite(disk, fsbtodb(fs, fs->fs_csaddr + i), space, size) == -1) { ERROR(disk, "Failed to write sb summary information"); return (-1); } space += size; } But the bwrite returns error because the called pwrite() tries to write 4096 bytes from a null pointer. And that it does not like. Now the question is: why isn't d_sbcsum not filled out? Note that the disk is filled with random data. I've been looking for quite some time, but I just don't get it. Where should the superblock come from if a whole disk is being used? (so there no MBR or gpart written. Dangerously dedicated) --WjW From owner-freebsd-fs@freebsd.org Fri Jul 7 06:24:02 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D304ED9B70C for ; Fri, 7 Jul 2017 06:24:02 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6F3EA6AA13 for ; Fri, 7 Jul 2017 06:24:02 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v676NtEQ065168 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 7 Jul 2017 09:23:55 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v676NtEQ065168 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id v676NsA0065167; Fri, 7 Jul 2017 09:23:54 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 7 Jul 2017 09:23:54 +0300 From: Konstantin Belousov To: Willem Jan Withagen Cc: Bruce Evans , FreeBSD Filesystems Subject: Re: newfs returns cg 0: bad magic number Message-ID: <20170707062354.GP1935@kib.kiev.ua> References: <20170705051458.GU1935@kib.kiev.ua> <20170705154533.M1171@besplex.bde.org> <9fe3ec97-60ea-e9e6-fb65-9b163d64ac45@digiware.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <9fe3ec97-60ea-e9e6-fb65-9b163d64ac45@digiware.nl> User-Agent: Mutt/1.8.3 (2017-05-23) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Jul 2017 06:24:02 -0000 On Fri, Jul 07, 2017 at 12:12:49AM +0200, Willem Jan Withagen wrote: > On 5-7-2017 08:55, Bruce Evans wrote: > > On Wed, 5 Jul 2017, Konstantin Belousov wrote: > > > >> On Wed, Jul 05, 2017 at 02:00:43AM +0200, Willem Jan Withagen wrote: > >>> Hi, > >>> > >>> I'm able to create a Ceph RBD backed ggate disk, in /dev/ggate0. > >>> It looks like I can: > >>> run dd on it > >>> gpart the disk > >>> create a zpool on it > >>> > >>> But when I try to create a UFS file system on it, newfs complains > >>> straight from the bat. > >>> > >>> # sudo newfs -E /dev/ggate0p1 > >>> /dev/ggate0p1: 1022.0MB (2093056 sectors) block size 32768, fragment > >>> size 4096 > >>> using 4 cylinder groups of 255.53MB, 8177 blks, 32768 inodes. > >>> Erasing sectors [128...2093055] > >>> super-block backups (for fsck_ffs -b #) at: > >>> 192, 523520, 1046848, 1570176 > >>> cg 0: bad magic number > >>> > >>> Googling returns that this is on and off a problem with new devices, but > >>> there is no generic suggestion on how to debug this.... > >>> > >>> Any/all suggestions are welcome, > >> Typically this error means that the drive returns wrong data, not the > >> bytes that were written to it and expected to be read. > > > > This might be for writing to a nonexistent sector. Checking for write > > errors was broken by libufs, so some write errors are only sometimes > > detected as a side effect of reading back garbage. > > > > I use the following quick fix (the patch also fixes some style bugs). > > > > X Index: mkfs.c > > X =================================================================== > > X RCS file: /home/ncvs/src/sbin/newfs/mkfs.c,v > > X retrieving revision 1.85 > > X diff -u -1 -r1.85 mkfs.c > > X --- mkfs.c 9 Apr 2004 19:58:33 -0000 1.85 > > X +++ mkfs.c 7 Apr 2005 23:51:56 -0000 > > X @@ -437,16 +441,19 @@ > > X if (!Nflag && Oflag != 1) { > > X - i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > > SBLOCKSIZE); > > X + i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > > X + SBLOCKSIZE); > > X if (i == -1) > > X - err(1, "can't read old UFS1 superblock: %s", disk.d_error); > > X - > > X + err(1, "can't read old UFS1 superblock: %s", > > X + disk.d_error); > > X if (fsdummy.fs_magic == FS_UFS1_MAGIC) { > > X fsdummy.fs_magic = 0; > > X - bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > > SBLOCKSIZE); > > X + bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > > X + SBLOCKSIZE); > > X for (i = 0; i < fsdummy.fs_ncg; i++) > > X - bwrite(&disk, fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), > > X - chdummy, SBLOCKSIZE); > > X + bwrite(&disk, > > X + fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), > > X + chdummy, SBLOCKSIZE); > > X } > > X } > > X - if (!Nflag) > > X - sbwrite(&disk, 0); > > X + if (!Nflag && sbwrite(&disk, 0) != 0) > > X + err(1, "sbwrite: %s", disk.d_error); > > X if (Eflag == 1) { > > X @@ -518,4 +525,4 @@ > > X } > > X - if (!Nflag) > > X - sbwrite(&disk, 0); > > X + if (!Nflag && sbwrite(&disk, 0) != 0) > > X + err(1, "sbwrite: %s", disk.d_error); > > X for (i = 0; i < sblock.fs_cssize; i += sblock.fs_bsize) > > > > libufs broke the error handling for the most important writes -- to > > the superblock. Error handling is still done almost correctly in > > wtfs(), and most writes are still done using wtfs() which is now > > just a wrapper which adds error handling to libufs's bwrite(3), but > > writes to superblock are (were) now done internally by libufs's > > sbwrite(3) which (like most of libufs) is too hard to use. > > > > Note that -current needs a slightly different fix. Part of libufs > > being too hard to use is that it is a library so it can't just exit > > for errors. It returns errors in the string disk.d_error and the > > fix uses that for newfs, unlike for most other calls to sbwrite(3). > > However, newfs no longer uses sbwrite(3). It uses a wrapper > > do_sbwrite() which reduces to pwrite(2). The wrapper doesn't set > > d_error, so it is incompatible with sbwrite(3). > > > > This is an example that libufs is even harder to use than might first > > appear. The version with the do_sbwrite() wrapper fixes a previous > > version which replaced bwrite(3) instead of wrapping it. bwrite() > > in the application conflicted with bwrite(3) in libufs, since libufs > > is not designed to have its internals replaced by inconsistent parts > > like that. Apparently, a special case is only needed for superblock > > writes, and do_sbwrite() does that, and since libufs doesn't call any > > sbwrite() function internally there is no need to replace sbwrite(3); > > sbwrite(3) is just useless for its main application. All that the > > bwrite(3) and sbwrite(3) library functions do is handle the block > > size implicitly in a way that makes them harder to use than just > > multiplying by the block size like wtfs() used to do and do_sbwrite() > > now does. > > This is where the trouble originates: > /usr/srcs/11/src/lib/libufs/sblock.c:148 > /* > * Write superblock summary information. > */ > blks = howmany(fs->fs_cssize, fs->fs_fsize); > space = (uint8_t *)disk->d_sbcsum; > for (i = 0; i < blks; i += fs->fs_frag) { > > But: > > (gdb) p disk->d_sbcsum > $19 = (struct csum *) 0x0 > > and this pointer is later on used to write: > for (i = 0; i < blks; i += fs->fs_frag) { > size = fs->fs_bsize; > if (i + fs->fs_frag > blks) > size = (blks - i) * fs->fs_fsize; > if (bwrite(disk, fsbtodb(fs, fs->fs_csaddr + i), space, size) > == -1) { > ERROR(disk, "Failed to write sb summary information"); > return (-1); > } > space += size; > } > > But the bwrite returns error because the called pwrite() tries to write > 4096 bytes from a null pointer. And that it does not like. > > Now the question is: why isn't d_sbcsum not filled out? > Note that the disk is filled with random data. > > I've been looking for quite some time, but I just don't get it. > Where should the superblock come from if a whole disk is being used? > (so there no MBR or gpart written. Dangerously dedicated) Indeed I am not sure what do you report there. newfs(8) does not use sbwrite() function from libufs. Set a breakpoint on the sbwrite() function and catch the backtrace of the call. From owner-freebsd-fs@freebsd.org Fri Jul 7 13:29:12 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E0999DA370C for ; Fri, 7 Jul 2017 13:29:12 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CEE917A559 for ; Fri, 7 Jul 2017 13:29:12 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v67DTC6G015880 for ; Fri, 7 Jul 2017 13:29:12 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 220472] 11.1-RC1 kernel panic (zfs recv) Date: Fri, 07 Jul 2017 13:29:12 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-STABLE X-Bugzilla-Keywords: regression X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: warlock@phouka.net X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Jul 2017 13:29:13 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D220472 --- Comment #2 from John Kennedy --- I've been extremely frustrated trying to get a kernel dump for this panic. Currently: FreeBSD jormungandr.phouka.net 11.1-RC1 FreeBSD 11.1-RC1 #79 r313908+8df37be70f94(releng/11.1): Thu Jul 6 18:20:07 PDT 2017=20=20=20=20 warlock@jormungandr.phouka.net:/usr/obj/usr/src/sys/GENERIC amd64 >From rc.conf: dumpdev=3D"/dev/ada0p1" dumpdir=3D"/var/crash" savecore_enable=3D"YES" # dumpon -l ada0p1 # savecore -Cv unable to open bounds file, using 0 checking for kernel dump on device /dev/ada0p1 mediasize =3D 35433480192 bytes sectorsize =3D 512 bytes magic mismatch on last dump header on /dev/ada0p1 No dump exists I've got a pair of ZFS-mirroed Samsung SSD 960 EVO 250GB drives (/dev/nvd[0= 1]) M.2 drives as zroot. It didn't seem like dumpon liked the encrypted swap partitions on there (even with "late" option), so I threw in a extra drive = and created a unencrypted swap partition on it with enough space for all the memory, just in case. # gpart show /dev/nvd0 =3D> 40 488397088 nvd0 GPT (233G) 40 1024 1 freebsd-boot (512K) 1064 984 - free - (492K) 2048 4194304 2 freebsd-swap (2.0G) 4196352 484200448 3 freebsd-zfs (231G) 488396800 328 - free - (164K) # gpart show /dev/ada0 =3D> 40 937703008 ada0 GPT (447G) 40 4056 - free - (2.0M) 4096 69206016 1 freebsd-swap (33G) 69210112 868492936 - free - (414G) # grep ' memory ' /var/log/dmesg.today=20 real memory =3D 34359738368 (32768 MB) avail memory =3D 33191440384 (31653 MB) Nothing is saved to /var/crash. It exists, only contains "minfree" (with t= he contents of "2048"). Booting to single user after panic and running saveco= re by hand doesn't seem to find anything, either. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Fri Jul 7 14:24:06 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7FB34DA456E for ; Fri, 7 Jul 2017 14:24:06 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail110.syd.optusnet.com.au (mail110.syd.optusnet.com.au [211.29.132.97]) by mx1.freebsd.org (Postfix) with ESMTP id 0FC0E7C08D for ; Fri, 7 Jul 2017 14:24:05 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail110.syd.optusnet.com.au (Postfix) with ESMTPS id 343F4109330; Sat, 8 Jul 2017 00:23:56 +1000 (AEST) Date: Sat, 8 Jul 2017 00:23:55 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Konstantin Belousov cc: Willem Jan Withagen , FreeBSD Filesystems Subject: Re: newfs returns cg 0: bad magic number In-Reply-To: <20170707062354.GP1935@kib.kiev.ua> Message-ID: <20170707220108.F1124@besplex.bde.org> References: <20170705051458.GU1935@kib.kiev.ua> <20170705154533.M1171@besplex.bde.org> <9fe3ec97-60ea-e9e6-fb65-9b163d64ac45@digiware.nl> <20170707062354.GP1935@kib.kiev.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=VbSHBBh9 c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=1kE14xw-pnMb5_ANrOEA:9 a=CjuIK1q_8ugA:10 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Jul 2017 14:24:06 -0000 On Fri, 7 Jul 2017, Konstantin Belousov wrote: > On Fri, Jul 07, 2017 at 12:12:49AM +0200, Willem Jan Withagen wrote: >> On 5-7-2017 08:55, Bruce Evans wrote: >>> On Wed, 5 Jul 2017, Konstantin Belousov wrote: >>> >>>> On Wed, Jul 05, 2017 at 02:00:43AM +0200, Willem Jan Withagen wrote: >>>>> Hi, >>>>> >>>>> I'm able to create a Ceph RBD backed ggate disk, in /dev/ggate0. >>>>> It looks like I can: >>>>> run dd on it >>>>> gpart the disk >>>>> create a zpool on it >>>>> >>>>> But when I try to create a UFS file system on it, newfs complains >>>>> straight from the bat. >>>>> >>>>> # sudo newfs -E /dev/ggate0p1 >>>>> /dev/ggate0p1: 1022.0MB (2093056 sectors) block size 32768, fragment >>>>> size 4096 >>>>> using 4 cylinder groups of 255.53MB, 8177 blks, 32768 inodes. >>>>> Erasing sectors [128...2093055] >>>>> super-block backups (for fsck_ffs -b #) at: >>>>> 192, 523520, 1046848, 1570176 >>>>> cg 0: bad magic number >>>>> >>>>> Googling returns that this is on and off a problem with new devices, but >>>>> there is no generic suggestion on how to debug this.... >>>>> >>>>> Any/all suggestions are welcome, >>>> Typically this error means that the drive returns wrong data, not the >>>> bytes that were written to it and expected to be read. >>> >>> This might be for writing to a nonexistent sector. Checking for write >>> errors was broken by libufs, so some write errors are only sometimes >>> detected as a side effect of reading back garbage. >>> >>> I use the following quick fix (the patch also fixes some style bugs). >>> >>> X Index: mkfs.c >>> X =================================================================== >>> X RCS file: /home/ncvs/src/sbin/newfs/mkfs.c,v >>> X retrieving revision 1.85 >>> X diff -u -1 -r1.85 mkfs.c >>> X --- mkfs.c 9 Apr 2004 19:58:33 -0000 1.85 >>> X +++ mkfs.c 7 Apr 2005 23:51:56 -0000 >>> X @@ -437,16 +441,19 @@ >>> X if (!Nflag && Oflag != 1) { >>> X - i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, >>> SBLOCKSIZE); >>> X + i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, >>> X + SBLOCKSIZE); >>> X if (i == -1) >>> X - err(1, "can't read old UFS1 superblock: %s", disk.d_error); >>> X - >>> X + err(1, "can't read old UFS1 superblock: %s", >>> X + disk.d_error); >>> X if (fsdummy.fs_magic == FS_UFS1_MAGIC) { >>> X fsdummy.fs_magic = 0; >>> X - bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, >>> SBLOCKSIZE); >>> X + bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, >>> X + SBLOCKSIZE); >>> X for (i = 0; i < fsdummy.fs_ncg; i++) >>> X - bwrite(&disk, fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), >>> X - chdummy, SBLOCKSIZE); >>> X + bwrite(&disk, >>> X + fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), >>> X + chdummy, SBLOCKSIZE); >>> X } >>> X } >>> X - if (!Nflag) >>> X - sbwrite(&disk, 0); >>> X + if (!Nflag && sbwrite(&disk, 0) != 0) >>> X + err(1, "sbwrite: %s", disk.d_error); >>> X if (Eflag == 1) { >>> X @@ -518,4 +525,4 @@ >>> X } >>> X - if (!Nflag) >>> X - sbwrite(&disk, 0); >>> X + if (!Nflag && sbwrite(&disk, 0) != 0) >>> X + err(1, "sbwrite: %s", disk.d_error); >>> X for (i = 0; i < sblock.fs_cssize; i += sblock.fs_bsize) >>> >>> libufs broke the error handling for the most important writes -- to >>> the superblock. Error handling is still done almost correctly in >>> wtfs(), and most writes are still done using wtfs() which is now >>> just a wrapper which adds error handling to libufs's bwrite(3), but >>> writes to superblock are (were) now done internally by libufs's >>> sbwrite(3) which (like most of libufs) is too hard to use. >>> >>> Note that -current needs a slightly different fix. Part of libufs >>> being too hard to use is that it is a library so it can't just exit >>> for errors. It returns errors in the string disk.d_error and the >>> fix uses that for newfs, unlike for most other calls to sbwrite(3). >>> However, newfs no longer uses sbwrite(3). It uses a wrapper >>> do_sbwrite() which reduces to pwrite(2). The wrapper doesn't set >>> d_error, so it is incompatible with sbwrite(3). >>> >>> This is an example that libufs is even harder to use than might first >>> appear. The version with the do_sbwrite() wrapper fixes a previous >>> version which replaced bwrite(3) instead of wrapping it. bwrite() >>> in the application conflicted with bwrite(3) in libufs, since libufs >>> is not designed to have its internals replaced by inconsistent parts >>> like that. Apparently, a special case is only needed for superblock >>> writes, and do_sbwrite() does that, and since libufs doesn't call any >>> sbwrite() function internally there is no need to replace sbwrite(3); >>> sbwrite(3) is just useless for its main application. All that the >>> bwrite(3) and sbwrite(3) library functions do is handle the block >>> size implicitly in a way that makes them harder to use than just >>> multiplying by the block size like wtfs() used to do and do_sbwrite() >>> now does. >> >> This is where the trouble originates: >> /usr/srcs/11/src/lib/libufs/sblock.c:148 >> /* >> * Write superblock summary information. >> */ >> blks = howmany(fs->fs_cssize, fs->fs_fsize); >> space = (uint8_t *)disk->d_sbcsum; >> for (i = 0; i < blks; i += fs->fs_frag) { I already said that newfs doesn't use this. >> >> But: >> >> (gdb) p disk->d_sbcsum >> $19 = (struct csum *) 0x0 >> >> and this pointer is later on used to write: >> for (i = 0; i < blks; i += fs->fs_frag) { >> size = fs->fs_bsize; >> if (i + fs->fs_frag > blks) >> size = (blks - i) * fs->fs_fsize; >> if (bwrite(disk, fsbtodb(fs, fs->fs_csaddr + i), space, size) >> == -1) { >> ERROR(disk, "Failed to write sb summary information"); >> return (-1); >> } >> space += size; >> } >> >> But the bwrite returns error because the called pwrite() tries to write >> 4096 bytes from a null pointer. And that it does not like. >> >> Now the question is: why isn't d_sbcsum not filled out? >> Note that the disk is filled with random data. d_sbcsum() isn't filled out because newfs also doesn't call sbread() (or ufs_disk_fillout()). >> I've been looking for quite some time, but I just don't get it. >> Where should the superblock come from if a whole disk is being used? >> (so there no MBR or gpart written. Dangerously dedicated) > > Indeed I am not sure what do you report there. newfs(8) does not use > sbwrite() function from libufs. Set a breakpoint on the sbwrite() > function and catch the backtrace of the call. The full unusability of libufs is now even clearer. sbread(3) is obviously unusable by newfs since there is no fs initially. libufs's API is already ugly to support this as well as possible. Closing is done by ufs_disk_close(), but there is no corresponding ufs_disk_open(). Opening is done by the badly named functions ufs_disk_fillout() and ufs_disk_fillout_blank(). These open a file a fill out a struct uufsd. The 'blank' one is for use in newfs when there is no superblock. It fills out the the struct uufsd except for leaving the fields related to the superblock blank. The man pages don't give the details about what these fields are, so you have to look at the implementation to see which libufs functions can be safely called with blank initialization. It might seem obvious that sbwrite() should not be called unless sbread() (or a non-blank fillout) has been called to initialize libufs fields related to the superblock, but the first libufs'ed versions of newfs made the call. This was fragile: newfs is chummy with the implementation and hacks on the libufs variable disk.d_bsize (it still does), so that this variable is not "blank"; then sbwrite(3) uses this variable to "unblank" another even more internal variable so that sbwrite(3) sort of worked. Further history: - r109926: first use of libufs in newfs - r185588: to get variable block sizes, don't use libufs bwrite(3) in newfs - r188520: replacing library functions as in r185588 is fragile in all cases and was broken for static linkage, so don't do it. Don't use sbwrite(3) instead - r207141: add soft update/journaling stuff to libufs only. This adds complications to sbwrite(3). Since newfs doesn't use sbwrite(3), it is unaffected. I think it doesn't need any changes, but the replacement for sbwrite(3) is now even more incompatible. sbwrite(3) is only used by: - tunefs. This shouldn't write more than the basic superblock - fsck_ffs: the soft update/journaling parts call sbwrite(3). So it seems that sbwrite(3) had no useful uses before soft update/journaling started using it, and all not incorrect uses of it start with non-blank initialization. For newfs, there are a lot of silly complications involving the block size: - the of initialization of disk.d_bsize to sectorsize (from the firmware or label) or even the -S option doesn't seem to be good for anything. Actually, the comment explains this. It says that "Our blocks = sector size". This is not what libufs wants. It is what was convenient for newfs before libufs. It sometimes works for libufs too, but is not documented to work (no method for initializing disk.d_bsize is documented). - oops, the wrapper db_sbwrite() is no to handle complications for the block size. It is to add the partition offset. For most i/o's newfs tells libufs the full offset. Superblock i/o's are special because the offsets are passed implicitly and the implicit values don't contain the offset. Bruce From owner-freebsd-fs@freebsd.org Fri Jul 7 14:27:31 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7004EDA469A for ; Fri, 7 Jul 2017 14:27:31 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5DD267C15F for ; Fri, 7 Jul 2017 14:27:31 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v67ERVvj000467 for ; Fri, 7 Jul 2017 14:27:31 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 220472] 11.1-RC1 kernel panic (zfs recv) Date: Fri, 07 Jul 2017 14:27:31 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-STABLE X-Bugzilla-Keywords: regression X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: warlock@phouka.net X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Jul 2017 14:27:31 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D220472 --- Comment #3 from John Kennedy --- The basic layout of the problem-pool is a stock ZFS layout with some additi= ons and boot environment: # zfs list -r zspin/zroot NAME USED AVAIL REFER MOUNTPOINT zspin/zroot 96.6G 370G 96K /zspin/zroot zspin/zroot/ROOT 10.6G 370G 96K none zspin/zroot/ROOT/11.0-releng 48K 370G 3.22G /zspin/zroot zspin/zroot/ROOT/default 10.6G 370G 8.74G /zspin/zroot zspin/zroot/aux 200K 370G 96K /zspin/zroot/aux zspin/zroot/aux/aux 104K 370G 96K /zspin/zroot/aux/aux zspin/zroot/git 136K 370G 96K /zspin/zroot/git zspin/zroot/release 136K 370G 96K /zspin/zroot/release zspin/zroot/tmp 1004K 370G 400K /tmp zspin/zroot/usr 85.9G 370G 96K /usr zspin/zroot/usr/home 73.2G 370G 73.1G /zspin/zroot/usr/home zspin/zroot/usr/ports 9.98G 370G 6.73G /usr/ports zspin/zroot/usr/src 2.77G 370G 2.77G /usr/src zspin/zroot/var 2.20M 370G 96K /var zspin/zroot/var/audit 136K 370G 96K /var/audit zspin/zroot/var/crash 136K 370G 96K /var/crash zspin/zroot/var/log 1.00M 370G 336K /var/log zspin/zroot/var/mail 564K 370G 132K /var/mail zspin/zroot/var/tmp 248K 370G 96K /var/tmp There are some snapshots. Different filesystems have different stamps on th= em, but my current theory is that "aux" is the problem child: # zfs list -rtall zspin/zroot | fgrep @ | sed -E 's/^[^@]+(@[^ ]+).*$/\1/' | sort | uniq @2017-05-18-03:55:20 @20170526 @20170527 @20170528 @20170528-2 @backup @backup2 # zfs list -rt all zspin/zroot NAME USED AVAIL REFER MOUNTPOI= NT zspin/zroot 96.6G 370G 96K /zspin/z= root zspin/zroot@backup 8K - 96K - zspin/zroot@backup2 8K - 96K - zspin/zroot@20170526 8K - 96K - zspin/zroot@20170527 8K - 96K - zspin/zroot@20170528 8K - 96K - zspin/zroot@20170528-2 8K - 96K - ... zspin/zroot/aux 200K 370G 96K=20 /zspin/zroot/aux zspin/zroot/aux@20170528 0 - 96K - zspin/zroot/aux/aux 104K 370G 96K=20 /zspin/zroot/aux/aux zspin/zroot/aux/aux@20170528 8K - 96K - zspin/zroot/aux/aux@20170528-2 0 - 96K - ... This is how I can reliably panic my system: zfs destroy -rv zaux/ouroboros/zroot zfs send -RD zspin/zroot@backup | zfs receive -Fu -dv zaux/ouroboros zfs send -RD -I zspin/zroot@backup zspin/zroot@20170527 | zfs receive -Fu -= dv zaux/ouroboros # panic during this zfs send -RD -I zspin/zroot@20170527 zspin/zroot@20170528 | zfs receive -Fu= -dv zaux/ouroboros On the zfs receive, the last status message before the panic is receiving zspin/zroot/aux/aux@20170528 (#12 below, no summary of received bytes afterwards). # zfs send -RD -I zspin/zroot@20170527 zspin/zroot@20170528 | zfs receive -= Fu -dvn zaux/ouroboros | cat -n 1 would receive incremental stream of zspin/zroot@20170528 into zaux/ouroboros/zroot@20170528 2 would receive incremental stream of zspin/zroot/var@20170528 into zaux/ouroboros/zroot/var@20170528 3 would receive incremental stream of zspin/zroot/var/audit@20170528 = into zaux/ouroboros/zroot/var/audit@20170528 4 would receive incremental stream of zspin/zroot/var/tmp@20170528 in= to zaux/ouroboros/zroot/var/tmp@20170528 5 would receive incremental stream of zspin/zroot/var/crash@20170528 = into zaux/ouroboros/zroot/var/crash@20170528 6 would receive incremental stream of zspin/zroot/var/mail@20170528 i= nto zaux/ouroboros/zroot/var/mail@20170528 7 would receive incremental stream of zspin/zroot/var/log@20170528 in= to zaux/ouroboros/zroot/var/log@20170528 8 would receive incremental stream of zspin/zroot/tmp@20170528 into zaux/ouroboros/zroot/tmp@20170528 9 would receive incremental stream of zspin/zroot/release@20170528 in= to zaux/ouroboros/zroot/release@20170528 10 would receive incremental stream of zspin/zroot/git@20170528 into zaux/ouroboros/zroot/git@20170528 11 would receive full stream of zspin/zroot/aux@20170528 into zaux/ouroboros/zroot/aux@20170528 12 would receive full stream of zspin/zroot/aux/aux@20170528 into zaux/ouroboros/zroot/aux@20170528 13 would receive incremental stream of zspin/zroot/ROOT@20170528 into zaux/ouroboros/zroot/ROOT@20170528 14 would receive incremental stream of zspin/zroot/ROOT/default@201705= 28 into zaux/ouroboros/zroot/ROOT/default@20170528 15 would receive incremental stream of zspin/zroot/usr@20170528 into zaux/ouroboros/zroot/usr@20170528 16 would receive incremental stream of zspin/zroot/usr/home@20170528 i= nto zaux/ouroboros/zroot/usr/home@20170528 17 would receive incremental stream of zspin/zroot/usr/ports@20170528 = into zaux/ouroboros/zroot/usr/ports@20170528 18 would receive incremental stream of zspin/zroot/usr/src@20170528 in= to zaux/ouroboros/zroot/usr/src@20170528 19 would receive incremental stream of zspin/zroot/ROOT/11.0-releng@20170528 into zaux/ouroboros/zroot/ROOT/11.0-releng@20170528 --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Fri Jul 7 19:49:54 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 80E35DAA2C0 for ; Fri, 7 Jul 2017 19:49:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6E610E2D for ; Fri, 7 Jul 2017 19:49:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v67Jns3R010575 for ; Fri, 7 Jul 2017 19:49:54 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 220546] zfs.ko: undefined reference to abd_is_linear(), abd_copy_to_buf() and (probably) others with ASSERT's enabled Date: Fri, 07 Jul 2017 19:49:54 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: avos@freebsd.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Jul 2017 19:49:54 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D220546 Andriy Voskoboinyk changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-bugs@FreeBSD.org |freebsd-fs@FreeBSD.org --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Fri Jul 7 21:16:28 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A4255DAB90E for ; Fri, 7 Jul 2017 21:16:28 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from smtp.digiware.nl (smtp.digiware.nl [176.74.240.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 524D23822 for ; Fri, 7 Jul 2017 21:16:27 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from router.digiware.nl (localhost.digiware.nl [127.0.0.1]) by smtp.digiware.nl (Postfix) with ESMTP id 1D8F144738; Fri, 7 Jul 2017 23:16:19 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.com Received: from smtp.digiware.nl ([127.0.0.1]) by router.digiware.nl (router.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VHWh7BxxA6qt; Fri, 7 Jul 2017 23:16:18 +0200 (CEST) Received: from [192.168.10.67] (opteron [192.168.10.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.digiware.nl (Postfix) with ESMTPSA id 31B0044737; Fri, 7 Jul 2017 23:16:18 +0200 (CEST) Subject: Re: newfs returns cg 0: bad magic number To: Bruce Evans , Konstantin Belousov Cc: FreeBSD Filesystems References: <20170705051458.GU1935@kib.kiev.ua> <20170705154533.M1171@besplex.bde.org> <9fe3ec97-60ea-e9e6-fb65-9b163d64ac45@digiware.nl> <20170707062354.GP1935@kib.kiev.ua> <20170707220108.F1124@besplex.bde.org> From: Willem Jan Withagen Message-ID: <9dd06699-a98b-2c8b-e710-73372b55c5d2@digiware.nl> Date: Fri, 7 Jul 2017 23:16:17 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170707220108.F1124@besplex.bde.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Jul 2017 21:16:28 -0000 On 7-7-2017 16:23, Bruce Evans wrote: > On Fri, 7 Jul 2017, Konstantin Belousov wrote: Reverted all I changed and, I have the following change now that diagnose the errors, against 11.1RC1: for mkfs.c: 532,533c533,534 < if (!Nflag) < do_sbwrite(&disk); --- > if (!Nflag && do_sbwrite(&disk) == -1) > err(1, "do_sbwrite(%d): ", __LINE__ ); 601c602,603 < do_sbwrite(&disk); --- > if (do_sbwrite(&disk) == -1) > err(1, "do_sbwrite(%d): ", __LINE__ ); But that brings me back to the original issue: cg 0: bad magic number > For newfs, there are a lot of silly complications involving the block size: > - the of initialization of disk.d_bsize to sectorsize (from the firmware > or label) or even the -S option doesn't seem to be good for anything. > Actually, the comment explains this. It says that "Our blocks = sector > size". This is not what libufs wants. It is what was convenient for > newfs before libufs. It sometimes works for libufs too, but is not > documented to work (no method for initializing disk.d_bsize is > documented). > - oops, the wrapper db_sbwrite() is no to handle complications for the > block > size. It is to add the partition offset. For most i/o's newfs tells > libufs the full offset. Superblock i/o's are special because the offsets > are passed implicitly and the implicit values don't contain the offset. # newfs /dev/ggate0 /dev/ggate0: 64.0MB (131072 sectors) block size 32768, fragment size 4096 using 4 cylinder groups of 16.03MB, 513 blks, 2176 inodes. super-block backups (for fsck_ffs -b #) at: 192, 33024, 65856, 98688 cg 0: bad magic number But the geom access pattern is rather akward: 1) ggate0[WRITE(offset=67108352, length=512)] 2) ggate0[READ(offset=8192, length=8192)] 3) ggate0[WRITE(offset=65536, length=8192)] 4) ggate0[WRITE(offset=98304, length=131072)] 5) ggate0[WRITE(offset=16908288, length=131072)] 6) ggate0[WRITE(offset=33718272, length=131072)] 7) ggate0[WRITE(offset=50528256, length=131072)] 8) ggate0[READ(offset=131072, length=4096)] WRITE-4 is where initcg writes the first cylinder group. So there should cg_magic be set. READ-8 is the bread that actually errors in not reading CG_MAGIC. mkfs.c:1002:alloc(): bread(&disk, part_ofs + fsbtodb(&sblock, cgtod(&sblock, 0)), (char *)&acg, sblock.fs_cgsize) This data was written in WRITE-4. The full disk.d_cg seems to be empty both in the acg block as well as on disk: # hexdump -s 128k /dev/ggate0 0020000 0000 0000 0000 0000 0000 0000 0000 0000 * 0024880 0000 0000 0255 0009 0000 0000 0003 0000 the second cg however does seem the have a correct CG_MAGIC in the second 32bit var (0x90255) on line 24880. So I'm assuming that it did not get written. Now things are even more awkward, in that once in a while newfs does complete without complaints. Making me believe that some uninitialized use could play a role --WjW From owner-freebsd-fs@freebsd.org Sat Jul 8 00:13:01 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 20646DADF82 for ; Sat, 8 Jul 2017 00:13:01 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-io0-x244.google.com (mail-io0-x244.google.com [IPv6:2607:f8b0:4001:c06::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id DA5B567C91 for ; Sat, 8 Jul 2017 00:13:00 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-io0-x244.google.com with SMTP id z62so589422ioi.0 for ; Fri, 07 Jul 2017 17:13:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=5oagBlLjzTiAuZvmd6kDa1FiS2ChdocWLumj0cHVGmg=; b=D9YoudGOBWGEhjkTwSLNKD2rDgdefVwdTbAaJtqiqzlQCb4uFFzSb3aFv+xyHaIywF ZFvULZCoyAH4tRZGDhAFOgm9R5l0vvAO7e0hvsxKxxzSIHz+llp76aPMJm9fncJXoaQw YdN9/KRQqt5eXFdhVA7XFlWeyOCBOfR/2ocpBA5VApZBDioi2nYin99Knd8aBITtfgkv E7G4uG4naGbZb8XDzA25skpP5QkGd8U08LfCXRwkHL0ZrX5fA5kQqZR4AZARXeDELHzZ rFhpGsOwpLPwRLnf3QTMLw0oDzLk5tJQKzb+eshweijnKN2ejc1ji9vgBVpsPhv2x7Ej 6dYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=5oagBlLjzTiAuZvmd6kDa1FiS2ChdocWLumj0cHVGmg=; b=nFdALchZ7JtEb7BCIgN8hnRVaOF+QWNDJggVjYuuZzGMLWPBznVWm4fz4tzAL1+u3P no3xb1jUahPtDFUVFnXL3cAGNJvM3wWmLEF1WEIwlhwOI2GUCS+aewZ+SlWdpAEQUYnB tix3sGDvsf7FfkBqOvraQ7JlpjanRnIYLTzynP/2yOcl8Dq0gDRikrX2CC54lMzAHEoB O8vBHTy2yW/LD7qXK1wNLZkmSZ1PHauo+D3dZ6W6uz+EJ0Fw3ZGm+OyN+AnT/lrQqKDs h+vzJxSa9obV1W1xU/6H2JAxtNemTSkm+rIKLpiwz8xJIG5mteZA2snp3yHnCm8DMpDa BHfA== X-Gm-Message-State: AKS2vOz7G7zs6ibL7lOwm/9Ir6cMA4QlFy+JERotXAI+VcDK04BHq7RQ UeGhYqsm9joLBFBO7ug9rwRLaXwkDhJR X-Received: by 10.107.178.13 with SMTP id b13mr64400672iof.148.1499472779849; Fri, 07 Jul 2017 17:12:59 -0700 (PDT) MIME-Version: 1.0 Sender: wlosh@bsdimp.com Received: by 10.79.212.167 with HTTP; Fri, 7 Jul 2017 17:12:59 -0700 (PDT) X-Originating-IP: [2607:fb90:6e26:c4a6:0:4e:8014:5301] Received: by 10.79.212.167 with HTTP; Fri, 7 Jul 2017 17:12:59 -0700 (PDT) In-Reply-To: <9dd06699-a98b-2c8b-e710-73372b55c5d2@digiware.nl> References: <20170705051458.GU1935@kib.kiev.ua> <20170705154533.M1171@besplex.bde.org> <9fe3ec97-60ea-e9e6-fb65-9b163d64ac45@digiware.nl> <20170707062354.GP1935@kib.kiev.ua> <20170707220108.F1124@besplex.bde.org> <9dd06699-a98b-2c8b-e710-73372b55c5d2@digiware.nl> From: Warner Losh Date: Fri, 7 Jul 2017 18:12:59 -0600 X-Google-Sender-Auth: GRSBzaod2VSWbwtj_n2ytGzyyCU Message-ID: Subject: Re: newfs returns cg 0: bad magic number To: Willem Jan Withagen Cc: Bruce Evans , FreeBSD FS , Konstantin Belousov Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Jul 2017 00:13:01 -0000 On Jul 7, 2017 3:16 PM, "Willem Jan Withagen" wrote: On 7-7-2017 16:23, Bruce Evans wrote: > On Fri, 7 Jul 2017, Konstantin Belousov wrote: Reverted all I changed and, I have the following change now that diagnose the errors, against 11.1RC1: for mkfs.c: 532,533c533,534 < if (!Nflag) < do_sbwrite(&disk); --- > if (!Nflag && do_sbwrite(&disk) == -1) > err(1, "do_sbwrite(%d): ", __LINE__ ); 601c602,603 < do_sbwrite(&disk); --- > if (do_sbwrite(&disk) == -1) > err(1, "do_sbwrite(%d): ", __LINE__ ); But that brings me back to the original issue: cg 0: bad magic number > For newfs, there are a lot of silly complications involving the block size: > - the of initialization of disk.d_bsize to sectorsize (from the firmware > or label) or even the -S option doesn't seem to be good for anything. > Actually, the comment explains this. It says that "Our blocks = sector > size". This is not what libufs wants. It is what was convenient for > newfs before libufs. It sometimes works for libufs too, but is not > documented to work (no method for initializing disk.d_bsize is > documented). > - oops, the wrapper db_sbwrite() is no to handle complications for the > block > size. It is to add the partition offset. For most i/o's newfs tells > libufs the full offset. Superblock i/o's are special because the offsets > are passed implicitly and the implicit values don't contain the offset. # newfs /dev/ggate0 /dev/ggate0: 64.0MB (131072 sectors) block size 32768, fragment size 4096 using 4 cylinder groups of 16.03MB, 513 blks, 2176 inodes. super-block backups (for fsck_ffs -b #) at: 192, 33024, 65856, 98688 cg 0: bad magic number But the geom access pattern is rather akward: 1) ggate0[WRITE(offset=67108352, length=512)] 2) ggate0[READ(offset=8192, length=8192)] 3) ggate0[WRITE(offset=65536, length=8192)] 4) ggate0[WRITE(offset=98304, length=131072)] 5) ggate0[WRITE(offset=16908288, length=131072)] 6) ggate0[WRITE(offset=33718272, length=131072)] 7) ggate0[WRITE(offset=50528256, length=131072)] 8) ggate0[READ(offset=131072, length=4096)] WRITE-4 is where initcg writes the first cylinder group. So there should cg_magic be set. READ-8 is the bread that actually errors in not reading CG_MAGIC. mkfs.c:1002:alloc(): bread(&disk, part_ofs + fsbtodb(&sblock, cgtod(&sblock, 0)), (char *)&acg, sblock.fs_cgsize) This data was written in WRITE-4. The full disk.d_cg seems to be empty both in the acg block as well as on disk: # hexdump -s 128k /dev/ggate0 0020000 0000 0000 0000 0000 0000 0000 0000 0000 * 0024880 0000 0000 0255 0009 0000 0000 0003 0000 the second cg however does seem the have a correct CG_MAGIC in the second 32bit var (0x90255) on line 24880. So I'm assuming that it did not get written. Now things are even more awkward, in that once in a while newfs does complete without complaints. Making me believe that some uninitialized use could play a role We are seeing at work corruption where we newfs and create a bunch of directories and reboot. On the fsck, it complains that the two sup er blocks don't match and 15 of the 256 directories we created return EBADF when listed. It looks like some of the writes didn't reach disk maybe. We are chasing it down still. This is with current of maybe 4-8 weeks ago... Warner --WjW _______________________________________________ freebsd-fs@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@freebsd.org Sat Jul 8 11:14:23 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 45354D97995 for ; Sat, 8 Jul 2017 11:14:23 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 281027D5BB for ; Sat, 8 Jul 2017 11:14:23 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v68BEMEQ087210 for ; Sat, 8 Jul 2017 11:14:22 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 220546] zfs.ko: undefined reference to abd_is_linear(), abd_copy_to_buf() and (probably) others with ASSERT's enabled Date: Sat, 08 Jul 2017 11:14:23 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: avos@freebsd.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Jul 2017 11:14:23 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D220546 --- Comment #1 from Andriy Voskoboinyk --- Created attachment 184171 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D184171&action= =3Dedit Add 'static' keyword to inline functions in abd.h I think this was (also) caused by 'CFLAGS=3D-O0'; anyway, the patch fixes t= his issue for me. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Sat Jul 8 11:15:19 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 55500D97A46 for ; Sat, 8 Jul 2017 11:15:19 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3EC507D659 for ; Sat, 8 Jul 2017 11:15:19 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v68BFJKe088728 for ; Sat, 8 Jul 2017 11:15:19 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 220546] [patch] zfs.ko: undefined reference to abd_is_linear(), abd_copy_to_buf() and (probably) others with ASSERT's enabled Date: Sat, 08 Jul 2017 11:15:19 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: avos@freebsd.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: short_desc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Jul 2017 11:15:19 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D220546 Andriy Voskoboinyk changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|zfs.ko: undefined reference |[patch] zfs.ko: undefined |to abd_is_linear(), |reference to |abd_copy_to_buf() and |abd_is_linear(), |(probably) others with |abd_copy_to_buf() and |ASSERT's enabled |(probably) others with | |ASSERT's enabled --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Sat Jul 8 11:16:26 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2B22CD97B3F for ; Sat, 8 Jul 2017 11:16:26 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 185D17D6F3 for ; Sat, 8 Jul 2017 11:16:26 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v68BGPJA090319 for ; Sat, 8 Jul 2017 11:16:25 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 220546] [patch] zfs.ko: undefined reference to abd_is_linear(), abd_copy_to_buf() and (probably) others with ASSERT's enabled Date: Sat, 08 Jul 2017 11:16:26 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: avos@freebsd.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: keywords Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Jul 2017 11:16:26 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D220546 Andriy Voskoboinyk changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |patch --=20 You are receiving this mail because: You are the assignee for the bug.=