From owner-freebsd-current@freebsd.org Mon Jan 8 05:09:39 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3535BE77144 for ; Mon, 8 Jan 2018 05:09:39 +0000 (UTC) (envelope-from bsd-lists@BSDforge.com) Received: from udns.ultimatedns.net (static-24-113-41-81.wavecable.com [24.113.41.81]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1CAAE691A0; Mon, 8 Jan 2018 05:09:38 +0000 (UTC) (envelope-from bsd-lists@BSDforge.com) Received: from udns.ultimatedns.net (localhost [127.0.0.1]) by udns.ultimatedns.net (8.14.9/8.14.9) with ESMTP id w0859fWS078564; Sun, 7 Jan 2018 21:09:47 -0800 (PST) (envelope-from bsd-lists@BSDforge.com) X-Mailer: UDNSMS MIME-Version: 1.0 Cc: "Michael Tuexen" , "Warner Losh" , "O. Hartmann" In-Reply-To: <20180107123201.19ea0fde@thor.intern.walstatt.dynvpn.de> From: "Chris H" Reply-To: bsd-lists@BSDforge.com To: "FreeBSD CURRENT" Subject: Re: r327359: cylinder checksum failed: cg0, cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2> Date: Sun, 07 Jan 2018 21:09:47 -0800 Message-Id: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jan 2018 05:09:39 -0000 On Sun, 7 Jan 2018 12:31:34 +0100 "O=2E Hartmann" sa= id > Am Thu, 4 Jan 2018 12:14:47 +0100 > "O=2E Hartmann" schrieb: >=20 > > On Thu, 4 Jan 2018 09:10:37 +0100 > > Michael Tuexen wrote: > >=20 > > > > On 31=2E Dec 2017, at 02:45, Warner Losh wrote: > > > >=20 > > > > On Sat, Dec 30, 2017 at 4:41 PM, O=2E Hartmann > > wrote: > > > > =20 > > > >> On most recent CURRENT I face the error shwon below on /tmp filesy= stem > > > >> (UFS2) residing > > > >> on a Samsung 850 Pro SSD: > > > >>=20 > > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x451= 5d2a3 > > !=3D > > > >> bp: 0xd9fba319 > > > >> handle_workitem_freefile: got error 5 while accessing filesystem > > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x451= 5d2a3 > > > >> !=3D bp: 0xd9fba319 > > > >> handle_workitem_freefile: got error 5 while accessing filesystem > > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x451= 5d2a3 > > > >> !=3D bp: 0xd9fba319 > > > >> handle_workitem_freefile: got error 5 while accessing filesystem > > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x451= 5d2a3 > > > >> !=3D bp: 0xd9fba319 > > > >> handle_workitem_freefile: got error 5 while accessing filesystem > > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x451= 5d2a3 > > > >> !=3D bp: 0xd9fba319 > > > >> handle_workitem_freefile: got error 5 while accessing filesystem > > > >>=20 > > > >> I've already formatted the /tmp filesystem, but obviously without = any > > > >> success=2E > > > >>=20 > > > >> Since I face such strange errors also on NanoBSD images dd'ed to S= D > > cards, > > > >> I guess there > > > >> is something fishy =2E=2E=2E =20 > > > >=20 > > > >=20 > > > > It indicates a problem=2E We've seen these 'corruptions' on data in m= otion > > at > > > > work, but I hacked fsck to report checksum mismatches (it silently > > corrects > > > > them today) and we've not seen any mismatch when we unmount and fsc= k the > > > > filesystem=2E =20 > > > Not sure this helps: But we have seen this also after system panics > > > when having soft update journaling enabled=2E Having soft update journa= ling > > > disabled, we do not observed this after several panics=2E > > > Just to be clear: The panics are not related to this issue, > > > but to other network development we do=2E > > >=20 > > > You can check using tunefs -p devname if soft update journaling is en= abled > > or > > > not=2E =20 > >=20 > > In all cases I reported in earlier and now, softupdates ARE ENABLED on = all > > partitions in question (always GPT, in my cases also all on flash based > > devices, SD card and/or SSD)=2E >=20 >=20 > =2E=2E=2E and journalling as well! >=20 > In case of the SD, I produced the layout of the NanoBSD image via "dd" > including the /cfg > partition=2E The problem occured even when having overwritten the SD card w= ith > a new image=2E > The problem went away once I unmounted /cfg and reformatted via newfs=2E Af= ter > that, I did > not see any faults again! I have no explanation for this behaviour except= the > dd didn't > overwrite "faulty" areas or the obligate "gpart recover" at the end of th= e > procedure > restored something faulty=2E >=20 > The /tmp filesystem I reported in was also from an earlier date - and I > didn't formatted > it as I said - I confused the partition in question with another one=2E The > partition has > been created and formatted months ago under CURRENT=2E >=20 > In single user mode, I reformatted the partition again - with journaling = and > softupdates > enabled=2E As with the /cfg partition on NanoBSD with SD card, I didn't rea= lise > any faults > again since then=2E=20 >=20 FWIW I *also* experience this on gpart/FFS2 partitioned/formatted drives *with* journaling enabled=2E As a result; if the system crashes, more often times, than not, fsck(8) canNOT use the journal, and indicates that it must "fall through" to complete the task=2E This is on a SATA (ahci) driven disk=2E My experiences with this seem to suggest that journaling is the cause= =2E > >=20 > >=20 > > >=20 > > > Best regards > > > Michael =20 > > > >=20 > > > > Warner > --=20 > O=2E Hartmann >=20 > Ich widerspreche der Nutzung oder =C3=9Cbermittlung meiner Daten f=C3= =BCr > Werbezwecke oder f=C3=BCr die Markt- oder Meinungsforschung (=C2=A7 28 Ab= s=2E 4 BDSG)=2E --Chris