From owner-freebsd-current@freebsd.org  Mon Jan  8 05:09:39 2018
Return-Path: <owner-freebsd-current@freebsd.org>
Delivered-To: freebsd-current@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3535BE77144
 for <freebsd-current@mailman.ysv.freebsd.org>;
 Mon,  8 Jan 2018 05:09:39 +0000 (UTC)
 (envelope-from bsd-lists@BSDforge.com)
Received: from udns.ultimatedns.net (static-24-113-41-81.wavecable.com
 [24.113.41.81])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 1CAAE691A0;
 Mon,  8 Jan 2018 05:09:38 +0000 (UTC)
 (envelope-from bsd-lists@BSDforge.com)
Received: from udns.ultimatedns.net (localhost [127.0.0.1])
 by udns.ultimatedns.net (8.14.9/8.14.9) with ESMTP id w0859fWS078564;
 Sun, 7 Jan 2018 21:09:47 -0800 (PST)
 (envelope-from bsd-lists@BSDforge.com)
X-Mailer: UDNSMS
MIME-Version: 1.0
Cc: "Michael Tuexen" <tuexen@freebsd.org>, "Warner Losh" <imp@bsdimp.com>,
 "O. Hartmann" <ohartmann@walstatt.org>
In-Reply-To: <20180107123201.19ea0fde@thor.intern.walstatt.dynvpn.de>
From: "Chris H" <bsd-lists@BSDforge.com>
Reply-To: bsd-lists@BSDforge.com
To: "FreeBSD CURRENT" <freebsd-current@freebsd.org>
Subject: Re: r327359: cylinder checksum failed: cg0,
 cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2>
Date: Sun, 07 Jan 2018 21:09:47 -0800
Message-Id: <eb67ddbcb4b173c7556faa4ae3cee073@udns.ultimatedns.net>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.25
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Jan 2018 05:09:39 -0000

On Sun, 7 Jan 2018 12:31:34 +0100 "O=2E Hartmann" <ohartmann@walstatt=2Eorg> sa=
id

> Am Thu, 4 Jan 2018 12:14:47 +0100
> "O=2E Hartmann" <ohartmann@walstatt=2Eorg> schrieb:
>=20
> > On Thu, 4 Jan 2018 09:10:37 +0100
> > Michael Tuexen <tuexen@freebsd=2Eorg> wrote:
> >=20
> > > > On 31=2E Dec 2017, at 02:45, Warner Losh <imp@bsdimp=2Ecom> wrote:
> > > >=20
> > > > On Sat, Dec 30, 2017 at 4:41 PM, O=2E Hartmann <ohartmann@walstatt=2Eor=
g>
> > wrote:
> > > >    =20
> > > >> On most recent CURRENT I face the error shwon below on /tmp filesy=
stem
> > > >> (UFS2) residing
> > > >> on a Samsung 850 Pro SSD:
> > > >>=20
> > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x451=
5d2a3
> > !=3D
> > > >> bp: 0xd9fba319
> > > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x451=
5d2a3
> > > >> !=3D bp: 0xd9fba319
> > > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x451=
5d2a3
> > > >> !=3D bp: 0xd9fba319
> > > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x451=
5d2a3
> > > >> !=3D bp: 0xd9fba319
> > > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x451=
5d2a3
> > > >> !=3D bp: 0xd9fba319
> > > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > > >>=20
> > > >> I've already formatted the /tmp filesystem, but obviously without =
any
> > > >> success=2E
> > > >>=20
> > > >> Since I face such strange errors also on NanoBSD images dd'ed to S=
D
> > cards,
> > > >> I guess there
> > > >> is something fishy =2E=2E=2E   =20
> > > >=20
> > > >=20
> > > > It indicates a problem=2E We've seen these 'corruptions' on data in m=
otion
> > at
> > > > work, but I hacked fsck to report checksum mismatches (it silently
> > corrects
> > > > them today) and we've not seen any mismatch when we unmount and fsc=
k the
> > > > filesystem=2E   =20
> > > Not sure this helps: But we have seen this also after system panics
> > > when having soft update journaling enabled=2E Having soft update journa=
ling
> > > disabled, we do not observed this after several panics=2E
> > > Just to be clear: The panics are not related to this issue,
> > > but to other network development we do=2E
> > >=20
> > > You can check using tunefs -p devname if soft update journaling is en=
abled
> > or
> > > not=2E =20
> >=20
> > In all cases I reported in earlier and now, softupdates ARE ENABLED on =
all
> > partitions in question (always GPT, in my cases also all on flash based
> > devices, SD card and/or SSD)=2E
>=20
>=20
> =2E=2E=2E and journalling as well!
>=20
> In case of the SD, I produced the layout of the NanoBSD image via "dd"
> including the /cfg
> partition=2E The problem occured even when having overwritten the SD card w=
ith
> a new image=2E
> The problem went away once I unmounted /cfg and reformatted via newfs=2E Af=
ter
> that, I did
> not see any faults again! I have no explanation for this behaviour except=
 the
> dd didn't
> overwrite "faulty" areas or the obligate "gpart recover" at the end of th=
e
> procedure
> restored something faulty=2E
>=20
> The /tmp filesystem I reported in was also from an earlier date - and I
> didn't formatted
> it as I said - I confused the partition in question with another one=2E The
> partition has
> been created and formatted months ago under CURRENT=2E
>=20
> In single user mode, I reformatted the partition again - with journaling =
and
> softupdates
> enabled=2E As with the /cfg partition on NanoBSD with SD card, I didn't rea=
lise
> any faults
> again since then=2E=20
>=20
FWIW I *also* experience this on gpart/FFS2 partitioned/formatted drives
*with* journaling enabled=2E As a result; if the system crashes, more often
times, than not, fsck(8) canNOT use the journal, and indicates that it
must "fall through" to complete the task=2E This is on a SATA (ahci) driven
disk=2E My experiences with this seem to suggest that journaling is the cause=
=2E
> >=20
> >=20
> > >=20
> > > Best regards
> > > Michael =20
> > > >=20
> > > > Warner
> --=20
> O=2E Hartmann
>=20
> Ich widerspreche der Nutzung oder =C3=9Cbermittlung meiner Daten f=C3=
=BCr
> Werbezwecke oder f=C3=BCr die Markt- oder Meinungsforschung (=C2=A7 28 Ab=
s=2E 4 BDSG)=2E
--Chris