From owner-freebsd-questions@freebsd.org Wed Dec 13 12:36:50 2017 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2EA5BE9CF2F for ; Wed, 13 Dec 2017 12:36:50 +0000 (UTC) (envelope-from freebsd@edvax.de) Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.126.135]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "mout.kundenserver.de", Issuer "TeleSec ServerPass DE-2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A40C06F140 for ; Wed, 13 Dec 2017 12:36:48 +0000 (UTC) (envelope-from freebsd@edvax.de) Received: from r56.edvax.de ([92.195.18.98]) by mrelayeu.kundenserver.de (mreue007 [212.227.15.167]) with ESMTPA (Nemesis) id 0LnYkN-1exZwl1YTP-00hwGl; Wed, 13 Dec 2017 13:36:32 +0100 Date: Wed, 13 Dec 2017 13:36:27 +0100 From: Polytropon To: freebsd@dreamchaser.org Cc: Adam Vande More , FreeBSD Questions Subject: Re: Subject: Thunderbird causing system crash, need guidance Message-Id: <20171213133627.49a5e53b.freebsd@edvax.de> In-Reply-To: <603b487e-d1b7-eb98-6bcd-f2c2c6d3b843@dreamchaser.org> References: <201712110045.vBB0jCTQ078476@nightmare.dreamchaser.org> <38e2ef70-fa1b-25bf-4447-752006418d0a@dreamchaser.org> <20171211135803.d1aff6c8.freebsd@edvax.de> <5fbcd05c-ce12-b1a4-a9e9-79276dad7183@dreamchaser.org> <20171212200126.3ddf75e5.freebsd@edvax.de> <603b487e-d1b7-eb98-6bcd-f2c2c6d3b843@dreamchaser.org> Reply-To: Polytropon Organization: EDVAX X-Mailer: Sylpheed 3.1.1 (GTK+ 2.24.5; i386-portbld-freebsd8.2) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:8yKhaGX06wPW9LbGIbTxFdW+SpcjxeWLc1bJnPwcR8rV/thRoV2 4Z0RAu5bctm/mS1Q+iNXWElBvY23wAMVtd6fFUxtgFJqnJzIUotRJ7Ff5HKTWjM+4KJkzeX 1gqVcUJ1Eyyr/K8Q0kagq6H8cOr6vq5hsh2WlYl5z5GkZo7xahRI3K+8Q2KDXXeia4Rl1dR 9LH8eLvbGrB9KCzYLgjag== X-UI-Out-Filterresults: notjunk:1;V01:K0:KgAWIsyFOGU=:gEN1h1HjZh5hE+H7rh+A+4 7+Wnfu4IX301aLmgpm1Z5iVS8TL3ieKWsPfX7cNoxwNVjGsglvsGb6MLPnKx06doEgjEw/XCU 0MHa2tv+VzK5LrLcVPrvOK5Be9pKbs0wJOm2UMwS1ThdCK7HDmONmflLcgVtxYWASzBCSbbkM tSICRIlzmKtY6x8pK0vp3K5w5xJmwh2/raA0l6DWBe2RkDQIjUQ4HFjnNTgs/Or6e8QEmM83p V3OMuimoWV2ehN2REk/g0hiC39prqmyvMA7VGJYxuZvsE/F1yJYlxQhygHTjPqX1FVTfYqnGe cvzTj7v3+cU7fdxQfB9qSbWAXZnO6w6tReFWOf2yS5dbNyPhD0lkl8QQT7PM/7lvfiO+c/y5+ Mz8wvH1ASGlt5JgbiDJHVY+xyjWUmgBtBJSHXoGzBqMRr4Rku2rcCupgvh9fm/kam1JqW+4oS 8+AX6h1LGFpyvLzyQfjeDSmHSGytRX9nQGDXG/Jfk+R86/OpBMrhHCceIaFJ4zk85+6Tc2UnZ u/uDHXc/EcbI8T5DC1rBtuY0AZ0Efr/fWLzHPWCTVRXLffvpdwBEWXdWC3KwqXQ2EWOGPhHWg J+GXdl06JeQ+bPxxPZns+pMsYqzw4ksTWRPZDV2Ikzlkv2AGrChz5XKAv2ITX8eGE71a4D3f0 aUf4t+TgaeXLE/pTklGNil6M4modEw7TZZIJOCOwECeRAw6Aq6SDPtIERgImHGs4DK13bo7JD ngYWpoddsxJMlY402SP2j31PKnalORW2ovtt+uKB/u0Mm4Cdo23rMMDpR3U= X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Dec 2017 12:36:50 -0000 On Tue, 12 Dec 2017 22:06:35 -0700, Gary Aitken wrote: > On 12/12/17 12:01, Polytropon wrote: > > On Tue, 12 Dec 2017 11:30:26 -0700, Gary Aitken wrote: > >> On 12/11/17 05:58, Polytropon wrote: > >>> On Sun, 10 Dec 2017 21:56:16 -0700, Gary Aitken wrote: > >>>> On 12/10/17 19:02, Adam Vande More wrote: > >>>>> On Sun, Dec 10, 2017 at 6:45 PM, Gary Aitken wrote: > >> > >>>> However, I'm confused. Upon reboot, the system checks to see if > >>>> file systems were properly dismounted and is supposed to do an > >>>> fsck. Since those don't show up in messages, I can't verify > >>>> this, but I'm pretty certain it must have thought it was clean, > >>>> which it wasn't. (One reason I'm pretty certain is the time > >>>> involved when run manually as you suggested). > >>> > >>> This is the primary reason for setting > >>> > >>> background_fsck="NO" > >> > >> Already had that set for just that reason. > >> > >>> in /etc/rc.conf - if you can afford a little downtime. The > >>> background fsck doesn't have all the repair capabilities a forced > >>> foreground check has, to it _might_ leave the file system in an > >>> inconsistent state, and the system runs with that unclean > >>> partition. > >>> > >>>> The file system in question was mounted below "/". Does the > >>>> system only auto-check file systems mounted at "/"? > >>> > >>> Yes, / is the first file system it checks. The two last fields in > >>> /etc/fstab control what fsck will check, and /etc/rc.conf allows > >>> additional flags for those automatic checks. > >> > >> The ordering part I understand; what I don't understand is why it > >> (as I recall) rebooted successfully with no warnings in spite of > >> the background_fsck="NO" being set and when one of the disks > >> apparently didn't fsck properly. I thought it should have halted > >> in single-user mode and waited for me to do a full fsck manually. > >> Unfortunately, the fsck output is not printed to the log, and I > >> logged in as root on the vt0 device, so it had scrolled off by the > >> time I went to look for it. A good reason never to log into the > >> vt0 device. Is there any way to get the "transient" boot-time fsck > >> and other messages recorded in the log? > > > > There is an easy explanation: > > > > The foregroud fsck at boot time can only handle a subset of damages. > > In some cases, you are required to perform a second run of fsck in > > order to fix problems. This is where a forced full fsck is very > > useful (usually in single-user mode). > > > > You can specify additional flags for boot-time fsck via /etc/rc.conf, > > which are: > > > > fsck_y_enable="NO" # Set to YES to do fsck -y if the initial > > preen fails. fsck_y_flags="" # Additional flags for fsck -y > > background_fsck="YES" # Attempt to run fsck in the background where > > possible. background_fsck_delay="60" # Time to wait (seconds) before > > starting the fsck. > > > > For example, fsck_y_flags="-f" would be such an addition. As you can > > see, an initial preen ("limited fsck") can fail, and the filesystem > > might be in an inconsistent state. This is probably what you've been > > experiencing. > > > > See "man fsck" for details. :-) > > My language skills must be degenerating along with everything else... :-( My language skills aren't better either. ;-) > I have already read and reread the fsck man page, but thanks for the > fsck_y_flags example. I'm fairly sure a normal boot fsck should > not have succeeded. I would have thought the same, but then I decided to consult the authoritative source: the source. If I read everything correctly (and there is sufficient doubt I do!), fsck will exit with 0 in case a re-run is required. This requirement is indicated by a text message ("please re-run fsck"), but ony the return code matters to /etc/rc's "next steps". So in /usr/src/sbin/fsck_ffs/main.c, we find a function called checkfilesys() returning int, but this is discarded with (void); the main() function returns an int called ret, it is initialized 0 and only set = 2 in case of "go to single user mode", declared in fsck.h, set in fsutil.c by catchquit(), which seems to be a signal handler... So my assumption could be correct that fsck "false-positively" returns 0, boot continues as normal (with mount -w), but the file system is still in an inconsistent state... > According to the handbook, 12.2.4, if an fsck fails it should drop into > single user. Definitely. This usually happens in case of severe errors where a repair attempt could do more damage. > However, when I manually > unmounted the file system, ran "fsck -f" (which corrected numerous errors), > then rebooted, everything was (not surprisingly) once again functioning > properly. That matches my assumption. As soon as the file system is in a consistent state again, things work as intended. > So I'm looking for, if possible: > > 1. An explanation for the above behavior, which seems inconsistent with > the documented and expected behavior. The only fsck flag set in rc.conf > is 'background_fsck="NO"'. Is there some state a disk (or something else, > such as a normal shutdown flag), can (however theoretically) be in, where > it is possible to have a corrupt disk that won't pass normal boot time > fsck in preen mode but will not be checked in the first place, even while > another disk, the one containing all the system files, is checked? I think we have at last an entry point for explanation now. :-) > 2. A way to get the output of boot-time fsck commands recorded in the > system log, so one can after the fact of a reboot check to see what the > heck went on in terms of the fsck sequence? This is usually the text mode console where you can press the Scroll Lock key and scroll up to view the message that is still in the text scroll buffer. As far as I know, /var/log/messages does not record fsck status messages. > On 12/12/17 11:54, Adam Vande More wrote: > > On Tue, Dec 12, 2017 at 12:30 PM, Gary Aitken > > > wrote: > > > >> The ordering part I understand; what I don't understand is why it (as > >> I recall) rebooted successfully with no warnings in spite of the > >> background_fsck="NO" being set and when one of the disks apparently > >> didn't fsck properly. I thought it should have halted in > >> single-user mode and waited for me to do a full fsck manually. > > > > That happens if the preen fails. See the man page I pointed you to. > > There are cases where it can miss things. > > Are you saying if the preen fails, it mounts the file system normally > and continues to boot into multi-user? According to the man page for > fsck, a failure in preen mode will exit with failure, and the rc.d/fsck > script (by my incompetent reading) will do stop_boot in that case. See above: I think it really is the case... > > You can set: > > > > fsck_y_enable="YES" > > > > If you want it to handle a failed preen automatically. > > I don't want that; I want it to drop into single-user and let me do a > manual fsck. Very good choice. As I mentioned, a "fsck -fy" _might_ damage more than it does good, depending on the problem it tries to fix. In worst case, data recovery is easier with an unclean filesystem (with inodes partially intact) than with one that has been modified (inodes cleared, disconnected, or files set to zero length or zero bytes). -- Polytropon Magdeburg, Germany Happy FreeBSD user since 4.0 Andra moi ennepe, Mousa, ...