Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 Nov 2015 07:31:22 +0100
From:      Gerhard Schmidt <schmidt@ze.tum.de>
To:        kpneal@pobox.com, freebsd-questions@freebsd.org
Subject:   Re: Random Lockup with FreeBSD 10.2 on SuperMicro Boards
Message-ID:  <564AC9BA.70601@ze.tum.de>
In-Reply-To: <20151116164507.GA87691@neutralgood.org>
References:  <56498205.3060806@ze.tum.de> <20151116094334.GS2604@mordor.lan> <5649A761.7040303@ze.tum.de> <20151116111609.a9757a4a.freebsd@edvax.de> <5649AEC3.5090104@ze.tum.de> <20151116164507.GA87691@neutralgood.org>

next in thread | previous in thread | raw e-mail | index | archive | help


Am 16.11.2015 um 17:45 schrieb kpneal@pobox.com:
> On Mon, Nov 16, 2015 at 11:24:03AM +0100, Gerhard Schmidt wrote:
>>
>>
>> Am 16.11.2015 um 11:16 schrieb Polytropon:
>>> On Mon, 16 Nov 2015 10:52:33 +0100, Gerhard Schmidt wrote:
>>>> My Workstation is running 10.2 for about at least 4 Month. I've never
>>>> had a problem event with SU+J. The difference is that on my workstation
>>>> /var is not on a raid array. So SU+J isn't the real problem.
>>>
>>> Maybe there's a file system inconsistency? How do you
>>> perform file system checks (automatic in background,
>>> which is discouraged, or in SUM, as recommended)?
>>> I'm asking because I've noticed the following lines
>>> in your previous message:
>>>
>>>> Trying to mount root from ufs:/dev/raid/r0p3 [rw]...
>>>> WARNING: / was not properly dismounted
>>>
>>> Is /var affected as well? If yes - give it a _clean_
>>> fsck, running in foreground on the unmounted partition.
>>>
>>> In case you have already done this, redirect my comment
>>> to /dev/null immediately. :-)
>>
>> /var was also inconsistent because of the lockup, I had to turn off the
>> server hard. The fsck recovered the Journal and marked the fs as clean
>> without background fsck.
> 
> When in doubt use 'fsck -f' to force a check despite the filesystem
> being marked clean. 
> 
> Personally, I got bit by SU (plain) a long time ago and I've never really
> trusted it since. I strongly advise you to 'fsck -f' on your /var just to
> rule out _any_ corruption there.

That could explain the problem on one Server but I have 3 of them. All
have the same problem. One corruption could be a random event but 3 not.
So there is a problem somewhere in 10.2 that wasn't there in 10.1.

BTW these are production servers. It's not that easy to shut everything
down just to do a fsck out of distrust of the filesystem. I have 76
FreeBSD Servers up and running all with SU and most with SU+J not once
i've got a problem with FS corruption outside of powerfailures or Kernel
Panics. All problems could be fixed with fsck mostly with no or minor
data loss, and this could be fix out of the Backups. And i'm using
FreeBSD since Version 2.1.5

Regards
   Estartu

-- 
----------------------------------------------------------
Gerhard Schmidt                | E-Mail: schmidt@ze.tum.de
Technische Universität München | Jabber: estartu@ze.tum.de
WWW & Online Services          |
Tel: +49 89 289-25270          | PGP-PublicKey
Fax: +49 89 289-25257          | on request



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?564AC9BA.70601>