From owner-freebsd-questions@FreeBSD.ORG Sun Apr 29 01:58:18 2012 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E7E5D106566C for ; Sun, 29 Apr 2012 01:58:18 +0000 (UTC) (envelope-from aimass@yabarana.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id A794D8FC0C for ; Sun, 29 Apr 2012 01:58:17 +0000 (UTC) Received: by iahk25 with SMTP id k25so3698543iah.13 for ; Sat, 28 Apr 2012 18:58:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding:x-gm-message-state; bh=M13WYvggzEMjIVqTjP1/UJ35X7QzHLW8PxaNUF3GUdU=; b=esmdy9Q7eO2fHcfJMLrW1xnxn13x2SHesLBj/D3DT6kpA91H/0megOkTDMFyvNPPDH cufo6mny4vU3MIfMz6S84QIrYbKyNf9s+EANbur0QuED93y06Z5gMc+im3Z4l9ZRVPQZ 1VBGgm7eJzMvGRvFVdfWbdM6CUq17tplkQMkxyL0HOaNuf8r5ZX617RIWfXmXrS2PrDS VPeUPWTTeqru9XegUhwGnBmnblv00EtoMsMF/3C+NcZ/3LmNsT+kYpSzvyd6fYJ8nU6b dDoWsYMq+7LXPvaI4cW9alPAhBja6k49DXsRd3PHE+K0kYL5OaSzlCssAloh9R/EO99L /Z+A== MIME-Version: 1.0 Received: by 10.50.197.233 with SMTP id ix9mr6837048igc.26.1335664697350; Sat, 28 Apr 2012 18:58:17 -0700 (PDT) Received: by 10.231.74.138 with HTTP; Sat, 28 Apr 2012 18:58:17 -0700 (PDT) In-Reply-To: <201204290403.58388.erich@alogreentechnologies.com> References: <201204290403.58388.erich@alogreentechnologies.com> Date: Sat, 28 Apr 2012 21:58:17 -0400 Message-ID: From: Alejandro Imass To: Erich Dollansky Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQkOGyARbBrpI2A4gojL5GqX8oz4RNYjGrGNMGcftuvSfFRKnOPGn1X7/bdSdJ7OadAJTsux Cc: Wojciech Puchar , freebsd-questions@freebsd.org Subject: Re: UFS Crash and directories now missing X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Apr 2012 01:58:19 -0000 On Sat, Apr 28, 2012 at 5:03 PM, Erich Dollansky wrote: > Hi, > > On Saturday 28 April 2012 20:15:25 Alejandro Imass wrote: >> On Sat, Apr 28, 2012 at 3:22 AM, Wojciech Puchar >> wrote: >> >> I somewhat agree, but it wasn't a person. I am the only administrator= , >> >> the only one with root access. The jails were effectively moved to th= e >> >> /usr/local/etc/apache22 of the single that survived at the top level. >> >> I'm thinking something between mount, EzJail, the journal and the way >> >> MySQL created a great deal of head contention, so something must have >> >> gotten corrupted at the directory level like you state, but the >> >> strange part is no _data_ corruption as such, because I was able to >> >> physically archive the jails, move them to the correct directory and >> > >> > >> > no matter what you do FreeBSD DOES NOT ramdomly move directories. if y= ou are >> > sure you didn't move it yourself then it must be machine hardware prob= lem >> > but still unlikely. >> >> After a little more research, ___it it NOT unlikely at all___ that >> under high distress and a hard boot, UFS could have somehow corrupted >> the directory structure, whilst maintaining the data intact. From what >> I've learned so far, UFS is actually divided into 2 layers: one that >> controls the directory structure and metadata and a lower layer >> containing the data, so the directories being screwed up and the data >> intact it is actually quite possible. >> >> What I'm trying to do is figure out is how it happened, and try >> prevent it from happening again, so instead of dismissing it as >> impossibility, I think we all should spend a little time figuring out >> how these things can happen and determine how it can be prevented or >> reduced. > > somebody mentioned the links. Did you use links in the jails to access th= e data? If then the directories of the jails got screwed, the links are gon= e but the original data is still there. The damaged directory might got fix= ed during the first reboot after the crash and you never noticed the fix. > Hi Erich, thanks for your reply. I don't know what links you are referring to, but please point me in that direction. I initially suspected that it could have been the journal recovery and/or fsck but as you can see, a couple of people have said this is impossible, but have to admit my ignorance on some specifics of the UFS filesystem, yet out of logic seems like the most plausible explanation. I've been running FBSD since 6.2 and jails since then as well. Today I run 6 public servers in 8.2 with between 15 to 20 jails each and we switched to ezjail last year and use strictly by the book. I do use flavours though, and I may archive and re-create jails with a specific archive but always using ezjail-admin. Since all our servers are 8.2 and all updated the same, I may port jails from one server to the other using the ezjail archive method, but nothing as stupid as someone was suggesting that I was using cp or soft links. I've never had any problems except in _this particular server_ where I have client that has a problem with MySQL and under some conditions it drains the whole server. I suspected corruption of the fs because of all the contention generated by MySQL to the point where it simply hung and had to hard-reboot. I doubt it's hardware because these are relatively new servers Xeon X3370, 8GB RAM, 2 x 150GB 10,000rpm Velociraptor disks. We have the pristine OS in one disk and jails in the other. Nothing runs outside of jails, not even the MTA which runs postfix inside one of the jails. This is the first crash when anything like this has happened in over 6 years running FBSD, and I am surprised as anyone here because of the weirdness of the jail directories moving like that. We had backups of the previous night, but I didn't even use them. The data was all there, intact, just moved inside the only surviving jail, which happens to be the http reverse proxy of all the other jails. If you have any leads as to how this can happen other than cosmic rays I would greatly appreciate it. Thanks! --=20 Alejandro > Erich