From owner-freebsd-fs@FreeBSD.ORG Wed Mar 25 12:32:44 2009 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1828B1065677; Wed, 25 Mar 2009 12:32:44 +0000 (UTC) (envelope-from jacks@sage-american.com) Received: from mail.sagedata.net (mail.sagedata.net [63.214.156.21]) by mx1.freebsd.org (Postfix) with ESMTP id CC6EB8FC14; Wed, 25 Mar 2009 12:32:43 +0000 (UTC) (envelope-from jacks@sage-american.com) Received: from sagemaster (sageweb.net [65.68.247.73]) by mail.sagedata.net (8.14.3/8.14.3) with SMTP id n2PCLd9p007104; Wed, 25 Mar 2009 07:21:39 -0500 (CDT) (envelope-from jacks@sage-american.com) X-Authentication-Warning: mail.sagedata.net: Host sageweb.net [65.68.247.73] claimed to be sagemaster Message-Id: <3.0.1.32.20090325072137.00ee6b48@sage-american.com> X-Sender: jacks@sage-american.com X-Mailer: Windows Eudora Pro Version 3.0.1 (32) Date: Wed, 25 Mar 2009 07:21:37 -0500 To: "Daniel O'Connor" , Bartosz Stec From: "Jack L. Stone" In-Reply-To: <200903251925.36108.doconnor@gsoft.com.au> References: <49C9E635.5010106@kkip.pl> <49C83673.3000604@aldan.algebra.com> <200903251820.54749.doconnor@gsoft.com.au> <49C9E635.5010106@kkip.pl> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-Scanned-By: milter-spamc/1.13.385 (mail.sagedata.net [63.214.156.21]); Wed, 25 Mar 2009 07:21:40 -0500 X-Scanned-By: milter-sender/1.16.915 (mail.sagedata.net [63.214.156.21]); Wed, 25 Mar 2009 07:21:40 -0500 X-Virus-Scanned: ClamAV 0.94.2/9164/Tue Mar 24 23:02:31 2009 on mail.sagedata.net X-Virus-Status: Clean X-Spam-Status: NO, hits=-10.00 required=4.50 X-Spam-Report: Content analysis details: (-10.0 points, 4.5 required) | | pts rule name description | ---- ---------------------- -------------------------------------------------- | -10 ALL_TRUSTED Passed through trusted hosts only via SMTP | Cc: Yoshihiro Ota , "Mikhail T." , freebsd-stable@freebsd.org, fs@freebsd.org Subject: Re: support quality (Re: dump | restore fails: unknown tape headertype 1853384566) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Mar 2009 12:32:44 -0000 At 07:25 PM 3.25.2009 +1030, Daniel O'Connor wrote: >On Wednesday 25 March 2009 18:37:17 Bartosz Stec wrote: >> > Yes, dump is broken for you, deal with it. It is quite possible your FS >> > is corrupt, and/or your disk is damaged. >> >> ..and/or it is some other hardware problem, maybe you also should test >> your memory with memtest or something similiar? I'm using dump/restore >> very frequently and I had never seen such problem. Neither on -RELAESE, >> -STABLE, nor -CURRENT. >> So I think you should make sure that your problem is not >> hardware/filesystem dependent before you point dump/restore as a couse >> of the problem. Peter Jeremy already gives you good hints to do that. > >One other thing would be to make absolutely sure that your version of dump & >restore are in sync, the are very machine/version dependent. > >-- I've been watching this thread with some interest since we've had some similar problems with dump/restore which we use every morning via cron scripts on a number of servers to produce bootable clones as part of our backup program. Have been doing this for years and also never saw a problem as most of you say. We prefer dump/restore for backups. However, last month upon upon upgrading those servers from FBSD-6.3px (RELEASE) to 7.0px (RELEASE) we found that about one-half of the servers had a similar problem as the original poster while the other half did not. All of the servers (rackmounts) use the same (type) hardware. We spent many hours trying to solve the problem with those that failed to dump/restore. Also, searched for any others with the problem and only found a very few, but without solutions to this issue. (Indeed, the only one was a reference to any efforts to restore an older OS version which didn't apply here). And, indeed we tried everything suggested here to fix the proble without success. Sometimes the problem was dump which would reach 99% and never finish -- it would stick there and would overlap with another cron start the next day, and the next day, and the next day. (The servers that did work fooled us and we found out about this issue on the others when the overlaps appeared and drew our attention). That's when our work to try and solve the issues started and went on for days. Our script that has always worked contained this (after scraping and making fresh FS): /sbin/dump -D /root/dumpdates -0auL -f - / | /sbin/restore -rf - Indeed, the first thing we did was to remove the pipe and tried to restore from a file. However, because the dumps would not go past the 99%, no file to restore from! There were some exceptions when the dump would complete, but was not reliable. When these reached the restore level, restore would go crazy with errors. SOLUTION The "clones" are a very important pasrt of our backup program. Since the dump side of the problems simply stuck and provided no error message at all and the errors from any restores were not useful, our only solution was to revert back to FBSD-6.3 on those servers with this issue and dump/restore went back to working again. We left those that were working on FBSD-7.0-R and they continue to work okay. We could only conclude that the problem was perhaps something with hardeware, perhaps the way memory was handled in 7.0, but that is only a guess. Once again, every suggestion on this thread was tried during our long efforts to fix the issue. Perhaps there is yet another suggestion? In the meantime, we've decided to wait for 7.2R (7.1 did not fix the problems either). /Jack (^_^) Happy trails, Jack L. Stone System Admin Sage-american