From owner-freebsd-fs@FreeBSD.ORG  Sun Aug 31 10:26:18 2003
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id ED29416A4BF
	for <freebsd-fs@freebsd.org>; Sun, 31 Aug 2003 10:26:18 -0700 (PDT)
Received: from web41805.mail.yahoo.com (web41805.mail.yahoo.com
	[66.218.93.139])	by mx1.FreeBSD.org (Postfix) with SMTP id 8B2BD43FEA
	for <freebsd-fs@freebsd.org>; Sun, 31 Aug 2003 10:26:18 -0700 (PDT)
	(envelope-from neoninternet@yahoo.com)
Message-ID: <20030831172618.95711.qmail@web41805.mail.yahoo.com>
Received: from [68.2.118.193] by web41805.mail.yahoo.com via HTTP;
	Sun, 31 Aug 2003 10:26:18 PDT
Date: Sun, 31 Aug 2003 10:26:18 -0700 (PDT)
From: Kevin Bockman <neoninternet@yahoo.com>
To: freebsd-fs@freebsd.org
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Subject: Filesystem problem
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 31 Aug 2003 17:26:19 -0000

Hi. I have been experencing some filesystem problems
for the last month or so. I was running 4.8-STABLE and
updated to 5.1-RELEASE-p2.  While I was running 4.8
and I tried to run a command that required hard disk
activity, the process would 'hang' and I would no
longer be able to ssh or telnet in.  I would get stuck
after typing in my login.

Running 5.1 is a different story.  I did a clean
install of 5.1-RELEASE and cvsup'd to -p2.  Every time
I do this, it's great for a day or so then it acts up.
 Before after it started, even if I rebooted it would
immediately start up.  On 5.1, it is only hanging for
that process and everything else is fine.  I can still
login, webserver responds, etc.

Here is a little info:

FreeBSD devel.neoninternet.net 5.1-RELEASE-p2 FreeBSD
5.1-RELEASE-p2 #0: Sat Aug 23 20:12:41 PDT 2003    
kevin@devel.ph.cox.net:/usr/src/sys/i386/compile/SLURPEE
 i386

CPU: AMD Athlon(tm) XP 2600+ (2086.51-MHz 686-class
CPU)
real memory  = 1073676288 (1023 MB)
ad0: 117246MB <Maxtor 6Y120P0> [238216/16/63] at
ata0-master UDMA133

root   1173  0.0  0.1  1436  916  p3  D+    6:38PM  
0:00.00 man vmstat
root    784  0.0  0.1   752  636  d0  D     4:34PM  
0:00.02 make all DIRPRFX=i386/libi386/
root    847  0.0  0.0   312  212  d0  D     4:34PM  
0:00.00  (cc)
root    848  0.0  0.3  4104 3488  d0  D     4:34PM  
0:00.01  (cc1)
root    849  0.0  0.1   928  668  d0  D     4:34PM  
0:00.00 /usr/bin/as -o comconsole.o -

last pid:  1252;  load averages:  0.00,  0.00,  0.00  
                                up 0+02:37:22 
19:04:48
64 processes:  1 running, 63 sleeping
CPU states:  0.0% user,  0.0% nice,  0.0% system, 
0.0% interrupt,  100% idle
Mem: 34M Active, 23M Inact, 38M Wired, 204K Cache, 22M
Buf, 906M Free
Swap: 2048M Total, 2048M Free

devel# vmstat
 procs      memory      page                    disks 
   faults      cpu
 r b w     avm    fre  flt  re  pi  po  fr  sr ad0 da0
  in   sy  cs us sy id
 1 7 0  144612 928056   16   0   0   0   9   0   0   0
 331    0 254  0  0 100

Anyone have any suggestions?  I can not control-C out
of 'man vmstat'.  While doing 'make' in
/usr/src/sys/boot it was hanging on as, when I
restarted it, it got to i386/libi386 and will not do
anything else.  I'm running that through serial
console, it let me ^C out of that.  I tried going into
single user mode and running umount, now it just sits
there and I can't ^C.  I have no ideas, this was all
working yesterday!! :-)

Any ideas on what else to check or other helpful hints
would help bunches.

Sorry for the cross-posts.  Just not sure where to go
with this one.

Thanks,

Kevin

 


__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

From owner-freebsd-fs@FreeBSD.ORG  Tue Sep  2 13:10:35 2003
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 1788816A4BF
	for <freebsd-fs@freebsd.org>; Tue,  2 Sep 2003 13:10:35 -0700 (PDT)
Received: from freebsd.org (dhcp065-024-168-078.columbus.rr.com
	[65.24.168.78])	by mx1.FreeBSD.org (Postfix) with SMTP id B510243F93
	for <freebsd-fs@freebsd.org>; Tue,  2 Sep 2003 12:55:49 -0700 (PDT)
	(envelope-from stuck_in_telnet@freebsd.org)
To: freebsd-fs@freebsd.org
From: stuck_in_telnet@no.where
Message-Id: <20030902195549.B510243F93@mx1.FreeBSD.org>
Subject: /sbin/newfs 4.8-STABLE_20030803 segfault and core
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
Date: Tue, 02 Sep 2003 20:10:35 -0000
X-Original-Date: Tue Sep  2 15:30:00 EDT     
X-List-Received-Date: Tue, 02 Sep 2003 20:10:35 -0000

#        size   offset    fstype   [fsize bsize bps/cpg]
  h: 19925880        0    4.2BSD     2048 16384    89   # (Cyl.    0 - 19767*)

/tmp/newfs: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD), for FreeBSD 4.8, statically linked, not stripped

/tmp/newfs -NU -i 3 ad0h    
Warning: Block size restricts cylinders per group to 40200.
Warning: 1160 sector(s) in last cylinder unallocated
/dev/ad0h:      19925880 sectors in 4865 cylinders of 1 tracks, 4096 sectors
        9729.4MB in 1 cyl groups (40200 c/g, 80400.00MB/g, -48203008 i/g) SOFTUPDATES
super-block backups (for fsck -b #) at:
zsh: segmentation fault (core dumped)  /tmp/newfs -NU -i 3 ad0h


Core was generated by `newfs'.
Program terminated with signal 11, Segmentation fault.
#0  0x804c6c8 in initcg (cylno=0, utime=1062500000) at /usr/src/sbin/newfs/mkfs.c:835
(gdb) #0  0x804c6c8 in initcg (cylno=0, utime=1062500000) at /usr/src/sbin/newfs/mkfs.c:835
#1  0x804c080 in mkfs (pp=0x806eec4, fsys=0x8093bc0 "/dev/ad0h", fi=3, fo=-1) at /usr/src/sbin/newfs/mkfs.c:709
#2  0x8049581 in main (argc=1, argv=0xbfbffbc4) at /usr/src/sbin/newfs/newfs.c:617

binary from above sup date running on 4.8-RELEASE.
if run on smaller partitions it simply emits a warning...
/tmp/newfs -NU -i 3 ad1b
Minimum bytes per inode is 773576

no biggie, now off to fix the mail client...

From owner-freebsd-fs@FreeBSD.ORG  Wed Sep  3 14:36:30 2003
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 2D87016A4BF
	for <freebsd-fs@freebsd.org>; Wed,  3 Sep 2003 14:36:30 -0700 (PDT)
Received: from shiva.jussieu.fr (shiva.jussieu.fr [134.157.0.129])
	by mx1.FreeBSD.org (Postfix) with ESMTP id E2EBB43FDF
	for <freebsd-fs@freebsd.org>; Wed,  3 Sep 2003 14:36:28 -0700 (PDT)
	(envelope-from arno@heho.snv.jussieu.fr)
Received: from heho.snv.jussieu.fr (heho.snv.jussieu.fr [134.157.184.22])
          by shiva.jussieu.fr (8.12.9/jtpda-5.4) with ESMTP id h83LaR9U001589
          for <freebsd-fs@freebsd.org>; Wed, 3 Sep 2003 23:36:27 +0200 (CEST)
Received: from heho.snv.jussieu.fr (localhost [127.0.0.1])
	h83LaRTe027921          for <freebsd-fs@freebsd.org>;
	Wed, 3 Sep 2003 23:36:27 +0200 (MEST)
Received: (from arno@localhost)
	by heho.snv.jussieu.fr (8.12.9/8.12.9/Submit) id h83LaR94027918;
	Wed, 3 Sep 2003 23:36:27 +0200 (MEST)
To: freebsd-fs@freebsd.org
From: arno@heho.snv.jussieu.fr (Arno J. Klaassen)
Date: 03 Sep 2003 23:36:26 +0200
Message-ID: <wpwucpsgf9.fsf@heho.snv.jussieu.fr>
Lines: 51
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Antivirus: scanned by sophie at shiva.jussieu.fr
Subject: very slow fsck on bad disk
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 03 Sep 2003 21:36:30 -0000


hello,

is this normal ? :

 - I have a 40G IBM Deskstar IDE disk, about 1-2 years old,
   connected to some -stable Pentium-Pro box
 - a while ago, I got lots of "disk errors", I decided
   to reboot the box, but then it did not recognise the disk
   any longer (so I pulled ou the (data) disk)
 - ...
 - Time passed, I found out that the last backup of that
   disk was rather old, and I'd like to spend some time to
   get of the disk whatever is still readable
 - I manage to find an ASUS-AMD MB, running 5.1-RELEASE
   whos BIOS accepts the disk as secondary master
 - I could not mount the disk, since "I/O Error" or
   something like that
 - I started :

    fsck_ffs -y -b 32 /dev/ad2s1e

I started this .... "last saturday" and it's still running.
When I look at the dmesg or /var/log messages, my eye
got triggered by this :


  Sep  3 23:02:27 tabarnac kernel: ad2: hard error cmd=read fsbn 16405947 status=59 error=40
  Sep  3 23:02:32 tabarnac kernel: ad2: hard error cmd=read fsbn 16405948 status=59 error=40
  Sep  3 23:02:37 tabarnac kernel: ad2: hard error cmd=read fsbn 16405949 status=59 error=40
  Sep  3 23:02:42 tabarnac kernel: ad2: hard error cmd=read fsbn 16405950 status=59 error=40

  Sep  3 23:02:47 tabarnac kernel: ad2: hard error cmd=read fsbn 16769183 of 16769183-16769310 status=59 error=40

  Sep  3 23:02:52 tabarnac kernel: ad2: hard error cmd=read fsbn 16769183 status=59 error=40
  Sep  3 23:02:57 tabarnac kernel: ad2: hard error cmd=read fsbn 16769184 status=59 error=40
  Sep  3 23:03:02 tabarnac kernel: ad2: hard error cmd=read fsbn 16769185 status=59 error=40


i.e., for a long time, every five seconds the error says "next block"
then suddenly, it says "no more errors in between blocks 16405950 and
16769183" (i.e. 363233 blocks ....)
and once again 5 seconds later it says "next block bad as well"

Am i wrong or does this smell like "each surface error is queued
for syslog, syslog print is triggered every 5 seconds, progress
on fsck ata-disks is hold until next syslog message is printed"

Thank you very much in advance.

Arno

From owner-freebsd-fs@FreeBSD.ORG  Wed Sep  3 21:27:14 2003
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 649EF16A4E0
	for <freebsd-fs@freebsd.org>; Wed,  3 Sep 2003 21:27:14 -0700 (PDT)
Received: from carver.gumbysoft.com (carver.gumbysoft.com [66.220.23.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 9E12A43FE5
	for <freebsd-fs@freebsd.org>; Wed,  3 Sep 2003 21:27:13 -0700 (PDT)
	(envelope-from dwhite@gumbysoft.com)
Received: by carver.gumbysoft.com (Postfix, from userid 1000)
	id 9074D72DA4; Wed,  3 Sep 2003 21:27:13 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1])
	by carver.gumbysoft.com (Postfix) with ESMTP
	id 8DBCA72DA3; Wed,  3 Sep 2003 21:27:13 -0700 (PDT)
Date: Wed, 3 Sep 2003 21:27:13 -0700 (PDT)
From: Doug White <dwhite@gumbysoft.com>
To: "Arno J. Klaassen" <arno@heho.snv.jussieu.fr>
In-Reply-To: <wpwucpsgf9.fsf@heho.snv.jussieu.fr>
Message-ID: <20030903212556.C88884@carver.gumbysoft.com>
References: <wpwucpsgf9.fsf@heho.snv.jussieu.fr>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: freebsd-fs@freebsd.org
Subject: Re: very slow fsck on bad disk
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 04 Sep 2003 04:27:14 -0000

On Wed, 3 Sep 2003, Arno J. Klaassen wrote:

>     fsck_ffs -y -b 32 /dev/ad2s1e
>
> I started this .... "last saturday" and it's still running.
> When I look at the dmesg or /var/log messages, my eye
> got triggered by this :
>
>
>   Sep  3 23:02:27 tabarnac kernel: ad2: hard error cmd=read fsbn 16405947 status=59 error=40
>   Sep  3 23:02:32 tabarnac kernel: ad2: hard error cmd=read fsbn 16405948 status=59 error=40

[... disk errors ...]

> Am i wrong or does this smell like "each surface error is queued
> for syslog, syslog print is triggered every 5 seconds, progress
> on fsck ata-disks is hold until next syslog message is printed"

It isn't trying to log to the defective volume, is it? And the delays are
probably the disk resetting.  Errors and timeouts take a while.

I would say you are the proud owner of a new doorstop. :)

-- 
Doug White                    |  FreeBSD: The Power to Serve
dwhite@gumbysoft.com          |  www.FreeBSD.org

From owner-freebsd-fs@FreeBSD.ORG  Thu Sep  4 13:25:27 2003
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 3604D16A4BF; Thu,  4 Sep 2003 13:25:27 -0700 (PDT)
Received: from rwcrmhc11.comcast.net (rwcrmhc11.comcast.net [204.127.198.35])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 0C7B943F3F; Thu,  4 Sep 2003 13:25:26 -0700 (PDT)
	(envelope-from julian@elischer.org)
Received: from interjet.elischer.org ([12.233.125.100])
          by attbi.com (rwcrmhc11) with ESMTP
          id <2003090420252501300g9gqve>; Thu, 4 Sep 2003 20:25:25 +0000
Received: from localhost (localhost.elischer.org [127.0.0.1])
	by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id NAA42450;
	Thu, 4 Sep 2003 13:25:25 -0700 (PDT)
Date: Thu, 4 Sep 2003 13:25:23 -0700 (PDT)
From: Julian Elischer <julian@elischer.org>
To: Andrew Kinney <andykinney@advantagecom.net>
In-Reply-To: <3F573729.8917.53574D7@localhost>
Message-ID: <Pine.BSF.4.21.0309041319450.41602-100000@InterJet.elischer.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: freebsd-hackers@freebsd.org
cc: fs@freebsd.org
Subject: Re: 20TB Storage System (fsck????)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 04 Sep 2003 20:25:27 -0000



On Thu, 4 Sep 2003, Andrew Kinney wrote:
> > 
> 
> Our experience has been that with 4GB of RAM (or more) you 
> really must increase your KVA to 2GB, leaving only 2GB of UVA.  
> So, I would concur with what Julian said.
> 
> <ducks his head to avoid the rotten tomatoes that are sure to be 
> thrown> ;-)
> 
> With the lack of third party filesystem support in FreeBSD, might 
> you be better served by looking at a Linux system running 
> ReiserFS or one of the other file systems designed for such 
> behemoth disk systems?
> 
> These days, I think Sun even gives away Solaris licenses with their 
> low end x86 servers, so that might even be an option.
> 
> UFS is great, but there are other filesystems out there that have 
> already addressed such problems from their use in academic, 
> government, and scientific computing where gigantic filesystems 
> tend to be more prevalent.
> 

UFS2 will make the filesystem..
All we need is a way to FIX such a filesystem.

My brief analysis of this indicates that a 'serial' fsck should be
possible.

What this would do is read through the filesystem metadata, creating 
several 'list' files on another filesystem. These would then be
duplicated and sorted on  several different fields, and then
recombined in a 'merge' manner, to produce lists of unallocated files,
bad directory entries, duplicate allocated blocks etc. etc.

This would probably be workable in a similar order of magnitute
of time as a normal fsck, except 'offline' and able to handle a much
larger filesystem.

julian