Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 18 Nov 2000 03:31:40 -0500 (EST)
From:      Chris Jesseman <chris@sitemajic.net>
To:        Mark Garcia <markg@go2net.com>
Cc:        Mike Meyer <mwm@mired.org>, Chris Jesseman <whacker@sitemajic.net>, questions@freebsd.org
Subject:   Re: du -df inconsistency - get fsck to fix?
Message-ID:  <974536300.3a163e6c94542@www.sitemajic.net>
In-Reply-To: <Pine.LNX.3.96.1001116233208.12295A-100000@skritz.go2net.com>
References:  <Pine.LNX.3.96.1001116233208.12295A-100000@skritz.go2net.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Thanks a bunch guys. I unmounted, fsck and remounted and it cleared up.

Before I was familar with this type of problem, I used lsof and a shell script 
I found researching open files on this thread-> at the time there was only one 
open file. Restarting the offending process didn't seem to help this time.

-Chris Jesseman

Quoting Mark Garcia <markg@go2net.com>:

> On Fri, 17 Nov 2000, Mike Meyer wrote:
> 
> > Mark Garcia <markg@skritz.go2net.com> types:
> > > On Fri, 17 Nov 2000, Mike Meyer wrote:
> > > 
> > > > Chris Jesseman <whacker@sitemajic.net> types:
> > > > > I did have an open file but trashing it didn't help. I'm just
> curious if this 
> > > > > is on a lower file system level or would restarting the
> offending logger 
> > > > > release the space? I really don't want to reboot...
> > > > 
> > > > What do you mean by "trashed"? Files aren't removed from the file
> > > > system until the last link to them is broken. Rm on a file just
> breaks
> > > > one link. If a process has an open file descriptor pointing at
> the
> > > > file, *that's* also a link - and the file won't go away until the
> > > > process closes the link.
> > > > Stopping a processes closes all open files, so restarting a
> server
> > > > should work.
> > 
> > My bad - I wasn't very clear. By "server", I meant the server process
> > with the open log file, not the hardware that software is running on.
> > 
> > > I'd hope it work <snicker>  Actually, restarting the server is an
> arcane
> > > way of handling this.  Its probably the easiest and most intrusive
> way of
> > > telling the kernel to go stuff yourself and forget about what you
> were
> > > doing.  Rather, you should umount the partition, run fsck -p on it,
> choose
> > > 'yes' to the appropriate inode to clear, and then remount.
> > 
> > No, restarting a server that has an open log file filling a partition
> > isn't an arcane way of handling this, it's SOP. The kernel isn't
> > involved at all, except that once the file closes, it will notice
> > there are no longer any links to it, and free the disk space.
> > 
> > The problem with your solution is that - if I've diagnosed the
> problem
> > correctly - you won't be able to unmount the disk until the file is
> 
> True in most cases.  But, you can still have a kernel pointer to a
> memory
> allocation associated with what _was_ the space taken by the
> pre-existing
> filehandle.  The unmounting of a file system will be stopped if there is
> a
> _link_ from a process to a file.  By him removing that link, the kernel
> has lost the pointer of the process to that allocated space.  The
> process
> can no longer communicate to the kernel to tell it to clear the
> pointer.
> And the kernel will hold in memory that reference to the space.  The
> disk
> can be unmounted in this case.
> 
> > closed anyway. At which time, the problem is fixed, so why bother
> > going through all that extra work?
> 
> I've experienced this alot.  Usually from a program ie. qmail, which is
> trying to handle some 50mb file that someone sent, and it tries to
> deliver it, but qmail-remote barfs a horrible death, and the kernel
> still
> thinks there is a 50mb file... this happens a couple of times and the
> disk
> runs out of space... when a 'du' reflects that there is only 2% used.
> 
> There is no link anymore of a process with a filehandle open, but the
> allocated space was never cleaned up from the crashing qmail process. 
> I
> guess you can say a memory leak.  But this is the case... an umount met
> with success, fscking smacked the fs in shape, and the kernel cleared
> itself (some garbage collecting) and the space came back in a blink of
> an
> eye...
> 
> ->MAG
> 
>  > <mike > > > > > Thanks much you guys help!
> > > > > Chris Jesseman
> > > > > > 
> > > > > > Well, "optimal" depends on your goals.  Looks like you've got
> a server
> > > > > > log file you rm'ed from the file system, but the server is
> still
> > > > > > logging to it. You need to convince the server to close the
> log file,
> > > > > > which will cause it to be removed from the disk. If worst
> comes to
> > > > > > worst, doing a shutdown and reboot will solve the problem.
> > > > > > 
> > > > > > 	<mike
> > > > > 
> > > > > > 
> > > > > > 
> > > > > > > Thank you,
> > > > > > > Chris Jesseman
> > > > > > > 
> > > > > > > [0] /var#fsck  /dev/da0s1e
> > > > > > > ** /dev/da0s1e (NO WRITE)
> > > > > > > ** Last Mounted on /var
> > > > > > > ** Phase 1 - Check Blocks and Sizes
> > > > > > > ** Phase 2 - Check Pathnames
> > > > > > > ** Phase 3 - Check Connectivity
> > > > > > > ** Phase 4 - Check Reference Counts
> > > > > > > UNREF FILE I=94  OWNER=root MODE=100644
> > > > > > > SIZE=18173952 MTIME=Nov 15 22:09 2000 
> > > > > > > CLEAR? no
> > > > > > > 
> > > > > > > ** Phase 5 - Check Cyl groups
> > > > > > > 55 files, 17867 used, 1948 free (108 frags, 230 blocks,
> 0.5%
> > > > > > fragmentation)
> > > > > > > 
> > > > > > > 
> > > > > > > Misc. info from my 4.1.1 Stable box:
> > > > > > > 
> > > > > > > [0] /var#fsck -p /dev/da0s1e
> > > > > > > /dev/da0s1e: NO WRITE ACCESS
> > > > > > > /dev/da0s1e: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
> > > > > > > 
> > > > > > > [8] /var#cat /etc/fstab
> > > > > > > # Device                Mountpoint      FStype  Options     
>    Dump  
> > > > > >  Pass#
> > > > > > > /dev/da0s1e             /var            ufs     rw          
>    2     
> > > > > >  2
> > > > > > > 
> > > > > > > [0] /var#mount
> > > > > > > /dev/da0s1e on /var (ufs, local, writes: sync 1486005 async
> 715463,
> > > > > > reads: sync 
> > > > > > > 20046 async 6527)
> > > > > > > 
> > > > > > > [0] /var#du -h /var
> > > > > > > 1.0K    /var/at/jobs
> > > > > > > 1.0K    /var/at/spool
> > > > > > > 3.0K    /var/at
> > > > > > > 2.0K    /var/crash
> > > > > > > 4.0K    /var/cron/tabs
> > > > > > > 5.0K    /var/cron
> > > > > > > 2.0K    /var/msgs
> > > > > > > 1.0K    /var/preserve
> > > > > > >  52K    /var/run
> > > > > > > 1.0K    /var/rwho
> > > > > > > 1.0K    /var/tmp/vi.recover
> > > > > > > 3.0K    /var/tmp
> > > > > > >  20K    /var/yp
> > > > > > > 1.0K    /var/pwcheck
> > > > > > >  91K    /var
> > > > > > > [0] /var#df -H
> > > > > > > Filesystem    Size   Used  Avail Capacity  Mounted on
> > > > > > > /dev/da0s1e    20M    18M   372K    98%    /var
> > > > > > > 
> > > > > > > 
> > > > > > > To Unsubscribe: send mail to majordomo@FreeBSD.org
> > > > > > > with "unsubscribe freebsd-questions" in the body of the
> message
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > > 
> > > > To Unsubscribe: send mail to majordomo@FreeBSD.org
> > > > with "unsubscribe freebsd-questions" in the body of the message
> > > > 
> > > 
> > > 
> > 
> > 
> > To Unsubscribe: send mail to majordomo@FreeBSD.org
> > with "unsubscribe freebsd-questions" in the body of the message
> > 
> 
> 



Chris Jesseman, President
http://www.sitemajic.net


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?974536300.3a163e6c94542>