Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 24 Mar 1996 01:00:09 +0200 (EET)
From:      Heikki Suonsivu <hsu@clinet.fi>
To:        FreeBSD-gnats-submit@freebsd.org
Subject:   kern/1098: File system corruption (2 cases)
Message-ID:  <199603232300.BAA17185@katiska.clinet.fi>
Resent-Message-ID: <199603232310.PAA17981@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         1098
>Category:       kern
>Synopsis:       File system corruption (2 cases)
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Mar 23 15:10:01 PST 1996
>Last-Modified:
>Originator:     Heikki Suonsivu
>Organization:
Clinet, Espoo, Finland
>Release:        FreeBSD 2.2-CURRENT i386
>Environment:

	First case: P90 news server, two Adaptek 2940s, 2*4G news spool,
	1 * 4G root/usr, 1 * 1G news history.  News disks were mounted
	async when this happened.

	Second case: 486-100, IDE disks, 1*400M root/usr, 1.2G local.
	No async mounts.

	Kernels are from mid-february sups.

>Description:

	Filesystems get seriosly corrupted.  Files and directories
	which are in active use at the time of the crash are corrupted.

	fsck -y does not fix directories, it becomes a loop until
	broken directory is manually clri'd, fsck removes cleared
	inodes but after salvaging the damaged directory it also salvages
	cleared inodes, thus never getting rid of cleared files.  clring
	the damaged directory fixes this.

	In the first case, large number of news directories were corrupted,
	in addition to small damage on all disks.

	In the second case, things like /dev, /usr/libexec and several
	other directories actively used were lost.  libc.so.2.2 was
	corrupted (at least).  The common thing was that all the things
	corrupted were the ones with most active use, and things corrupted
	were not necessarily things being modified (libexec and libc hardly
	change often).  

	In the second case the computer first seemed to become confused,
	and finally locked up.  After reboot it did not come up as /dev
	was corrupted.

>How-To-Repeat:

	I don't know.  One happened to an unattended machine (news server),
	the second case could have been a shock by hitting the table
	the computer was on, but the symptoms are exactly the same
	for the unattended machine.

	It seems that everything in the buffer cache was
	corrupted and then written back to the disk.  Ie. it smashes
	everything important.

>Fix:
	
	I wish I knew.
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199603232300.BAA17185>