Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 26 Sep 1998 07:54:50 -0700 (PDT)
From:      Don Lewis <Don.Lewis@tsc.tdk.com>
To:        FreeBSD-gnats-submit@FreeBSD.ORG
Subject:   bin/8055: [PATCH] more fsck bugs
Message-ID:  <199809261454.HAA15489@w3.gv.tsc.tdk.com>

next in thread | raw e-mail | index | archive | help

>Number:         8055
>Category:       bin
>Synopsis:       fsck has a number of bugs
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:
>Keywords:
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Sep 26 08:00:01 PDT 1998
>Last-Modified:
>Originator:     Don Lewis
>Organization:
TDK Semiconductor
>Release:        FreeBSD 3.0-BETA
>Environment:

	Most likely all versions of *BSD

>Description:

While trying to track down some softupdates problems, I noticed that
fsck -p was reconnecting orphaned directories under lost+found and
these directories had multiple hard links to them that were found
in subsequent fsck runs, even though the first fsck run had marked
the filesystem clean.  I found a couple of bugs in fsck that could
potentially cause this problem, but I'm not convinced that I found
the real cause.

	When pass3() finds a directory that does not have connectivity
	to the filesystem tree, it attempts to find the root of the
	directory tree that directory belongs to in order to reconnect
	the root of that tree under lost+found.

		In order to prevent an infinite loop, pass3() will only
		traverse "numdirs" directories before giving up.
		Unfortunately, "numdirs" is obtained from the filesystem
		superblock and may be much less than the number of
		directories that fsck has already found in the filesystem.
		This may cause fsck to give up too early, and reconnect
		the middle of the orphaned tree under lost+found.  Pass3()
		will continue to run and will sooner or later reconnect
		the root of the orphaned tree under lost+found as well,
		resulting in multiple hard links to the intermediate
		directory.

		If pass3() actually encounters a loop of orphaned
		directories, it will pick one of the directories to
		reconnect under lost+found, resulting in multiple
		hard links to this directory.

Other bugs that I stumbled across while digging around in fsck:

	When an orphaned file with a link count greater than one is
	reconnected under lost+found, adjust() does not correct the
	file's link count.

	If malloc() fails in cacheino(), cacheino() just silently
	returns.

	If the call to linkup() in pass3() causes the lost+found
	directory to be created, it may cause the inpsort array to
	be remalloc'ed.  This could be a problem since pass3() holds
	a pointer to somewhere in the middle of the array.


>How-To-Repeat:

>Fix:
	
The number of directories actually found should be counted, and this
number should be used for the loop detection in pass3().

When reconnecting an orphaned ring of directories, the ring should be
broken and the front of the resulting chain should be reconnected under
lost+found.  Kirk McKusick suggests using the name of the link that
was removed as the directory name to be used in lost+found instead of
synthesizing a name.

Adjust() should fix the link count when it reconnects a file under
lost+found.

The malloc() result should be checked.

Pass3() should not keep a pointer into inpsort across calls to linkup().


Here's the patch:


--- dir.c.orig	Wed Sep 23 01:27:22 1998
+++ dir.c	Sat Sep 26 06:00:39 1998
@@ -310,9 +310,13 @@
 
 	dp = ginode(idesc->id_number);
 	if (dp->di_nlink == lcnt) {
-		if (linkup(idesc->id_number, (ino_t)0) == 0)
+		if (linkup(idesc->id_number, (ino_t)0, NULL) == 0) {
 			clri(idesc, "UNREF", 0);
-	} else {
+			return;
+		}
+		lcnt--; /* matches lncntp[orphan]--; in linkup */
+	}
+	if (lcnt != 0) {
 		pwarn("LINK COUNT %s", (lfdir == idesc->id_number) ? lfname :
 			((dp->di_mode & IFMT) == IFDIR ? "DIR" : "FILE"));
 		pinode(idesc->id_number);
@@ -395,9 +399,10 @@
 }
 
 int
-linkup(orphan, parentdir)
+linkup(orphan, parentdir, name)
 	ino_t orphan;
 	ino_t parentdir;
+	char *name;
 {
 	register struct dinode *dp;
 	int lostdir;
@@ -431,6 +436,7 @@
 				lfdir = allocdir(ROOTINO, (ino_t)0, lfmode);
 				if (lfdir != 0) {
 					if (makeentry(ROOTINO, lfdir, lfname) != 0) {
+						numdirs++;
 						if (preen)
 							printf(" (CREATED)\n");
 					} else {
@@ -475,7 +481,7 @@
 		return (0);
 	}
 	(void)lftempname(tempname, orphan);
-	if (makeentry(lfdir, orphan, tempname) == 0) {
+	if (makeentry(lfdir, orphan, (name ? name : tempname)) == 0) {
 		pfatal("SORRY. NO SPACE IN lost+found DIRECTORY");
 		printf("\n\n");
 		return (0);
--- fsck.h.orig	Sun Mar  8 01:55:16 1998
+++ fsck.h	Wed Sep 23 23:54:20 1998
@@ -163,6 +163,7 @@
 	ufs_daddr_t i_blks[1];		/* actually longer */
 } **inphead, **inpsort;
 long numdirs, listmax, inplast;
+long countdirs;			/* number of directories we actually found */
 
 char	*cdevname;		/* name of device being checked */
 long	dev_bsize;		/* computed value of DEV_BSIZE */
@@ -262,7 +263,7 @@
 struct dinode  *ginode __P((ino_t inumber));
 void		inocleanup __P((void));
 void		inodirty __P((void));
-int		linkup __P((ino_t orphan, ino_t parentdir));
+int		linkup __P((ino_t orphan, ino_t parentdir, char *name));
 int		makeentry __P((ino_t parent, ino_t ino, char *name));
 void		panic __P((const char *fmt, ...));
 void		pass1 __P((void));
--- inode.c.orig	Sat Aug  1 11:03:28 1998
+++ inode.c	Wed Sep 23 03:14:23 1998
@@ -380,7 +380,7 @@
 	inp = (struct inoinfo *)
 		malloc(sizeof(*inp) + (blks - 1) * sizeof(ufs_daddr_t));
 	if (inp == NULL)
-		return;
+		errx(EEXIT, "cannot allocate space for inoinfo structure");
 	inpp = &inphead[inumber % numdirs];
 	inp->i_nexthash = *inpp;
 	*inpp = inp;
--- pass1.c.orig	Mon Jul  6 12:11:35 1998
+++ pass1.c	Wed Sep 23 02:19:13 1998
@@ -221,6 +221,7 @@
 		else
 			statemap[inumber] = DSTATE;
 		cacheino(dp, inumber);
+		countdirs++;
 	} else
 		statemap[inumber] = FSTATE;
 	typemap[inumber] = IFTODT(mode);
--- pass3.c.orig	Mon Jun 15 00:07:18 1998
+++ pass3.c	Thu Sep 24 01:04:04 1998
@@ -42,18 +42,29 @@
 #include <sys/param.h>
 
 #include <ufs/ufs/dinode.h>
+#include <ufs/ufs/dir.h>
 
 #include "fsck.h"
 
+static int pass3clear __P((struct inodesc *));
+static int pass3findname __P((struct inodesc *));
+
 void
 pass3()
 {
-	register struct inoinfo **inpp, *inp;
+	register struct inoinfo *inp;
 	ino_t orphan;
 	int loopcnt;
+	long inpindex;
+	struct inodesc idesc;
+	char namebuf[MAXNAMLEN+1];
 
-	for (inpp = &inpsort[inplast - 1]; inpp >= inpsort; inpp--) {
-		inp = *inpp;
+	for (inpindex = inplast - 1; inpindex >= 0; inpindex--) {
+		/*
+		 * inpsort might be remalloc()'d by linkup() if it creates
+		 * lost+found
+		 */
+		inp = inpsort[inpindex];
 		if (inp->i_number == ROOTINO ||
 		    !(inp->i_parent == 0 || statemap[inp->i_number] == DSTATE))
 			continue;
@@ -62,15 +73,76 @@
 		for (loopcnt = 0; ; loopcnt++) {
 			orphan = inp->i_number;
 			if (inp->i_parent == 0 ||
-			    statemap[inp->i_parent] != DSTATE ||
-			    loopcnt > numdirs)
+			    statemap[inp->i_parent] != DSTATE) {
+				if (linkup(orphan, inp->i_dotdot, NULL)) {
+					inp->i_parent = inp->i_dotdot = lfdir;
+					lncntp[lfdir]--;
+					statemap[orphan] = DFOUND;
+					propagate();
+				}
+				break;
+			} else if (loopcnt > countdirs) {
+				pfatal("ORPHANED DIRECTORY LOOP DETECTED I=%lu",
+				       orphan);
+				if (reply("RECONNECT") == 0)
+					break;
+				idesc.id_type = DATA;
+				idesc.id_fix = IGNORE;
+				idesc.id_number = inp->i_parent;
+				idesc.id_parent = orphan;
+				idesc.id_func = pass3findname;
+				idesc.id_name = namebuf;
+				idesc.id_entryno = 0;
+				if ((ckinode(ginode(idesc.id_number), &idesc)
+				     &FOUND) == 0)
+					panic("COULD NOT FIND NAME IN PARENT DIRECTORY");
+				if (linkup(orphan, inp->i_parent, namebuf)) {
+					idesc.id_type = DATA;
+					idesc.id_fix = IGNORE;
+					idesc.id_number = inp->i_parent;
+					idesc.id_parent = orphan;
+					idesc.id_func = pass3clear;
+					if ((ckinode(ginode(idesc.id_number), &idesc)
+					     &FOUND) == 0)
+						panic("COULD NOT FIND NAME IN PARENT DIRECTORY");
+					inp->i_parent = inp->i_dotdot = lfdir;
+					lncntp[lfdir]--;
+					lncntp[orphan]++;
+					statemap[orphan] = DFOUND;
+					propagate();
+				}
 				break;
-			inp = getinoinfo(inp->i_parent);
+			} else {
+				inp = getinoinfo(inp->i_parent);
+			}
 		}
-		(void)linkup(orphan, inp->i_dotdot);
-		inp->i_parent = inp->i_dotdot = lfdir;
-		lncntp[lfdir]--;
-		statemap[orphan] = DFOUND;
-		propagate();
 	}
+}
+
+int
+pass3findname(idesc)
+	struct inodesc *idesc;
+{
+	register struct direct *dirp = idesc->id_dirp;
+
+	if (dirp->d_ino != idesc->id_parent || idesc->id_entryno < 2) {
+		idesc->id_entryno++;
+		return (KEEPON);
+	}
+	memmove(idesc->id_name, dirp->d_name, (size_t)dirp->d_namlen + 1);
+	return (STOP|FOUND);
+}
+
+int
+pass3clear(idesc)
+	struct inodesc *idesc;
+{
+	register struct direct *dirp = idesc->id_dirp;
+
+	if (dirp->d_ino != idesc->id_parent || idesc->id_entryno < 2) {
+		idesc->id_entryno++;
+		return (KEEPON);
+	}
+	dirp->d_ino = 0;
+	return (STOP|FOUND|ALTERED);
 }
>Audit-Trail:
>Unformatted:

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199809261454.HAA15489>