Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 07 Dec 2014 11:16:57 +0100
From:      Ireneusz Pluta <ipluta@wp.pl>
To:        freebsd-fs@freebsd.org
Subject:   A way to quick fix of "leaking lots of unreferenced inodes"
Message-ID:  <54842919.3030204@wp.pl>

next in thread | raw e-mail | index | archive | help
Hello,

I need to fix a server running the:

9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243825: Tue Dec  4 09:23:10 UTC 2012

which suffers a lot from the bug fixed with this commit:

http://lists.freebsd.org/pipermail/svn-src-releng/2013-September/000114.html

The main and practically only purpose of this machine is running a PostgreSQL server with serveral 
multi-hundred GB databases.

And yes, I currently use the "procedure" do restart postgres, with umount -f 
/mountpoint/of/pgsql/data, every few weeks, to not let it accumulate too much leak over longer uptimes.

To not to take a risk of eventual troubles with freebsd-update, and thus having the machine down for 
maybe more than expected half to one hour, I am thinking of making the fix quicker just by applying the

-            VOP_UNLOCK(nvp, 0);
+            vput(nvp);

patch to the /usr/src/sys/ufs/ufs/ufs_vnops.c and rebuilding the kernel. But I want to make sure if 
just applying only this patch to the 9.1-RELEASE, and ignoring all other changes within src/ufs/ufs 
made before this commit, is enough and safe for the filesystem.

As I browse commit log of my local git clone of freebsd repo between release/9.1.0 and this fix, I 
can see there was one more earlier change to the ufs_vnops.c itself:

$ git log  -U0 --oneline release/9.1.0..releng/9.2 sys/ufs/ufs/ufs_vnops.c
0ec41e1 MFS of 255104: MFC of 253998:
diff --git a/sys/ufs/ufs/ufs_vnops.c b/sys/ufs/ufs/ufs_vnops.c
index 487477c..b70166d 100644
--- a/sys/ufs/ufs/ufs_vnops.c
+++ b/sys/ufs/ufs/ufs_vnops.c
@@ -1274 +1274 @@ relock:
-                       VOP_UNLOCK(nvp, 0);
+                       vput(nvp);
3d32639 MFC r248422: Remove negative name cache entry pointing to the target name, which could be 
instantiated while tdvp was unlocked.
diff --git a/sys/ufs/ufs/ufs_vnops.c b/sys/ufs/ufs/ufs_vnops.c
index 03c8bb0..487477c 100644
--- a/sys/ufs/ufs/ufs_vnops.c
+++ b/sys/ufs/ufs/ufs_vnops.c
@@ -1564,0 +1565 @@ relock:
+       cache_purge_negative(tdvp);

and even some more to the sys/ufs/ufs path, as listed at the end.

So again this is my question: can I just put to my /usr/src/sys/ufs/ufs/ufs_vnops.c what has changed 
in http://lists.freebsd.org/pipermail/svn-src-releng/2013-September/000114.html, recompile, and not 
worry about all the other stuff?

Thanks
Irek.

$ git log  -U0 --oneline release/9.1.0..releng/9.2 sys/ufs/ufs | tee
0ec41e1 MFS of 255104: MFC of 253998:
diff --git a/sys/ufs/ufs/ufs_vnops.c b/sys/ufs/ufs/ufs_vnops.c
index 487477c..b70166d 100644
--- a/sys/ufs/ufs/ufs_vnops.c
+++ b/sys/ufs/ufs/ufs_vnops.c
@@ -1274 +1274 @@ relock:
-                       VOP_UNLOCK(nvp, 0);
+                       vput(nvp);
a89175a Merge the second part of the unmapped I/O changes.  This enables the infrastructure in the 
block layer and UFS filesystem as well as a few drivers.  The list of MFC revisions is long, so I 
won't quote changelogs.
diff --git a/sys/ufs/ufs/ufs_extern.h b/sys/ufs/ufs/ufs_extern.h
index c590748..31a2ba8 100644
--- a/sys/ufs/ufs/ufs_extern.h
+++ b/sys/ufs/ufs/ufs_extern.h
@@ -123,0 +124 @@ void  softdep_revert_rmdir(struct inode *, struct inode *);
+#define        BA_UNMAPPED     0x00040000      /* Do not mmap resulted buffer. */
63c193a MFC of 248561:
diff --git a/sys/ufs/ufs/ufs_lookup.c b/sys/ufs/ufs/ufs_lookup.c
index 35fe8fd..8d11e24 100644
--- a/sys/ufs/ufs/ufs_lookup.c
+++ b/sys/ufs/ufs/ufs_lookup.c
@@ -1388 +1388,2 @@ static int
-ufs_dir_dd_ino(struct vnode *vp, struct ucred *cred, ino_t *dd_ino)
+ufs_dir_dd_ino(struct vnode *vp, struct ucred *cred, ino_t *dd_ino,
+    struct vnode **dd_vp)
@@ -1390,0 +1392 @@ ufs_dir_dd_ino(struct vnode *vp, struct ucred *cred, ino_t *dd_ino)
+       struct vnode *ddvp;
@@ -1392,0 +1395 @@ ufs_dir_dd_ino(struct vnode *vp, struct ucred *cred, ino_t *dd_ino)
+       ASSERT_VOP_LOCKED(vp, "ufs_dir_dd_ino");
@@ -1394,0 +1398,13 @@ ufs_dir_dd_ino(struct vnode *vp, struct ucred *cred, ino_t *dd_ino)
+       /*
+        * First check to see if we have it in the name cache.
+        */
+       if ((ddvp = vn_dir_dd_ino(vp)) != NULL) {
+               KASSERT(ddvp->v_mount == vp->v_mount,
+                   ("ufs_dir_dd_ino: Unexpected mount point crossing"));
+               *dd_ino = VTOI(ddvp)->i_number;
+               *dd_vp = ddvp;
+               return (0);
+       }
+       /*
+        * Have to read the directory.
+        */
@@ -1411,0 +1428 @@ ufs_dir_dd_ino(struct vnode *vp, struct ucred *cred, ino_t *dd_ino)
+       *dd_vp = NULL;
@@ -1436 +1453 @@ ufs_checkpath(ino_t source_ino, ino_t parent_ino, struct inode *target, struct u
-               error = ufs_dir_dd_ino(vp, cred, &dd_ino);
+               error = ufs_dir_dd_ino(vp, cred, &dd_ino, &vp1);
@@ -1447,15 +1464,7 @@ ufs_checkpath(ino_t source_ino, ino_t parent_ino, struct inode *target, struct u
-               error = VFS_VGET(mp, dd_ino, LK_SHARED | LK_NOWAIT, &vp1);
-               if (error != 0) {
-                       *wait_ino = dd_ino;
-                       break;
-               }
-               /* Recheck that ".." still points to vp1 after relock of vp */
-               error = ufs_dir_dd_ino(vp, cred, &dd_ino);
-               if (error != 0) {
-                       vput(vp1);
-                       break;
-               }
-               /* Redo the check of ".." if directory was reparented */
-               if (dd_ino != VTOI(vp1)->i_number) {
-                       vput(vp1);
-                       continue;
+               if (vp1 == NULL) {
+                       error = VFS_VGET(mp, dd_ino, LK_SHARED | LK_NOWAIT,
+                           &vp1);
+                       if (error != 0) {
+                               *wait_ino = dd_ino;
+                               break;
+                       }
@@ -1462,0 +1472,2 @@ ufs_checkpath(ino_t source_ino, ino_t parent_ino, struct inode *target, struct u
+               KASSERT(dd_ino == VTOI(vp1)->i_number,
+                   ("directory %d reparented\n", VTOI(vp1)->i_number));
@@ -1469,0 +1481,2 @@ ufs_checkpath(ino_t source_ino, ino_t parent_ino, struct inode *target, struct u
+       if (vp1 != NULL)
+               vput(vp1);
3d32639 MFC r248422: Remove negative name cache entry pointing to the target name, which could be 
instantiated while tdvp was unlocked.
diff --git a/sys/ufs/ufs/ufs_vnops.c b/sys/ufs/ufs/ufs_vnops.c
index 03c8bb0..487477c 100644
--- a/sys/ufs/ufs/ufs_vnops.c
+++ b/sys/ufs/ufs/ufs_vnops.c
@@ -1564,0 +1565 @@ relock:
+       cache_purge_negative(tdvp);
b89ace2 MFC r247388: Work around the hold of references to the struct dquot by the freeblk workitems 
for some time at unmount.
diff --git a/sys/ufs/ufs/ufs_quota.c b/sys/ufs/ufs/ufs_quota.c
index c3789c3..88437c9 100644
--- a/sys/ufs/ufs/ufs_quota.c
+++ b/sys/ufs/ufs/ufs_quota.c
@@ -83 +83 @@ static int dqsync(struct vnode *, struct dquot *);
-static void dqflush(struct vnode *);
+static int dqflush(struct vnode *);
@@ -683,2 +683,6 @@ again:
-       dqflush(qvp);
-       /* Clear um_quotas before closing the quota vnode to prevent
+       error = dqflush(qvp);
+       if (error != 0)
+               return (error);
+
+       /*
+        * Clear um_quotas before closing the quota vnode to prevent
@@ -1618 +1622 @@ out:
-static void
+static int
@@ -1622,0 +1627 @@ dqflush(struct vnode *vp)
+       int error;
@@ -1628,0 +1634 @@ dqflush(struct vnode *vp)
+       error = 0;
@@ -1636,3 +1642,5 @@ dqflush(struct vnode *vp)
-                               panic("dqflush: stray dquot");
-                       LIST_REMOVE(dq, dq_hash);
-                       dq->dq_ump = (struct ufsmount *)0;
+                               error = EBUSY;
+                       else {
+                               LIST_REMOVE(dq, dq_hash);
+                               dq->dq_ump = NULL;
+                       }
@@ -1641,0 +1650 @@ dqflush(struct vnode *vp)
+       return (error);
3436e90 MFC     r246562:
diff --git a/sys/ufs/ufs/inode.h b/sys/ufs/ufs/inode.h
index 51f0197..25142dd 100644
--- a/sys/ufs/ufs/inode.h
+++ b/sys/ufs/ufs/inode.h
@@ -154,4 +153,0 @@ struct inode {
-#define        MAXSYMLINKLEN(ip) \
-       ((ip)->i_ump->um_fstype == UFS1) ? \
-       ((NDADDR + NIADDR) * sizeof(ufs1_daddr_t)) : \
-       ((NDADDR + NIADDR) * sizeof(ufs2_daddr_t))
1572df8 MFC     r239359:
diff --git a/sys/ufs/ufs/inode.h b/sys/ufs/ufs/inode.h
index 2b02000..51f0197 100644
--- a/sys/ufs/ufs/inode.h
+++ b/sys/ufs/ufs/inode.h
@@ -170 +169,0 @@ struct indir {
-       int     in_exists;              /* Flag if the block exists. */
diff --git a/sys/ufs/ufs/ufs_bmap.c b/sys/ufs/ufs/ufs_bmap.c
index e0fb307..22887c8 100644
--- a/sys/ufs/ufs/ufs_bmap.c
+++ b/sys/ufs/ufs/ufs_bmap.c
@@ -215 +214,0 @@ ufs_bmaparray(vp, bn, bnp, nbp, runp, runb)
-               ap->in_exists = 1;
@@ -360 +358,0 @@ ufs_getlbns(vp, bn, ap, nump)
-       ap->in_exists = 0;
@@ -373 +370,0 @@ ufs_getlbns(vp, bn, ap, nump)
-               ap->in_exists = 0;
a53e5a7 MFC     r246299;
diff --git a/sys/ufs/ufs/ufs_lookup.c b/sys/ufs/ufs/ufs_lookup.c
index 56ca058..35fe8fd 100644
--- a/sys/ufs/ufs/ufs_lookup.c
+++ b/sys/ufs/ufs/ufs_lookup.c
@@ -1435 +1434,0 @@ ufs_checkpath(ino_t source_ino, ino_t parent_ino, struct inode *target, struct u
-       error = 0;
75f830b MFC r243245:
diff --git a/sys/ufs/ufs/ufsmount.h b/sys/ufs/ufs/ufsmount.h
index 6447dce..b55d958 100644
--- a/sys/ufs/ufs/ufsmount.h
+++ b/sys/ufs/ufs/ufsmount.h
@@ -100,0 +101 @@ struct ufsmount {
+       int     um_writesuspended;              /* suspension in progress */
bb61831 MFC r242476: The r241025 fixed the case when a binary, executed from nullfs mount, was still 
possible to open for write from the lower filesystem.  There is a symmetric situation where the 
binary could already has file descriptors opened for write, but it can be executed from the nullfs 
overlay.
diff --git a/sys/ufs/ufs/ufs_extattr.c b/sys/ufs/ufs/ufs_extattr.c
index 777f385..51bef86 100644
--- a/sys/ufs/ufs/ufs_extattr.c
+++ b/sys/ufs/ufs/ufs_extattr.c
@@ -337 +337 @@ ufs_extattr_enable_with_open(struct ufsmount *ump, struct vnode *vp,
-       vp->v_writecount++;
+       VOP_ADD_WRITECOUNT(vp, 1);
521315f MFC r244239: Fix a typo, resulting in the NULL pointer dereference.
diff --git a/sys/ufs/ufs/ufs_quota.c b/sys/ufs/ufs/ufs_quota.c
index d353167..c3789c3 100644
--- a/sys/ufs/ufs/ufs_quota.c
+++ b/sys/ufs/ufs/ufs_quota.c
@@ -1055 +1055 @@ again:
-                               MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp);
+                               MNT_VNODE_FOREACH_ACTIVE_ABORT(mp, mvp);






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?54842919.3030204>