Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 7 Jul 2005 15:11:45 GMT
From:      Steve Sears <sjs@acm.org>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/83098: 'vrele: negative ref cnt' in shutdown with root and ufs/union
Message-ID:  <200507071511.j67FBjTb019515@www.freebsd.org>
Resent-Message-ID: <200507071520.j67FKO3I098614@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         83098
>Category:       kern
>Synopsis:       'vrele: negative ref cnt' in shutdown with root and ufs/union
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Jul 07 15:20:23 GMT 2005
>Closed-Date:
>Last-Modified:
>Originator:     Steve Sears
>Release:        5.3-RELEASE
>Organization:
self
>Environment:
FreeBSD bowler.local.com 5.3-RELEASE FreeBSD 5.3-RELEASE #funky.0: Mon Jun 27 18:26:10 PDT 2005     root@bldf3.local.com:/amd/svlusr01/vol/home1/sjs/p4/ng/funky/freebsd/usr/src/sys/i386/compile/funky  i386
>Description:
            I have an embeded application using 5.3 that loads the root FS from compact flash (CF) into a memory disk. The CF is too small for all of the tools, so we keep the largest part of the root image more or less locally and then copy it to a second memory disk and union the two of them together. This is pretty slick. The memory disk is just ufs, such that my fstab looks like this:

# Device              Mountpoint      FStype  Options         Dump    Pass#
/dev/md1             /               ufs     ro,union        1       1
/dev/acd0            /cdrom          cd9660  ro,noauto       0       0
proc                 /proc           procfs  rw              0       0
md                   /tmp            mfs     rw,-M,-s32m     2       0
/dev/nvrd1 	         /var	       ufs     rw              2       0

..and when you look at the mountpoints (output of the mount command):

# mount
/dev/md0 on / (ufs, local)
devfs on /dev (devfs, local)
/dev/ad4s1 on /cfcard (msdosfs, local)
/dev/md1 on / (ufs, local, read-only, union)
devfs on /dev (devfs, local)
/dev/nvrd1 on /var (ufs, local)
procfs on /proc (procfs, local)
/dev/md2 on /tmp (ufs, local, soft-updates)


As I inferred above, this actually works pretty well. The problem occurs when you invoke 'shutdown -r', as we get a panic because /dev/md1 is missing a reference:

(gdb) bt
#0  0x802f620b in kdb_enter (msg=0x12 <Address 0x12 out of bounds>)
    at cpufunc.h:63
#1  0x802dc650 in panic (fmt=0x8042a9f1 "vrele: negative ref cnt")
    at ../../../kern/kern_shutdown.c:561
#2  0x80335b0e in vrele (vp=0x81b65318) at ../../../kern/vfs_subr.c:2141
#3  0x80330f8b in checkdirs (olddp=0x81b65318, newdp=0x81b64c60)
    at ../../../kern/vfs_mount.c:978
#4  0x803313a2 in dounmount (mp=0x81b5bc00, flags=524288, td=0x81824320)
    at ../../../kern/vfs_mount.c:1119
#5  0x8033744a in vfs_unmountall () at ../../../kern/vfs_subr.c:3112
#6  0x802dc21a in boot (howto=0) at ../../../kern/kern_shutdown.c:391
#7  0x802dbb10 in reboot (td=0x81824320, uap=0x12)
    at ../../../kern/kern_shutdown.c:181
#8  0x80404083 in syscall (frame=
      {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 2139090692, tf_esi = 2139090680, tf_ebp = 2139090440, tf_isp = -1554072204, tf_ebx = 2139090552, tf_edx = -1, tf_ecx = 4, tf_eax = 55, tf_trapno = 0, tf_err = 2, tf_eip = 134543359, tf_cs = 31, tf_eflags = 518, tf_esp = 2139090236, tf_ss = 47})
    at ../../../i386/i386/trap.c:1077

..And looking at the mount structure so we know what's happening:

(gdb) frame 4
#4  0x803313a2 in dounmount (mp=0x81b5bc00, flags=524288, td=0x81824320)
    at ../../../kern/vfs_mount.c:1119
1119                            checkdirs(fsrootvp, mp->mnt_vnodecovered);
(gdb) p *mp
$1 = {mnt_list = {tqe_next = 0x0, tqe_prev = 0x81b5d400}, mnt_op = 0x8045c680, 
  mnt_vfc = 0x8045c6c0, mnt_vnodecovered = 0x81b64c60, mnt_syncer = 0x0, 
  mnt_nvnodelist = {tqh_first = 0x81b65318, tqh_last = 0x8c186198}, 
  mnt_lock = {lk_interlock = 0x806f2220, lk_flags = 16794624, 
    lk_sharecount = 0, lk_waitcount = 0, lk_exclusivecount = 1, lk_prio = 80, 
    lk_wmesg = 0x8042a46d "vfslock", lk_timo = 0, lk_lockholder = 0x81824320, 
    lk_newlock = 0x0}, mnt_mtx = {mtx_object = {lo_class = 0x8044d6bc, 
      lo_name = 0x8042a45c "struct mount mtx", 
      lo_type = 0x8042a45c "struct mount mtx", lo_flags = 196608, lo_list = {
        tqe_next = 0x0, tqe_prev = 0x0}, lo_witness = 0x0}, mtx_lock = 4, 
    mtx_recurse = 0}, mnt_writeopcount = 1, mnt_flag = 4129, mnt_opt = 0x0, 
  mnt_optnew = 0x0, mnt_kern_flag = 16777217, mnt_maxsymlinklen = 120, 
  mnt_stat = {f_version = 537068824, f_type = 4, f_flags = 4129, 
    f_bsize = 2048, f_iosize = 16384, f_blocks = 123863, f_bfree = 29584, 
    f_bavail = 19675, f_files = 32254, f_ffree = 26992, f_syncwrites = 0, 
    f_asyncwrites = 0, f_syncreads = 801, f_asyncreads = 403, f_spare = {0, 0, 
      0, 0, 0, 0, 0, 0, 0, 0}, f_namemax = 255, f_owner = 0, f_fsid = {val = {
        1120142860, 1346961508}}, f_charspare = '\0' <repeats 79 times>, 
    f_fstypename = "ufs", '\0' <repeats 12 times>, 
    f_mntfromname = "/dev/md1", '\0' <repeats 79 times>, 
    f_mntonname = "/", '\0' <repeats 86 times>}, mnt_cred = 0x81a82e80, 
  mnt_data = 0x819a2d00, mnt_time = 0, mnt_iosize_max = 131072, 
  mnt_export = 0x0, mnt_mntlabel = 0x0, mnt_fslabel = 0x0, 
  mnt_nvnodelistsize = 337}

When I look at checkdirs(), I see that it properly gave up all of its references. But there seems to be a missing reference on the vode, as the final act of checkdirs() is to see if this is the rootvnode and take that last ref; which is where we die as the count goes negative.

I believe the code in checkdirs is correct. I think we are failing to take a reference when we UPDATE-mount on the root; that is, the code fails to recognize this a root mount and take a root reference on the
vnode. I cannot find code in vfs_mount() or ffs_omount() that checks for a root mount.

When the second disk (md1) is mounted with the 'union' option above,
the MNT_UPDATE flag is passed into the mount routines. This is not a
unionfs issue.

      I think vfs_domount() needs to check for an update to root and
take another reference on the vnode when this happens. I'm not familiar
enough with this code to know what the right fix is, but it really seems
that it should be fixed in the VFS layer, not in the ufs or lower layers.
>How-To-Repeat:
      See description. Mount a disk on /. Mount another disk on / using
the 'union' option. Invoke 'shutdown -r now' and catch it in gdb. You can't just unmount as this is the root.
>Fix:
      
>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200507071511.j67FBjTb019515>