Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 16 Jun 2003 02:10:09 -0700 (PDT)
From:      Tom Alsberg <alsbergt@cs.huji.ac.il>
To:        freebsd-bugs@FreeBSD.org
Subject:   Re: kern/53004: union_lookup returning . (0xbc332e90) not same as startdir (0xc1fa8a40)
Message-ID:  <200306160910.h5G9A97u086923@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/53004; it has been noted by GNATS.

From: Tom Alsberg <alsbergt@cs.huji.ac.il>
To: freebsd-gnats-submit@FreeBSD.org, scrappy@hub.org
Cc:  
Subject: Re: kern/53004: union_lookup returning . (0xbc332e90) not same as startdir (0xc1fa8a40)
Date: Mon, 16 Jun 2003 12:01:47 +0300

 I noticed this a few days ago too, and sent a message to the
 FreeBSD-hackers list.  David Schultz <das@FreeBSD.ORG> asked me to
 repost this to gnats as a followup to this PR.  Following it is,
 including a simple (and yet "foolproof" as I noticed) way to reproduce
 it:
 
 <snip>
 From: Tom Alsberg <alsbergt@cs.huji.ac.il>
 To: FreeBSD Hackers List <freebsd-hackers@freebsd.org>
 Subject: (bug?) panic in union filesystem - file/.
 
 Hi there.
 
 I recently stumbled upon a crash in the union filesystem.  It seems
 that when trying to stat "<file>/." where file is a regular
 (non-directory) file in a union mounted filesystem, the system will
 panic.
 
 I first noticed this as an effect of zsh (Z shell)'s tab completion,
 which after I checked, tries to lstat "<file>/." if there are no other
 completions and the file exists, to see if it is a directory with
 other files in it which it should try to complete (I do not know why
 they chose to do it this way).
 
 It seems like a bug in the union filesystem to me.  I can reproduce it
 on both 4.8-STABLE and 5.1-CURRENT.
 
 Simplest way I reproduce it:
 
 # Create two directories somewhere:
 	cd /var/tmp
 	mkdir foo
 	mkdir bar
 # union-mount one on top of the other:
 	mount -t union bar foo
 # enter the mounted directory, create a regular file there, and read
 # <file>/.:
 	cd foo
 	touch meow
 	cat meow/.
 
 Everywhere I checked, there is a panic at that point:
 
 panic: union_lookup returning . (0xc8d83edc) not same as startdir (0xc8cb2e00)
 
 Relevant part of a backtrace (with gdb -k on saved core files of a
 4.8-CURRENT kernel compiled with debugging):
 
 <snip>
 #0  dumpsys () at /r+d/4.8/src/sys/kern/kern_shutdown.c:487
 #1  0xc022b067 in boot (howto=256) at /r+d/4.8/src/sys/kern/kern_shutdown.c:316
 #2  0xc022b4a5 in panic (
     fmt=0xc0420e80 "union_lookup returning . (%p) not same as startdir (%p)")
     at /r+d/4.8/src/sys/kern/kern_shutdown.c:595
 #3  0xc02674b8 in union_lookup (ap=0xc8d83d70)
     at /r+d/4.8/src/sys/miscfs/union/union_vnops.c:615
 #4  0xc02577fd in lookup (ndp=0xc8d83ec8) at vnode_if.h:52
 #5  0xc02572f8 in namei (ndp=0xc8d83ec8)
     at /r+d/4.8/src/sys/kern/vfs_lookup.c:153
 #6  0xc025fd43 in vn_open (ndp=0xc8d83ec8, fmode=1, cmode=0)
     at /r+d/4.8/src/sys/kern/vfs_vnops.c:138
 #7  0xc025be78 in open (p=0xc8d74ac0, uap=0xc8d83f80)
     at /r+d/4.8/src/sys/kern/vfs_syscalls.c:1029
 #8  0xc03c5a45 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, 
       tf_edi = 134564005, tf_esi = -1077939303, tf_ebp = -1077939744, 
       tf_isp = -925351980, tf_ebx = -1077939304, tf_edx = 134578912, 
       tf_ecx = 1, tf_eax = 5, tf_trapno = 12, tf_err = 2, tf_eip = 134531788, 
       tf_cs = 31, tf_eflags = 663, tf_esp = -1077939788, tf_ss = 47})
     at /r+d/4.8/src/sys/i386/i386/trap.c:1175
 #9  0xc03b5995 in Xint0x80_syscall ()
 </snip>
 
 I looked a bit at the code of the union filesystem, and the best I
 know until now is that it is because of union_allocvp putting NULL in
 (*ap->a_vpp) in (src/sys/miscfs/union/union_vnops.c,
                  union_lookup(...), about line 543):
 
         error = union_allocvp(ap->a_vpp, dvp->v_mount, dvp, upperdvp, cnp,
                               uppervp, lowervp, 1);
 
 which later triggers (src/sys/miscfs/union/union_vnops.c,
                       union_lookup(...), about line 573):
 
 #ifdef DIAGNOSTIC
         if (cnp->cn_namelen == 1 &&
             cnp->cn_nameptr[0] == '.' &&
             *ap->a_vpp != dvp) {
                 panic("union_lookup returning . (%p) not same as startdir (%p)", ap->a_vpp, dvp);
         }
 #endif
 
 But I'm not sure what exactly is wrong in or before union_allocvp, and
 right now I don't yet understand what's exactly going on in the code
 there (I'm not exactly sure what the DIAGNOSTIC marked code is doing
 there - what is it for, and why is this specific case special?, but I
 see union_lookup would just fail (and not panic) without it, so that's
 perhaps a workaround)...
 
 Can someone with more experience/understanding of the union filesystem
 take a look at this?
 
   Thanks,
   -- Tom
 </snip>
 
 -- 
   Tom Alsberg - hacker (being the best description fitting this space)
   Web page:	http://www.cs.huji.ac.il/~alsbergt/
 DISCLAIMER:  The above message does not even necessarily represent what
 my fingers have typed on the keyboard, save anything further.
 
 
 -- 
   Tom Alsberg - hacker (being the best description fitting this space)
   Web page:	http://www.cs.huji.ac.il/~alsbergt/
 DISCLAIMER:  The above message does not even necessarily represent what
 my fingers have typed on the keyboard, save anything further.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200306160910.h5G9A97u086923>