Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Feb 2001 09:03:26 -0500
From:      "James F. Hranicky" <jfh@cise.ufl.edu>
To:        freebsd-fs@FreeBSD.ORG
Subject:   Re: Softupdates umount bug? Or vinum problem? 
Message-ID:  <20010221140326.8AF89D80A@mail.cise.ufl.edu>
In-Reply-To: Message from Poul-Henning Kamp <phk@critter.freebsd.dk>  of "Tue, 20 Feb 2001 21:56:46 %2B0100." <95034.982702606@critter> 

next in thread | previous in thread | raw e-mail | index | archive | help
Poul-Henning Kamp <phk@critter.freebsd.dk> wrote: 
> 
> You may want to try to look at the commit mckusic did this
> morning/night/whatever, it looked like it had something to
> do with this problem.
 
Is there a time frame for either of these patches making their
way into -stable (I don't see either going into releng_4) ? 
Or can anyone suggest a workaround (e.g., run sync 10 times 
before umounting) ? Or, should I just disable softupdates for 
now (bummer)?

I'm willing to do more testing and work with anyone who's interested.

BTW, all my testing has been on 4.2-STABLE, with latest crash
occuring on this rev: 

	FreeBSD 4.2-STABLE #1: Thu Feb 15 11:31:58 EST 2001

I'm including all the info I mentioned in my first post here.

Jim

--------------------------------------------------------------------
Trace of 1st crash:
-------------------

   db> trace
   Debugger(c0338723) at Debugger+0x34
   panic(c0358bc7,ce22b400,d6f42d74,c02ad4dc,c7606800) at panic+0x70
   ufs_dirbad(c7606800,400,c0358b70,0,d6f9b240) at ufs_dirbad+0x3a
   ufs_lookup(d6f42db0,d6f42dc4,c01bf226,d6f42db0,d6f9b240) at ufs_lookup+0x260
   ufs_vnoperate(d6f42db0,d6f9b240,d6e3d403,d6f42eac,c02b2715) at ufs_vnoperate+0x15
   vfs_cache_lookup(d6f42e08,d6f42e18,c01c1fc4,d6f42e08,d6f9b240) at vfs_cache_lookup+0x28a
   ufs_vnoperate(d6f42e08,d6f9b240,c74bf600,d6f42eac,d6e6bbe0) at ufs_vnoperate+0x15
   lookup(d6f42e84,d6e6bbe0,d6e6bbe0,d6f42f80,d6e6bbe0) at lookup+0x290
   namei(d6f42e84,d6e6bbe0,2,d6f42f80,809c340) at namei+0x178
   stat(d6e6bbe0,d6f42f80,809c348,809e080,809c300) at stat+0x41
   syscall2(2f,2f,2f,809c300,809e080) at syscall2+0x1f1
   Xint0x80_syscall() at Xint0x80_syscall+0x25

Unfortunately, I don't have a dump.

Trace of 2nd crash + shell commands:
------------------------------------

   <root@dogwood:~> # mount /export/mail 
   <root@dogwood:~> # mount
   /dev/da0s1a on / (ufs, local)
   /dev/da0s1f on /private (ufs, local)
   /dev/da0s1e on /var (ufs, local)
   procfs on /proc (procfs, local)
   /dev/vinum/mail on /export/mail (ufs, NFS exported, local, soft-updates)
   <root@dogwood:~> # tunefs -p /export/mail 
   tunefs: soft updates:  (-n)                                enabled
   tunefs: maximum contiguous block count: (-a)               15
   tunefs: rotational delay between contiguous blocks: (-d)   0 ms
   tunefs: maximum blocks per file in a cylinder group: (-e)  2048
   tunefs: minimum percentage of free space: (-m)             1%
   tunefs: optimization preference: (-o)                      time
   tunefs: should optimize for space with minfree < 8%

[ Seeing if I could optimize for time with 1% minfree -- no. ]

   <root@dogwood:~> # time /var/tmp/make_imap_files
   /cise/mail/mhspool/aa1 done

[ crash ]   

   db> trace
   Debugger(c033bf63) at Debugger+0x34
   panic(c035a1e1,c035a1c0,41c0,7ada80,c70618d4) at panic+0x70
   ffs_valloc(d724ad00,41ed,c732c600,d6f01d08,d6f01e70) at ffs_valloc+0xf8
   ufs_mkdir(d6f01e70,d6f01f2c,c01ccc9a,d6f01e70,d6e58d40) at ufs_mkdir+0x85
   ufs_vnoperate(d6f01e70,d6e58d40,2,d6f01f80,c0374780) at ufs_vnoperate+0x15
   mkdir(d6e58d40,d6f01f80,8099030,4,8099037) at mkdir+0x162
   syscall2(2f,2f,2f,8099037,4) at syscall2+0x1f1
   Xint0x80_syscall() at Xint0x80_syscall+0x25

Here's the make_imap_script (I'll be running uw-imapd with mods
for MH mailboxes):

   #/bin/sh
   ypcat passwd | grep tcsh | perl -nle 'print "$1 $2" if (m,^(\S+):\S+:(\d+:\d+):,)' | \
   sort -u | egrep '^[abc]' |
     while read user ugid ; do
         mkdir -p $top/$user/Mail/inbox
   #
   # Tar about 1000 files into the user's mailbox for testing purposes
   #
   # Approx time to untar:
   #
   #  Vol with two 1 drive plexes: 2-3 seconds
   #  Vol with two 2 drive striped plexes: 1 second 
   #
         ( cd $top/$user/Mail/inbox ; tar xfBp /tmp/c.tar )
         /usr/bin/printf '%s\n%s\n%s\n' 'Path: Mail' 'draft-folder: drafts' 'unseen-sequence: unseen ' > \
                 $top/$user/.mh_profile ;
         /usr/sbin/chown -R $ugid $top/$user
         /bin/chmod -R go-rwx $top/$user
         echo $top/$user done
     done

This runs for about 2300 users, generating quite a bit of fs activity.
In this case, I'm only creating the accounts beginning with [abc]
though...

Info from the dump of the second crash:
---------------------------------------

   <root@palm:/usr/obj/private/freebsd-src/src/sys/CISEKERN> # gdb -k kernel.debug /var/crash/vmcore.1 
   GNU gdb 4.18
   [ ... ]
   IdlePTD 4567040
   initial pcb at 3a6000
   panicstr: ffs_valloc: dup alloc
   panic messages:
   ---
   panic: ffs_valloc: dup alloc
   
   syncing disks... 47 15 7 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 
   giving up on 2 buffers
   Uptime: 9h59m26s
   
   dumping to dev #da/0x20001, offset 1572992
   dump 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 
   ---
   #0  dumpsys () at /private/freebsd-src/src/sys/kern/kern_shutdown.c:469
   469             if (dumping++) {
   (kgdb) where
   #0  dumpsys () at /private/freebsd-src/src/sys/kern/kern_shutdown.c:469
   #1  0xc019bd63 in boot (howto=256) at /private/freebsd-src/src/sys/kern/kern_shutdown.c:309
   #2  0xc019c0f9 in panic (fmt=0xc035a1e1 "ffs_valloc: dup alloc")
       at /private/freebsd-src/src/sys/kern/kern_shutdown.c:556
   #3  0xc02a4054 in ffs_valloc (pvp=0xd724ad00, mode=16877, cred=0xc732c600, vpp=0xd6f01d08)
       at /private/freebsd-src/src/sys/ufs/ffs/ffs_alloc.c:609
   #4  0xc02b5c25 in ufs_mkdir (ap=0xd6f01e70) at /private/freebsd-src/src/sys/ufs/ufs/ufs_vnops.c:1310
   #5  0xc02b6cbd in ufs_vnoperate (ap=0xd6f01e70) at /private/freebsd-src/src/sys/ufs/ufs/ufs_vnops.c:2287
   #6  0xc01ccc9a in mkdir (p=0xd6e58d40, uap=0xd6f01f80) at vnode_if.h:653
   #7  0xc03004ed in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 134844471, tf_esi = 4, 
         tf_ebp = -1077937700, tf_isp = -688906284, tf_ebx = 134844464, tf_edx = 134844464, tf_ecx = 4, 
         tf_eax = 136, tf_trapno = 7, tf_err = 2, tf_eip = 134605108, tf_cs = 31, tf_eflags = 518, 
         tf_esp = -1077938064, tf_ss = 47}) at /private/freebsd-src/src/sys/i386/i386/trap.c:1150
   #8  0xc02f1ca5 in Xint0x80_syscall ()
   #9  0x8050a1d in ?? ()
   #10 0x8052da2 in ?? ()
   #11 0x8048135 in ?? ()
   (kgdb) up
   #1  0xc019bd63 in boot (howto=256) at /private/freebsd-src/src/sys/kern/kern_shutdown.c:309
   309                     dumpsys();
   (kgdb) 
   #2  0xc019c0f9 in panic (fmt=0xc035a1e1 "ffs_valloc: dup alloc")
       at /private/freebsd-src/src/sys/kern/kern_shutdown.c:556
   556             boot(bootopt);
   (kgdb) 
   #3  0xc02a4054 in ffs_valloc (pvp=0xd724ad00, mode=16877, cred=0xc732c600, vpp=0xd6f01d08)
       at /private/freebsd-src/src/sys/ufs/ffs/ffs_alloc.c:609
   609                     panic("ffs_valloc: dup alloc");
   (kgdb) list
   604             }
   605             ip = VTOI(*vpp);
   606             if (ip->i_mode) {
   607                     printf("mode = 0%o, inum = %lu, fs = %s\n",
   608                         ip->i_mode, (u_long)ip->i_number, fs->fs_fsmnt);
-->609                     panic("ffs_valloc: dup alloc");
   610             }
   611             if (ip->i_blocks) {                             /* XXX */
   612                     printf("free inode %s/%lu had %ld blocks\n",
   613                         fs->fs_fsmnt, (u_long)ino, (long)ip->i_blocks);
   (kgdb) up
   #4  0xc02b5c25 in ufs_mkdir (ap=0xd6f01e70) at /private/freebsd-src/src/sys/ufs/ufs/ufs_vnops.c:1310
   1310            error = UFS_VALLOC(dvp, dmode, cnp->cn_cred, &tvp);
   (kgdb) list
   1305            /*
   1306             * Must simulate part of ufs_makeinode here to acquire the inode,
   1307             * but not have it entered in the parent directory. The entry is
   1308             * made later after writing "." and ".." entries.
   1309             */
-->1310            error = UFS_VALLOC(dvp, dmode, cnp->cn_cred, &tvp);
   1311            if (error)
   1312                    goto out;
   1313            ip = VTOI(tvp);
   1314            ip->i_gid = dp->i_gid;


Shell history of 3rd fs crash:
------------------------------
   newfs -v -i 2048 -m 1 /dev/vinum/mail
   fsck /dev/vinum/mail

[ ok ]

   tunefs -n enable /export/mail
   tunefs -n enable /dev/vinum/mail
   fsck /dev/vinum/mail

[ ok ]

   tunefs -p /dev/vinum/mail
   tunefs -o time /dev/vinum/mail

[ fs changed back to space during fs IO ]

   vi /etc/fstab
   mount /export/mail
   df -k
   showmount -e
   cd /export/mail
   ls
   cd
   time /var/tmp/make_imap_files
   cat > /var/tmp/mkimf0
   vinum list
   more /tmp/mail.p1
   vinum create -f /tmp/mail.p1
   vinum list
   vinum attach mail.p1 mail
   vinum start mail.p1
   vinum list
   find /export/mail

[ ok ]

   time /var/tmp/make_imap_files
   rm -rf /cise/mail/mhspool/a*
   vinum list
   man iostat
   iostat -n 5 1
   ls -d /cise/mail/mhspool/a*
   rm -rf /cise/mail/mhspool/[bc]*
   vinum list
   time /var/tmp/make_imap_files
   vinum list
   cat > /var/tmp/mkimf1
   vinum list
   vinum list
   vinum list
   vinum list
   vinum list
   vinum list
   vinum list
   vinum list
   vinum list
   vinum list
   vinum list

[ waited for mirror to sync up, between 1-2 hours ]

   df -k /export/mail

[ seemed ok ]

   umount /export/mail
   fsck /dev/vinum/mail

[ failed, many errors ]

   mount /export/mail

[ failed ]

   vinum list
   echo '***** filesystem dead *****'

I went ahead and ran fsck again just now (Wed morning). Here are the 
highlights (3M worth of output).

fsck output:
---------------

   70578 DUP I=31872
   UNEXPECTED SOFT UPDATE INCONSISTENCY
   70579 DUP I=31872
   UNEXPECTED SOFT UPDATE INCONSISTENCY
   70580 DUP I=31873
   UNEXPECTED SOFT UPDATE INCONSISTENCY
   70581 DUP I=31873

[....]

   UNEXPECTED SOFT UPDATE INCONSISTENCY
   1806410 DUP I=849031
   UNEXPECTED SOFT UPDATE INCONSISTENCY
   INTERNAL ERROR: dups with -p

[....]

   UNEXPECTED SOFT UPDATE INCONSISTENCY
   UNALLOCATED NAME=/lost+found/#01311046

   UNEXPECTED SOFT UPDATE INCONSISTENCY
   UNALLOCATED NAME=/lost+found/#01311043

[....]

   UNEXPECTED SOFT UPDATE INCONSISTENCY
   DUP/BAD FILE=/lost+found/#00370626

[....]

    I=1125952  OWNER=aarcot MODE=0
   SIZE=0 MTIME=Feb 20 10:23 2001 

   REMOVE? yes

[....]

    I=2175671  OWNER=cjn MODE=0
   SIZE=0 MTIME=Feb 20 10:27 2001  COUNT 0 SHOULD BE -1
   ADJUST? yes

[....]

   ** Phase 5 - Check Cyl groups

   SALVAGE? yes


   SALVAGE? yes


   SALVAGE? yes

   (53252 frags, 4059511 blocks, 0.2% fragmentation)

   ***** FILE SYSTEM MARKED CLEAN *****

[ Seems like this is premature ]

   ***** FILE SYSTEM WAS MODIFIED *****

Here's what the dir looks like now:
-----------------------------------

   <root@dogwood:~> # ls /export/mail/mhspool 
   aa1/            aesquive/       ajohnson/       amm/            aritter/        av0/
   aa2/            aev/            ajp/            amolina/        arowan/         avance/

[....]

   adossani/       ajheim/         amlone/         aring/          aup/            inbox/
    
Many of the files were missing, and an inbox was put back in the wrong place.

lost+found:

   <root@dogwood:~> # ls /export/mail/lost+found 
   ls: #00354752: Bad file descriptor
   ls: #00539840: Bad file descriptor
   ls: #00802048: Bad file descriptor

[30 lines of that, then many # files]

   <root@dogwood:/export/mail/lost+found> # /bin/ls -1 | wc -l
       4449

I'm willing to do more looking around if somone can tell me
where to look.

--------------------------------------------------------------------

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010221140326.8AF89D80A>