Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 3 Nov 1996 23:18:23 -0500 (EST)
From:      Thomas David Rivers <ponds!rivers@dg-rtp.dg.com>
To:        dyson@freebsd.org
Cc:        ponds!freefall.cdrom.com!freebsd-hackers, ponds!lakes.water.net!rivers
Subject:   More info on the daily panics...
Message-ID:  <199611040418.XAA01799@lakes.water.net>

next in thread | raw e-mail | index | archive | help
> > 
> > 
> > Everyone -
> > 
> >  As an example of something that crashes 2.1.5 (fairly quickly, I
> > may add) and will not likely be determined by 'crashme'.  I offer
> > the following shell script.
> > 
> If you can digest the problem with your system down to this level, I am sure
> that a fix will be quickly forthcoming.  Try the fix that Bruce just
> posted, and see if your problem goes away.
> 
> John
> 

 Well - I don't really have any ideas... I was trying to suggest that
running 'crashme' for weeks on end probably isn't going to route out
too many, if any, bugs...  The problems I'm seeing appear to all
be file-system related.

 I've applied Bruce's fix; but it really was for a different problem.

 I got another panic today:

	ffs_valloc: dup alloc

 that's the one I usually get when doing the newfs on the install
(for 2.1.0 and 2.1.5) if I don't alter the newfs parms.  Just to
be clear; this is a 2.1.5-Stable kernel with Bruce's ufs_vnops.c
fix.

 Here's the gdb -k traceback.  This isn't a debugable kernel (I should
probably fix that), so there's not too much information:


Script started on Sun Nov  3 22:39:49 1996
[ponds.water.net]$ gdb -k kernel.6 vmcore.6
GDB is free software and you are welcome to distribute copies of it
 under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.13 (i386-unknown-freebsd), 
Copyright 1994 Free Software Foundation, Inc...(no debugging symbols found)...
IdlePTD 1e4000
current pcb at 1d5484
panic: ffs_valloc: dup alloc
#0  0xf0193c7b in boot ()
(kgdb) where
#0  0xf0193c7b in boot ()
#1  0xf0112b83 in panic ()
#2  0xf0175183 in ffs_valloc ()
#3  0xf01813c6 in ufs_makeinode ()
#4  0xf017ed85 in ufs_create ()
#5  0xf012cb97 in vn_open ()
#6  0xf012a3cf in open ()
#7  0xf019bff6 in syscall ()
#8  0xf01914bb in Xsyscall ()
#9  0x33c0 in ?? ()
#10 0x32cd in ?? ()
#11 0x327d in ?? ()
#12 0x31d7 in ?? ()
#13 0x2f7b in ?? ()
#14 0x2e76 in ?? ()
#15 0x3b87 in ?? ()
#16 0x4854 in ?? ()
#17 0x474e in ?? ()
#18 0x467a in ?? ()
#19 0x1ffe in ?? ()
#20 0x1e6b in ?? ()
#21 0x1d21 in ?? ()
#22 0x16a7 in ?? ()
#23 0x10d3 in ?? ()
(kgdb) quit
[ponds.water.net]$ 
Script done on Sun Nov  3 22:40:09 1996

 As you can see from the ls -l on /var/crash - after I enabled
savedump; I get a crash almost every day (although I went for
almost a week and ran just fine):

[ponds.water.net]$ pwd
/var/crash
[ponds.water.net]$ ls -l
total 131684
-rw-r--r--  1 root  wheel        2 Nov  3 13:21 bounds
-rw-r--r--  1 root  wheel   956177 Oct 23 01:29 kernel.0
-rw-r--r--  1 root  wheel   956177 Oct 24 03:07 kernel.1
-rw-r--r--  1 root  wheel   965207 Oct 30 07:54 kernel.2
-rw-r--r--  1 root  wheel   965207 Oct 31 03:20 kernel.3
-rw-r--r--  1 root  wheel   965207 Nov  1 03:21 kernel.4
-rw-r--r--  1 root  wheel   965207 Nov  2 03:24 kernel.5
-rw-r--r--  1 root  wheel   965207 Nov  3 13:22 kernel.6
-rw-rw-r--  1 root  wheel        5 Jul 16 22:37 minfree
-rw-r--r--  1 root  wheel  8650752 Oct 23 01:29 vmcore.0
-rw-r--r--  1 root  wheel  8650752 Oct 24 03:07 vmcore.1
-rw-r--r--  1 root  wheel  8650752 Oct 30 07:54 vmcore.2
-rw-r--r--  1 root  wheel  8650752 Oct 31 03:20 vmcore.3
-rw-r--r--  1 root  wheel  8650752 Nov  1 03:21 vmcore.4
-rw-r--r--  1 root  wheel  8650752 Nov  2 03:23 vmcore.5
-rw-r--r--  1 root  wheel  8650752 Nov  3 13:22 vmcore.6


(By the way, the panic on Nov 2nd - vmcore.5, was 
    panic: ifree: freeing free inode
 the most frequent one I see...)

You'll note from the times, these crashes tend to occur at
roughly the same time.  I've been scanning the cron logs; the
only thing I see starting up about that time is the 03:15:00 
/usr/libexec/atrun and a 03:15:00 /usr/lib/newsbin/input/newsrun.
Both of these run every hour (well, atrun runs more often than that,
of course) - so I'm not yet sure exactly what command(s) is/are
causing the panic.

Almost all the entries in the cron logs look like:

Oct 23 01:15:01 ponds CRON[3494]: (news) CMD (/usr/lib/newsbin/input/newsrun)
Oct 23 01:20:00 ponds CRON[3610]: (root) CMD (/usr/libexec/atrun)
Oct 23 01:30:01 ponds cron[138]: (CRON) STARTUP (fork ok)

although a few have something besides 'newsrun' starting up...  it all
seems pretty harmless.

Let me add that there are no 'at' entries in the queue, for anyone
on the system...

Perhaps my next step is to build a debuggable kernel and start
debugging...

Let me add that a duplicate alloc, and freeing already free'd inodes,
etc.... tend to point to a problem in the same area.  Something
fishy the the inode management in ffs_alloc.c.

Of course, a reliable reproduction of the problem would help toward
locating the fix...

The 'freeing free inode' occurs if cg_inosused for the inode being
free'd is already clear.  The 'dup alloc' panic occurs if the i_mode
of the inode which was returned from the allocator was already set
(indicating the inode was already allocated.)  The allocator here
is ffs_nodealloccg() - which can return a wrong result if the cg_inosused
field is clear... 

isclr() (from /usr/include/sys/param.h) is:

   #define isclr(a,i)      (((a)[(i)/NBBY] & (1<<((i)%NBBY))) == 0)

[NBBY is '8' from types.h.]  This seems completely reasonable for
accessing a bit from a character array...

The cg_inosused() macro to access the used inodes table would be
incorrect if CG_MAGIC wasn't set for some reason - then you would
wind up pointing to trash for the inode table...  

This is what makes me think that allocating more inodes/blocks
in a file system tickles this problem...

Ideas on possible investigation avenues (i.e. hints on where to 
look) would be appreciated :-)   But, I'm starting to become
familiar with ffs_valloc.c & the cg structure :-)

	- Dave Rivers -






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199611040418.XAA01799>