Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 10 Dec 2001 17:08:00 +0300
From:      "Vladimir B.Grebenschikov" <vova@express.ru>
To:        FreeBSD-gnats-submit@freebsd.org
Subject:   kern/32672: Invalid FFS node allocation algorithm on systems with a lot of memory and lots of small files accessed
Message-ID:  <E16DR68-0000tT-00@vbook.express.ru>

next in thread | raw e-mail | index | archive | help

>Number:         32672
>Category:       kern
>Synopsis:       Invalid FFS node allocation algorithm on systems with a lot of memory and lots of small files accessed
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Dec 10 06:10:00 PST 2001
>Closed-Date:
>Last-Modified:
>Originator:     Vladimir B. Grebenschikov
>Release:        FreeBSD 4.4-RELEASE i386
>Organization:
SW Soft
>Environment:
FreeBSD vrebuild 4.4-RELEASE FreeBSD 4.4-RELEASE #4: Mon Dec 10 15:23:49 GMT 2001 root@vrebuild:/usr/src/sys/compile/VREBUILD  i386
maxusers        512
(tried both with UFS_DIRHASH and without UFS_DIRHASH, with SOFTUPDATES and without SOFTUPDATES)

System 2Gb RAM, 2 x 800MHz:

CPU: Pentium III/Pentium III Xeon/Celeron (803.41-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x68a  Stepping = 10

Features=0x387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE> = 2147483648 (2097152K bytes)

avail memory = 2087796736 (2038864K bytes)
Programming 16 pins in IOAPIC #0
IOAPIC #0 intpin 2 -> irq 0
Programming 16 pins in IOAPIC #1
FreeBSD/SMP: Multiprocessor motherboard
 cpu0 (BSP): apic id:  0, version: 0x00040011, at 0xfee00000
 cpu1 (AP):  apic id:  1, version: 0x00040011, at 0xfee00000
 io0 (APIC): apic id:  4, version: 0x000f0011, at 0xfec00000
 io1 (APIC): apic id:  5, version: 0x000f0011, at 0xfec01000

>Description:

In case of a lot of memory and lots of small files operations ('make release' in my case)
system can reach maximum of M_FFSNODE (inode) objects and deadlocks in 
ufs/ffs/ffs_vfsops.c:ffs_vget()

==============================================================================
        /*
         * Lock out the creation of new entries in the FFS hash table in
         * case getnewvnode() or MALLOC() blocks, otherwise a duplicate 
         * may occur!
         */
        if (ffs_inode_hash_lock) {
                while (ffs_inode_hash_lock) {
                        ffs_inode_hash_lock = -1;
                        tsleep(&ffs_inode_hash_lock, PVM, "ffsvgt", 0);
                }
                goto restart;
        }
        ffs_inode_hash_lock = 1;

        /*
         * If this MALLOC() is performed after the getnewvnode()
         * it might block, leaving a vnode with a NULL v_data to be
         * found by ffs_sync() if a sync happens to fire right then,
         * which will cause a panic because ffs_sync() blindly
         * dereferences vp->v_data (as well it should).
         */
        MALLOC(ip, struct inode *, sizeof(struct inode),
            ump->um_malloctype, M_WAITOK);
=========================================================================


One process gets sleeping on "FFS Node" (in MALLOC in the above code) because 
maximum of M_FFSNODE objects is reached (for me it is 0x6400000), in my case 
it was 'cvs checkout' from make release scripts.

All the other processes trying to get access to disk get locked on "ffsvgt"
(because ffs_inode_hash_lock is taken by cvs)

So some comments:

1st: I think the placement of lock and MALLOC in ffs_vget() needs to be 
changed to avoid deadlocks.
(first do MALLOC and then lock ffs_inode_hash_lock) 

2nd: We need to do something when the number of allocated ffsnode objects is exceeded (its
limit is set to vm_kmem_size/2 by default), free some cache objects or so.

>How-To-Repeat:

Get 2Gb RAM system and run make release (with ports and docs)

>Fix:

See above
>Release-Note:
>Audit-Trail:
>Unformatted:

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E16DR68-0000tT-00>