Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 2 Feb 1999 12:14:08 -0800 (PST)
From:      templin@erg.sri.com
To:        freebsd-gnats-submit@FreeBSD.ORG
Subject:   kern/9883: MGET()(and variants) return NULL with M_WAIT flag; system crashes!
Message-ID:  <199902022014.MAA26033@hub.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         9883
>Category:       kern
>Synopsis:       MGET()(and variants) return NULL with M_WAIT flag; system crashes!
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Feb  2 12:20:01 PST 1999
>Closed-Date:
>Last-Modified:
>Originator:     Fred L. Templin
>Release:        2.2.7-19981122-SNAP
>Organization:
SRI International (The ANCORS Project)
>Environment:
FreeBSD sequoia.csl.sri.com 2.2.7-19981122-SNAP FreeBSD 2.2.7-19981122-SNAP #3:
Fri Jan 29 23:21:23 PST 1999     livio@sequoia.csl.sri.com:/usr/src/sys/compile/
EMERALD  i386
>Description:
(This problem was discovered during research ativities with the ANCORS
Project. Project information is at:  http://www.csl.sri.com/ancors)


We have a system which periodically runs out of mbuf's. (The nature
of the mbuf leak is beyond the scope of this problem report.) When
the 'mb_map' gets depleted, the kernel routines 'm_get()', 'MGET()'
and 'MGETHDR()' can return NULL *even when the caller has set the
M_WAIT flag*. But, there are numerous instances in the kernel in
which callers fail to check the return code and simply begin
referencing what they believe to be a valid mbuf. In our case,
the routine 'nfs_rephead()' is referencing a NULL mbuf pointer
returned from MGETHDR(), causing the system to panic with a
'page fault' trap.   
>How-To-Repeat:
Insert an artificial mbuf leak into the kernel. Configure the machine
as an NFS server and have some NFS activity going on before triggering
the mbuf leak. A crash out of the NFS code should eventually result
after the mb_map has been depleted by the leak.
>Fix:
Search out all instances in the kernel in which the return code from
the MGET() variants (with the M_WAIT flag set) is not checked. Check
the return code and handle the condition of a NULL mbuf pointer in the
appropriate way in each instance. Numerous examples which fail to check
the return code are in the kernel NFS code in: /usr/src/sys/nfs, but
there are many other instances scattered throughout the kernel. This
all assumes that returning NULL with the M_WAIT flag is a reasonable
thing to do. Since so much existing code seems to assume that setting
the M_WAIT flag will always result in a valid mbuf return, perhaps
allowing a NULL return is a bad thing to do in the first place... 
>Release-Note:
>Audit-Trail:
>Unformatted:

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199902022014.MAA26033>