Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 6 Nov 2002 13:13:33 +0200 (EET)
From:      Iasen Kostov <ikostov@otel.net>
To:        Archie Cobbs <archie@dellroad.org>
Cc:        freebsd-net@FreeBSD.ORG
Subject:   Re: NFS functions does *NOT* check if they really have allocated any memory
Message-ID:  <20021106120422.G80368-100000@shadowhand.OTEL.net>
In-Reply-To: <200211052106.gA5L6igd039808@arch20m.dellroad.org>

next in thread | previous in thread | raw e-mail | index | archive | help


On Tue, 5 Nov 2002, Archie Cobbs wrote:

> Iasen Kostov writes:
> >   As I experience system crushes at time of mbufs exhaustion I've compiled
> > a debug kernel and traced the problem. I seems the NFS functions
> > (nfsm_rpchead, nfsm_reqh ...) does *NOT* chek if they really have
> > allocated memory by MGET macro.
>
> No check is necessary if M_WAIT is specified; the M_GET() function
> is always successful in that case. Same for malloc().

  If that was true, I should not see any traps 12 , should I ? :)
  In case of nfsm_reqh MGET() called as  MGET(mb, M_WAIT, MT_DATA) returns
NULL in casese of mbuf exhaustion.

this is fix/test a add to nfsm_reqh() function:
nfs/nfs_subs.c:591
        MGET(mb, M_WAIT, MT_DATA);
/*
*	This becomes true when there is no more mbufs available.
*	If you don't belive me - test it :)
*/
        if(mb == 0) {
            printf("nfsm_reqh: no memory for header\n");
            return NULL;
        }

	If there was not this check - kernel crushes at this point:
nfs/nfs_subs.c:592 // Of the original file

        if (hsiz >= MINCLSIZE) {
                MCLGET(mb, M_WAIT);
        }


	Here is the panic message:

IdlePTD at phsyical address 0x00326000
initial pcb at physical address 0x00299ba0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0xc
fault code              = supervisor write, page not present
instruction pointer     = 0x8:0xc01d1864
stack pointer           = 0x10:0xcd717d68
frame pointer           = 0x10:0xcd717d7c
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 167 (ls)
interrupt mask          = none
trap number             = 12
panic: page fault

	And backtrace:

#0  dumpsys () at ../../kern/kern_shutdown.c:487
#1  0xc015182b in boot (howto=256) at ../../kern/kern_shutdown.c:316
#2  0xc0151c50 in poweroff_wait (junk=0xc02734ec, howto=-1071173617)
    at ../../kern/kern_shutdown.c:595
#3  0xc0241382 in trap_fatal (frame=0xcd717d28, eva=12)
    at ../../i386/i386/trap.c:974
#4  0xc0241055 in trap_pfault (frame=0xcd717d28, usermode=0, eva=12)
    at ../../i386/i386/trap.c:867
#5  0xc0240c3f in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = 16, tf_edi
= 0,
      tf_esi = 0, tf_ebp = -848200324, tf_isp = -848200364, tf_ebx = 1,
      tf_edx = 0, tf_ecx = 6685184, tf_eax = 0, tf_trapno = 12, tf_err =
2,
      tf_eip = -1071835036, tf_cs = 8, tf_eflags = 66183, tf_esp = 512,
      tf_ss = -848199992}) at ../../i386/i386/trap.c:466
#6  0xc01d1864 in nfsm_reqh (vp=0xcd70dc00, procid=4, hsiz=72,
    bposp=0xcd717dc0) at ../../nfs/nfs_subs.c:593
#7  0xc01d83c5 in nfs3_access_otw (vp=0xcd70dc00, wmode=63, p=0xcbff2080,
    cred=0xc131b100) at ../../nfs/nfs_vnops.c:292
#8  0xc01d8dab in nfs_getattr (ap=0xcd717e20) at ../../nfs/nfs_vnops.c:637
#9  0xc018660f in vn_stat (vp=0xcd70dc00, sb=0xcd717ec8, p=0xcbff2080)
    at vnode_if.h:276
#10 0xc01865cc in vn_statfile (fp=0xc1320fc0, sb=0xcd717ec8, p=0xcbff2080)
    at ../../kern/vfs_vnops.c:451
#11 0xc01468cf in fstat (p=0xcbff2080, uap=0xcd717f80) at
../../sys/file.h:206
.
.
.

(kgdb) l nfs_subs.c:593
588             struct nfsmount *nmp;
589             int nqflag;
590
591             MGET(mb, M_WAIT, MT_DATA); << Here MGET returns NULL in mb
(I'm sure - I saw it :)
592             if (hsiz >= MINCLSIZE)
593                     MCLGET(mb, M_WAIT); << At this point kernel
crushes
594             mb->m_len = 0;
595             bpos = mtod(mb, caddr_t);
596
597             /*

	As you said - MGET used with M_WAIT flag should never return NULL
pointer. Is this a problem with MGET macro or it is somewhere in functions
that it calls? But wherever is the problem it is a big problem :). It make
(at least) NFS servers unstable and could lead to data loss (when kernel
crashes).


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20021106120422.G80368-100000>