Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 27 May 2003 20:00:32 -0300 (ADT)
From:      "Marc G. Fournier" <scrappy@hub.org>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        stable@freebsd.org
Subject:   Re: system slowdown - vnode related
Message-ID:  <20030527194741.B9254@hub.org>
In-Reply-To: <200305272105.h4RL5ppG067806@apollo.backplane.com>
References:  <20030521171941.364325314@netcom1.netcom.com>  <20030524190051.R598@hub.org><20030526123617.D56519@hub.org>  <20030526130556.G56519@hub.org> <1054060050.640.35.camel@netcom1.netcom.com> <200305272105.h4RL5ppG067806@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help

As a side-bar (or addition) ... we recently had one of those 'certain
circumstances" ... basically, the system still reported a fair # of vnodes
free, but one of the file systems seemed to "lock up" ... its hard to
describe ... but, basically, I could do an 'ls -lt /', or 'ls -lt /usr',
but if I tried to do an 'ls -lt /vm', the ls would just hang there ... if
I was logged into one of the VMs that were running *on* that file system,
I could do an ls no problem ... I could do a 'df' of the system, but not a
'df /vm' ...

I was able to get a ctl-alt-esc -> panic which dump'd core ... David
Schultz (between moves) took a quick look at it and thinks that there
might be a memory leak in there somewhere, and is hoping over the next few
weeks to take deeper look into it ...

The reason (David, correct me if I'm wrong here?) that he suspected a
memory leak was the size of:

         temp12882897204801K 204801K204800K182280393  289     0  16,32,64,128,256,512,1K,2K,4K,8K,16K,32K,64K,128K

I had similar happen on a different server (with about 1/2 of the
unionfs's) but was unable to get a core dump out of that one ...

One thing that hasn't been mentioned in this thread so far as I've seen
... if anyone is playing with stuff like this, make sure you have DDB
enabled ... it doesn't always work, especially with these hangs, but if
you get lucky and it does, it tends to go along way towards helping to
debug such things ...

On Tue, 27 May 2003, Matthew Dillon wrote:

>
> :I'll try this if I can tickle the bug again.
> :
> :I may have just run out of freevnodes - I only have about 1-2000 free
> :right now.  I was just surprised because I have never seen a reference
> :to tuning this sysctl.
> :
> :- Mike H.
>
>     The vnode subsystem is *VERY* sensitive to running out of KVM, meaning
>     that setting too high a kern.maxvnodes value is virtually guarenteed to
>     lockup the system under certain circumstances.  If you can reliably
>     reproduce the lockup with maxvnodes set fairly low (e.g. less then
>     100,000) then it ought to be easier to track the deadlock down.
>
>     Historically speaking systems did not have enough physical memory to
>     actually run out of vnodes.. they would run out of physical memory
>     first which would cause VM pages to be reused and their underlying
>     vnodes deallocated when the last page went away.  Hence the amount of
>     KVM being used to manage vnodes (vnode and inode structures) was kept
>     under control.
>
>     But today's Intel systems have far more physical memory relative to
>     available KVM and it is possible for the vnode management to run
>     out of KVM before the VM system runs out of physical memory.
>
>     The vnlru kernel thread is an attempt to control this problem but it
>     has had only mixed success in complex vnode management situations
>     like unionfs where an operation on a vnode may cause accesses to
>     additional underlying vnodes.  In otherwords, vnlru can potentially
>     shoot itself in the foot in such situations while trying to flush out
>     vnodes.
>
> 						-Matt
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030527194741.B9254>