Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 30 Jul 2003 00:41:49 -0300 (ADT)
From:      The Hermit Hacker <scrappy@hub.org>
To:        Don Bowman <don@sandvine.com>
Cc:        'Robert Watson' <rwatson@freebsd.org>
Subject:   RE: kernel deadlock
Message-ID:  <20030730004133.X17191@hub.org>
In-Reply-To: <FE045D4D9F7AED4CBFF1B3B813C853370274206C@mail.sandvine.com>
References:  <FE045D4D9F7AED4CBFF1B3B813C853370274206C@mail.sandvine.com>

next in thread | previous in thread | raw e-mail | index | archive | help

Suggestion: upgrade to 4.8-STABLE ... there hav

On Tue, 29 Jul 2003, Don Bowman wrote:

> From: Don Bowman [mailto:don@sandvine.com]
> >
> > From: Robert Watson [mailto:rwatson@freebsd.org]
> > > On Tue, 29 Jul 2003, Dave Dolson wrote:
> > >
> > > > To follow up, I've discovered that the system has
> > exhausted its "FFS
> > > > node" malloc type.
> >  ...
> > >
> > > Some problems with this have turned up in -CURRENT on large-memory
> > > machines where some of the scaling factors have been off.  In
> >
> > We currently have kern.maxvnodes=70354 set (automatically
> > scaled). This
> > is a 1GB box.
> >
> > I will try re-running the test with less.
> >
> > when it hits kern.maxvnodes, what will it do?
>
> After applying the fixes from RELENG_4 for kern/52425,
> I can still easily reproduce this hang without low memory.
> Further debugging shows that vnlru process is waiting on
> vlrup. This line is shown below. ie vnlru_nowhere is being
> incremented ever 3 seconds.
>
> static void
> vnlru_proc(void)
> {
>  ...
>         s = splbio();
>         for (;;) {
>  ...
>                 if (done == 0) {
>                         vnlru_nowhere++;
>                         tsleep(vnlruproc, PPAUSE, "vlrup", hz * 3);
>                 }
>         }
>         splx(s);
>
> syncher is in vlruwk wait from getnewvnode().
>
> lots of other processes waiting on ffsvgt.
>
> this implies that vlrureclaim() was unable to free anything.
>
> i have maxvnode = 35k. as soon as i hit this value, my system locked
> up [bash on serial shell non-responsive, serial driver echos chars,
> can drop into ddb]. Processes which don't use filesystem seem to continue
> to run ok.
>
> A couple of procs are waiting on inode: env, cron. These never come
> out of waiting for it.
>
> suggestions?
>
> db> ps
>   pid   proc     addr    uid  ppid  pgrp  flag stat wmesg   wchan   cmd
>   649 dc35a8a0 e0a32000    0   641   641 004104  3  ffsvgt c03698a8 atrun
>   648 dc35a3c0 e0e36000    0   647   648 000014  3  vlruwk c0364c90 cron
>   647 dc35b740 e03d4000    0   135   135 000004  3  ppwait dc35b740 cron
>   646 dc35b0c0 e03ee000    0   635   101 004004  3   inode c368ee00 env
>   645 dc35ad80 e03f1000    0   212   644 004006  3  ffsvgt c03698a8 grep
>   644 dc35aa40 e0400000    0   212   644 004006  3  ffsvgt c03698a8 sysctl
>   641 dc35a080 e0e4c000    0   640   641 004084  3    wait dc35a080 sh
>   640 dc35a220 e0e39000    0   135   135 000084  3  piperd e037c5c0 cron
>   635 dc35a560 e0e32000    0   101   101 004084  3  piperd e037cd40 sh
>   456 dc35abe0 e03fc000    0   133   456 4004004  3  ffsvgt c03698a8 tclsh83
>   212 dc35bdc0 e0392000    0   199   212 004086  3    wait dc35bdc0 bash
>   199 dc35c440 e036e000    0     1   199 004186  3    wait dc35c440 login
>   187 dc35c2a0 e0376000    0     1     7 000086  3  select c037c460 snmpd
>   169 dc35af20 e03e7000    0     1   169 000084  3  nanslp c0364970
> siocontrol
>   163 dc35b260 e03e2000    0     1   163 000084  3  nanslp c0364970 wddt
>   143 dc35b400 e03dd000   25     1   143 2000184  3   pause e03dd260
> sendmail
>   140 dc35b5a0 e03d9000    0     1   140 000184  3  select c037c460 sendmail
>   137 dc35b8e0 e03d0000    0     1   137 000184  3  select c037c460 sshd
>   135 dc35ba80 e03c2000    0     1   135 000004  3   inode c35f4400 cron
>   133 dc35bc20 e0397000    0     1   133 000084  3  select c037c460 inetd
>   124 dc35bf60 e0382000    0     1   124 000084  3  select c037c460 syslogd
>   101 dc35c100 e037e000    0     1   101 000084  3    wait dc35c100 dhclient
>     6 dc35c5e0 defd1000    0     0     0 000204  3   vlrup dc35c5e0 vnlru
>     5 dc35c780 defce000    0     0     0 000204  3  syncer c037c388 syncer
>     4 dc35c920 defcb000    0     0     0 000204  3  psleep c0364b3c
> bufdaemon
>     3 dc35cac0 defc8000    0     0     0 000204  3  psleep c0373280 vmdaemon
>     2 dc35cc60 defc5000    0     0     0 000204  3  psleep c0352118
> pagedaemon
>     1 dc35ce00 dc361000    0     0     1 004284  3    wait dc35ce00 init
>     0 c037b760 c040e000    0     0     0 000204  3   sched c037b760 swapper
>
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
>

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030730004133.X17191>