Date: Tue, 12 Dec 2000 10:01:41 -0800 From: Kachun Lee <kachun@pathlink.com> To: Matt Dillon <dillon@earth.backplane.com> Cc: freebsd-stable@freebsd.org Subject: Re: Extreme high load with 12/7 4-releng Message-ID: <200012121801.KAA42878@pathlink.net> In-Reply-To: <200012120257.eBC2vC798971@earth.backplane.com> References: <200012120230.SAA32402@pathlink.net>
next in thread | previous in thread | raw e-mail | index | archive | help
At 06:57 PM 12/11/00 -0800, you wrote: > >:I upgraded 2 of our servers from 4-releng around 4.1.1-release to one that >:cvsup on Dec 7. Before the upgrade, the systems were running at load around >:2. After the upgrade, the load went to over 40 just after few hours of >:usage. Here was some data from top... > > 513 processes? What are you running on the machine? 'ps axl' > > By the feel of it I'm guessing a news machine, in which case it could > simply be catching up on the feed. > > -Matt Yep, they are news (nntp) frontends, but no feed stuff. They run a custom nntpd that access the spool by NFS. They have been running that way for several years and the last rev to the nntpd was at least 6 months old. This was the top from a server still running 4.1.1-releng (just before 4.2-release): -------- last pid: 73544; load averages: 1.52, 2.07, 2.24 up 1+20:15:10 20:43:28 574 processes: 7 running, 567 sleeping CPU states: 19.5% user, 0.4% nice, 22.9% system, 9.9% interrupt, 47.3% idle Mem: 286M Active, 92M Inact, 107M Wired, 15M Cache, 61M Buf, 996K Free Swap: 600M Total, 600M Free ------- I saw there was a big different in Inact and no swap used. >:----- >:last pid: 26893; load averages: 36.05, 40.55, 47.01 up 0+04:29:07 >:18:14:54 >:513 processes: 9 running, 503 sleeping, 1 zombie >:CPU states: 21.3% user, 0.7% nice, 30.0% system, 10.7% interrupt, 37.3% idle >:Mem: 196M Active, 204M Inact, 83M Wired, 17M Cache, 61M Buf, 1152K Free >:Swap: 600M Total, 17M Used, 583M Free, 2% Inuse, 232K Out Here was a pxl from one of the servers... thanks for helping UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND 0 0 0 0 -18 0 0 0 sched DLs ?? 0:00.74 (swapper) 0 1 0 0 10 0 528 204 wait ILs ?? 0:00.65 /sbin/init - 0 2 0 0 -18 0 0 0 psleep DL ?? 5:18.74 (pagedaemon 0 3 0 0 18 0 0 0 psleep DL ?? 0:00.00 (vmdaemon) 0 4 0 0 -18 0 0 0 psleep DL ?? 5:36.70 (bufdaemon) 0 5 0 0 -6 0 0 0 biord DL ?? 16:50.85 (syncer) 0 25 1 0 10 0 524932 17064 mfsidl SLs ?? 0:02.37 mfs /dev/ad 0 0 32 1 21 18 0 208 64 pause Is ?? 0:00.00 adjkerntz -i 0 141 1 0 2 0 904 536 select Ss ?? 2:02.97 syslogd -s 0 144 1 0 2 0 1224 512 select Ss ?? 0:04.64 timed 1 146 1 0 2 0 928 528 select Is ?? 0:00.31 /usr/sbin/po 0 151 1 0 2 0 552 308 select Is ?? 0:00.33 mountd -r 0 153 1 0 2 0 360 140 accept Is ?? 0:00.00 nfsd: master 0 155 153 0 2 0 352 132 nfsd S ?? 1:01.77 nfsd: server 0 156 153 0 2 0 352 132 nfsd I ?? 0:00.81 nfsd: server 0 157 153 0 2 0 352 132 nfsd I ?? 0:00.06 nfsd: server 0 158 153 0 2 0 352 132 nfsd I ?? 0:00.01 nfsd: server 0 161 1 93 2 0 263056 488 select Is ?? 0:00.00 rpc.statd 0 165 1 7 10 0 208 28 nfsidl S ?? 33:53.24 nfsiod -n 8 0 166 1 1 10 0 208 28 nfsidl S ?? 25:32.08 nfsiod -n 8 0 167 1 0 10 0 208 28 nfsidl S ?? 13:27.34 nfsiod -n 8 0 168 1 0 10 0 208 28 nfsidl S ?? 8:34.35 nfsiod -n 8 0 169 1 0 10 0 208 28 nfsidl S ?? 4:50.07 nfsiod -n 8 0 170 1 1 10 0 208 28 nfsidl S ?? 3:03.59 nfsiod -n 8 0 171 1 0 10 0 208 28 nfsidl S ?? 1:53.19 nfsiod -n 8 0 172 1 0 10 0 208 28 nfsidl S ?? 1:19.36 nfsiod -n 8 0 177 1 0 2 0 1172 796 select Ss ?? 15:33.84 amd 1 181 1 0 2 0 876 504 sbwait Ss ?? 0:21.87 rwhod 0 196 1 0 2 0 1036 624 select Is ?? 0:03.52 inetd -wW 0 198 1 0 10 0 960 612 nanslp Is ?? 0:02.82 cron 0 204 1 0 2 0 1536 1048 select Is ?? 0:01.03 sendmail: ac 0 275 1 0 2 0 1372 900 select Ss ?? 0:07.28 /usr/local/e 7003 336 1 0 2 -10 1656 784 accept S<s ?? 1:11.74 nnrpd: acce 0 18088 196 0 2 0 2116 1100 select Ss ?? 0:00.05 telnetd 7003 18268 326 248 10 0 176 32 nanslp I ?? 0:00.00 sleep 60 7003 18274 336 0 2 -10 1656 876 kqread S< ?? 0:00.00 nnrpd: conn 0 43789 196 0 2 0 2116 996 select Is ?? 0:00.05 telnetd 7005 83944 275 0 18 0 1440 952 lockf I ?? 0:00.04 /usr/local/e 7005 83945 275 0 2 0 1440 956 accept I ?? 0:00.04 /usr/local/e 7005 86962 275 0 18 0 1440 956 lockf I ?? 0:00.03 /usr/local/e 0 719 1 0 3 0 216 80 ttyin I p0- 0:00.00 cat /dev/rex 7003 422 336 0 2 0 4084 1556 sbwait S ?? 0:14.05 znnrpd 7003 1172 336 0 2 0 4772 1656 select S ?? 0:32.93 znnrpd 7003 1357 336 1 2 0 4312 1808 sbwait S ?? 0:29.02 znnrpd 7003 1453 336 0 28 0 4124 1612 - R ?? 0:35.77 znnrpd ... (400-500 of these) 7003 1555 336 0 2 0 4364 2064 sbwait S ?? 0:03.38 znnrpd 30001 18089 18088 202 10 0 1308 936 wait Is p0 0:00.07 -bash (bash) 0 18109 18089 1 10 0 1312 940 wait S p0 0:00.05 -su (bash) 0 18276 18109 2 28 0 956 196 - R+ p0 0:00.01 ps axl 30004 43791 43789 192 10 0 1320 920 wait Is p2 0:00.07 -bash (bash) 0 43803 43791 0 3 0 1296 896 ttyin I+ p2 0:00.05 -su (bash) 0 374 1 0 3 0 920 512 ttyin Is+ v0 0:00.00 /usr/libexec 0 375 1 0 3 0 920 516 ttyin Is+ v1 0:00.00 /usr/libexec 0 376 1 0 3 0 920 516 ttyin Is+ v2 0:00.00 /usr/libexec 7003 326 1 248 10 0 1336 940 wait I d0- 0:25.95 /bin/sh /usr 7003 342 1 5 18 0 1860 952 opause I d0- 0:02.33 /usr/local/b 0 377 1 0 3 0 920 516 ttyin Is+ d0 0:00.00 /usr/libexec >: >:The one thing I noticed was the system started swaping constantly, even >:though the Swap Used did not go up. Also, the system still had 204M Inact. >:No Swap Used before the upgrade. >: >:I did some search on the mail lists and saw a long thread in hacker related >:to vm_paging, but I could not find any conlusion to that thread. I did not >:see any MFC, other than a vm issue that needed to turn on by sysctl, that >:looked might be related. Any insight to this problem? >: >:Best regards > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200012121801.KAA42878>