From owner-freebsd-performance@FreeBSD.ORG Wed Aug 11 18:56:52 2010 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2C4E71065674 for ; Wed, 11 Aug 2010 18:56:52 +0000 (UTC) (envelope-from markham_breitbach@ssimicro.com) Received: from mail.ssimicro.com (mail.ssimicro.com [64.247.129.10]) by mx1.freebsd.org (Postfix) with ESMTP id 090218FC13 for ; Wed, 11 Aug 2010 18:56:51 +0000 (UTC) Received: from beaver.ssimicro.com (beaver.ssimicro.com [199.247.84.12]) (authenticated bits=0) by mail.ssimicro.com (8.14.4/8.14.4) with ESMTP id o7BIqJ9S095973 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Wed, 11 Aug 2010 12:52:20 -0600 (MDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.96.1 at mail.ssimicro.com Message-ID: <4C62F272.4030703@ssimicro.com> Date: Wed, 11 Aug 2010 12:56:50 -0600 From: markham breitbach User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.8) Gecko/20100802 Thunderbird/3.1.2 MIME-Version: 1.0 To: freebsd-performance@freebsd.org References: <4C62D827.2030409@ssimicro.com> <949C0FF2-04AA-4440-82B0-F44A13B8F0C2@mac.com> In-Reply-To: <949C0FF2-04AA-4440-82B0-F44A13B8F0C2@mac.com> X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: massive load average spikes X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Aug 2010 18:56:52 -0000 On 11/08/10 11:59 AM, Chuck Swiger wrote: > Hi-- > > On Aug 11, 2010, at 10:04 AM, markham breitbach wrote: >> I am running into an issue where I am seeing load average on a server suddenly jump from >> nominal values around 0.5 to anywhere from 10 up over 70 in under 1 second. This does not >> seem to be related to CPU overload, and LA immediately begins to fall back again to >> nominal. This does not seem to happen with any regular frequency, and can happen several >> times an hour or not for hours. > [ ... ] >> Can anyone suggest what may be causing this or how to track that down? > >From the (limited) available data, I'd imagine someone is doing wardialling of your mail service to try common username/password combinations and break in. Especially if they are connecting via POP3S / IMAPS ports and doing SSL negotiation, there's a very high burst of CPU load, as imap or pop daemons get forked to handle the requests, then quit immediately afterwards when the login attempt fails. You won't see much change in memory loading unless they do get a valid login since the Dovecot daemons are already resident & there's no real I/O made to disk until it looks up a real user's mail. > > Looking at tcpdump for new connection requests or checking the Dovecot mail logs for a slew of attempted logins for invalid users, and correlating with your load spikes would be a way of checking on this theory.... > > Regards, Sorry for the limited data, It's hard to know where to draw the line between useful data and information overload, but I'm more than happy to supply whatever other info you might find useful. I did take a look at my dovecot logs, and there are not more than a couple of failed auth attempts in any given minute. Sendmail logs don't show any excessive activity when LA spikes either. "vmstat -w1" shows occasional spikes of processes in the run queue, but that doesn't usually correlate to spikes in load average (although sometimes it is close). here is a sample of vmstat and an approximately correlating output of load average for ~20s. Notice the load average spikes >40, but there are virtually no processes in the run queue. (/var/mail is isolated on ad10) procs memory page disks faults cpu r b w avm fre flt re pi po fr sr ad4 ad6 ad8 ad10 in sy cs us sy id 0 1 2 1852712 141476 2535 1 1 0 1834 0 0 0 0 7 31232 5202 4403 1 2 97 0 1 1 1852176 141184 2022 0 0 0 1673 0 14 14 0 16 31278 4706 3826 0 1 98 0 1 2 1850264 142468 2213 0 0 0 2234 0 29 29 0 10 31394 5251 4948 0 2 98 0 1 0 1851364 142584 1717 0 0 0 1407 0 0 0 0 0 31251 3869 4753 0 3 97 0 1 2 1852200 141712 2054 0 0 0 1500 0 4 4 0 0 31197 3893 3393 0 1 99 1 1 1 1857440 138980 2306 0 0 0 1384 0 7 6 0 0 31420 6814 5436 0 2 97 0 1 2 1857984 138380 2631 0 0 0 1992 0 8 9 0 10 31469 6318 4227 0 3 97 0 1 0 1856708 138576 2372 0 0 0 2032 0 1 1 0 0 31496 6473 4839 0 2 98 0 1 3 1857044 138176 3602 0 0 0 2899 0 1 1 0 0 31573 9621 6006 1 3 96 0 1 0 1856836 138208 1120 0 0 0 1106 0 1 1 0 270 32221 6226 5031 0 1 99 2 1 1 1855824 138500 2522 0 0 0 2196 0 15 15 0 11 31619 9254 5394 0 2 97 0 1 0 1854304 138936 2380 0 0 0 2671 0 22 22 0 20 31484 8465 5864 2 3 96 3 1 1 1857960 136608 3026 0 0 0 1941 0 0 0 0 0 31331 9048 5327 0 3 97 0 1 1 1865832 133420 7232 0 0 0 5103 0 14 14 0 12 31721 19322 11197 1 9 90 3 1 0 1872044 129148 3629 0 0 0 1982 0 4 5 0 0 31904 11714 5716 0 3 97 0 1 0 1868948 131136 4417 0 0 0 4303 0 39 39 0 38 31937 12498 7073 1 4 95 0 1 2 1868220 131748 2117 0 0 0 1905 0 2 2 0 0 31203 4858 3604 1 2 98 0 1 1 1867152 132172 1518 0 0 0 1367 0 0 0 0 3 31202 3190 3923 0 2 98 0 1 2 1867016 132296 1556 0 0 0 1325 0 0 0 0 0 31133 2802 3568 0 2 98 0 1 1 1864572 132672 2020 0 0 0 1715 0 0 0 0 0 31286 4487 5098 0 3 97 0 1 0 1869548 130208 2117 0 0 0 1235 0 1 1 0 1 31283 4378 3211 0 1 99 0 1 0 1868416 130040 1767 0 0 0 1485 0 0 0 0 2 31379 4929 4294 0 1 98 Wed Aug 11 12:40:55 MDT 2010 0.60 3.38 3.79 0.60 3.38 3.79 0.55 3.33 3.77 0.55 3.33 3.77 0.55 3.33 3.77 0.55 3.33 3.77 40.94 11.66 6.70 40.94 11.66 6.70 40.94 11.66 6.70 40.94 11.66 6.70 Wed Aug 11 12:41:05 MDT 2010 40.94 11.66 6.70 40.94 11.66 6.70 37.67 11.46 6.66 37.67 11.46 6.66 37.67 11.46 6.66 37.67 11.46 6.66 37.67 11.46 6.66 34.65 11.27 6.63 34.65 11.27 6.63 34.65 11.27 6.63