From owner-freebsd-fs@FreeBSD.ORG Fri Apr 3 21:33:40 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 03DBE21D for ; Fri, 3 Apr 2015 21:33:40 +0000 (UTC) Received: from mail.tezzaron.com (mail.tezzaron.com [50.206.41.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BA7FF6CF for ; Fri, 3 Apr 2015 21:33:39 +0000 (UTC) Received: from delaware.tezzaron.com ([10.252.50.1]) by mail.tezzaron.com (IceWarp 11.1.2.0 x64) with ASMTP (SSL) id 201504031633310251; Fri, 03 Apr 2015 16:33:31 -0500 Message-ID: <551F072C.1000505@tezzaron.com> Date: Fri, 03 Apr 2015 16:33:32 -0500 From: Adam Guimont User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Rick Macklem Subject: Re: NFSD high CPU usage References: <1199661815.10758124.1427941695874.JavaMail.root@uoguelph.ca> In-Reply-To: <1199661815.10758124.1427941695874.JavaMail.root@uoguelph.ca> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Apr 2015 21:33:40 -0000 Rick Macklem wrote: > I can think of two explanations for this. > 1 - The server nfsd threads get confused when the TCP recv Q fills > and start looping around. > OR > 2 - The client is sending massive #s of RPCs (or crap that is > incomplete RPCs). > > To get a better idea w.r.t. what is going on, I'd suggest that > you capture packets (for a relatively short period) when the > server is 100% CPU busy. > # tcpdump -s 0 -w out.pcap host > - run on the server should do it. > Then look at out.pcap in wireshark and see what the packets > look like. (wireshark understands NFS, whereas tcpdump doesn't) > If #1, I'd guess very little traffic (maybe TCP layer stuff), > if #2, I'd guess you'll see a lot of RPC requests or garbage > that isn't a valid request. (This latter case would suggest a > CentOS problem.) > > If you capture the packets but can't look at them in wireshark, > you could email me the packet capture as an attachment and I > can look at it after Apr. 10, when I get home. > > rick > Thanks Rick, I was able to capture this today while it was happening. The capture is for about 100 seconds. I took a look at it in wireshark and to me it appears like the #2 situation you were describing. If you would like to confirm that I've uploaded the pcap file here: https://www.dropbox.com/s/pdhwj5z5tz7iwou/out.pcap.20150403 I will continue running some tests and trying to gather as much data as I can. Regards, Adam Guimont