From owner-freebsd-fs@FreeBSD.ORG  Wed Apr  1 20:06:16 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 8AC49112
 for <freebsd-fs@freebsd.org>; Wed,  1 Apr 2015 20:06:16 +0000 (UTC)
Received: from mail.tezzaron.com (mail.tezzaron.com [50.206.41.178])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 519C11EB
 for <freebsd-fs@freebsd.org>; Wed,  1 Apr 2015 20:06:16 +0000 (UTC)
Received: from delaware.tezzaron.com ([10.252.50.1])
 by mail.tezzaron.com (IceWarp 11.1.2.0 x64) with ASMTP (SSL) id
 201504011503410510
 for <freebsd-fs@freebsd.org>; Wed, 01 Apr 2015 15:03:41 -0500
Message-ID: <551C4F1D.1000206@tezzaron.com>
Date: Wed, 01 Apr 2015 15:03:41 -0500
From: Adam Guimont <aguimont@tezzaron.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:31.0) Gecko/20100101 Thunderbird/31.5.0
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Subject: NFSD high CPU usage
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.18-1
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 01 Apr 2015 20:06:16 -0000

I have an issue where NFSD will max out the CPU (1200% in this case) 
when a client workstation runs out of memory while trying to write via 
NFS. What also happens is the TCP Recv-Q fills up and causes connection 
timeouts for any other client trying to use the NFS server.

I can reproduce the issue by running stress on a low-end client 
workstation. Change into the NFS mounted directory and then use stress 
to write via NFS and exhaust the memory, example:

stress --cpu 2 --io 4 --vm 20 --hdd 4

The client workstation will eventually run out of memory trying to write 
into the NFS directory, fill the TCP Recv-Q on the NFS server, and then 
NFSD will max out the CPU.

The actual client workstations (~50) are not running stress when this 
happens, it's a mixture of EDA tools (simulation and verification).

For what it's worth, this is how I've been monitoring the TCP buffer 
queues where "xx.xxx.xx.xxx" is the IP address of the NFS server:

cmdwatch -n1 'netstat -an | grep -e "Proto" -e "tcp4" | grep -e "Proto" 
-e "xx.xxx.xx.xxx.2049"'

I have tried several tuning recommendations but it has not solved the 
problem.

Has anyone else experienced this and is anyone else able to reproduce it?

---
NFS server specs:

OS = FreeBSD 10.0-RELEASE
CPU = E5-1650 v3
Memory = 96GB
Disks = 24x ST6000NM0034 in 4x raidz2
HBA = LSI SAS 9300-8i
NIC = Intel 10Gb X540-T2
---
/boot/loader.conf

autoboot_delay="3"
geom_mirror_load="YES"
mpslsi3_load="YES"
cc_htcp_load="YES"
---
/etc/rc.conf

hostname="***"
ifconfig_ix0="inet *** netmask 255.255.248.0 -tso -vlanhwtso"
defaultrouter="***"
sshd_enable="YES"
ntpd_enable="YES"
zfs_enable="YES"
sendmail_enable="NO"
nfs_server_enable="YES"
nfs_server_flags="-h *** -t -n 128"
nfs_client_enable="YES"
rpcbind_enable="YES"
rpc_lockd_enable="YES"
rpc_statd_enable="YES"
samba_enable="YES"
atop_enable="YES"
atop_interval="5"
zabbix_agentd_enable="YES"
---
/etc/sysctl.conf

vfs.nfsd.server_min_nfsvers=3
vfs.nfsd.cachetcp=0
kern.ipc.maxsockbuf=16777216
net.inet.tcp.sendbuf_max=16777216
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.sendspace=1048576
net.inet.tcp.recvspace=1048576
net.inet.tcp.sendbuf_inc=32768
net.inet.tcp.recvbuf_inc=65536
net.inet.tcp.keepidle=10000
net.inet.tcp.keepintvl=2500
net.inet.tcp.always_keepalive=1
net.inet.tcp.cc.algorithm=htcp
net.inet.tcp.cc.htcp.adaptive_backoff=1
net.inet.tcp.cc.htcp.rtt_scaling=1
net.inet.tcp.sack.enable=0
kern.ipc.soacceptqueue=1024
net.inet.tcp.mssdflt=1460
net.inet.tcp.minmss=1300
net.inet.tcp.tso=0
---
Client workstations:

OS = CentOS 6.6 x64
Mount options from `cat /proc/mounts` =
rw,nosuid,noatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=***,mountvers=3,mountport=916,mountproto=udp,local_lock=none,addr=***
---


Regards,

Adam Guimont