From owner-freebsd-net@FreeBSD.ORG Mon Apr 4 06:48:16 2011 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C239F106566B for ; Mon, 4 Apr 2011 06:48:16 +0000 (UTC) (envelope-from egrosbein@rdtc.ru) Received: from eg.sd.rdtc.ru (eg.sd.rdtc.ru [62.231.161.221]) by mx1.freebsd.org (Postfix) with ESMTP id 359198FC19 for ; Mon, 4 Apr 2011 06:48:15 +0000 (UTC) Received: from eg.sd.rdtc.ru (localhost [127.0.0.1]) by eg.sd.rdtc.ru (8.14.4/8.14.4) with ESMTP id p346mDlx058322 for ; Mon, 4 Apr 2011 13:48:13 +0700 (NOVST) (envelope-from egrosbein@rdtc.ru) Message-ID: <4D9969A8.1060701@rdtc.ru> Date: Mon, 04 Apr 2011 13:48:08 +0700 From: Eugene Grosbein User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; ru-RU; rv:1.9.2.13) Gecko/20110112 Thunderbird/3.1.7 MIME-Version: 1.0 To: "net@freebsd.org" Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: Subject: mbuf clusters exhaustion & keglimit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Apr 2011 06:48:16 -0000 Hi! I'm running several loaded PPPoE access servers based on FreeBSD 8.2-STABLE/amd64 with em and igb network interfaces and 4GB RAM. No memory-intensive tasks other than routing about 2Gbit/s (1G "in" and a bit less "out"). kern.ipc.nmbclusters is set to 100000 in /etc/sysctl.conf and several months I had no problems with mbufs. Last week one of the routes stopped serviceing users for several hours but responded to pings and console was alive. Outgoing ping worked fine too but any process trying to use TCP or UDP kernel service got stuck in "keglimit" state. I've dropped to KDB from console, ran "call doadump", got full crashdump, returned from KDB, saved crashdump and tried to reboot cleanly. mpd5 failed to stop within 30 seconds timeout but file systems were unmounted cleanly and system rebooted. "vmstat -z -M vmcore" says that system was out of mbuf clusters: ITEM SIZE LIMIT USED FREE REQUESTS FAILURES mbuf_cluster: 2048, 100000, 100000, 0, 18897242, 317691 After that I've created graphs of mbuf cluster usage for all my routers and see no apparent leaks. The question is: how much kernel memory is it safe to dedicate to mbuf clusters? This system still runs with 100000 mbuf clusters maximum: Mem: 65M Active, 2759M Inact, 455M Wired, 31M Cache, 398M Buf, 435M Free It seems, 100000 mbuf clusters take only 207MB (2048+256 bytes for each), do they? Eugene Grosbein