From owner-freebsd-net@FreeBSD.ORG Mon Apr 26 23:52:21 2010 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 74FB9106566B for ; Mon, 26 Apr 2010 23:52:21 +0000 (UTC) (envelope-from erik@malcolm.berkeley.edu) Received: from malcolm.berkeley.edu (malcolm.Berkeley.EDU [IPv6:2607:f140:ffff:ffff::239]) by mx1.freebsd.org (Postfix) with ESMTP id 5B3D18FC18 for ; Mon, 26 Apr 2010 23:52:21 +0000 (UTC) Received: from malcolm.berkeley.edu (localhost [127.0.0.1]) by malcolm.berkeley.edu (8.14.3/8.13.8m1) with ESMTP id o3QNqLek020524 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 26 Apr 2010 16:52:21 -0700 (PDT) (envelope-from erik@malcolm.berkeley.edu) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.95.3 at malcolm.berkeley.edu Received: (from erik@localhost) by malcolm.berkeley.edu (8.14.3/8.13.3/Submit) id o3QNqKLq020523 for freebsd-net@freebsd.org; Mon, 26 Apr 2010 16:52:20 -0700 (PDT) (envelope-from erik) Date: Mon, 26 Apr 2010 16:52:20 -0700 From: Erik Klavon To: freebsd-net@freebsd.org Message-ID: <20100426235220.GA20501@malcolm.berkeley.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.3 (malcolm.berkeley.edu [127.0.0.1]); Mon, 26 Apr 2010 16:52:21 -0700 (PDT) Subject: ifa_free panic in 8 stable X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Apr 2010 23:52:21 -0000 Hi I have a dual processor, single core amd64 machine running a recent cvsup of 8 stable. On this development machine I use netgraph(3) to implement one to one NAT with one ng_nat(4) node. I use ipfw(8) rules to direct traffic to netgraph nodes as needed based on table entries using an ng_ipfw(4) node. When I load test ng_nat on the development system using iperf(1) running on the independent system, the development system panics after a couple of days as follows. panic: negative refcount 0xffffff0002a344d4 cpuid = 1 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a panic() at panic+0x182 ifa_free() at ifa_free+0x5d ip_output() at ip_output+0x49d ip_forward() at ip_forward+0x199 ip_input() at ip_input+0x4cd ng_ipfw_rcvdata() at ng_ipfw_rcvdata+0xadVOP_STRATEGY: bp is not locked but should be KDB: enter: lock violation n[thread pid 17 tid 100042 ] Stopped at kdb_enter+0x3d: movq $0,0x6bb730(%rip) This panic is repeatable on this machine. I am unable to obtain a core dump after these panics; after I attempt to dump core using panic the system does not respond and must be power cycled. I first encountered this problem with 8.0p1. I've reproduced this problem with both em and bge interfaces. I've looked around in the functions mentioned in the backtrace, but haven't made any progress in identifying why this panic occurs. Please share any suggestions you think of for tracking down the source of this problem. My kernel is configured with options DEBUG=-g, KDB, DDB, KDB_TRACE, BREAK_TO_DEBUGGER, INVARIANTS, INVARIANT_SUPPORT, WITNESS, DEBUG_LOCKS, DEBUG_VFS_LOCKS, DIAGNOSTIC, SW_WATCHDOG, DEADLKRES, IPFIREWALL, IPFIREWALL_VERBOSE, IPFIREWALL_VERBOSE_LIMIT=100, IPFIREWALL_FORWARD, and IPDIVERT. I use the following ipfw(8) rules to direct traffic from the independent system to netgraph and vice versa. (x and y below replace the first two octets of the globally routable addresses I'm using in this test.) # direct traffic from the independent system into ng_nat 01100 netgraph tablearg ip from table(87) to any in # direct traffic from the internet into ng_nat 01110 netgraph tablearg ip from any to table(88) in via vlan615 # forward NATed traffic to the subnet's router if it isn't local 01120 fwd x.y.254.1 ip4 from x.y.254.0/25 to not x.y.254.0/25 in via vlan613 # pass traffic after it is NATed, so the default deny rule doesn't block it 01130 allow ip from any to table(87) 01140 allow ip from table(88) to any ipfw(8) table 87 contains the entry 10.10.0.10/32 200254017 and table 88 contains the entry x.y.254.17/32 100254017 The above two table entries direct traffic to the following ng_nat(4) node. Name: NAT0254017 Type: nat ID: 0000000b Num hooks: 2 Local hook Peer name Peer type Peer ID Peer hook ---------- --------- --------- ------- --------- in ipfw ipfw 00000001 100254017 out ipfw ipfw 00000001 200254017 This ng_nag(4) node was created using the following commands. ngctl mkpeer ipfw: nat 100254017 out ngctl name ipfw:100254017 NAT0254017 ngctl connect ipfw: NAT0254017: 200254017 in ngctl msg NAT0254017: setaliasaddr x.y.254.17 ngctl msg NAT0254017: redirectaddr { "local_addr=10.10.0.10" "alias_addr=x.y.254.17" 'description="Static NAT" } I've assigned the independent system the address 10.10.0.10 on vlan613 with a default router of 10.10.0.1. The development system's address on vlan613 is 10.10.0.1. Based on the above setup, traffic from the independent system is NATed by the development syste to IP address x.y.254.17. I use iperf -d -U -P 20 for the load testing with another system outside of the test setup acting as an iperf server. Erik