From owner-freebsd-net Tue Jul 2 0:10:56 2002 Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E914E37B400 for ; Tue, 2 Jul 2002 00:10:50 -0700 (PDT) Received: from iguana.icir.org (iguana.icir.org [192.150.187.36]) by mx1.FreeBSD.org (Postfix) with ESMTP id A4E8743E0A for ; Tue, 2 Jul 2002 00:10:50 -0700 (PDT) (envelope-from rizzo@iguana.icir.org) Received: (from rizzo@localhost) by iguana.icir.org (8.11.6/8.11.3) id g627AoZ02468; Tue, 2 Jul 2002 00:10:50 -0700 (PDT) (envelope-from rizzo) Date: Tue, 2 Jul 2002 00:10:50 -0700 From: Luigi Rizzo To: net@freebsd.org Subject: Mbuf allocator performance (was Should we keep a cache of mbuf+cluster ready for use ?) Message-ID: <20020702001050.B2250@iguana.icir.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i Sender: owner-freebsd-net@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org [Bcc to -current as relevant there] As a followup to my question below, i did some simple experiment on a -current box acting as a router, using two "em" cards and DEVICE_POLLING (just for stability). The whole of the code is basically below -- the receiving side tries to grab the mbuf+cluster from the free pool first, and falls back to the standard method on failure; the sending side tries to attach the freed buffer to the free pool if it is the case. There is a simplification here with respect to what could go into an m_freem()/mgethdr_cluster() pair because the driver is already holding a lock, on Giant in this case, so you do not need further locking. Still, to come to the data points: CURRENT STABLE (*) fastfwd=1, pool_max=0 276kpps 365 kpps fastfwd=1, pool_max=50 355kpps 383 kpps fastfwd=0, pool_max=0 195 pps 142 kpps fastfwd=0, pool_max=50 227 kpps 146 kpps (*) This version of STABLE that I am using for comparison has some proprietary optimizations which make it a bit faster than normal. However it still uses the old ipfw code, which is invoked when fastfwd=0, and is significantly slower than the new one. Now this really seems to call for adding this interface into the mbuf subsystem. I believe we have to find a name for the allocator (the deallocator might well go into m_freem(), depending on how we implement the locking) and whether it makes sense to lock mgethdr_cluster() under per-cpu locks or under Giant, or even let the caller make sure that it holds the proper lock before trying to invoke the procedure (as i expect the "producers" or "consumers" of these pairs to be located in the network stack, chances are that they already hold a lock on Giant). cheers luigi The code: struct mbuf *em_pool; static int em_pool_max = 50; SYSCTL_INT(_hw, OID_AUTO, em_pool_max, CTLFLAG_RW, &em_pool_max,0,"max size of mbuf pool"); static int em_pool_now; SYSCTL_INT(_hw, OID_AUTO, em_pool_now, CTLFLAG_RD, &em_pool_now,0,"Current size of mbuf pool"); ... in em_get_buf() .. if (em_pool) { mp = em_pool; em_pool = mp->m_nextpkt; em_pool_now--; goto have_it; } ... in em_clean_transmit_interrupts() ... if ((m = tx_buffer->m_head)) { if (em_pool_now < em_pool_max && m->m_next == NULL && m->m_flags & M_EXT && M_WRITABLE(m) ) { m->m_nextpkt = em_pool; em_pool = m; em_pool_now++; } else m_freem(m); tx_buffer->m_head = NULL; } On Sat, Jun 29, 2002 at 02:53:03PM -0700, Luigi Rizzo wrote: > Hi, > during some experiments i was doing recently, i noticed that there > is a significant improvement in the forwarding speed (especially > at very high speeds) if we keep a small pool of mbuf+cluster > ready for use. This is because most network drivers do something > like this > > MGETHDR(m_new, M_DONTWAIT, MT_DATA); > if (m_new == NULL) > return(ENOBUFS); > > MCLGET(m_new, M_DONTWAIT); > if (!(m_new->m_flags & M_EXT)) { > m_freem(m_new); > return(ENOBUFS); > } > > when replenishing the receive buffers, and both macros are quite > long even if there are available blocks in the free lists. We can > store buffers of this form when/if they are released with some code > like this: > > if (my_pool_count < my_pool_max && m->m_next == NULL && > (m->m_flags & M_EXT) && M_EXT_WRITABLE(m) ) { > m->m_nextpkt = my_pool; > m->m_data = ->m_ext.ext_buf; > m->m_len = m->m_pkthdr.len = MCLBYTES; > my_pool = m; > my_pool_now++; > } else { > ... rest of m_freem() ... > } > > and save a lot of overhead (we just need to reset m_data and > m_len and m_pkthdr.len) when someone wants to allocate them. > > Is there interest in committing some code like this to > mbuf.h and maybe uipc_mbuf*.c ? > > cheers > luigi > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message