Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 17 Dec 1998 02:13:10 GMT
From:      Michael Robinson <robinson@netrinsics.com>
To:        dot@dotat.at
Cc:        fenner@parc.xerox.com, freebsd-net@FreeBSD.ORG
Subject:   Re: MLEN < write length < MINCLSIZE "bug"
Message-ID:  <199812170213.CAA00532@netrinsics.com>
In-Reply-To: <E0zqHzA-0000Le-00@fanf.noc.demon.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Tony Finch <dot@dotat.at> writes:
>Having read this bit of the red demon book recently (although I can't
>find the precise reference again at the moment), ISTR that the
>heuristic is that since allocating an mbuf with a cluster takes two
>allocations, MINCLSIZE is just bigger than two mbufs.

So it is as I suspected.  MINCLSIZE is a parameter for a classic time/space
performance tradeoff.  A small MINCLSIZE gives you fewer mbuf allocations,
but with lots of unused space in mbuf clusters.  A big MINCLSIZE gives you
more mbuf allocations, and more copy operations, but with more efficient 
memory use.

As such, MINCLSIZE seems like a good candidate for a sysctl (a patch for
which can be found at the end of this message).  People running heavily-used
dedicated network servers may find it useful to be able to tune this
parameter.

It seems to me that this is largely orthogonal, though, to the issue of 
segmenting writes in sosend before sending them to the protocol.  That is 
more an issue of hardware speed vs. kernel speed.  For example, on a dialup
PPP connection, the additional packet header overhead vastly outweighs
the mostly non-existent parallelism of the serial interface.  However, a 
100Mhz 64-bit PCI gigabit Ethernet controller can process buffers faster than
the CPU can spit them out, so segmenting the writes could result in significant
improvements in throughput and latency.

So I think this behavior is something that one should be able to turn on and
off.  The question is with what granularity: kernel, interface, or socket?

A socket option would be trivial to implement, but wouldn't work for existing
code until it was retrofitted in.

A sysctl would also be trivial to implement, would work with existing code,
but the granularity is probably to coarse.

A new option for ifconfig would work at the interface level, but I don't 
know if that's what people want or will accept.

Comments?

	-Michael Robinson


Index: sys/mbuf.h
===================================================================
RCS file: /cdrom/CVSROOT/src/sys/sys/mbuf.h,v
retrieving revision 1.18
diff -u -r1.18 mbuf.h
--- mbuf.h	1996/08/19 18:30:15	1.18
+++ mbuf.h	1998/12/17 01:39:44
@@ -52,7 +52,8 @@
 #define	MLEN		(MSIZE - sizeof(struct m_hdr))	/* normal data len */
 #define	MHLEN		(MLEN - sizeof(struct pkthdr))	/* data len w/pkthdr */
 
-#define	MINCLSIZE	(MHLEN + MLEN)	/* smallest amount to put in cluster */
+extern int minclsize;
+#define	MINCLSIZE	minclsize	/* smallest amount to put in cluster */
 #define	M_MAXCOMPRESS	(MHLEN / 2)	/* max amount to copy for compression */
 
 /*
Index: sys/sysctl.h
===================================================================
RCS file: /cdrom/CVSROOT/src/sys/sys/sysctl.h,v
retrieving revision 1.48.2.2
diff -u -r1.48.2.2 sysctl.h
--- sysctl.h	1997/08/30 14:08:56	1.48.2.2
+++ sysctl.h	1998/12/17 01:39:58
@@ -231,6 +231,7 @@
 #define	KERN_PS_STRINGS		32	/* int: address of PS_STRINGS */
 #define	KERN_USRSTACK		33	/* int: address of USRSTACK */
 #define KERN_MAXID		34      /* number of valid kern ids */
+#define KERN_MINCLSIZE		35      /* minumum size for mbuf cluster */
 
 #define CTL_KERN_NAMES { \
 	{ 0, 0 }, \
@@ -267,6 +268,7 @@
 	{ "maxsockbuf", CTLTYPE_INT }, \
 	{ "ps_strings", CTLTYPE_INT }, \
 	{ "usrstack", CTLTYPE_INT }, \
+	{ "minclsize", CTLTYPE_INT }, \
 }
 
 /*
Index: kern/uipc_socket.c
===================================================================
RCS file: /cdrom/CVSROOT/src/sys/kern/uipc_socket.c,v
retrieving revision 1.20.2.5
diff -u -r1.20.2.5 uipc_socket.c
--- uipc_socket.c	1998/03/02 07:58:12	1.20.2.5
+++ uipc_socket.c	1998/12/17 01:40:26
@@ -53,6 +53,9 @@
 static int somaxconn = SOMAXCONN;
 SYSCTL_INT(_kern, KERN_SOMAXCONN, somaxconn, CTLFLAG_RW, &somaxconn, 0, "");
 
+int minclsize = (MHLEN + MLEN);
+SYSCTL_INT(_kern, KERN_MINCLSIZE, minclsize, CTLFLAG_RW, &minclsize, 0, "");
+
 /*
  * Socket operation routines.
  * These routines are called by the routines in


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199812170213.CAA00532>