5.  Memory management

      A single mechanism is used for data storage: memory buffers, or mbuf's. An mbuf is a structure of the form:

struct mbuf {
	struct	mbuf *m_next;		/* next buffer in chain */
	u_long	m_off;			/* offset of data */
	short	m_len;			/* amount of data in this mbuf */
	short	m_type;			/* mbuf type (accounting) */
	u_char	m_dat[MLEN];		/* data storage */
	struct	mbuf *m_act;		/* link in higher-level mbuf list */
};
The m_next field is used to chain mbufs together on linked lists, while the m_act field allows lists of mbuf chains to be accumulated. By convention, the mbufs common to a single object (for example, a packet) are chained together with the m_next field, while groups of objects are linked via the m_act field (possibly when in a queue).

      Each mbuf has a small data area for storing information, m_dat. The m_len field indicates the amount of data, while the m_off field is an offset to the beginning of the data from the base of the mbuf. Thus, for example, the macro mtod, which converts a pointer to an mbuf to a pointer to the data stored in the mbuf, has the form

#define	mtod(x,t)	((t)((int)(x) + (x)->m_off))
(note the t parameter, a C type cast, which is used to cast the resultant pointer for proper assignment).

      In addition to storing data directly in the mbuf's data area, data of page size may be also be stored in a separate area of memory. The mbuf utility routines maintain a pool of pages for this purpose and manipulate a private page map for such pages. An mbuf with an external data area may be recognized by the larger offset to the data area; this is formalized by the macro M_HASCL(m), which is true if the mbuf whose address is m has an external page cluster. An array of reference counts on pages is also maintained so that copies of pages may be made without core to core copying (copies are created simply by duplicating the reference to the data and incrementing the associated reference counts for the pages). Separate data pages are currently used only when copying data from a user process into the kernel, and when bringing data in at the hardware level. Routines which manipulate mbufs are not normally aware whether data is stored directly in the mbuf data array, or if it is kept in separate pages.

      The following may be used to allocate and free mbufs:

m = m_get(wait, type);
MGET(m, wait, type);

n = m_free(m);
MFREE(m,n);

      The following utility routines are available for manipulating mbuf chains:

m = m_copy(m0, off, len);

The m_copy routine create a copy of all, or part, of a list of the mbufs in m0. Len bytes of data, starting off bytes from the front of the chain, are copied. Where possible, reference counts on pages are used instead of core to core copies. The original mbuf chain must have at least off + len bytes of data. If len is specified as M_COPYALL, all the data present, offset as before, is copied.
m_cat(m, n);

The mbuf chain, n, is appended to the end of m. Where possible, compaction is performed.
m_adj(m, diff);

The mbuf chain, m is adjusted in size by diff bytes. If diff is non-negative, diff bytes are shaved off the front of the mbuf chain. If diff is negative, the alteration is performed from back to front. No space is reclaimed in this operation; alterations are accomplished by changing the m_len and m_off fields of mbufs.
m = m_pullup(m0, size);

After a successful call to m_pullup, the mbuf at the head of the returned list, m, is guaranteed to have at least size bytes of data in contiguous memory within the data area of the mbuf (allowing access via a pointer, obtained using the mtod macro, and allowing the mbuf to be located from a pointer to the data area using dtom, defined below). If the original data was less than size bytes long, len was greater than the size of an mbuf data area (112 bytes), or required resources were unavailable, m is 0 and the original mbuf chain is deallocated.

This routine is particularly useful when verifying packet header lengths on reception. For example, if a packet is received and only 8 of the necessary 16 bytes required for a valid packet header are present at the head of the list of mbufs representing the packet, the remaining 8 bytes may be ``pulled up'' with a single m_pullup call. If the call fails the invalid packet will have been discarded.

      By insuring that mbufs always reside on 128 byte boundaries, it is always possible to locate the mbuf associated with a data area by masking off the low bits of the virtual address. This allows modules to store data structures in mbufs and pass them around without concern for locating the original mbuf when it comes time to free the structure. Note that this works only with objects stored in the internal data buffer of the mbuf. The dtom macro is used to convert a pointer into an mbuf's data area to a pointer to the mbuf,

#define	dtom(x)	((struct mbuf *)((int)x & ~(MSIZE-1)))

      Mbufs are used for dynamically allocated data structures such as sockets as well as memory allocated for packets and headers. Statistics are maintained on mbuf usage and can be viewed by users using the netstat(1) program.