Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 28 Aug 2000 10:42:00 -0700 (PDT)
From:      Archie Cobbs <archie@whistle.com>
To:        freebsd-arch@freebsd.org
Subject:   Proposal to clarify mbuf handling rules (fwd)
Message-ID:  <200008281742.KAA69859@bubba.whistle.com>

next in thread | raw e-mail | index | archive | help
David O'Brien correctly pointed out that I should have sent
the email below to -arch and -net instead of -current and
-net. So at the risk of repetition for some folks, here it is
again for anybody on -arch but not -net or -current. Followups
should go to -net I suppose. Sorry for the confusion.

Thanks,
-Archie

> From owner-freebsd-net@FreeBSD.ORG Sun Aug 27 14:38:34 2000
> From: Archie Cobbs <archie@whistle.com>
> Message-Id: <200008272125.OAA66159@bubba.whistle.com>
> Subject: Proposal to clarify mbuf handling rules
> To: freebsd-net@FreeBSD.ORG, freebsd-current@FreeBSD.ORG
> 
> In looking at some of the problems relating to divert, bridging,
> etc., it's apparent that lots of code is breaking one of the rules
> for handling mbufs: that mbuf data can sometimes be read-only.
> 
> Each mbuf may be either a normal mbuf or a cluster mbuf (if the
> mbuf flags contains M_EXT). Cluster mbufs point to an entire page
> of memory, and this page of memory may be shared by more than one
> cluster mbuf (see m_copypacket()). This effectively makes the mbuf
> data read-only, because a change to one mbuf affects all of the
> mbufs, not just the one you're working on. There have been (and
> still are) several FreeBSD bugs because of this subtlety.
> 
> A test for an mbuf being "read-only" is:
> 
>   if ((m->m_flags & M_EXT) != 0 && MEXT_IS_REF(m))  ...
> 
> So an implicit rule for handling mbufs is that they should be
> treated as read-only unless/until you either check that it's not,
> and/or pullup a new (non-cluster) mbuf that covers the data area
> that you're going to modify.
> 
> However, many routines that take an mbuf parameter assume that the
> mbuf given to them is modifiable and proceed to write all over it.
> A few examples are: ip_input(), in_arpinput(), tcp_input(),
> divert_packt(), etc.
> 
> In practice, this is often not a problem because the mbuf is actually
> modifiable (because there are no other references to it). But this
> is just because we've been lucky. When you throw things like bridging,
> dummynet, divert, and netgraph into the mix, not to mention other
> site-specific hacks, then these assumptions no longer hold. At the
> minimum these assumptions should be clearly commented, but that's
> not even the case right now.
> 
> Routines that don't change any data, or that only do m_pullup(),
> M_PREPEND(), m_adj(), etc. don't have a problem.
> 
> So I'd like to propose a mini-project to clarify and fix this problem.
> Here is the propsal:
> 
>   1.  All routines that take an mbuf as an argument must not assume
>       that any mbuf in the chain is modifyable, unless expclicitly
>       and clearly documented (in the comment at the top of the function)
>       as doing so.
> 
>   2.  For routines that don't modify data, incorporate liberal use
>       of the "const" keyword to make this clear. For example, change
> 
> 	  struct ip *ip;
> 	  ip = mtod(m, struct ip *);
> 
>       to:
> 
> 	  const struct ip *ip;
> 	  ip = mtod(m, const struct ip *);
> 
>   3.  For any routines that do need to modify mbuf data, but don't
>       assume anything about the mbuf, alter those routines to do
>       an m_pullup() when necessary to make the data are they are
>       working on modifiable. For example:
> 
> 	struct ip *ip;
> 
> 	/* Pull up IP header */
> 	if (m->m_len < sizeof(*ip) && !(m = m_pullup(m, sizeof(*ip))))
> 		return;
> 	ip = mtod(m, struct ip *);
> 
>     #ifdef NEW_CODE_BEING_ADDED
> 	/* Make sure the IP header area is writable */
> 	if ((m->m_flags & M_EXT) != 0 && MEXT_IS_REF(m)) {
> 		/* m_pullup() *always* prepends a fresh, non-cluster mbuf */
> 		if ((m = m_pullup(m, sizeof(struct ip))) == 0)
> 			return;
> 		ip = mtod(m, struct ip *);
> 	}
>     #endif
> 
> 	/* Modify the header */
> 	ip->ip_len = 123;
> 	...
> 
> The only negative is the addition of the NEW_CODE_BEING_ADDED code
> in the relevant places. In practice this test will usually fail,
> as most mbufs are modifiable, so there should be no noticable
> slowdown. However, robustness should improve, especially when
> bridging, diverting, etc.
> 
> What do people think? If this is generally agreeable I'll try to
> work on putting together a patch set for review.

___________________________________________________________________________
Archie Cobbs   *   Whistle Communications, Inc.  *   http://www.whistle.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200008281742.KAA69859>