Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 31 May 2002 13:25:09 -0700 (PDT)
From:      Archie Cobbs <archie@dellroad.org>
To:        Julian Elischer <julian@elischer.org>
Cc:        Bosko Milekic <bmilekic@unixdaemons.com>, Archie Cobbs <archie@dellroad.org>, freebsd-net@FreeBSD.ORG
Subject:   Re: m_split() considered harmful
Message-ID:  <200205312025.g4VKP9t02205@arch20m.dellroad.org>
In-Reply-To: <Pine.BSF.4.21.0205311243460.29361-100000@InterJet.elischer.org> "from Julian Elischer at May 31, 2002 12:54:39 pm"

next in thread | previous in thread | raw e-mail | index | archive | help
Julian Elischer writes:
> '90-'95 under MACH/BSD and when I moved the code to FreeBSD1.x it cam
> along.. there was no M_LEADINGSPACE/M_TRAILINGSPACE macro at that time..
> I did it in my code explicitly.
> 
> It was not used in standard code because only in my own protocol stack
> did I know that the damned object was not shared and writable..
> This has improved with time..
> 
> Having the size set to 0 however stopped users from hitting
> cases where the WRITABILITY of the ext objext has not been correctly 
> tracked as it always returns "not enough space" (sometimes -ve).
> 
> If we change this we need to audit the WRITABILTY..
> e.g. it is not checked in M_TRAILINGSPACE

It's not clear whether the caller of M_TRAILINGSPACE()/M_LEADINGSPACE()
is responsible for checking for writability, or the macros themselves.
It seems to make more sense that the caller would be responsible...
why would you call M_TRAILINGSPACE() unless you wanted to write
something in there?

I agree that we should audit for writability, however, the only
case where this could become an issue with this patch is if somewhere
a call to m_split() is made, followed by a call to M_TRAILINGSPACE(),
followed by a write to the mbuf data area because M_TRAILINGSPACE() > 0.
In fact, this would have to happen to 2 or more copies of the cluster
for there to be a problem.  I'd say this is very unlikely, but still
possible.

Any *other* use of the mbuf as writable after a call to m_split()
is already an existing bug and is not made any worse by this patch.

As a temporary saftey measure, I'll add M_WRITABLE(m) into the
M_TRAILINGSPACE() macro. However, I see this as a temporary hack;
the correct fix is to put the burden of writability on the caller.
After all, M_TRAILINGSPACE() doesn't modify the mbuf data!

That is, what we really need is a more general audit for code that
writes into mbufs that might be read-only -- and, as one special case
of tha, code that calls M_TRAILINGSPACE()/M_LEADINGSPACE() before writing
into an mbuf.

FYI, any easy way to do this would be to add const'ness to mtod():

    #define mtod(m, t)  ((const t)((m)->m_data))

and then look for GCC warnings. Any takers?? :-)

Updated patch below.

-Archie

__________________________________________________________________________
Archie Cobbs     *     Packet Design     *     http://www.packetdesign.com

Index: sys/mbuf.h
===================================================================
RCS file: /home/ncvs/src/sys/sys/mbuf.h,v
retrieving revision 1.89
diff -u -r1.89 mbuf.h
--- sys/mbuf.h	5 Feb 2002 02:00:56 -0000	1.89
+++ sys/mbuf.h	31 May 2002 20:24:41 -0000
@@ -344,6 +344,9 @@
 /*
  * Compute the amount of space available
  * before the current start of data in an mbuf.
+ *
+ * The M_WRITABLE() is a temporary, conservative safety measure: the burden
+ * of checking writability of the mbuf data area rests solely with the caller.
  */
 #define	M_LEADINGSPACE(m)						\
 	((m)->m_flags & M_EXT ?						\
@@ -354,10 +357,14 @@
 /*
  * Compute the amount of space available
  * after the end of data in an mbuf.
+ *
+ * The M_WRITABLE() is a temporary, conservative safety measure: the burden
+ * of checking writability of the mbuf data area rests solely with the caller.
  */
 #define	M_TRAILINGSPACE(m)						\
-	((m)->m_flags & M_EXT ? (m)->m_ext.ext_buf +			\
-	    (m)->m_ext.ext_size - ((m)->m_data + (m)->m_len) :		\
+	((m)->m_flags & M_EXT ?						\
+	    (M_WRITABLE(m) ? (m)->m_ext.ext_buf + (m)->m_ext.ext_size	\
+		- ((m)->m_data + (m)->m_len) : 0) :			\
 	    &(m)->m_dat[MLEN] - ((m)->m_data + (m)->m_len))
 
 /*
Index: kern/uipc_mbuf.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/uipc_mbuf.c,v
retrieving revision 1.91
diff -u -r1.91 uipc_mbuf.c
--- kern/uipc_mbuf.c	12 Apr 2002 00:01:50 -0000	1.91
+++ kern/uipc_mbuf.c	31 May 2002 20:24:45 -0000
@@ -560,6 +560,11 @@
  * Partition an mbuf chain in two pieces, returning the tail --
  * all but the first len0 bytes.  In case of failure, it returns NULL and
  * attempts to restore the chain to its original state.
+ *
+ * Note that the resulting mbufs might be read-only, because the new
+ * mbuf can end up sharing an mbuf cluster with the original mbuf if
+ * the "breaking point" happens to lie within a cluster mbuf. Use the
+ * M_WRITABLE() macro to check for this case.
  */
 struct mbuf *
 m_split(struct mbuf *m0, int len0, int wait)
@@ -609,7 +614,6 @@
 		n->m_flags |= M_EXT;
 		n->m_ext = m->m_ext;
 		MEXT_ADD_REF(m);
-		m->m_ext.ext_size = 0; /* For Accounting XXXXXX danger */
 		n->m_data = m->m_data + len;
 	} else {
 		bcopy(mtod(m, caddr_t) + len, mtod(n, caddr_t), remain);

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200205312025.g4VKP9t02205>