From owner-freebsd-net@FreeBSD.ORG Sun Nov 26 18:48:17 2006 Return-Path: X-Original-To: freebsd-net@freebsd.org Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 094E816A40F for ; Sun, 26 Nov 2006 18:48:17 +0000 (UTC) (envelope-from sam@errno.com) Received: from ebb.errno.com (ebb.errno.com [69.12.149.25]) by mx1.FreeBSD.org (Postfix) with ESMTP id 95DAA43D77 for ; Sun, 26 Nov 2006 18:47:20 +0000 (GMT) (envelope-from sam@errno.com) Received: from [10.0.0.248] (trouble.errno.com [10.0.0.248]) (authenticated bits=0) by ebb.errno.com (8.13.6/8.12.6) with ESMTP id kAQIlqom095590 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 26 Nov 2006 10:47:53 -0800 (PST) (envelope-from sam@errno.com) Message-ID: <4569E158.5070800@errno.com> Date: Sun, 26 Nov 2006 10:47:52 -0800 From: Sam Leffler User-Agent: Thunderbird 1.5.0.7 (X11/20060920) MIME-Version: 1.0 To: Yar Tikhiy References: <20061126180306.GA64912@comp.chem.msu.su> In-Reply-To: <20061126180306.GA64912@comp.chem.msu.su> X-Enigmail-Version: 0.94.0.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org Subject: Re: m_copy & if_simloop X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Nov 2006 18:48:17 -0000 Yar Tikhiy wrote: > Hi folks, > > A friend user reported to me that rwhod wouldn't work in CURRENT > due to broken outgoing packets. Here's an example: > > 16:15:28.212810 IP truncated-ip - 6865 bytes missing! (tos 0x0, ttl 64, id 28554, offset 0, flags [none], proto: UDP (17), length: 7169, bad cksum 11c (->c64b)!) 10.10.10.4.513 > 10.10.10.255.513: UDP, length 276 > 0x0000: 4500 1c01 6f8a 0000 4011 011c 0a0a 0a04 E...o...@....... > ^^^^ ^^^^ broken fields > 0x0010: 0a0a 0aff 0201 0201 011c 0000 0101 0000 ................ > 0x0020: 4565 9ef0 0000 0000 6467 0000 0000 0000 Ee......dg...... > 0x0030: 0000 0000 0000 0000 0000 0000 0000 0000 ................ > 0x0040: 0000 0000 0000 0000 0000 001a 0000 000e ................ > 0x0050: 0000 0005 4564 a5e7 7474 7976 3000 0000 ....Ed..ttyv0... > 0x0060: 726f 6f74 0000 0000 4565 9d4a 0000 01a6 root....Ee.J.... > 0x0070: 7474 7976 3100 0000 726f 6f74 0000 0000 ttyv1...root.... > 0x0080: 4565 9d4d 0000 000c 7474 7976 3200 0000 Ee.M....ttyv2... > 0x0090: 726f 6f74 0000 0000 4565 9d4f 0000 0099 root....Ee.O.... > 0x00a0: 7474 7976 3300 0000 726f 6f74 0000 0000 ttyv3...root.... > 0x00b0: 4565 9d52 0000 019e 7474 7976 3400 0000 Ee.R....ttyv4... > 0x00c0: 726f 6f74 0000 0000 4565 9d54 0000 019c root....Ee.T.... > 0x00d0: 7474 7976 3500 0000 726f 6f74 0000 0000 ttyv5...root.... > 0x00e0: 4565 9d59 0000 0198 7474 7976 3600 0000 Ee.Y....ttyv6... > 0x00f0: 726f 6f74 0000 0000 4565 9d5b 0000 0195 root....Ee.[.... > 0x0100: 7474 7976 3700 0000 726f 6f74 0000 0000 ttyv7...root.... > 0x0110: 4565 9d5e 0000 0000 7474 7970 3100 0000 Ee.^....ttyp1... > 0x0120: 7961 7200 0000 0000 4565 8361 0000 04b2 yar.....Ee.a.... > > BTW, the problem manifests itself only if the packet is longer than > 256 bytes. > > The problem seems to stem from the following. In ether_output(), > the broadcast packet is copied and looped back up the stack, now > via if_simloop(). The copy has been made with m_copy() since 4.4BSD. > I can't tell about the old days, but today m_copy() alias m_copym() > will copy mbufs but not mbuf clusters, which results in an effectively > read-only copy. Such a copy must not be passed up the stack because > the stack is free to change it and thus destroy the original. For > a long time, enough leading bytes were in plain mbuf(s) for the bug > to stay unnoticed. However, the pattern changed in CURRENT some > day and -- here we are. > > The problem can be cured by using m_dup() in place of m_copy() > (verified). > > Is my analysis correct? Sounds likely. The read-only'ness definitely. > > If so, here's an idea of a general fix. Several source files do > the following: > > struct mbuf *mcopy = m_copy(m, 0, M_COPYALL); > /* some even don't check mcopy for NULL here! */ > if_simloop(ifp, mcopy, family, hdrlen); > > It's common code, so just a flag to if_simloop() cound be introduced > meaning "m_dup() the packet properly". E.g.: > > if_simloop(ifp, m, family, hdrlen, M_DUP); > > In STABLE, M_COPYALL can be added to hdrlen instead to preserve the > ABI. M_COPYALL is defined as 1000000000 now, which allows for the > trick. > > Comments? Thanks! What you suggest seems ok. You might also look at m_unshare which was added for similar purpose but is not exactly what you want. It may be possible to combine m_copy+m_unshare code (not calls) to create the new mbuf chain more efficiently. Sam