From owner-freebsd-stable Wed Aug 1 11:36:28 2001 Delivered-To: freebsd-stable@freebsd.org Received: from aussie.org (hallam.lnk.telstra.net [139.130.54.166]) by hub.freebsd.org (Postfix) with ESMTP id 1C24737B401 for ; Wed, 1 Aug 2001 11:36:18 -0700 (PDT) (envelope-from mlnn4@oaks.com.au) Received: from dualp2 (dualp2 [203.29.75.73]) by aussie.org (8.11.3/8.11.4) with SMTP id f71IaFZ01795 for ; Thu, 2 Aug 2001 04:36:15 +1000 (EST) (envelope-from mlnn4@oaks.com.au) Message-Id: <200108011836.f71IaFZ01795@aussie.org> From: "Chris" To: "freebsd-stable@freebsd.org" Date: Thu, 02 Aug 2001 04:34:29 +1000 Reply-To: "Chris" X-Mailer: PMMail 98 Standard (2.01.1600) For Windows NT (5.0.2195;2) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Subject: kernel upgrade causing truncated IPSEC packets [followup] Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG This is a followup to a message I posted in -net on the 29th of July. I'm hoping that someone will at least be able to confirm that they also see the problem. As a quick summary, when I upgraded a few boxes to a recent kernel (their old ones were from mid-April), our IPSEC VPN took a dive, and I'm trying to find out why. I see the below problem on multiple machines. The actual cause seems to be that the IPSEC-encapsulated packets are being truncated before they get to the PPP process, even though tcpdump and netstat on the tunnel device indicate that the full packet is being passed through. Additionally, I have now noticed that the truncation will happen even in transport mode, if the outgoing packets are large enough (I originally thought it was confined to tunnel mode). For example, if I ping a remote address with which I have a transport mode session established using a payload of 204 bytes, I get a reply. If I ping it with a payload of 205 bytes or more, the truncation occurs and I do not get a reply. You can see this clearly in the below output which is obtained from the PPP process by using the 'set log local Async' option. Result of 'ping -s 204 -c 1 210.XXX.XXX.XXX' Async: Write Async: 7e 21 45 00 01 20 0b 88 00 00 40 33 XX ee d1 XX Async: XX XX d2 XX XX XX 32 04 00 00 08 de 67 5e 00 00 Async: 00 20 67 8e 44 39 6a 40 92 47 bf 62 c3 4f 0f 88 Async: f7 1b 00 00 00 20 a1 b4 83 d6 1b b3 ac 60 27 65 Async: 3d 3a c3 af 80 5a 54 e0 1c 7d 5d 37 aa f0 ec 9c Async: 0f 9f e5 02 0a 8b 39 05 ea 8a 7d 5e 90 ca 50 60 Async: a5 80 a4 e2 85 f6 9a a0 47 32 19 c1 46 f7 f0 46 Async: 8f 10 3a 49 dd 4d 21 32 61 7b 35 03 ee 71 68 75 Async: 26 7a fd 18 d6 4e 1b 34 85 f9 bd 53 00 a2 8c ed Async: 3a 6e 8e 98 96 7d 33 39 37 06 5a 7b 9a a6 32 23 Async: ca f6 53 2c 56 f1 f3 43 02 43 2f 83 8a a1 b7 46 Async: 4d 71 db 7d 5d a8 97 db 9f aa 8c 72 10 eb 58 77 Async: eb 4b 1f d2 a4 88 f9 77 e5 7a 3b 95 00 70 f2 7d Async: 5d ee 79 69 14 eb 78 ff ae 4f c4 b4 d7 b8 6e 65 Async: 0d a6 0c 4a 1e 2b b4 b3 56 76 b1 28 82 de 6d c5 Async: 7f 1f 3c 43 58 58 e3 6b 90 c0 e2 6e 86 6b 61 b6 Async: 7a 93 8b d6 81 ff 60 fc 23 2a a0 c1 74 b2 a7 21 Async: fd c8 50 c0 4a 47 9f 2c cc 41 f0 95 a2 90 ca 7c Async: 98 51 70 c7 e4 19 7a 43 9e 7e Async: Read Async: 7e 21 45 00 01 20 c1 57 00 00 3b 33 XX 1e d2 XX Async: XX XX d1 XX XX XX 32 04 00 00 09 4b e8 da 00 00 Async: 00 16 7d 5d 5c 3d 9f f4 1a 23 28 73 53 f6 55 06 [rest of reply snipped] As you can see, the entire packet went out and I got a reply. The IP header indicates that the proper packet length is 0x120 (288) bytes, and 298 were sent (the rest being PPP overhead). Result of 'ping -s 205 -c 1 210.XXX.XXX.XXX' Async: Write Async: 7e 21 45 00 01 20 0b d9 00 00 40 33 XX 9d d1 XX Async: XX XX d2 XX XX XX 32 04 00 00 08 de 67 5e 00 00 Async: 00 21 62 8e 15 84 58 c8 4f 64 8e f4 d2 b2 0f 88 Async: f7 1b 00 00 00 21 60 2c ea 2a a2 68 07 74 01 23 Async: 7e [This is all that there was] By adding one byte to the size of the output packet, the IPSEC transport session now fails, with the above packet being truncated. I cannot get a sucessful transmission of any larger packet, either. As you can see from the above IP header, 0x0120 bytes should have been in the packet (identi- cal to the previous sucessful example due to padding within the IPSEC code) but only 65 were sent. Both tcpdump on tun0 and 'netstat -bI tun0' indicate that as far as the kernel is concerned, the full packet went out, even though it did not. FWIW here's the more verbose debug output for the second instance above - tun0: TCP/IP: OUT AH: 209.XXX.XXX.XXX ---> 210.XXX.XXX.XXX, spi 0xbfbff170 tun0: Debug: m_enqueue: len = 1 tun0: Debug: m_dequeue: queue len = 1 tun0: Debug: proto_LayerPush: Using 0x0021 tun0: HDLC: hdlc_Output tun0: HDLC: 21 45 00 01 20 1c 8b 00 00 40 33 XX eb d1 XX XX tun0: HDLC: XX d2 XX XX XX 32 04 00 00 09 c5 50 46 00 00 00 tun0: HDLC: 07 30 c8 2d d4 51 83 bb 6e ee e3 d5 f3 07 0c 2b tun0: HDLC: 1f 00 00 00 07 b3 7f 42 e4 74 cd db c9 98 ca tun0: Async: Write tun0: Async: 7e 21 45 00 01 20 1c 8b 00 00 40 33 XX eb d1 XX tun0: Async: XX XX d2 08 a2 04 32 04 00 00 09 c5 50 46 00 00 tun0: Async: 00 07 30 c8 2d d4 51 83 bb 6e ee e3 d5 f3 07 0c tun0: Async: 2b 1f 00 00 00 07 b3 7f 42 e4 74 cd db c9 98 ca tun0: Async: 7e tun0: Debug: link_PushPacket: Transmit proto 0x0021 tun0: Debug: m_enqueue: len = 1 tun0: Debug: m_dequeue: queue len = 1 tun0: Debug: link_Dequeue: Dequeued from queue 0, containing 0 more packets tun0: Physical: write tun0: Physical: 7e 21 45 00 01 20 1c 8b 00 00 40 33 XX eb d1 XX tun0: Physical: XX XX d2 XX XX XX 32 04 00 00 09 c5 50 46 00 00 tun0: Physical: 00 07 30 c8 2d d4 51 83 bb 6e ee e3 d5 f3 07 0c tun0: Physical: 2b 1f 00 00 00 07 b3 7f 42 e4 74 cd db c9 98 ca tun0: Physical: 7e tun0: Debug: deflink: DescriptorWrite: wrote 65(65) to 2 Which shows that as far as the PPP process is concerned, there were only 65 bytes to write (including PPP overhead), despite the kernel thinking otherwise. I am running the most recent PPP and a 4.3-STABLE kernel that was cvsupped on the 17th of July. A kernel built today shows basically identical behaviour. Several of the machines on the VPN do not use modems and are unaffected by the problem. Can anyone confirm my findings or offer suggestions ? ---------------------------- Topology (IP's are illustrative) - o Local LAN has machine 'A' at 10.0.58.2/24 o A's default gateway is the FBSD box 'B' at 10.0.58.1/24 o B is dialled up using ppp to routable address 1.2.3.4 o Central gateway 'C' is on the net at 5.6.7.8 o C has an interface hosting a local LAN at 10.0.48.1/24 I use IPSEC AH/ESP transport mode between 'B' and 'C', and have set up a native IPSEC tunnel (not using gif) between 10.0.58.0/24 and 10.0.48.0/24. This has been in place and working for a good part of a year. Since I put in the new kernel, the tunnel between A and C fails completely, and the tunnel and transport mode between B and C is intermittent (depending on the size of the packets). Swapping back to the April kernel made the problem go away so I do not expect the problem is in the PPP process per se. -- Chris To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message