From owner-freebsd-hackers@FreeBSD.ORG Sun Jul 24 02:38:48 2005 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 89EDB16A420 for ; Sun, 24 Jul 2005 02:38:48 +0000 (GMT) (envelope-from edwin@verolan.com) Received: from ns11.webmasters.com (ns11.webmasters.com [66.118.156.2]) by mx1.FreeBSD.org (Postfix) with SMTP id 6A23243D45 for ; Sun, 24 Jul 2005 02:38:47 +0000 (GMT) (envelope-from edwin@verolan.com) Received: (qmail 25293 invoked from network); 24 Jul 2005 02:35:39 -0000 Received: from unknown (HELO localhost.localdomain) (204.9.60.14) by ns11.webmasters.com with SMTP; 24 Jul 2005 02:35:39 -0000 Received: from localhost.localdomain (asx01 [127.0.0.1]) by localhost.localdomain (8.13.1/8.13.1) with ESMTP id j6O2ckg7015410; Sat, 23 Jul 2005 22:38:46 -0400 Received: (from edwin@localhost) by localhost.localdomain (8.13.1/8.13.1/Submit) id j6O2cj6A015409; Sat, 23 Jul 2005 22:38:45 -0400 Date: Sat, 23 Jul 2005 22:38:45 -0400 From: Edwin To: Max Laier Message-ID: <20050724023845.GA15209@asx01.verolan.com> References: <20050719034215.GB20752@asx01.verolan.com> <200507231623.16183.max@love2party.net> <20050723184108.GA14076@asx01.verolan.com> <200507240027.54127.max@love2party.net> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <200507240027.54127.max@love2party.net> User-Agent: Mutt/1.4.1i X-Operating-System: Linux/(i686) Cc: Edwin , freebsd-hackers@freebsd.org, Giorgos Keramidas Subject: Re: help w/panic under heavy load - 5.4 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Jul 2005 02:38:48 -0000 Max/et.al., replies to your message in-line below... If I understand correctly...(albeit an overly brief understanding :)) 1. ethernet packet comes in - stuck into an mbuf 2. ether_demux calls ip_fastforward passing the mbuf struct 3. mbuf struct is copied/munged into ip struct by mtod 4. ntohs is called to change ip->ip_len to host byte order incidentally - ip_len should be set to ntohs(ip->ip_len) as well - it seems like neither one of those calls worked? 5. also - the call to set hlen to ip->ip_hl <<2 didn't work out well either - right? since hlen = -1057417216, and i think it's supposed to be 20 (5*4) - am I correct there as well? 6. due to ip->ip_len being in network byte order still a little gremlin helps us to think we have a 10240 byte packet and we need to fragment it... 7. in ip_fragment - ip->ip_len is still 10240 - so we assume that we need to make several fragments - however, the mbuf is correct (len = 40) 8. in ip_fragment - to create the 'second' fragment, we try to copy 1480 bytes @ offset 1500 out of the mbuf that only has a valid data length of 40-bytes??? Are we really looking for the cause of ip->ip_len not being in the correct order @ the right time then? - in that case - there's two possibilities that I see - and I don't think that ntohs not working (1) is too realistic, so I would suppose we are looking for what flipped it in the first place? 1. either ntohs didn't work for some reason, or 2. it was already in host order, and the ntohs call flipped it back to network order If you feel that it's a ipfw/ipfil issue - I can easily take IPFIREWALL* options out of the kernel and build a new one - just give me about 15 minutes. cheers. /edwin Max Laier (max@love2party.net) wrote: > On Saturday 23 July 2005 20:41, Edwin wrote: > > Kernel name: D1-0722 (for reference) > > > > mbsd05# kgdb kernel.debug /usr/local/STORAGE/crash/vmcore.5 > > #13 0xc06933c1 in ip_fastforward (m=0xc12e6c00) at > > /usr/src/sys/netinet/ip_fastfwd.c:572 warning: Source file is more recent > > than executable. > > Let's hope that's still correct ... > it is - result of manual patch application and removal - just the timestamp/dates on the file are different (verified by diff from clean source tree just now to make sure again. > > 572 if (ip_fragment(ip, &m, mtu, ifp->if_hwassist, > > (kgdb) l > > 567 m->m_pkthdr.csum_flags |= CSUM_IP; > > 568 /* > > 569 * ip_fragment expects ip_len and ip_off in host byte > > 570 * order but returns all packets in network byte order > > 571 */ > > 572 if (ip_fragment(ip, &m, mtu, ifp->if_hwassist, > > 573 (~ifp->if_hwassist & CSUM_DELAY_IP))) { > > 574 goto drop; > > 575 } > > 576 KASSERT(m != NULL, ("null mbuf and no error")); > > (kgdb) i loc > > ip = (struct ip *) 0xc12f700e > > m0 = (struct mbuf *) 0xc12f700e > > ro = {ro_rt = 0xc11f8420, ro_dst = {sa_len = 16 '\020', sa_family = 2 > > '\002', sa_data = "\000\000ĀĻ\002\005\000\000\000\000\000\000\000"}} > > dst = (struct sockaddr_in *) 0xc76bfc3c > > ia = (struct in_ifaddr *) 0x0 > > ifa = (struct ifaddr *) 0x0 > > ifp = (struct ifnet *) 0xc0f91800 > > odest = {s_addr = 84060352} > > dest = {s_addr = 84060352} > > sum = 0 > > ip_len = 0 > > This should not happen. ip_len is initialize from ntohs(ip->ip_len) and never > touched again. Anyway, let's look some more ... is it accurate to say that ip->ip_len is 10240 @ this point - but it should be 40? at line 542 of ip_fastfwd.c 1.17.2.7... the ip->ip_len <= mtu should eval to true and fall through to the true case - but it falls through to false (hence the ip_fragment section) - b/c it is still in network order? if (ip->ip_len <= mtu || (ifp->if_hwassist & CSUM_FRAGMENT && (ip->ip_off & IP_DF) == 0)) { /* * Restore packet header fields to original values */ ip->ip_len = htons(ip->ip_len); ip->ip_off = htons(ip->ip_off); /* * Send off the packet via outgoing interface */ error = (*ifp->if_output)(ifp, m, (struct sockaddr *)dst, ro.ro_rt); } else { /* * Handle EMSGSIZE with icmp reply needfrag for TCP MTU discovery */ if (ip->ip_off & IP_DF) { ipstat.ips_cantfrag++; icmp_error(m, ICMP_UNREACH, ICMP_UNREACH_NEEDFRAG, 0, ifp); goto consumed; } else { /* * We have to fragement the packet */ m->m_pkthdr.csum_flags |= CSUM_IP; /* * ip_fragment expects ip_len and ip_off in host byte * order but returns all packets in network byte order */ if (ip_fragment(ip, &m, mtu, ifp->if_hwassist, (~ifp->if_hwassist & CSUM_DELAY_IP))) { goto drop; } KASSERT(m != NULL, ("null mbuf and no error")); /* > > > error = 84060352 > > hlen = -1057417216 > > mtu = 0 > > __func__ = "ip_fastforward" > > (kgdb) p *ip > > $1 = {ip_hl = 5, ip_v = 4, ip_tos = 0 '\0', ip_len = 10240, ip_id = 61249, > > ip_len should be 40 as ip_len is supposed to be in HOST BYTE ORDER at this > point. Feeding 10240 to ntohs() give the correct value, so something > obviously went wrong. > > Let's see how we got here: > 355 does the byteorder flip to host byte order > 366 pfil OUT > 451 pfil IN > 527 first check ip_len < if_mtu etc ... > > Obviously, the only thing that might mess with the byte order (unless I missed > something along the way) is one of the pfil consumers. > > *** > *** What firewall(s) are you running with? > *** ipfw enabled - it's a permit all (IPFIREWALL_DEFAULT_TOACCEPT) - output from 'ipfw show' fb54c# ipfw show 65535 26395 1874336 allow ip from any to any fb54c# here is the diff from the generic config mbsd05# diff /root/kernels/D1-0722 /root/kernels/GENERIC 21,22d20 < makeoptions DEBUG=-g < 24c22 < #cpu I486_CPU --- > cpu I486_CPU 26,27c24,25 < #cpu I686_CPU < ident D1-0722 --- > cpu I686_CPU > ident GENERIC 31,48d28 < < options KDB < options DDB < options INVARIANTS < options INVARIANT_SUPPORT < < options CPU_SOEKRIS < options CPU_GEODE < < options HZ=1000 < options DEVICE_POLLING < < options IPFIREWALL < options IPFIREWALL_VERBOSE < options IPFIREWALL_VERBOSE_LIMIT < options IPFIREWALL_DEFAULT_TO_ACCEPT < options DUMMYNET < options IPDIVERT mbsd05# > > > ip_off = 0, ip_ttl = 63 '?', ip_p = 17 '\021', ip_sum = 31921, ip_src = > > {s_addr = 67479744}, ip_dst = {s_addr = 84060352}} (kgdb) p *m > > $2 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xc12f700e "E", > > mh_len = 40, mh_flags = 3, mh_type = 1}, M_dat = {MH = {MH_pkthdr = {rcvif > > = 0xc0f90000, len = 40, header = 0x0, csum_flags = 769, csum_data = 0, tags > > 40, there you have it - no need to fragment at all! > > > /usr/src/sys/netinet/ip_output.c:967 > > 967 m->m_next = m_copy(m0, off, len); > > (kgdb) l > > 962 len = ip->ip_len - off; > > 963 m->m_flags |= M_LASTFRAG; > > 964 } else > > 965 mhip->ip_off |= IP_MF; > > 966 mhip->ip_len = htons((u_short)(len + mhlen)); > > 967 m->m_next = m_copy(m0, off, len); > > 968 if (m->m_next == NULL) { /* copy failed */ > > 969 m_free(m); > > 970 error = ENOBUFS; /* ??? */ > > 971 ipstat.ips_odropped++; > > Just to make sure, we didn't touch the original packet at this point so the > above values are still the ones we based the (wrong) decision on. > > -- > /"\ Best regards, | mlaier@freebsd.org > \ / Max Laier | ICQ #67774661 > X http://pf4freebsd.love2party.net/ | mlaier@EFnet > / \ ASCII Ribbon Campaign | Against HTML Mail and News