From owner-freebsd-arm@FreeBSD.ORG Mon Aug 26 20:57:20 2013 Return-Path: Delivered-To: freebsd-arm@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id AD77C252; Mon, 26 Aug 2013 20:57:20 +0000 (UTC) (envelope-from ian@FreeBSD.org) Received: from mho-01-ewr.mailhop.org (mho-03-ewr.mailhop.org [204.13.248.66]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 725832FB3; Mon, 26 Aug 2013 20:57:20 +0000 (UTC) Received: from c-24-8-230-52.hsd1.co.comcast.net ([24.8.230.52] helo=damnhippie.dyndns.org) by mho-01-ewr.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from ) id 1VE3qg-000BJ5-Or; Mon, 26 Aug 2013 20:57:19 +0000 Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by damnhippie.dyndns.org (8.14.3/8.14.3) with ESMTP id r7QKvGcs052096; Mon, 26 Aug 2013 14:57:16 -0600 (MDT) (envelope-from ian@FreeBSD.org) X-Mail-Handler: Dyn Standard SMTP by Dyn X-Originating-IP: 24.8.230.52 X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX183M/zxwR6+B3B7uS1F8qH+ Subject: ARM network trouble after recent mbuf changes From: Ian Lepore To: Andre Oppermann , freebsd-arm Content-Type: text/plain; charset="us-ascii" Date: Mon, 26 Aug 2013 14:57:16 -0600 Message-ID: <1377550636.1111.156.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Porting FreeBSD to the StrongARM Processor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 20:57:20 -0000 This new thread pulls together info from several other threads and irc conversations, to summarize what we know right now for Andre in case the problem is directly related to the mbuf changes. It looks like ARM systems consistantly get address translation faults related to network operations during boot. Zbyszek Bodek bisected it down to r254807; revisions before that work, beginning with that one they don't. A representative dmesg appears below. The abort happens in in_cksum(), or sbappendaddr_locked(), or soreceive_generic(), depending on various kernel config options and what network operations happen first. Thomas Skibo reports: I've been experiencing this too on the Zedboard and I spent some time looking into it. In my case, arprequest() is overwriting past the end of an mbuf into the m_next field of the next one. Later, something tries to reference address 0x6401a8c0 which is actually the machine's IP address in network order. It looks like MH_ALIGN() used in arprequest() isn't working properly after the recent mbuf header changes. Here's the mbuf just after arprequest() has performed MH_ALIGN(). The m_data pointer is 0xc2c41de8 and the length is 0x1c. That puts the data over the edge into the next mbuf. The m_pkthdr appears to have been placed at 0xc2c41d18 (I think). It looks like the compiler inserted padding at 1d14 so MHLEN isn't correct. XMD% mrd 0xc2c41d00 32 C2C41D00: 00000000 C2C41D04: 00000000 C2C41D08: C2C41DE8 (m_data) C2C41D0C: 0000001C (m_len) C2C41D10: 00000201 (m_type,m_flags) C2C41D14: 00000000 (?) C2C41D18: 00000000 (pkthdr.rcvif) C2C41D1C: 00000000 (pkthdr.tags) C2C41D20: 0000001C (pkthdr.len) C2C41D24: 00000000 C2C41D28: 00000000 C2C41D2C: 00000000 Thomas also reports that removing the bitfield definitions, so that flags and type are two separate integers, works around the problem. Could this be something related to how bitfields are handled in EABI? A sample dmesg with the fault... KDB: debugger backends: ddb KDB: current backend: ddb Copyright (c) 1992-2013 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 10.0-CURRENT #1 r254935: Mon Aug 26 14:32:21 MDT 2013 ilepore@revolution.hippie.lan:/local/build/staging/freebsd/bb/obj/arm.armv6/local/build/staging/freebsd/bb/src/sys/BB arm FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610 CPU: Cortex A8-r3 rev 2 (Cortex-A core) Supported features: ARM_ISA THUMB2 JAZELLE THUMBEE ARMv4 Security_Ext WB disabled EABT branch prediction enabled LoUU:2 LoC:2 LoUIS:1 Cache level 1: 32KB/64B 4-way data cache WT WB Read-Alloc 32KB/64B 4-way instruction cache Read-Alloc Cache level 2: 256KB/64B 8-way unified cache WT WB Read-Alloc Write-Alloc real memory = 268435456 (256 MB) avail memory = 256483328 (244 MB) Texas Instruments AM3358 Processor, Revision ES1.0 random device not loaded; using insecure entropy random: initialized simplebus0: on fdtbus0 aintc0: mem 0x48200000-0x48200fff on simplebus0 aintc0: Revision 5.0 ti_scm0: mem 0x44e10000-0x44e11fff on simplebus0 am335x_prcm0: mem 0x44e00000-0x44e012ff on simplebus0 am335x_prcm0: Clocks: System 24.0 MHz, CPU 720 MHz am335x_dmtimer0: mem 0x44e05000-0x44e05fff,0x44e31000-0x44e31fff,0x48040000-0x48040fff,0x48042000-0x48042fff,0x48044000-0x48044fff,0x48046000-0x48046fff,0x48048000-0x48048fff,0x4804a000-0x4804afff irq 66,67,68,69,92,93,94,95 on simplebus0 Timecounter "AM335x Timecounter" frequency 24000000 Hz quality 1000 Event timer "AM335x Eventtimer0" frequency 24000000 Hz quality 1000 gpio0: mem 0x44e07000-0x44e07fff,0x4804c000-0x4804cfff,0x481ac000-0x481acfff,0x481ae000-0x481aefff irq 96,97,98,99,32,33,62,63 on simplebus0 gpioc0: on gpio0 gpiobus0: on gpio0 uart0: mem 0x44e09000-0x44e09fff irq 72 on simplebus0 uart0: console (115384,n,8,1) ti_edma30: mem 0x49000000-0x490fffff,0x49800000-0x498fffff,0x49900000-0x499fffff,0x49a00000-0x49afffff irq 12,13,14 on simplebus0 ti_edma30: EDMA revision 40014c00 sdhci_ti0: mem 0x48060000-0x48060fff irq 64 on simplebus0 mmc0: on sdhci_ti0 cpsw0: <3-port Switch Ethernet Subsystem> mem 0x4a100000-0x4a103fff irq 40,41,42,43 on simplebus0 cpsw0: CPSW SS Version 1.12 (0) cpsw0: Initial queue size TX=128 RX=384 cpsw0: Ethernet address: 00:18:31:8e:c0:96 miibus0: on cpsw0 smscphy0: PHY 0 on miibus0 smscphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto iichb0: mem 0x44e0b000-0x44e0bfff irq 70 on simplebus0 iichb0: I2C revision 4.0 iicbus0: on iichb0 iic0: on iicbus0 am335x_pmic0: at addr 0x24 on iicbus0 am335x_pwm0: mem 0x48300000-0x483000ff,0x48300100-0x4830017f,0x48300180-0x483001ff,0x48300200-0x4830025f irq 86,58 on simplebus0 am335x_pwm1: mem 0x48302000-0x483020ff,0x48302100-0x4830217f,0x48302180-0x483021ff,0x48302200-0x4830225f irq 87,59 on simplebus0 am335x_pwm2: mem 0x48304000-0x483040ff,0x48304100-0x4830417f,0x48304180-0x483041ff,0x48304200-0x4830425f irq 88,60 on simplebus0 Timecounters tick every 10.000 msec mmcsd0: 8GB at mmc0 48.0MHz/4bit/65535-block am335x_pmic0: TPS65217B ver 1.1 powered by USB and AC Sending DHCP Discover packet from interface cpsw0 (00:18:31:8e:c0:96) cpsw0: link state changed to DOWN cpsw0: link state changed to UP Received DHCP Offer packet on cpsw0 from 172.22.42.240 (accepted) Sending DHCP Request packet from interface cpsw0 (00:18:31:8e:c0:96) Received DHCP Ack packet on cpsw0 from 172.22.42.240 (accepted) (got root path) cpsw0 at 172.22.42.234 server 172.22.42.240 boot file /bb/boot/kernel/kernel subnet mask 255.255.255.0 router 172.22.42.254 rootfs 172.22.42.240:/bb Adjusted interface cpsw0 vm_fault(0xc05b0820, 57405000, 1, 0) -> 1 Fatal kernel mode data abort: 'Translation Fault (S)' trapframe: 0xc05c1ae8 FSR=00000005, FAR=5740540c, spsr=20000093 r0 =c188d418, r1 =00000000, r2 =00000000, r3 =00000010 r4 =f02a16ac, r5 =ea2a16ac, r6 =c188d3e8, r7 =00000000 r8 =425443df, r9 =00000000, r10=00000014, r11=c188d40c r12=57405400, ssp=c05c1b3c, slr=c049d748, pc =c049d720 [ thread pid 0 tid 100000 ] Stopped at in_cksum+0x14: ldr r1, [r12, #0x00c] -- Ian