Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 26 Aug 2013 15:41:40 -0600
From:      Ian Lepore <ian@FreeBSD.org>
To:        Andre Oppermann <andre@FreeBSD.org>
Cc:        freebsd-arm <freebsd-arm@FreeBSD.org>
Subject:   Re: ARM network trouble after recent mbuf changes
Message-ID:  <1377553300.1111.157.camel@revolution.hippie.lan>
In-Reply-To: <521BC472.7040804@freebsd.org>
References:  <1377550636.1111.156.camel@revolution.hippie.lan> <521BC472.7040804@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 2013-08-26 at 23:11 +0200, Andre Oppermann wrote:
> On 26.08.2013 22:57, Ian Lepore wrote:
> > This new thread pulls together info from several other threads and irc
> > conversations, to summarize what we know right now for Andre in case the
> > problem is directly related to the mbuf changes.
> >
> > It looks like ARM systems consistantly get address translation faults
> > related to network operations during boot.  Zbyszek Bodek bisected it
> > down to r254807; revisions before that work, beginning with that one
> > they don't.  A representative dmesg appears below.  The abort happens in
> > in_cksum(), or sbappendaddr_locked(), or soreceive_generic(), depending
> > on various kernel config options and what network operations happen
> > first.
> >
> > Thomas Skibo reports:
> >
> > I've been experiencing this too on the Zedboard and I spent some time
> > looking into it.
> >
> > In my case, arprequest() is overwriting past the end of an mbuf into the
> > m_next field of the next one.  Later, something tries to reference
> > address 0x6401a8c0 which is actually the machine's IP address in network
> > order.  It looks like MH_ALIGN() used in arprequest() isn't working
> > properly after the recent mbuf header changes.
> >
> > Here's the mbuf just after arprequest() has performed MH_ALIGN().  The
> > m_data pointer is 0xc2c41de8 and the length is 0x1c.  That puts the data
> > over the edge into the next mbuf.  The m_pkthdr appears to have been
> > placed at 0xc2c41d18 (I think).  It looks like the compiler inserted
> > padding at 1d14 so MHLEN isn't correct.
> >
> > XMD% mrd 0xc2c41d00 32
> > C2C41D00:   00000000
> > C2C41D04:   00000000
> > C2C41D08:   C2C41DE8 (m_data)
> > C2C41D0C:   0000001C (m_len)
> > C2C41D10:   00000201 (m_type,m_flags)
> > C2C41D14:   00000000  (?)
> > C2C41D18:   00000000 (pkthdr.rcvif)
> > C2C41D1C:   00000000 (pkthdr.tags)
> > C2C41D20:   0000001C (pkthdr.len)
> > C2C41D24:   00000000
> > C2C41D28:   00000000
> > C2C41D2C:   00000000
> >
> > Thomas also reports that removing the bitfield definitions, so that
> > flags and type are two separate integers, works around the problem.
> >
> > Could this be something related to how bitfields are handled in EABI?
> 
> Can you try this patch see check if it makes a difference on the bitfield?
> 

Nope, that made no difference for me, same abort in the same place.

-- Ian





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1377553300.1111.157.camel>