Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 10 Jul 2004 12:50:17 +0200
From:      Daniel Lang <dl@leo.org>
To:        Robert Watson <rwatson@freebsd.org>
Cc:        current@freebsd.org
Subject:   Re: panic: m_copym, length > size of mbuf chain
Message-ID:  <20040710105017.GA61243@atrbg11.informatik.tu-muenchen.de>
In-Reply-To: <Pine.NEB.3.96L.1040707122259.37929D-100000@fledge.watson.org>
References:  <20040707162154.GB45200@atrbg11.informatik.tu-muenchen.de> <Pine.NEB.3.96L.1040707122259.37929D-100000@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Robert,

Robert Watson wrote on Wed, Jul 07, 2004 at 12:24:59PM -0400:
[..]
> Just to try ruling out possibilities -- have you run an extensive set of
> hardware diagnostics?  Most server class hardware ships with a decent
> diagnostics disk, and I'm sure we can find some for you in the event your
> hardware didn't come with some.  While it's quite possibly a software
> problem, tracking hardware problems using software symptoms constitutes
> undesirable pain and so it wouldn't hurt to give that a spin.  I remember
> seing your earlier e-mails about running with WITNESS increasing the
> chances of pain -- this could be a bug in WITNESS as you suggest, or it
> could be that WITNESS increases the opportunities for a variety of locking
> related races by increasing the cost of lock/unlock operations.
[..]

So I come back to the issue. As I already wrote, I guess I can
rule out hardware problems now. I did a very thorough test with
the Dell diagnosis utilities which showed no problems.

Also, after John's patch I did not see any WITNESS related
problems (so far) again. But I had the m_copy panic again
(see subject). This time I did file a PR and did some more detailed
gdb analysis. It is all documented at:

http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/68889

I am puzzled, because the stack frame on entering m_copym has
0x0 as first argument (m), however in the previous frame
when m_copy() is called, the struct mbuf* argument is valid.

Ok, I just realized that there is a difference m_copy()=20
and m_copym() are apparently different functions. Is this a=20
makro/#define discrepancy it seems that that m_copym() is the
function which is called in this line of code.

Ah, I found it:

sys/mbuf.h:#define      m_copy(m, o, l) m_copym((m), (o), (l), M_DONTWAIT)

so, the puzzle remains, since the arguments passed are kept, except
that M_DONTWAIT flag is added.=20

Is this a trashed stack?

Cheers,
 Daniel
--=20
IRCnet: Mr-Spock     - Cool people don't move, they just hang around. - =20
Daniel Lang * dl@leo.org * ++49 89 289 18532  * http://www.leo.org/~dl/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040710105017.GA61243>