Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 11 Jul 2004 12:05:53 +0200
From:      Daniel Lang <dl@leo.org>
To:        Don Lewis <truckman@FreeBSD.org>
Cc:        current@FreeBSD.org
Subject:   Re: panic: m_copym, length > size of mbuf chain
Message-ID:  <20040711100553.GA64553@atrbg11.informatik.tu-muenchen.de>
In-Reply-To: <200407102324.i6ANOlEs015698@gw.catspoiler.org>
References:  <Pine.NEB.3.96L.1040710092144.19581J-100000@fledge.watson.org> <200407102324.i6ANOlEs015698@gw.catspoiler.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Don,

referring to your first answer, the 'len' parameter in the
tcp_output.c frame is 1460, offset is 737. The sum is
obviously greater than 975, the value of so->so_snd.sb_cc.

So the suggested assertion from Robert would have been
triggered. I consider now adding the SOCKBUF_DEBUG
value. However, with SACK disabled, the machine is still
up and running now.

Don Lewis wrote on Sat, Jul 10, 2004 at 04:24:47PM -0700:
[..]
> > (2) Try adding some assertions just before the copy to m_copy() in
> >     tcp_output().  I'd suggest something like the following:
> 
> I'm very suspicious of the SACK code.  In the non-SACK case, len gets
> set here:
> 
> 	if (!sack_rxmit)
> 		len = ((long)ulmin(so->so_snd.sb_cc, sendwin) - off);
> 
> but when the system panics len+off > sb_cc.
Yes.

> It would be interesting to look at *tp and *p in the tcp_output stack
> frame.
> 
> If I had to guess, I'd say that either tp->snd_recover-tp->snd_una or
> p->end-tp->snd_una is greater than so->so_snd.sb_cc.

(kgdb) p *tp
$4 = {t_segq = {lh_first = 0x0}, t_segqlen = 0, t_dupacks = 16, unused = 0x0,
  tt_rexmt = 0xc3f50148, tt_persist = 0xc3f50160, tt_keep = 0xc3f50178,
  tt_2msl = 0xc3f50190, tt_delack = 0xc3f501a8, t_inpcb = 0xc4d592d0,
  t_state = 5, t_flags = 1049092, t_force = 0, snd_una = 2644477935,
  snd_max = 2644478910, snd_nxt = 2644478910, snd_up = 2644477935,
  snd_wl1 = 465530853, snd_wl2 = 2644477935, iss = 2644477934,
  irs = 465530852, rcv_nxt = 465530854, rcv_adv = 465596389, rcv_wnd = 65700,
  rcv_up = 465530854, snd_wnd = 17520, snd_cwnd = 26280,
  snd_bwnd = 1073725440, snd_ssthresh = 2920, snd_bandwidth = 3498991,
  snd_recover = 2644478412, t_maxopd = 1460, t_rcvtime = 333934,
  t_starttime = 330457, t_rtttime = 333904, t_rtseq = 2644478412,
  t_bw_rtttime = 330457, t_bw_rtseq = 0, t_rxtcur = 145, t_maxseg = 1460,
  t_srtt = 717, t_rttvar = 72, t_rxtshift = 0, t_rttmin = 3, t_rttbest = 749,
  t_rttupdated = 0, max_sndwnd = 17520, t_softerror = 0, t_oobflags = 0 '\0',
  t_iobc = 0 '\0', snd_scale = 0 '\0', rcv_scale = 0 '\0',
  request_r_scale = 0 '\0', requested_s_scale = 0 '\0', ts_recent = 0,
  ts_recent_age = 0, last_ack_sent = 465530854, cc_send = 0, cc_recv = 0,
  snd_cwnd_prev = 0, snd_ssthresh_prev = 0, snd_recover_prev = 0,
  t_badrxtwin = 0, snd_limited = 0 '\0', rcv_second = 0, rcv_pps = 0,
  rcv_byps = 0, sack_enable = 1, snd_numholes = 4, snd_holes = 0xc4280be0,
  rcv_laststart = 465530854, rcv_lastend = 465530854,
  rcv_lastsack = 2644478693, rcv_numsacks = 0, sackblks = {{start = 0,
      end = 0}, {start = 0, end = 0}, {start = 0, end = 0}, {start = 0,
      end = 0}, {start = 0, end = 0}, {start = 0, end = 0}}}

So snd_recover - snd_una = 2644478412 - 2644477935 = 477,
this is less than so->so_snd.sb_cc = 975.

(kgdb) p *p
$6 = {start = 2644478672, end = 2644478686, rxmit = 2644478672, next = 0x0}

p->end - snd_una = 2644478686 - 2644477935 = 751, again less.

Hmmm, I inspected the code in tcp_output.c about occurences
of 'len', I stumbled across this code:

[..]
       /*
         * NOTE! on localhost connections an 'ack' from the remote
         * end may occur synchronously with the output and cause
         * us to flush a buffer queued with moretocome.  XXX
         *
         * note: the len + off check is almost certainly unnecessary.
         */
        if (!(tp->t_flags & TF_MORETOCOME) &&   /* normal case */
            (idle || (tp->t_flags & TF_NODELAY)) &&
            len + off >= so->so_snd.sb_cc &&
            (tp->t_flags & TF_NOPUSH) == 0) {
            goto send;
[..]

So here there is actually a check, but it does not seem to be
a big problem, as later the length is adjusted if

len + optlen + ipoptlen > tp->t_maxopd

And as above confirmed, the len is 1460 throughout this frame....

Best regards,
 Daniel
-- 
IRCnet: Mr-Spock  - My name is Pentium of Borg, division is futile, you
                                                will be approximated. - 
 Daniel Lang * dl@leo.org * +49 89 289 18532 * http://www.leo.org/~dl/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040711100553.GA64553>