Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 28 Oct 2011 07:56:23 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        Pawel Jakub Dawidek <pjd@freebsd.org>
Cc:        Kostik Belousov <kostikbel@gmail.com>, Lawrence Stewart <lstewart@freebsd.org>, freebsd-current@freebsd.org, Andre Oppermann <andre@freebsd.org>, freebsd-net@freebsd.org
Subject:   Re: 9.0-RC1 panic in tcp_input: negative winow.
Message-ID:  <201110280756.23317.jhb@freebsd.org>
In-Reply-To: <20111028054605.GF1667@garage.freebsd.pl>
References:  <20111022084931.GD1697@garage.freebsd.pl> <4EA9F76E.9010008@freebsd.org> <20111028054605.GF1667@garage.freebsd.pl>

next in thread | previous in thread | raw e-mail | index | archive | help
On Friday, October 28, 2011 1:46:07 am Pawel Jakub Dawidek wrote:
> On Fri, Oct 28, 2011 at 11:29:34AM +1100, Lawrence Stewart wrote:
> > On 10/26/11 22:53, John Baldwin wrote:
> > > The assertion would be triggered when the next packet arrives (as I said
> > > above).  Try modifying your debugging output to also log if the ACK is
> > > delayed.  I suspect it is not delayed until the last one.  (Pushing out 
an
> > > ACK will reset rcv_adv to be beyond rcv_nxt in tcp_output(), so in the 
case
> > > of an immediate ACK, rcv_nxt>  rcv_adv is only a transient condition all
> > > under a single lock invocation so never visible to other consumers of 
the
> > > protocol control block.)  If that is what you see, then that confirms 
what
> > > I guessed above and I will likely just remove the assertion in 
tcp_input()
> > > and patch the timewait code to handle this case.
> > >
> > 
> > Pawel, have you been able to confirm John's hypothesis? [...]
> 
> Yeah, sorry. I moved the debug to the points where we drop the t_inpcb
> lock and I still see rcv_nxt being greater than rcv_adv:
> 
> 	tcp_do_segment:2970 negative window: tp 0xfffffe00685ee3d0 rcv_nxt 
1312878324 rcv_adv 1312878187

Yes, I still expect this.  What I want to see is if 'delack' is always true in 
this case.

> This is just before the INP_WUNLOCK(tp->t_inpcb) under 'check_delack'
> label. I see this a lot (it was logged 545 times for 11 different tp
> pointers during 24h period).
> 
> 	tcp_do_segment:3009 negative window: tp 0xfffffe005cfc6000 rcv_nxt 
1442546453 rcv_adv 1442545722
> 
> This is just before calling tcp_output(). This one was logged 65 times
> for 3 different tp pointers.
> I placed a debug also after tcp_output() call, but it is not logged, so
> once we return from tcp_output() everything is fine.

That is consistent with what I expect then, since in the delack case,
tcp_output() isn't called.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201110280756.23317.jhb>