From owner-freebsd-net@FreeBSD.ORG Fri Mar 29 12:10:02 2013 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 65A663AF for ; Fri, 29 Mar 2013 12:10:02 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 57AD3EB9 for ; Fri, 29 Mar 2013 12:10:02 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r2TCA2h7038534 for ; Fri, 29 Mar 2013 12:10:02 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r2TCA2d1038533; Fri, 29 Mar 2013 12:10:02 GMT (envelope-from gnats) Date: Fri, 29 Mar 2013 12:10:02 GMT Message-Id: <201303291210.r2TCA2d1038533@freefall.freebsd.org> To: freebsd-net@FreeBSD.org Cc: From: Gleb Smirnoff Subject: Re: misc/177456: An error of calculating TCP sequence number will resault in the machine to restart X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Gleb Smirnoff List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Mar 2013 12:10:02 -0000 The following reply was made to PR kern/177456; it has been noted by GNATS. From: Gleb Smirnoff To: HouYeFei&XiBoLiu Cc: freebsd-gnats-submit@FreeBSD.org Subject: Re: misc/177456: An error of calculating TCP sequence number will resault in the machine to restart Date: Fri, 29 Mar 2013 16:08:03 +0400 HouYeFei & XiBoLiu, On Thu, Mar 28, 2013 at 11:55:04PM +0000, HouYeFei&XiBoLiu wrote: H> >Number: 177456 H> >Category: misc H> >Synopsis: An error of calculating TCP sequence number will resault in the machine to restart H> >Confidential: no H> >Severity: non-critical H> >Priority: low H> >Responsible: freebsd-bugs H> >State: open H> >Quarter: H> >Keywords: H> >Date-Required: H> >Class: sw-bug H> >Submitter-Id: current-users H> >Arrival-Date: Fri Mar 29 00:00:00 UTC 2013 H> >Closed-Date: H> >Last-Modified: H> >Originator: HouYeFei&XiBoLiu H> >Release: FreeBSD-9.0 H> >Organization: H> H3C H> >Environment: H> FreeBSD www.unixnotes.net 9.0-RELEASE FreeBSD 9.0-RELEASE #0: Sun May 4 12:36:15 HKT 2012 root@www.unixnotes.net:/usr/src/sys/i386/compile/unixnotes i386 H> >Description: H> There is a large number of TCP links between Client and Server, each link can transmit large amounts of data. When the Client is low on memory, at the same time it wants to establish a new TCP connection to the server. The Client sends SYN message and startups retransmission timer, but retransmission of the first time H> H> sends failed because there is not enough mbuf.At this time, a sequence number is transmitted messages on the tcpcb (tp->snd_nxt) regression. Then H> H> a syn+ack message is received and processing the tp->snd_una sequence number is increased by 1, resault in tp->snd_nxt < th->snd_una. It is likely that H> H> the sending buffer has data to send, but actually is empty, call H> H> Tcp_output to send ack to the Server. But Tcp_output enter to the mbuf replication process, leading to access a null pointer. I am trying to reproduce the problem, with no success yet. Can you please clarify the sequence of failures that is required? I understand your submission in the following way: Client performs connect(2). Client TCP stack generates SYN packet, and this packet is lost in network. Client TCP stack tries to retransmit SYN packet, buf mbuf allocation fails. Client TCP stack retransmits SYN packet. Server replies with SYN+ACK. ... and according to you smth should go wrong ... But in my tests nothing goes wrong. Client successfully retransmits SYN and connection is established. This is how I instrument this. I have added special TCP option and set it before doing connect. The tcp_output() emulates failures that you described: Index: tcp_output.c =================================================================== --- tcp_output.c (revision 248873) +++ tcp_output.c (working copy) @@ -898,6 +898,13 @@ send: else TCPSTAT_INC(tcps_sndwinup); + /* Fail allocating second packet. */ + if (tp->t_flags & TF_ZHOPA && tp->t_zhopa == 1) { + tp->t_zhopa = 2; + m = NULL; + error = ENOBUFS; + goto out; + } else m = m_gethdr(M_NOWAIT, MT_DATA); if (m == NULL) { error = ENOBUFS; @@ -1273,6 +1280,13 @@ timer: if (V_path_mtu_discovery && tp->t_maxopd > V_tcp_minmss) ip->ip_off |= htons(IP_DF); + /* Lose first packet. */ + if (tp->t_flags & TF_ZHOPA && tp->t_zhopa == 0) { + tp->t_zhopa = 1; + m_freem(m); + error = 0; + } else + error = ip_output(m, tp->t_inpcb->inp_options, &ro, ((so->so_options & SO_DONTROUTE) ? IP_ROUTETOIF : 0), 0, tp->t_inpcb); Am I doing something wrong? Can you provide your way to reproduce this? Do you have backtrace of the panic? -- Totus tuus, Glebius.