From owner-freebsd-hackers@FreeBSD.ORG Sat Jun 8 13:16:31 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 81AC97CD for ; Sat, 8 Jun 2013 13:16:31 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [198.74.231.69]) by mx1.freebsd.org (Postfix) with ESMTP id 457FD1473 for ; Sat, 8 Jun 2013 13:16:31 +0000 (UTC) Received: from fledge.watson.org (fledge.watson.org [198.74.231.63]) by cyrus.watson.org (Postfix) with ESMTPS id 4FE7B46B8D; Sat, 8 Jun 2013 09:16:30 -0400 (EDT) Date: Sat, 8 Jun 2013 14:16:30 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: vasanth rao naik sabavat Subject: Re: question in sosend_generic() In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Jun 2013 13:16:31 -0000 On Fri, 7 Jun 2013, vasanth rao naik sabavat wrote: > When sending data out of the socket I don't see in the code where the sb_cc > is incremented. sb_cc reflects data appended to the socket buffer; sosend_generic() is responsible for arranging copying in and performing flow control, but the protocol's own pru_send() routine performs the append. E.g., tcp_usr_send() performs sbappendstream() which actually adds it to the socket buffer. Notice that not all protocols actually use the send socket buffer -- for example, UNIX domain sockets direct cross-deliver to the receiving socket's receive buffer. > Is the socket send performed in the same thread of execution or the data is > copied on to the socket send buffer and a different thread then sends the > data out of the socket? Protocols provide their own implementations to handle data moving down the stack, so the specifics are protocol-dependent. In TCP, socket buffer append occurs synchronously in the same thread as part of the pru_send() downcall from the socket layer. When data leaves the send socket buffer is quite a different question. For TCP, data may be sent immediately if there various windows allow immediate transmit of the data (e.g., flow control, congestion control) ... or it may remain enqueued in the send socket buffer until an ACK is received that indicates the receiver is ready for more data (E.g., growing window size, ACK clocking, etc). In the steady send state (e.g., filling the window) I would expect to see data sent (and later removed) from the socket buffer only in an asynchronous context. Typically, ACK processing occurs in one of two threads: device driver interrupt handling (i.e., in the ithread) or in the netisr thread for encapsulated or looped back traffic. > Because, I see a call to sbwait(&so->so_snd) in the sosend_generic and I > don't understand who would wake up this thread? sbwait() implements blocking for flow/congestion control: when the socket buffer fills, the sending thread must wait for space to open up. Space becomes available as a result of successful transmit -- e.g., the sbtruncate() of the sending socket buffer when a TCP ACK has been received. So the thread that triggers the wakeup will usually be the ithread or netisr. In the case of UNIX domain sockets, it's actually the receiving thread that triggers the wakeup directly. > If the data is not copied on to the socket buffers then it should > technically send all data out in the same thread of execution and future > socket send calls should see that space is always fully available. In that > case I dont see a reason why we need to wait on the socket send buffer. As > there would no one who will actually wake you up. There are some false assumptions here. The sending thread will always append data [that fits] to the socket buffer, but may have to loop awaiting space for all data, depending on blocking/non-blocking status. Space becomes available when the remote endpoint acknowledges receipt, perhaps via a TCP ACK. You might never wake up if flow control from the remote endpoint doesn't find space becoming available, you've enabled blocking, and no timeout is set. If you fear the recipient may block the sender, then you need to implement some timeout mechanism to decide how long you're willing to wait. > if (space < resid + clen && > (atomic || space < so->so_snd.sb_lowat || space < > clen)) { > if ((so->so_state & SS_NBIO) || (flags & MSG_NBIO)) > { > SOCKBUF_UNLOCK(&so->so_snd); > error = EWOULDBLOCK; > goto release; > } > error = sbwait(&so->so_snd); > SOCKBUF_UNLOCK(&so->so_snd); > if (error) > goto release; > goto restart; > } > > In the above code snippet, for a blocking socket if the space is not > available, then it may trigger a deadlock? You can experience deadlocks between senders and receivers as a result of cyclic waits for constrained resources (e.g., buffers). However, that is a property of application design, and applications that are killed will close their sockets, releasing resources. Most application designers attempt to avoid deadlock in their designs by ensuring that there is a path to progress, even a slow one. The deadlock you're suggesting in general does not exist -- it would be silly to wait for something that could never happen. Instead, we wait for things that generally will happen (e.g., a TCP ACK) or a timeout, which would close the connection. Notice that sbwait() is allowed to fail -- if the connection is severed due to a timeout or RST, then it returns immediately with an error. Robert