Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 15 Nov 1996 10:07:47 -0600 (CST)
From:      Karl Denninger  <karl@Mcs.Net>
To:        jdp@polstra.com (John Polstra)
Cc:        karl@Mcs.Net, scrappy@ki.net, jgreco@brasil.moneng.mei.com, hackers@freebsd.org
Subject:   Re: Sockets question...
Message-ID:  <199611151607.KAA20718@Mercury.mcs.net>
In-Reply-To: <199611150258.SAA12064@austin.polstra.com> from "John Polstra" at Nov 14, 96 06:58:11 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> > I have an application which sends a query to a back-end server -- and that
> > server can return literally hundreds of KB of data in response.
> > 
> > On very long responses, it just STOPS.
> > 
> > The writing end thinks it wrote all of the data, netstat -an shows nothing
> > in the socket buffers, the reader is waiting in read() for more data (it
> > never saw the end marker, and in fact never saw more than half of the
> > response!)
> 
> If the socket is not in non-blocking mode, and you do a write(2) of N
> bytes to it, and the write call returns anything other than -1 or N,
> then that is a bug.

The mode has not been modified; it should be in *blocking* mode.  The write
returns the correct number of bytes in each case.

> If the sending socket returns N, but not all of the data gets to
> the receiving socket, then that is either a bug or a network problem.

Bingo.  Send more than about 400KB in a stream transaction and you WILL see
this.  Its very repeatable.

Bascially, what we do is this:

The query goes to the server.  It specifies the moral equivalent of a
"select" call in ESQL (this is a home-brew database application).

The server computes the response, and squirts out ~8500 records in response,
each of which is ~140 bytes long.  This is 8500 write(2) calls, one for each
record.  All return with no errors.

About 2700 of the records get to the other end.  The rest DISAPPEAR.  They
are NOT in the socket buffers (netstat -an shows *zero* bytes in the
queue).

I have *also* seen this if you zmodem a file to a terminal server port.  You
have a very good chance of long files not getting there intact.  I thought
this was a terminal server problem, but I'm no longer convinced -- this
looks like a problem in the network code.

Note that for the NETWORK case if we compile an on older machine running a
-CURRENT kernel (built around the July timeframe, has gcc 2.6.3 on it) the
application does NOT exhibit this problem.

Compile on the -current machine, and it DOES.

In both cases the application is linked -static.

> >From write(2):
> 
>      When using non-blocking I/O on objects such as sockets that are subject
> 		^^^^^^^^^^^^
>      to flow control, write() and writev() may write fewer bytes than request-
>      ed; the return value must be noted, and the remainder of the operation
>      should be retried when possible.
> 
> OK, it doesn't come right out and say directly that it _won't_ do that
> when using blocking I/O.  But that is the implication.  And it is the
> historical behavior.  And it is the behavior that probably 90% of
> network applications are written to expect.

--
--
Karl Denninger (karl@MCS.Net)| MCSNet - The Finest Internet Connectivity
http://www.mcs.net/~karl     | T1's from $600 monthly to FULL DS-3 Service
			     | 33 Analog Prefixes, 13 ISDN, Web servers $75/mo
Voice: [+1 312 803-MCS1 x219]| Email to "info@mcs.net" WWW: http://www.mcs.net/
Fax:   [+1 312 248-9865]     | 2 FULL DS-3 Internet links; 400Mbps B/W Internal



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199611151607.KAA20718>