Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 03 May 2008 18:28:36 +0100
From:      Tim Gebbett <tim@gebbettco.com>
To:        Andre Oppermann <andre@freebsd.org>
Cc:        Peter Jeremy <peterjeremy@optushome.com.au>, rwatson@freebsd.org, Mark Hills <mark@pogo.org.uk>, freebsd-net@freebsd.org
Subject:   Re: read() returns ETIMEDOUT on steady TCP connection
Message-ID:  <481CA0C4.2040001@gebbettco.com>
In-Reply-To: <481B7232.60608@freebsd.org>
References:  <alpine.BSO.1.10.0804191437400.21362@zrgural.vwaro.pbz>	<20080420025010.GJ73016@server.vk2pj.dyndns.org>	<alpine.BSO.1.10.0804201238480.31900@zrgural.vwaro.pbz>	<480BBD7E.8010700@freebsd.org>	<alpine.BSO.1.10.0804210740100.1745@zrgural.vwaro.pbz>	<480C9AC6.8090802@freebsd.org>	<480E7901.5000804@freebsd.org>	<alpine.BSO.1.10.0804240030500.6125@zrgural.vwaro.pbz> <481B7232.60608@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Andre,

Just to introduce myself, I am now helping Mark Hills with testing. 
Thank you for your suggestion, here are the results from a similar 
system (RELENG-7) with increasing
kern.ipc.nmbjumbop to 25600.

at 1600 streams using approx 340mbit, netstat  -m  was reporting
 
12550/250/12800/12800 4k (page size) jumbo clusters in use

After the read() returns ETIMEDOUT,

3857/10551/14408/25600 4k (page size) jumbo clusters in use

sysctl kern.ipc.nmbjumbop=25600 > 51200

After the read() returns ETIMEDOUT,

200/25400/25600/51200 4k (page size) jumbo clusters in use 
(current/cache/total/max)

netstat -m:

4140/26205/30345 mbufs in use (current/cache/total)
256/3482/3738/25600 mbuf clusters in use (current/cache/total/max)
256/3328 mbuf+clusters out of packet secondary zone in use (current/cache)
3882/21718/25600/51200 4k (page size) jumbo clusters in use 
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
17075K/100387K/117462K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/7/6656 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

Do you think we need to reel out further sysctls and should I apply the 
patch to see if tcp_output: error 55  is still occuring ?

Thanks again, Tim


Andre Oppermann wrote:
> Mark Hills wrote:
>> On Wed, 23 Apr 2008, Andre Oppermann wrote:
>>
>>> http://people.freebsd.org/~andre/tcp_output-error-log.diff
>>>
>>> Please apply this patch and enable the sysctl net.inet.tcp.log_debug=1
>>> and report any output.  You likely get some (normal) noise from 
>>> syncache.
>>> What we are looking for is reports from tcp_output.
>>
>> Hi Andre, I've applied the patch and tested.
>>
>> Aside from syncache noise, I get a constant stream of 'error 55' 
>> (ENOBUFS?), once the number of connection gets to around 150 at 192kbps.
>>
>> TCP: [192.168.5.43]:52153 to [192.168.5.40]:8080; tcp_output: error 
>> 55 while sending
>>
>> 192.168.5.40 is the IP address of this host, running the server.
>>
>> I tried to correlate the point of the application receiving ETIMEDOUT 
>> with these messages, but that is tricky as it seems to be outputting 
>> a lot of messages, and multiple messages over eachother (see below).
>>
>> Because of the mention of no buffer space available, I checked the 
>> values of net.inet.tcp.sendbuf* and recvbuf*, and increased the max 
>> values with no effect.
>>
>> When I get time I will modify the kernel to print errors which aren't 
>> ENOBUFS to see if there are any others. But in the meantime, this 
>> sounds like a problem to me. Is that correct?
>>
>> Mark
>>
>>
>> :8080; tcp_output: error 55 while sending
>> TCP: [192.168.5.42]:57384T CtPo:  
>> [[119922..116688..55..4402]]::85048400;1  ttoc p[_1o9u2t.p1u6t8:. 
>> 5e.r4r0o]r: 8080;5 5t cwp_hoiultep uste:n deirnrgor 55 while sending
>> TCP: [192.168.5.42]:57382 to [192.168.5.40]:8080; tcp_output: error 
>> 55 while sending
>> TCP: [192.168.5.42]:57381 to [192.168.5.40]:8080; tcp_output: error 
>> 55 while sending
>> TCP: [192.168.5.42]:57380 to [192.168.5.40]:8080; tcp_output: error 
>> 55 while sending
>
> After tracing through the code it seems you are indeed memory limited.
> Looking back at the netstat -m output:
>
>  12550/250/12800/12800 4k (page size) jumbo clusters in use
>  (current/cache/total/max)
>  0/0/0 requests for jumbo clusters denied (4k/9k/16k)
>
> This shows that the supply of 4k jumbo clusters is pretty much exhausted.
> The cache may be allocated to different CPUs and the one making the 
> request
> at a given point may be depleted and can't get any from the global pool.
> The big question is why the denied counter doesn't report anything.  I've
> looked at the code paths and don't see any obvious reason why it doesn't
> get counted.  Maybe Robert can give some insight here.
>
> Try doubling the amount of 4k page size jumbo mbufs.  They are the 
> primary
> workhorse in the kernel right now:
>
>  sysctl kern.ipc.nmbjumbop=25600
>
> This should get further.  Still more may be necessary depending on 
> workloads.
>




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?481CA0C4.2040001>