FreeBSD Mail Archives

Date:      Fri, 13 May 2005 13:39:21 -0600 (MDT)
From:      Matt Ruzicka <matt@frii.com>
To:        Maxim Konovalov <maxim@macomnet.ru>
Cc:        freebsd-net@freebsd.org
Subject:   Re: **net** Re: Outbound TCP issue, potentially related to'FreeBSD-SA-05:08.kmem  [REVISED]'
Message-ID:  <Pine.BSF.4.58.0505131316460.4892@elara.frii.com>
In-Reply-To: <20050513230848.K89035@mp2.macomnet.net>
References:  <Pine.BSF.4.58.0505121627400.66727@elara.frii.com> <20050513134227.P616@odysseus.silby.com> <20050513230848.K89035@mp2.macomnet.net>

hmm.. I'm starting to feel a bit silly maybe.

Running the netstat and grep below showed that we had between 800-1700
lines in TIME_WAIT.

I then ran the netcat test script while checking for TIME_WAITs.  They
climbed to the 4800 range then I started getting port failures.

I then ran sysctl net.inet.ip.portrange.last=65535 at Mike's
recommendation.

I then re-ran the netcat script while checking for TIME_WAITs.  This time
they climbed to 15000 range before I starting getting errors, but this time
I started getting errors from the netcat script.

  Can't get socket : No buffer space available

This leads me to believe we were in fact running out of ports and while
running this script also overran our socket buffers.

I just got the second email from Maxim with the vmstat request as well..

Currently both boxes are sitting higher in the TIME_WAITs

box 1:

-->netstat -an | grep -c TIME_WAIT                    2005/05/13 13:27:41
2455
-->vmstat -z | grep -i sock                           2005/05/13 13:29:41
socket:          224,    16424,   3874,  12564,  1107241


box 2:

-->netstat -an | grep -c TIME_WAIT                    2005/05/13 13:27:10
3541
-->vmstat -z | grep -i sock                           2005/05/13 13:29:50
socket:          224,    16424,   3987,   2760,   726779


When I check the vmstat while getting errors from the netcat script I get
this.

-->vmstat -z | grep -i sock                           2005/05/13 13:33:20
socket:          224,    16424,  16438,      0,  1150867


A minute or so later we are back to this:

-->netstat -an | grep -c TIME_WAIT                    2005/05/13 13:33:55
2302
-->vmstat -z | grep -i sock                           2005/05/13 13:34:27
socket:          224,    16424,   3282,  13156,  1155482


Here is my vmstat -z in a "normal" state.

ITEM            SIZE     LIMIT    USED    FREE  REQUESTS

PIPE:            160,        0,     80,    124,   282288
SWAPMETA:        160,   233016,      0,      0,        0
unpcb:           160,        0,     46,    104,    74835
ripcb:           192,    16424,      0,     21,        1
divcb:           192,    16424,      0,      0,        0
syncache:        160,    15359,      5,     71,   316134
tcpcb:           576,    16424,   3562,  12836,  1040575
udpcb:           192,    16424,     13,     93,    51641
socket:          224,    16424,   3621,  12817,  1167053
KNOTE:            64,        0,      0,    128,    50789
NFSNODE:         352,        0,  77943,     14,   415454
NFSMOUNT:        544,        0,      4,     10,        4
VNODE:           192,        0,  79602,    110,    79602
NAMEI:          1024,        0,      0,     32, 28750812
VMSPACE:         192,        0,    165,    155,   187746
PROC:            416,        0,    175,    168,   187759
DP fakepg:        64,        0,      0,      0,        0
PV ENTRY:         28,  2690958, 809707, 711801, 259399026
MAP ENTRY:        48,        0,  15192,  16216, 10277033
KMAP ENTRY:       48,    65615,   1037,    200,   305137
MAP:             108,        0,      7,      3,        7
VM OBJECT:        92,        0,  77799,    117,  4201147


And during the failures..

ITEM            SIZE     LIMIT    USED    FREE  REQUESTS

PIPE:            160,        0,     76,    128,   283432
SWAPMETA:        160,   233016,      0,      0,        0
unpcb:           160,        0,     42,    108,    75353
ripcb:           192,    16424,      0,     21,        1
divcb:           192,    16424,      0,      0,        0
syncache:        160,    15359,      2,     74,   331720
tcpcb:           576,    16424,  16375,     23,  1074316
udpcb:           192,    16424,     13,     93,    51949
socket:          224,    16424,  16430,      8,  1201620
KNOTE:            64,        0,      0,    128,    51096
NFSNODE:         352,        0,  78365,     21,   417728
NFSMOUNT:        544,        0,      4,     10,        4
VNODE:           192,        0,  80024,    112,    80024
NAMEI:          1024,        0,      0,     32, 28983336
VMSPACE:         192,        0,    150,    170,   202142
PROC:            416,        0,    160,    183,   202155
DP fakepg:        64,        0,      0,      0,        0
PV ENTRY:         28,  2690958, 661633, 859875, 263111591
MAP ENTRY:        48,        0,  13004,  18404, 10546263
KMAP ENTRY:       48,    65615,   1034,    203,   306496
MAP:             108,        0,      7,      3,        7
VM OBJECT:        92,        0,  78224,    484,  4438390



Am I pretty much just looking at a tuning issue at this point I assume?



Matthew Ruzicka - Systems Administrator
Front Range Internet, Inc.
matt@frii.net - (970) 212-0728

Got SPAM?  Take back your email with MailArmory.  http://www.MailArmory.com

On Fri, 13 May 2005, Maxim Konovalov wrote:

> On Fri, 13 May 2005, 12:58-0600, Matt Ruzicka wrote:
>
> > Yes, it still does.  And actually the script Maxim attached to his last
> > email (using our IP's) has an interesting side effect of causing the
> > connections to fail.
> >
> > It doesn't fail right away, but within a few moments.
> >
> > -->./netcat-test                          2005/05/13 12:46:51
> > fail
> > fail
> > fail
> > fail
> > ...
>
> Please run
>
> netstat -an | grep -c TIME_WAIT
>
> when fails occur.
>
> --
> Maxim Konovalov
>

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.58.0505131316460.4892>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation