Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 Nov 2008 21:43:07 +1100
From:      Peter Jeremy <peterjeremy@optushome.com.au>
To:        pluknet <pluknet@gmail.com>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: CARP performance tuning question.
Message-ID:  <20081106104307.GC51239@server.vk2pj.dyndns.org>
In-Reply-To: <a31046fc0811050540o527d315dvef1b35142f5caa29@mail.gmail.com>
References:  <a31046fc0811050540o527d315dvef1b35142f5caa29@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--/Uq4LBwYP4y1W6pO
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Whilst I don't doubt that you have a problem, your comments don't
correlate particularly well with the data you have provided and
this makes it difficult to immediately suggest a solution.

On 2008-Nov-05 16:40:32 +0300, pluknet <pluknet@gmail.com> wrote:
>AT work we use device carp(4) under high load:

carp(4) is solely a failover mechanism.  It either generates or receives
somewhat under 1pps per carp interface and the state it maintains is
basically 'master' or 'backup'.  I suspect the 'load' is being caused
by pf(4), possibly in conjunction with pfsync(4).

>The problem is that the server experiences a bad interactivity (from
>70k states and very bad from 120-150k)
>i.e. when a network workload (and interrupts count) begin to increase.
>
>>From top(1):
>CPU states:  0.0% user,  0.0% nice,  0.4% system, 76.3% interrupt, 23.3% i=
dle
>  PID USERNAME        THR PRI NICE   SIZE    RES STATE    TIME   WCPU COMM=
AND
>  13 root              1 -44 -163     0K     8K WAIT   407:43 57.86% swi1:=
 net

I agree that swi1 is using a significant amount of CPU but top is
still reporting >23% idle so you shouldn't be getting poor interactive
performance.

>ATM pfctl -s info shows such numbers:
>
>State Table                          Total             Rate
>  current entries                   153972
>  searches                      6052078938         4800.8/s
>  inserts                        120373545           95.5/s
>  removals                       120219573           95.4/s

That shows the load on pf(4) but doesn't really reflect what the
system is doing as a whole.

>It works currently under UP, but could be rebuilt to work under SMP
>(Xeon 5130) if that helps.

Unfortunately, I don't know if this will help or not because I'm not
sure what bottleneck you are hitting.

>Can someone give hints to decrease interrupt count and to help with
>the server stability at all?

Well, you haven't actually reported what the interrupt count or
what instability you are seeing so this is a bit difficult.

Can you please provide some more information:
- output from 'uname -a'
- output from 'vmstat -i; sleep 10; vmstat -i' under load
- output from 'netstat -i'
- 10-15 seconds of output from 'netstat -i 1' under load
- What is the box doing? Is it a straight filtering router?  Does it
  handle NAT?  Is it running apps itself (eg web, ftp, mail)?
- What speed are the interface(s) running at?
- What instability problems are you seeing?
- Please provide more details on what you mean by 'bad interactivity'.
- How complex is your pf ruleset?  How many rules?  Anything unusual?
- What scheduler are you using?
- What is the full output of 'pfctl -s info'?

--=20
Peter Jeremy
Please excuse any delays as the result of my ISP's inability to implement
an MTA that is either RFC2821-compliant or matches their claimed behaviour.

--/Uq4LBwYP4y1W6pO
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (FreeBSD)

iEYEARECAAYFAkkSyjsACgkQ/opHv/APuIdoiQCgsTHYbDRYx+VnitKkbpy1OsmJ
TEoAn0ZxKbz0Hy2BRiBTbVjzjEVVJD6G
=Ef3M
-----END PGP SIGNATURE-----

--/Uq4LBwYP4y1W6pO--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081106104307.GC51239>