Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 27 Jan 2008 14:22:17 +0100
From:      Max Laier <max@love2party.net>
To:        Stefan Lambrev <stefan.lambrev@moneybookers.com>
Cc:        freebsd-current@freebsd.org
Subject:   Re: FreeBSD 7, bridge, PF and syn flood = very bad performance
Message-ID:  <200801271422.23340.max@love2party.net>
In-Reply-To: <479C4F31.7090804@moneybookers.com>
References:  <479A2389.2000802@moneybookers.com> <200801262227.36970.max@love2party.net> <479C4F31.7090804@moneybookers.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--nextPart2828494.SRAVsIoMmF
Content-Type: text/plain;
  charset="iso-8859-6"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

On Sunday 27 January 2008, Stefan Lambrev wrote:
> Goo Day,
>
> Max Laier wrote:
> > On Saturday 26 January 2008, Stefan Lambrev wrote:
> >> Max Laier wrote:
> >>> On Friday 25 January 2008, Stefan Lambrev wrote:
> >>>> Greetings,
> >>>>
> >>>> Does anyone try to see PF with "keep state" in action when under
> >>>> syn flood attack?
> >>>> I tried to get some help in freebsd-pf@, because the test
> >>>> firewall, that I build hardly can handle 2-5MB/s syn flood.
> >>>> Unfortunately I do not saw useful advice.
> >>>> The problem is that a quad core bridge firewall running freebsd 7
> >>>> amd64 with PF is near useless and can't handle "small" SYN ddos.
> >>>>
> >>>> Here is the schema that I'm testing:
> >>>> web server (freebsd) - freebsd (bridged interfaces) - gigabit
> >>>> switch - clients + flooders
> >>>> In this configuration ~25MB/s syn flood (and I think this limit is
> >>>> because of my switch) is not a problem and the web server responds
> >>>> without a problem.
> >>>> With this configuration netperf -l 610 -p 10303 -H 10.3.3.1 shows
> >>>> 116MB/s stable speed , so I guess there are no problems with
> >>>> cables, hardware and etc :)
> >>>>
> >>>> But when I start pf (see below the config file) the traffic drops
> >>>> to 2-3MB/s and the web server is hardly accessible.
> >>>> It seems that device polling helps a lot in this situation, and at
> >>>> least the bridge firewall is accessible. Without "polling" the
> >>>> firewall is so heavily loaded
> >>>> that even commands like "date" take few seconds to finish, with 2
> >>>> cores at ~100% idle at same time.
> >>>>
> >>>> I have "flat profiles" from hwpmc, and I think it indicates a
> >>>> problem:
> >>>>
> >>>> (bridge, pf enabled, polling enabled, sched_ule - I have profiles
> >>>> and for other combinations too if needed)
> >>>>   %   cumulative   self              self     total
> >>>>  time   seconds   seconds    calls  ms/call  ms/call  name
> >>>>  24.0  268416.00 268416.00        0  100.00%
> >>>> _mtx_lock_sleep
> >>>
> >>> Can you build a kernel with LOCK_PROFILING and try to figure out
> >>> which lock is causing this?
> >>
> >> Yes I can build kernel with LOCK_PROFILING.
> >> But I have no idea how to use it :)
> >> Can you point me to some documentation?
> >
> > man LOCK_PROFILING
> >
> > basically:
> > # sysctl debug.lock.prof.enable=3D1 && sleep 60 && \
> >   sysctl debug.lock.prof.enable=3D0 && \
> >   sysctl debug.lock.prof.stats > log
> >
> > while under attack to sample one minute of lock statistics.
>
> Well I think the interesting lines from this experiment are:
> max              total            wait_total       count   avg
> wait_avg     cnt_hold     cnt_lock name
>     39            25328476     70950955     9015860     2     7
> 5854948      6309848 /usr/src/sys/contrib/pf/net/pf.c:6729 (sleep
> mutex:pf task mtx)
> 936935        10645209          350          50 212904     7
> 110           47 /usr/src/sys/contrib/pf/net/pf.c:980 (sleep mutex:pf
> task mtx)

Yeah, those two mostly are the culprit, but a quick fix is not really=20
available.  You can try to "set timeout interval" to something bigger=20
(e.g. 60 seconds) which will decrease the average hold time of the second=20
lock instance at the cost of increased peak memory usage.

I have the ideas how to fix this, but it will take much much more time=20
than I currently have for FreeBSD :-\  In general this requires a bottom=20
up redesign of pf locking and some data structures involved in the state=20
tree handling.

The first(=3Dmain) lock instance is also far from optimal (i.e. pf is a=20
congestion point in the bridge forwarding path).  For this I have also a=20
plan how to make at least state table lookups run in parallel to some=20
extend, but again the lack of free time to spend coding prevents me from=20
doing it at the moment :-\

=2D-=20
/"\  Best regards,                      | mlaier@freebsd.org
\ /  Max Laier                          | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | mlaier@EFnet
/ \  ASCII Ribbon Campaign              | Against HTML Mail and News

--nextPart2828494.SRAVsIoMmF
Content-Type: application/pgp-signature; name=signature.asc 
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQBHnIWPXyyEoT62BG0RAo/PAJ9+jlOo/Sf/pSVmqvSyMO7jiMnbtgCeKSkX
vrtQrvezUl30zCrFCDvUk8A=
=/psX
-----END PGP SIGNATURE-----

--nextPart2828494.SRAVsIoMmF--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200801271422.23340.max>