Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 22 Nov 2014 17:09:09 +0000
From:      "Robert N. M. Watson" <rwatson@FreeBSD.org>
To:        Adrian Chadd <adrian@freebsd.org>
Cc:        Craig Rodrigues <rodrigc@freebsd.org>, FreeBSD Net <freebsd-net@freebsd.org>, "Bjoern A. Zeeb" <bz@freebsd.org>, Marko Zec <zec@fer.hr>
Subject:   Re: VIMAGE UDP memory leak fix
Message-ID:  <85C7A32E-121D-495F-93C7-9D2B2F134FF6@FreeBSD.org>
In-Reply-To: <CAJ-VmokymFoLhCFQqqqa0NXKw4EVwkKQ8ZUjATOTR5EjRyvjKw@mail.gmail.com>
References:  <CAG=rPVehky00X4MuQQ-_Oe5ezWg52ZZrPASAh9GBy7baYv78CA@mail.gmail.com> <20141121002937.4f82daea@x23> <A4D676B3-6C50-47F7-8CFD-50B44FF4BE98@FreeBSD.org> <9300CB5F-6140-4C49-B026-EB69B0E8B37E@FreeBSD.org> <20141121120201.6c77ea5b@x23> <A4211137-9CE8-45A6-BA73-DCD767236C48@FreeBSD.org> <20141121162042.449b22dc@x23> <072B7B0F-4DE3-4D37-BC94-1DEA38CF3B12@FreeBSD.org> <CAJ-VmokymFoLhCFQqqqa0NXKw4EVwkKQ8ZUjATOTR5EjRyvjKw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On 21 Nov 2014, at 17:40, Adrian Chadd <adrian@freebsd.org> wrote:

>>> Skimming through a bunch of hosts with moderately loaded hosts with
>>> reasonably high uptime I couldn't find one where =
net.inet.tcp.timer_race
>>> was not zero. A ny suggestions how to best reproduce the race(s) in
>>> tcp_timer.c?
>>=20
>> They would likely occur only on very highly loaded hosts, as they =
require race conditions to arise between TCP timers and TCP close. I =
think I did manage to reproduce it at one stage, and left the counter in =
to see if we could spot it in production, and I have had (multiple) =
reports of it in deployed systems. I'm not sure it's worth trying to =
reproduce them, given that knowledge -- we should simply fix them.
>=20
> Wasn't this just fixed by Julien @ Verisign?

I don't believe so, although it's the kind of thing Julien is very good =
at fixing!

The issue here is that we can't call callout_drain() from contexts where =
we finalise TCP connection close and attempt to free the inpcb. The =
'easy' fix is to create a taskqueue thread to do the callout_drain() in =
the event that we discover that callout_stop() isn't able to guarantee =
that pending callouts are neither in execution nor scheduled. We'd then =
defer the very tail of TCP teardown to that asynchronous context rather =
than trying to do it to completion in the current (and rather more =
sensitive) one. This would happen only very in frequently so have little =
overhead in practice, although one would want to carefully look at the =
sync behaviour to make sure it wasn't frequently enough that a backlog =
might build up.

> As for the vimage stability side of things - I'd really like to see
> some VIMAGE torture tests written. Stuff like "do a high rate TCP
> connection test whilst creating and destroying VIMAGEs."

... and even for non-VIMAGE. :-)

Robert




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?85C7A32E-121D-495F-93C7-9D2B2F134FF6>