Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Nov 2014 09:40:27 -0800
From:      Adrian Chadd <adrian@freebsd.org>
To:        "Robert N. M. Watson" <rwatson@freebsd.org>
Cc:        Craig Rodrigues <rodrigc@freebsd.org>, FreeBSD Net <freebsd-net@freebsd.org>, "Bjoern A. Zeeb" <bz@freebsd.org>, Marko Zec <zec@fer.hr>
Subject:   Re: VIMAGE UDP memory leak fix
Message-ID:  <CAJ-VmokymFoLhCFQqqqa0NXKw4EVwkKQ8ZUjATOTR5EjRyvjKw@mail.gmail.com>
In-Reply-To: <072B7B0F-4DE3-4D37-BC94-1DEA38CF3B12@FreeBSD.org>
References:  <CAG=rPVehky00X4MuQQ-_Oe5ezWg52ZZrPASAh9GBy7baYv78CA@mail.gmail.com> <20141121002937.4f82daea@x23> <A4D676B3-6C50-47F7-8CFD-50B44FF4BE98@FreeBSD.org> <9300CB5F-6140-4C49-B026-EB69B0E8B37E@FreeBSD.org> <20141121120201.6c77ea5b@x23> <A4211137-9CE8-45A6-BA73-DCD767236C48@FreeBSD.org> <20141121162042.449b22dc@x23> <072B7B0F-4DE3-4D37-BC94-1DEA38CF3B12@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 21 November 2014 07:32, Robert N. M. Watson <rwatson@freebsd.org> wrote:
>
> On 21 Nov 2014, at 15:20, Marko Zec <zec@fer.hr> wrote:
>
>>> Bjoern and I chatted for the last twenty or so minutes about the
>>> code, and believe that as things stand, it is *not* safe to turn off
>>> UMA_ZONE_NOFREE for TCP due to a teardown race in TCP that has been
>>> known about and discussed for several years, but is some work to
>>> resolve and that we've not yet found time to do so. The XXXRW's in
>>> tcp_timer.c are related to this. We're pondering ways to fix it but
>>> think this is not something that can be rushed.
>>
>> OK fair enough - thanks a lot for looking into this!
>>
>> Skimming through a bunch of hosts with moderately loaded hosts with
>> reasonably high uptime I couldn't find one where net.inet.tcp.timer_race
>> was not zero. A ny suggestions how to best reproduce the race(s) in
>> tcp_timer.c?
>
> They would likely occur only on very highly loaded hosts, as they require=
 race conditions to arise between TCP timers and TCP close. I think I did m=
anage to reproduce it at one stage, and left the counter in to see if we co=
uld spot it in production, and I have had (multiple) reports of it in deplo=
yed systems. I'm not sure it's worth trying to reproduce them, given that k=
nowledge -- we should simply fix them.
>

Wasn't this just fixed by Julien @ Verisign?

As for the vimage stability side of things - I'd really like to see
some VIMAGE torture tests written. Stuff like "do a high rate TCP
connection test whilst creating and destroying VIMAGEs."



-adrian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmokymFoLhCFQqqqa0NXKw4EVwkKQ8ZUjATOTR5EjRyvjKw>