Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 14 Oct 2010 16:10:26 +0200
From:      Attilio Rao <attilio@freebsd.org>
To:        "Robert N. M. Watson" <rwatson@freebsd.org>
Cc:        FreeBSD Current <current@freebsd.org>, freebsd-net@freebsd.org, Sergey Kandaurov <pluknet@freebsd.org>, Jack F Vogel <jfv@freebsd.org>, Ryan Stone <rstone@sandvine.com>, Ryan Stone <rysto32@gmail.com>, Ed Maste <emaste@sandvine.com>
Subject:   Re: [PATCH] Netdump for review and testing -- preliminary version
Message-ID:  <AANLkTimLnRsa4v=A3Ui-1hKiVc5YLwkBND4NOmT4t%2BtB@mail.gmail.com>
In-Reply-To: <C73FFD46-80B0-44F0-9A19-2B047C285134@freebsd.org>
References:  <AANLkTikA5OUYD1A9pqCqVEZ5qk%2BVECq8x-fnRXnpp0KE@mail.gmail.com> <AANLkTikau6omhWrXVM13zonFEPCxXM%2B8EqJauovDu0OU@mail.gmail.com> <alpine.BSF.2.00.1010090121310.1232@fledge.watson.org> <AANLkTimisSojDg2z_f1_v71evfooVdPQ44eu2Thhrf3O@mail.gmail.com> <C73FFD46-80B0-44F0-9A19-2B047C285134@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
2010/10/14 Robert N. M. Watson <rwatson@freebsd.org>:
>
> On 13 Oct 2010, at 18:46, Ryan Stone wrote:
>
>> On Fri, Oct 8, 2010 at 9:15 PM, Robert Watson <rwatson@freebsd.org> wrot=
e:
>>> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /*
>>> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* get and fill=
 a header mbuf, then chain data as an
>>> extended
>>> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* mbuf.
>>> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*/
>>> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 MGETHDR(m, M_DONTWAI=
T, MT_DATA);
>>>
>>> The idea of calling into the mbuf allocator in this context is just fre=
aky,
>>> and may have some truly awful side effects. =C2=A0I suppose this is the=
 cost of
>>> trying to combine code paths in the network device driver rather than h=
ave
>>> an independent path in the netdump case, but it's quite unfortunate and=
 will
>>> significantly reduce the robustness of netdumps in the face of, for exa=
mple,
>>> mbuf starvation.
>>
>> Changing this will require very invasive changes to the network
>> drivers. =C2=A0I know that the Intel drivers allocate their own mbufs fo=
r
>> their receive rings and I imagine that all other drivers have to do
>> something similar. =C2=A0Plus the drivers are responsible for freeing mb=
ufs
>> after they have been transmitted. =C2=A0It seems to me that the cost of
>> making significant changes to the network drivers to support an
>> alternate lifecycle for netdump mbufs far outweighs the cost of losing
>> a couple of kernel dumps in extreme circumstances.
>
> My concern is less about occasional lost dumps that destabilising the dum=
ping process: calls into the memory allocator can currently trigger a lot o=
f interesting behaviours, such as further calls back into the VM system, wh=
ich can then trigger calls into other subsystems. What I'm suggesting is th=
at if we want the mbuf allocator to be useful in this context, we need to t=
each it about things not to do in the dumping / crash / ... context, which =
probably means helping uma out a bit in that regard. And a watchdog to make=
 sure the dump is making progress.

I think that this would be way too complicated just to cope with panic
within the VM/UMA (not sure what other subsystems you are referring
to, wrt supposed to call). Besides, if we have a panic in the VM I'm
sure that normal dumps could also be affected.
When dealing with netdump, I'm not trying to fix all the bugs related
to our dumping infrastructure because, as long as we already
discussed, we know there are quite a few of them, but trying at least
to follow the same fragile-ness than what we have today.
And again, while I think the "watchdog" idea is good, I think it still
applies to normal dumps too, it is not specific to netdump.

Thanks,
Attilio


--=20
Peace can only be achieved by understanding - A. Einstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTimLnRsa4v=A3Ui-1hKiVc5YLwkBND4NOmT4t%2BtB>