Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 21 Sep 2015 10:21:21 +0200
From:      Julien Charbon <jch@freebsd.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        freebsd-net@freebsd.org, Palle Girgensohn <girgen@FreeBSD.org>
Subject:   Re: Kernel panics in tcp_twclose
Message-ID:  <55FFBE01.6060706@freebsd.org>
In-Reply-To: <20150918160605.GN67105@kib.kiev.ua>
References:  <26B0FF93-8AE3-4514-BDA1-B966230AAB65@FreeBSD.org> <55FC1809.3070903@freebsd.org> <20150918160605.GN67105@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--0ehaFj5lButCRW2rORLCVxB4rePkEoGV8
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable


 Hi Konstantin, Hi Palle,

On 18/09/15 18:06, Konstantin Belousov wrote:
> On Fri, Sep 18, 2015 at 03:56:25PM +0200, Julien Charbon wrote:
>>  Hi Palle,
>>
>> On 18/09/15 11:12, Palle Girgensohn wrote:
>>> We see daily panics on our production systems (web server, apache
>>> running MPM event, openjdk8. Kernel with VIMAGE. Jails using netgraph=

>>> interfaces [not epair]).
>>>
>>> The problem started after the summer. Normal port upgrades seems to
>>> be the only difference. The problem occurs with 10.2-p2 kernel as
>>> well as 10.1-p4 and 10.1-p15.
>>>
>>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D203175
>>>
>>> Any ideas?
>>
>>  Thanks for you detailed report.  I am not aware of any tcp_twclose()
>> related issues (without VIMAGE) since FreeBSD 10.0 (does not mean ther=
e
>> are none).  Few interesting facts (at least for me):
>>
>>  - Your crash happens when unlocking a inp exclusive lock with INP_WUN=
LOCK()
>>
>>  - Something is already wrong before calling turnstile_broadcast() as =
it
>> is called with ts =3D NULL:
> In the kernel without witness this is a 99%-sure indication of attempt =
to
> unlock not owned lock.

 Thanks, this is useful.  So far I did not find any path where
tcp_twclose() can call INP_WUNLOCK without having the exclusive lock
held, that makes this issue interesting.

>>  I won't go to far here as I am not expert enough in VIMAGE, but one
>> question anyway:
>>
>>  - Can you correlate this kernel panic to a particular event?  Like fo=
r
>> example a VIMAGE/VNET jail destruction.
>>
>>  I will test that on my side on a 10.2 machine.

 I did not find any issues while testing 10.2 + VIMAGE on my side. Thus
Palle what I would suggest:

 - First, test with stable/10 to see if by chance this issue has already
been fixed in stable branch.

 - Second, if issue is still in stable/10, compile 10.2 kernel with
these options:

options        DDB
options        DEADLKRES
options        INVARIANTS
options        INVARIANT_SUPPORT
options        WITNESS
options        WITNESS_SKIPSPIN

 To see where the original fault is coming from.

 Thanks.

--
Julien


--0ehaFj5lButCRW2rORLCVxB4rePkEoGV8
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQEcBAEBCgAGBQJV/74JAAoJEKVlQ5Je6dhxkr4IAK+4UoTQ8JrDCbfESMDgExGB
MLB/2yRhBvSh+5Wl6csKDrVhlt517/2fyJ1Qq9c7VACD88dYK0qiKuV/0lyHrcn+
i9KtnvryFNDvwfOpnyzoCuxneGhoL60mIk9vsTWFzWDACbc1qM+7H5nI7WYBlvcv
qTgilD45m6XVbflA23RGTrycUSE3dvG0dkpE+9Eclz29aPwDjfBBcdv5mmzbPYET
cBeudX+FHxTEMlfy1HiZo88P3XxHI9el1hM66gwEszWXN+duLaBK8K+WQPJMCCxv
nCO5r+YpstK72zXPUAE6WUieoZR1rmVqRfFceFHCTKhdxJRBWDYfgP8gve7DdyM=
=e0uJ
-----END PGP SIGNATURE-----

--0ehaFj5lButCRW2rORLCVxB4rePkEoGV8--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55FFBE01.6060706>