Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 13 Nov 2015 05:28:56 +0700
From:      Eugene Grosbein <eugen@grosbein.net>
To:        Adrian Chadd <adrian@freebsd.org>, benno <benno@freebsd.org>, Jeff Roberson <jeff@freebsd.org>, Konstantin Belousov <kib@freebsd.org>
Cc:        "freebsd-mips@freebsd.org" <freebsd-mips@freebsd.org>
Subject:   Re: USB-related problem
Message-ID:  <564512A8.1000006@grosbein.net>
In-Reply-To: <CAJ-VmomOeSyQGxKPvaa9TiBENCcdvTJiTLvSi2FfbCH9c6CGAA@mail.gmail.com>
References:  <56348063.3090508@grosbein.net> <56370E1D.3040801@grosbein.net> <CAJ-Vmo=0vOAq8db_GeLWmdXr7xJdzUh44ZZJrQ9vVdpvzT9hiQ@mail.gmail.com> <563F5630.2000407@grosbein.net> <563F6F6F.1010807@grosbein.net> <CAJ-Vmo=fPSi7yZO5Xjodg8HPtTLd44Y9Y_8qg4EgTGwEpHO10A@mail.gmail.com> <563F91A8.9080702@grosbein.net> <CAJ-VmomUvoUerMS20qQsQujcjULVA=_jaLp9Mh3fU1fEpdwzZA@mail.gmail.com> <CAJ-Vmo=eUUZ928KgQbyOi8EdDFSmxhjvDOyAMvfXsqwDbO96ng@mail.gmail.com> <5640C0FD.2040803@grosbein.net> <CAJ-Vmo=6mztfvvBd91LPO5H418K8vW=%2BLk=6V5Z_y5DHu7v7HA@mail.gmail.com> <5640F315.5020303@grosbein.net> <CAJ-VmokpWM=d%2BtEFv8a8eU91UimVZ9W8da2QkKCTDjd%2B2ZM_LQ@mail.gmail.com> <56410214.3070901@grosbein.net> <CAJ-Vmo=QUUkTfQ7pvr_V%2BCQ8zQWOoqp3H8hD9LUR8C5U-5N=Ag@mail.gmail.com> <CAJ-VmomA27NcNCjx00DrZzFOU9bGYucUhbPXodH2uvNd8eJ3wg@mail.gmail.com> <564231DE.7090308@grosbein.net> <56423D82.5030203@grosbein.net> <CAJ-VmomOeSyQGxKPvaa9TiBENCcdvTJiTLvSi2FfbCH9c6CGAA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 11.11.2015 01:57, Adrian Chadd wrote:
> + benno, jeff, kib
> 
> cool, so now we know where syncer is hanging. nice work!
> 
> That syncer loop that calls it waiting for the list to be empty is
> also a bit suss; it looks like it could also get stuck in a loop and
> never yield.

I've used new state variables spread all over the kernel
(plus KTR facility) to track this down and came to a result I cannot explain.

First, here is call sequence leading to kernel hang, always the same:

sys/kern/vfs_subr.c, sync_vnode() -> VOP_FSYNC(vp, MNT_LAZY, td) -> VOP_FSYNC_APV()
in generated vnode_if.c in the kernel compile directory.

Here we have KTR_START3(KTR_VOP, ...) that is latest record that DDB shows for
"show ktr /v". There should be KTR_STOP3(KTR_VOP, "VOP", ...) recorded after, but it is not.
In fact, execution proceeds to vop_fsync(a) and never returns, precisely:

vop_fsync(a) -> sys/fs/devfs/devfs_vnops.c, devfs_fsync() ->
"return (vop_stdfsync(ap))" - never returns.

It reaches sys/kern/vfs_default.c, vop_stdfsync() and hangs exactly at this KASSERT:
https://svnweb.freebsd.org/base/head/sys/kern/vfs_default.c?annotate=288451#l683

I've defined another global variable "volatile unsigned vopsfsyncstate",
set it to 15 just before this KASSERT and to 126 just after the KASSERT
and it always equals to 15 when kernel hangs. I tested kernel compiled
with and without WITNESS (and WITHNESS_SKIPSPIN), no changes.

I do not understand how and why does it hang here but that's a fact.
Here I'm stuck.

Eugene Grosbein




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?564512A8.1000006>