Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 Oct 2005 01:38:53 -0400
From:      Kris Kennaway <kris@obsecurity.org>
To:        Don Lewis <truckman@FreeBSD.org>
Cc:        mi+mx@aldan.algebra.com, re@FreeBSD.org, current@FreeBSD.org, openoffice@FreeBSD.org
Subject:   Re: 6.0 hangs (while building OOo)
Message-ID:  <20051006053853.GA58630@xor.obsecurity.org>
In-Reply-To: <200510050220.j952KD25025940@gw.catspoiler.org>
References:  <200510041328.05637.mi%2Bmx@aldan.algebra.com> <200510050220.j952KD25025940@gw.catspoiler.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--9amGYk9869ThD9tj
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Oct 04, 2005 at 07:20:13PM -0700, Don Lewis wrote:
> On  4 Oct, Mikhail Teterin wrote:
> > ???????? 04 ??????? 2005 13:08, Don Lewis ?? ????????:
> >> Hung trying to lock a vnode ...
> >>
> >> What other processes are in the D state, and what is their wchan info?
> >=20
> > mi@roo:~ (301) ps -lax | awk 'match($10, "D")'
> >     0     2     0   0  -8  0     0     8 -      DL    ??    0:06,50 [g_=
event]
> >     0     3     0   0  -8  0     0     8 -      DL    ??    0:39,71 [g_=
up]
> >     0     4     0   0  -8  0     0     8 -      DL    ??    0:31,21 [g_=
down]
> >     0     5     0   0   8  0     0     8 -      DL    ??    0:00,00 [th=
read taskq]
> >     0     6     0   0   8  0     0     8 -      DL    ??    0:00,00 [kq=
ueue taskq]
> >     0     7     0   0  96  0     0     8 idle   DL    ??    0:00,00 [ai=
c_recovery0]
> >     0     8     0   0  96  0     0     8 idle   DL    ??    0:00,00 [ai=
c_recovery0]
> >     0     9     0   0  96  0     0     8 idle   DL    ??    0:00,00 [ai=
c_recovery1]
> >     0    10     0   0 -16  0     0     8 ktrace DL    ??    0:00,00 [kt=
race]
> >     0    39     0   0 -16  0     0     8 -      DL    ??    0:09,21 [ya=
rrow]
> >     0    44     0   0   8  0     0     8 usbevt DL    ??    0:00,01 [us=
b0]
> >     0    45     0   0   8  0     0     8 usbtsk DL    ??    0:00,00 [us=
btask]
> >     0    46     0   0  96  0     0     8 idle   DL    ??    0:00,00 [ai=
c_recovery1]
> >     0    47     0   0  -8  0     0     8 -      DL    ??    0:00,91 [fd=
c0]
> >     0    49     0   0 -16  0     0     8 psleep DL    ??    0:03,51 [pa=
gedaemon]
> >     0    50     0   0  20  0     0     8 psleep DL    ??    0:00,00 [vm=
daemon]
> >     0    51     0   0 171  0     0     8 pgzero DL    ??   12:19,32 [pa=
gezero]
> >     0    52     0   0 -16  0     0     8 psleep DL    ??    0:06,55 [bu=
fdaemon]
> >     0    53     0   0  20  0     0     8 syncer DL    ??    1:00,40 [sy=
ncer]
> >     0    54     0   0  -4  0     0     8 vlruwt DL    ??    0:03,16 [vn=
lru]
> >     0    55     0   0 -64  0     0     8 -      DL    ??    0:11,48 [sc=
hedcpu]
> >     0   115     0   0  -8  0     0     8 mdwait DL    ??    0:05,75 [md=
7]
> >     0 45773 45771   0  -4  0  1740  1208 ufs    D     p1    0:00,32 dma=
ke
> >     0 45806 45788 350  -4  0  1548   632 ufs    D     p1    0:00,00 /bi=
n/tcsh -fc zipdep.pl -u -j  ../../../
> >     0 65072 64985 271  -4  0  1248   480 ufs    D     p1    0:00,00 /bi=
n/tcsh -fc if ( -e ../../../unxfbsd.p
> >     0 65327  8694   0  -4  0  1432   908 ufs    D+    p2    0:02,05 fin=
d work/ -name provider.o
>=20
> Mikhail and I have been looking at this offline and have discovered the
> following:
> 	The wedged processes are waiting for vnode locks in the file
>         name lookup path for the access() and lstat syscalls().
>=20
> 	There are two locked directories that are wedging these
>         processes.
>=20
> 	We don't know what threads are holding the locks on these
>         directories, but we do know that is is none of the threads
>         associated with these processes, so it is not a classic deadlock
>         problem.

'show lockedvnods' doesn't help?

There is code in -current that saves stack traces when lockmgr locks
are acquired, when DEBUG_LOCKS is enabled - except it sometimes panics
while trying to save the trace because of a code bug.  I remind jeffr
about this on a more-or-less daily basis, but he hasn't had time to
commit the fix he has yet.  It still may be useful if this is easily
reproducible.

> This problem appears to be some sort of vnode lock leak.

leaked lockmgr locks usually panic when the thread exits.

Kris
--9amGYk9869ThD9tj
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (FreeBSD)

iD8DBQFDRLhtWry0BWjoQKURAqbBAKDDYHRox1Y3jeJDYh+vI/po8nMInACfXsXw
tqAxMscrtIZGb5inaiXgMfQ=
=WUq1
-----END PGP SIGNATURE-----

--9amGYk9869ThD9tj--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051006053853.GA58630>