Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 15 Dec 2017 09:36:52 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-threads@FreeBSD.org
Subject:   [Bug 224362] 'mutex is on list' assertion failed on pthread_mutex_lock/pthread_mutex_unlock
Message-ID:  <bug-224362-16@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D224362

            Bug ID: 224362
           Summary: 'mutex is on list' assertion failed on
                    pthread_mutex_lock/pthread_mutex_unlock
           Product: Base System
           Version: 9.2-RELEASE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: threads
          Assignee: freebsd-threads@FreeBSD.org
          Reporter: ice.nightcrawler@gmail.com

Created attachment 188857
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D188857&action=
=3Dedit
Patch for libthr

I've got spontaneous core dumps on extensive mutex operations. There are ma=
ny
threads and locks in my application.
And I quite sure that locking/unlocking logic is corrent - it's performing =
well
on other thread implementations (like NTPL on Linux).
But it is terminated by the assertion on FreeBSD after work for a long time
(from 4 up to 8 hours):
Abort trap (6) with "Fatal error 'mutex is on list'

Backtrace looks like this:
Program terminated with signal SIGABRT, Aborted.
(gdb) bt
#0  0x000000080161cd2a in thr_kill () from /lib/libc.so.7
#1  0x000000080161cc96 in raise () from /lib/libc.so.7
#2  0x000000080161b489 in abort () from /lib/libc.so.7
#3  0x000000080109e63a in ?? () from /lib/libthr.so.3
#4  0x0000000801099e66 in ?? () from /lib/libthr.so.3
#5  0x000000080242d301 in Lock (this=3D0x80387ad90)

(gdb) p *(struct pthread_mutex*) 0x80ff60c40
$4 =3D {m_lock =3D {m_owner =3D 100925, m_flags =3D 0, m_ceilings =3D {0, 0=
}, m_spare =3D
{0, 0, 0, 0}}, m_flags =3D 2, m_owner =3D 0x8038cc400, m_count =3D 0, m_spi=
nloops =3D
0, m_yieldloops =3D 0, m_qe =3D {tqe_next =3D 0x0, tqe_prev =3D 0x0}}

"Lock" function is the last call from my application - it performs
pthread_mutex_lock() call.
In this case it just locks the mutex which was created some time before in =
the
same thread.

Sometimes the assertion looks like 'mutex is not on list'

It looks like for me that libthr library supports a list of acquired mutexes
for each thread. And there are checks
before locking/unlocking that get failed for unknown reason.

I've have a patch for libthr check's logic. The only effect is that some mo=
re
time is required before assertion failed again.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-224362-16>