From owner-freebsd-current@FreeBSD.ORG Wed Nov 17 20:56:51 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 66FDB16A4CE for ; Wed, 17 Nov 2004 20:56:51 +0000 (GMT) Received: from mail.mcneil.com (mcneil.com [24.199.45.54]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1BF1943D45 for ; Wed, 17 Nov 2004 20:56:51 +0000 (GMT) (envelope-from sean@mcneil.com) Received: from localhost (localhost.mcneil.com [127.0.0.1]) by mail.mcneil.com (Postfix) with ESMTP id CF865F2082; Wed, 17 Nov 2004 12:56:50 -0800 (PST) Received: from mail.mcneil.com ([127.0.0.1]) by localhost (server.mcneil.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 13391-10; Wed, 17 Nov 2004 12:56:48 -0800 (PST) Received: from mcneil.com (mcneil.com [24.199.45.54]) by mail.mcneil.com (Postfix) with ESMTP id C0221F1802; Wed, 17 Nov 2004 12:56:48 -0800 (PST) From: Sean McNeil To: Doug White In-Reply-To: <20041117102623.P25028@carver.gumbysoft.com> References: <1100657472.74795.2.camel@server.mcneil.com> <20041117102623.P25028@carver.gumbysoft.com> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-Bsflrd6qJ12gUDoJBBUP" Date: Wed, 17 Nov 2004 12:56:48 -0800 Message-Id: <1100725008.21333.2.camel@server.mcneil.com> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 FreeBSD GNOME Team Port X-Virus-Scanned: by amavisd-new at mcneil.com cc: current@freebsd.org Subject: Re: Why won't slapd shutdown (kill -0)? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Nov 2004 20:56:51 -0000 --=-Bsflrd6qJ12gUDoJBBUP Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Wed, 2004-11-17 at 10:28 -0800, Doug White wrote:=20 > On Tue, 16 Nov 2004, Sean McNeil wrote: >=20 > > This has been happening for a long time with current and hasn't been > > resolved. When I start up slapd, I cannot stop it without kill -9 ing > > it. It would appear stuck in kse and probably has something to do with > > kill -0: >=20 > Mind expanding on this? The backtrace looks normal for a pthread process. > kill -0 just tests signal delivery; the process is completely unaware tha= t > the probe occured, though. The process may also be unkillable if its > stuck in some sort of I/O wait. >=20 > Is the server busy when you signal it? Oh, OK. I didn't look at /usr/local/etc/rc.subr too closely. I have additional information, though.... It appears that all the threads are destroyed yet it is still in the thread processing loop. The process is no longer active at all. I just had a similar problem happen with vlc where I closed it yet it is hanging in the same place as slapd with all the threads gone. Here is the one from vlc: (gdb) bt full #0 _thr_sched_switch_unlocked (curthread=3D0x955000) at pthread_md.h:226 psf =3D {psf_valid =3D 0, psf_flags =3D 0, psf_cancelflags =3D 2995= 2806, psf_interrupted =3D 8, psf_timeout =3D 11279168, psf_signo =3D 0, psf_state =3D 11279168, psf_wait_data =3D {mutex =3D 0x8, cond =3D 0x8, l= ock =3D 0x8, sigwait =3D 0x8}, psf_wakeup_time =3D {tv_sec =3D 0, tv_nsec =3D 0}, psf_sigset =3D { __bits =3D {29950366, 8, 9860096, 0}}, psf_sigmask =3D {__bits =3D {9752576, 1, 9860096, 0}}, psf_seqno =3D 29995347} curkse =3D (struct kse *) 0x952000 resume_once =3D 0 #1 0x0000000801c925e0 in _thr_sched_switch (curthread=3D0x955000) at /usr/src/lib/libpthread/thread/thr_kern.c:607 No locals. #2 0x0000000801c85cb4 in _pthread_join (pthread=3D0x967400, thread_return=3D0x0) at /usr/src/lib/libpthread/thread/thr_join.c:133 curthread =3D (struct pthread *) 0x955000 tmp =3D (void *) 0x0 crit =3D 0x0 ret =3D 0 #3 0x0000000000431749 in __vlc_thread_join (p_this=3D0xad4800, psz_file=3D0x6a283c "src/playlist/playlist.c", i_line=3D130) at src/misc/threads.c:716 i_ret =3D 1 #4 0x000000000040ee1a in playlist_Destroy (p_playlist=3D0xad4800) ---Type to continue, or q to quit--- at src/playlist/playlist.c:130 No locals. #5 0x000000000040c400 in VLC_CleanUp (i_object=3D0) at src/libvlc.c:831 p_intf =3D (intf_thread_t *) 0xad4800 p_playlist =3D (playlist_t *) 0xad4800 p_vout =3D (vout_thread_t *) 0xad4800 p_aout =3D (aout_instance_t *) 0xad4800 p_announce =3D (announce_handler_t *) 0xad4800 p_vlc =3D (vlc_t *) 0x94d400 #6 0x0000000000407415 in main (i_argc=3D1, ppsz_argv=3D0x7fffffffe940) at src/vlc.c:108 i_ret =3D 0 and here is a full trace of slapd: (gdb) bt full #0 0x000000080142e914 in kse_release () at kse_release.S:2 No locals. #1 0x0000000801428e49 in kse_wait (kse=3D0x62a000, td_wait=3D0x0, sigseqno=3D0) at /usr/src/lib/libpthread/thread/thr_kern.c:1843 ts =3D {tv_sec =3D 7647232, tv_nsec =3D 7647232} ts_sleep =3D {tv_sec =3D 60, tv_nsec =3D 0} saved_flags =3D 0 #2 0x0000000801427078 in kse_sched_multi (kmbx=3D0x62efa0) at /usr/src/lib/libpthread/thread/thr_kern.c:1039 curkse =3D (struct kse *) 0x62a000 curthread =3D (struct pthread *) 0x0 td_wait =3D (struct pthread *) 0x62a068 curframe =3D (struct pthread_sigframe *) 0x17f ret =3D 383 #3 0x000000080142afbf in _amd64_enter_uts () at /usr/src/lib/libpthread/arch/amd64/amd64/enter_uts.S:40 No locals. #4 0x0000000000000000 in ?? () No symbol table info available. #5 0x000000000062f000 in ?? () No symbol table info available. #6 0x000000000062a000 in ?? () No symbol table info available. ---Type to continue, or q to quit--- #7 0x0000000000000000 in ?? () No symbol table info available. #8 0x0000000000000000 in ?? () No symbol table info available. #9 0x0000000000000000 in ?? () No symbol table info available. #10 0x0000000000000000 in ?? () No symbol table info available. #11 0x0000000000000000 in ?? () No symbol table info available. #12 0x0000000000000001 in ?? () No symbol table info available. #13 0x0000000801426dd0 in _thr_sched_switch_unlocked () at /usr/src/lib/libpthread/thread/thr_kern.c:904 free_kseq =3D {tqh_first =3D 0x0, tqh_last =3D 0x801534810} gc_ksegq =3D {tqh_first =3D 0x0, tqh_last =3D 0x801534840} next_uniqueid =3D 7 active_kse_groupq =3D {tqh_first =3D 0x62f100, tqh_last =3D 0x74802= 0} active_kse_count =3D 2 free_threadq =3D {tqh_first =3D 0x0, tqh_last =3D 0x801534890} free_kse_count =3D 0 active_kseq =3D {tqh_first =3D 0x62a000, tqh_last =3D 0x6c9220} free_kse_groupq =3D {tqh_first =3D 0x0, tqh_last =3D 0x801534820} ---Type to continue, or q to quit--- kse_lock =3D {l_head =3D 0x6291c0, l_tail =3D 0x6291c0, l_type =3D LCK_ADAPTIVE, l_wait =3D 0x801426150 <_kse_lock_wait>, l_wakeup =3D 0x8014261e0 <_kse_lock_wakeup>} active_kseg_count =3D 2 inited =3D 1 free_thread_count =3D 0 free_kseg_count =3D 0 thr_hashtable =3D {{lh_first =3D 0x0} , { lh_first =3D 0x6c3c00}, {lh_first =3D 0x0}, {lh_first =3D 0x0}, { lh_first =3D 0x0}, {lh_first =3D 0x0}, {lh_first =3D 0x1874400}, { lh_first =3D 0x0}, {lh_first =3D 0x0}, {lh_first =3D 0x0}, {lh_first = =3D 0x0}, { lh_first =3D 0x74b000}, {lh_first =3D 0x0} , { lh_first =3D 0x74b800}, {lh_first =3D 0x0}, {lh_first =3D 0x0}, { lh_first =3D 0x0}, {lh_first =3D 0x0}, {lh_first =3D 0x0}, {lh_first = =3D 0x0}, { lh_first =3D 0x0}, {lh_first =3D 0x0}, {lh_first =3D 0x632000}, { lh_first =3D 0x0} , {lh_first =3D 0x29ab400}, { lh_first =3D 0x0}, {lh_first =3D 0x0}, {lh_first =3D 0x0}, {lh_first = =3D 0x0}, { lh_first =3D 0x0}, {lh_first =3D 0x2983c00}, { lh_first =3D 0x0} } thread_lock =3D {l_head =3D 0x6291e0, l_tail =3D 0x6291e0, l_type =3D LCK_ADAPTIVE, l_wait =3D 0x801426150 <_kse_lock_wait>, l_wakeup =3D 0x8014261e0 <_kse_lock_wakeup>} _tcb_mutex =3D 0x628380 Previous frame inner to this frame (corrupt stack?) which looks like total garbage. Looking at each thread I see that there is only a thread 1,2, and 3: (gdb) thread 1 [Switching to thread 1 (Thread 6 (LWP 100177))]#0 0x000000080142e914 in kse_release () at kse_release.S:2 2 RSYSCALL(kse_release) (gdb) bt #0 0x000000080142e914 in kse_release () at kse_release.S:2 #1 0x000000080141d926 in sig_daemon (arg=3D0x7fffffefef70) at /usr/src/lib/libpthread/thread/thr_sig.c:216 #2 0x0000000801426db5 in kse_sched_single (kmbx=3D0x7fffffefef70) at /usr/src/lib/libpthread/thread/thr_kern.c:902 (gdb) thread 2 [Switching to thread 2 (Thread 7 (sleeping))]#0 _thr_sched_switch_unlocked ( curthread=3D0x632000) at pthread_md.h:226 226 if (ret =3D=3D 0) { Current language: auto; currently c (gdb) bt #0 _thr_sched_switch_unlocked (curthread=3D0x632000) at pthread_md.h:226 #1 0x00000008014265e0 in _thr_sched_switch (curthread=3D0x632000) at /usr/src/lib/libpthread/thread/thr_kern.c:607 #2 0x0000000801419cb4 in _pthread_join (pthread=3D0x74b000, thread_return=3D0x0) at /usr/src/lib/libpthread/thread/thr_join.c:133 #3 0x0000000800719d09 in ldap_pvt_thread_join (thread=3D0x800609070, thread_return=3D0x62a068) at thr_posix.c:165 (gdb) thread 3 [Switching to thread 3 (LWP 100148)]#0 0x000000080142e914 in kse_release () at kse_release.S:2 2 RSYSCALL(kse_release) Current language: auto; currently asm (gdb) bt #0 0x000000080142e914 in kse_release () at kse_release.S:2 #1 0x0000000801428e49 in kse_wait (kse=3D0x62a000, td_wait=3D0x0, sigseqno=3D0) at /usr/src/lib/libpthread/thread/thr_kern.c:1843 #2 0x0000000801427078 in kse_sched_multi (kmbx=3D0x62efa0) at /usr/src/lib/libpthread/thread/thr_kern.c:1039 #3 0x000000080142afbf in _amd64_enter_uts () at /usr/src/lib/libpthread/arch/amd64/amd64/enter_uts.S:40 > > > > (gdb) bt > > #0 0x000000080142e914 in kse_release () at kse_release.S:2 > > #1 0x0000000801428e49 in kse_wait (kse=3D0x62a000, td_wait=3D0x0, > > sigseqno=3D0) > > at /usr/src/lib/libpthread/thread/thr_kern.c:1843 > > #2 0x0000000801427078 in kse_sched_multi (kmbx=3D0x62efa0) > > at /usr/src/lib/libpthread/thread/thr_kern.c:1039 > > #3 0x000000080142afbf in _amd64_enter_uts () > > at /usr/src/lib/libpthread/arch/amd64/amd64/enter_uts.S:40 > > #4 0x0000000000000000 in ?? () > > #5 0x000000000062f000 in ?? () > > #6 0x000000000062a000 in ?? () > > #7 0x0000000000000000 in ?? () > > #8 0x0000000000000000 in ?? () > > #9 0x0000000000000000 in ?? () > > #10 0x0000000000000000 in ?? () > > #11 0x0000000000000000 in ?? () > > #12 0x0000000000000001 in ?? () > > #13 0x0000000801426dd0 in _thr_sched_switch_unlocked () > > at /usr/src/lib/libpthread/thread/thr_kern.c:904 > > Previous frame inner to this frame (corrupt stack?) > > > > >=20 --=-Bsflrd6qJ12gUDoJBBUP Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (FreeBSD) iD8DBQBBm7sQyQsGN30uGE4RAuPxAJ9L47eUGY0C1AVjeU1NDPsB4eoQ9gCeK2Bs zy6uxbU+YyeznhOvYDbfL4A= =ENVQ -----END PGP SIGNATURE----- --=-Bsflrd6qJ12gUDoJBBUP--