From owner-freebsd-ports-bugs@freebsd.org Tue Apr 27 18:41:24 2021 Return-Path: Delivered-To: freebsd-ports-bugs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 882315FB42E for ; Tue, 27 Apr 2021 18:41:24 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 4FV9Zr32X8z4kXf for ; Tue, 27 Apr 2021 18:41:24 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 67CB25FB42B; Tue, 27 Apr 2021 18:41:24 +0000 (UTC) Delivered-To: ports-bugs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 677EF5FB2DA for ; Tue, 27 Apr 2021 18:41:24 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FV9Zr2DbYz4kmy for ; Tue, 27 Apr 2021 18:41:24 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 3E8B81FE80 for ; Tue, 27 Apr 2021 18:41:24 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 13RIfO72081394 for ; Tue, 27 Apr 2021 18:41:24 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 13RIfOHU081393 for ports-bugs@FreeBSD.org; Tue, 27 Apr 2021 18:41:24 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: ports-bugs@FreeBSD.org Subject: [Bug 255445] lang/python 3.8/3.9 SIGSEV core dumps in libthr TrueNAS Date: Tue, 27 Apr 2021 18:41:24 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Ports & Packages X-Bugzilla-Component: Individual Port(s) X-Bugzilla-Version: Latest X-Bugzilla-Keywords: crash X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: yocalebo@gmail.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: python@FreeBSD.org X-Bugzilla-Flags: maintainer-feedback? X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status keywords bug_severity priority component assigned_to reporter flagtypes.name Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-ports-bugs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Ports bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Apr 2021 18:41:24 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D255445 Bug ID: 255445 Summary: lang/python 3.8/3.9 SIGSEV core dumps in libthr TrueNAS Product: Ports & Packages Version: Latest Hardware: amd64 OS: Any Status: New Keywords: crash Severity: Affects Many People Priority: --- Component: Individual Port(s) Assignee: python@FreeBSD.org Reporter: yocalebo@gmail.com Flags: maintainer-feedback?(python@FreeBSD.org) Assignee: python@FreeBSD.org Seeing many TrueNAS (previously FreeNAS) users dump core on the main middlewared process (python) starting with our version 12.0 release. Relevant OS information: 12.2-RELEASE-p6 FreeBSD 12.2-RELEASE-p6 f2858df162b(HEAD) TRUENAS amd64 Python versions that experience the core dump: Python 3.8.7 Python 3.9.4 When initially researching this, I did find a regression with threading and python 3.8 on freeBSD and was able to resolve that particular problem by backporting the commits: https://github.com/python/cpython/commit/4d96b4635aeff1b8ad41d41422ce808ce0= b971c8 and https://github.com/python/cpython/commit/9ad58acbe8b90b4d0f2d2e139e38bb5aa3= 2b7fb6. The reason why I backported those commits is because all of the core dumps = that I've analyzed are panic'ing in the same spot (or very close to it). For example, here are 2 backtraces showing null-ptr dereference. Core was generated by `python3.8: middlewared'. Program terminated with signal SIGSEGV, Segmentation fault. #0 cond_signal_common (cond=3D) at /truenas-releng/freenas/_BE/os/lib/libthr/thread/thr_cond.c:457 warning: Source file is more recent than executable. 457 mp =3D td->mutex_obj; [Current thread is 1 (LWP 100733)] (gdb) list 452 _sleepq_unlock(cvp); 453 return (0); 454 } 455 456 td =3D _sleepq_first(sq); 457 mp =3D td->mutex_obj; 458 cvp->__has_user_waiters =3D _sleepq_remove(sq, td); 459 if (PMUTEX_OWNER_ID(mp) =3D=3D TID(curthread)) { 460 if (curthread->nwaiter_defer >=3D MAX_DEFER_WAITERS= ) { 461 _thr_wake_all(curthread->defer_waiters,=20 (gdb) p *td Cannot access memory at address 0x0 and another one Core was generated by `python3.8: middlewared'. Program terminated with signal SIGSEGV, Segmentation fault. #0 cond_signal_common (cond=3D) at /truenas-releng/freenas/_BE/os/lib/libthr/thread/thr_cond.c:459warning: Sou= rce file is more recent than executable. 459 if (PMUTEX_OWNER_ID(mp) =3D=3D TID(curthread)) { [Current thread is 1 (LWP 101105)] (gdb) list 454 } 455 456 td =3D _sleepq_first(sq); 457 mp =3D td->mutex_obj; 458 cvp->__has_user_waiters =3D _sleepq_remove(sq, td); 459 if (PMUTEX_OWNER_ID(mp) =3D=3D TID(curthread)) { 460 if (curthread->nwaiter_defer >=3D MAX_DEFER_WAITERS= ) { 461 _thr_wake_all(curthread->defer_waiters, 462 curthread->nwaiter_defer); 463 curthread->nwaiter_defer =3D 0; (gdb) p *mp Cannot access memory at address 0x0 I'm trying to instrument a program to "stress" test threading (tearing down= and recreating etc etc) but I've been unsuccessful at tickling this particular problem. The end-users that have seen this core dump sometimes go 1month + without a problem. Hoping someone more knowledgeable can at least give me a pointer or help me figure this one out. I have access to my VM that has all= the relevant core dumps available so if someone needs remote access to it to "p= oke" around, please let me know. You can reach me at caleb [at] ixsystems.com --=20 You are receiving this mail because: You are the assignee for the bug.=