From owner-freebsd-hackers@freebsd.org Tue Feb 26 20:25:10 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CE45B1502A89 for ; Tue, 26 Feb 2019 20:25:09 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic312-22.consmr.mail.ne1.yahoo.com (sonic312-22.consmr.mail.ne1.yahoo.com [66.163.191.203]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id F15C571991 for ; Tue, 26 Feb 2019 20:25:07 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: yr9.G9kVM1lHADL0MxVesQWj4fSDbSpAd5arPU2e5V6sIPfACIdXlFI9jUQSFib SVhFLd353Veu6.a6CEhwyARoZ0vbcX.AqUol4n4W5lYGQ5sK2LW8vHZDlIC5fZSU0yDgxnWSW9NR OoEQh6gMmyhVYBf4h8rbU7SNUpID4Qlsufkz35WUh1QKTZ8jKnZyr7gr.CoWOZhd_caF9GJX8uiB y3vJauCch5eyGPl1OKu4tm4rYQHIzj_782YiNhgCNrtLMQtZYIQukxVzB1qXtXgjDZS5riL7FYUE COxtUHroGT359TFX7xSv1t3lcksQ6FPL3Y9x5OC93JAsB22U_O2ITXC445uEbSxNBZ4MFY0otYqU s5nx4Vr0wW5jUvCxU49JI8xhblkx0b.b0fvxH3YyPoswns19.qtNiwgCmxIfGRmMJ_Oc.c0hYv2g 8KTTjCn2TYW7r0.js.ICheknLPe2BSDe.d0P_EEVooeUKuNBp.0uxu9fbhUUEJ9pzPPifZV_yyTB kSmp7ldJlF7l1IeYg.bmLaVs87oei6O8sN3xk1y0gaMYn4x5JIZa_7L_Pe2KFRt8FKBI0S2wJ51y qjiVLkOwc9lzkZsNrHrtKM4.uSoKkFyt1uLwqBVl82aZz6nnYGtMFcuPL2YLM2IfdpNQEeeRRQDO rkljEuUttRopZgld8rpbGqHOmkdMatmeZW1kbzsYxBZK4j022E8i9LqTbt7hxmn8yzuf92MVibW6 _jp0GiOt0Xu9MJmzVf5lsmH9y7s6TLcrdLs9eomMb9jNMy5DWMOsvWPPLPY1WyBv48r6fFJzddSw tDMq7xtmcxBz3Ak3mUx6S8I_mKWm8SjPAG8s06lmVVLA47fakohJvUmE1rGxXtdgJ8aE.KziFMaP F41B0KfTjqAOt9R80T7ExQEy2rwkTh9BxDXbeT_PU6ttTJjUEIiypexE3dWgrHDxeUekxHB3JIXe cInsnm_6Nj3okhUOzcT12_pOCT2uhVED9Wk0ORfHC4QlS0xiJO96ZZQ.FwV.BnBIiAWrLZFV0ejy jY8Yoz7OdqzARlRf6Mw80nRQ1Nlg0uVgQmuJC.0cDhPQ42AE5gio7Rq.Qph4t.kI- Received: from sonic.gate.mail.ne1.yahoo.com by sonic312.consmr.mail.ne1.yahoo.com with HTTP; Tue, 26 Feb 2019 20:25:06 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp424.mail.ne1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 6b1053a590c7ef4d6bf27d69baf020cc; Tue, 26 Feb 2019 20:25:01 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: The FreeBSD code structure that sometimes break powerpc64's pmac_thermal, buf*daemon* threads, etc. (stuck sleeping) Date: Tue, 26 Feb 2019 12:25:00 -0800 References: <1C8071BF-5B3B-416D-8CE4-437D4555EEB5@yahoo.com> To: Justin Hibbits , FreeBSD PowerPC ML , freebsd-hackers Hackers In-Reply-To: <1C8071BF-5B3B-416D-8CE4-437D4555EEB5@yahoo.com> Message-Id: <4DFDC776-BE0E-4952-8414-6B7ACFEC815B@yahoo.com> X-Mailer: Apple Mail (2.3445.102.3) X-Rspamd-Queue-Id: F15C571991 X-Spamd-Bar: ++ X-Spamd-Result: default: False [2.13 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36646, ipnet:66.163.184.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_SPAM_SHORT(0.96)[0.964,0]; NEURAL_HAM_LONG(-0.06)[-0.059,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.38)[ip: (4.67), ipnet: 66.163.184.0/21(1.27), asn: 36646(1.01), country: US(-0.07)]; NEURAL_SPAM_MEDIUM(0.36)[0.357,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[203.191.163.66.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Feb 2019 20:25:10 -0000 [In one aspect of my comments I seem to have misapplied some generic background information: cross CPU sbinuptime() values are not involved for timercb vs. sleepq_timeout in the call chain. It sometimes goes backwards on the same CPU during the call chain.] On 2019-Feb-26, at 04:29, Mark Millard wrote: > [head -r334018 based. I temporarily changed what messaging > happens for KTR_PROC in order to gather information. As stands > it was reporting on pmac_thermal because I can listen for the > fans to know the stuck-sleeping has happened recently.] >=20 > A powerpc64 sleepq_timeout callout usage has the structure: >=20 > 0xe00000009af7c730: at sleepq_timeout+0x148 > 0xe00000009af7c7d0: at softclock_call_cc+0x234 > 0xe00000009af7c910: at callout_process+0x2e0 > 0xe00000009af7c9f0: at handleevents+0x22c > 0xe00000009af7caa0: at timercb+0x340 > 0xe00000009af7cba0: at decr_intr+0x140 > 0xe00000009af7cbd0: at powerpc_interrupt+0x268 >=20 > timercb does: >=20 > now =3D sbinuptime(); > . . . > handleevents(now, 0); >=20 > That in turn leads to: >=20 > if (now >=3D state->nextcallopt || now >=3D state->nextcall) { > state->nextcall =3D state->nextcallopt =3D SBT_MAX; > callout_process(now); > } >=20 > Which leads to: >=20 > if (tmp->c_time <=3D now) { > . . . > if (tmp->c_iflags & CALLOUT_DIRECT) { > . . . > softclock_call_cc(tmp, cc, > #ifdef CALLOUT_PROFILING > &mpcalls_dir, = &lockcalls_dir, NULL, > #endif > 1); >=20 > So softclock_call_cc and sleepq_timeout do not get a copy > of timercb 's now value. (I will refer to this now value > in contexts for which it is not accessible.) >=20 > sleepq_timeout uses: >=20 > td->td_sleeptimo > sbinuptime() >=20 > to indicate to not do something like: >=20 > sleepq_resume_thread(. . .) >=20 > but also does not use sleepq_set_timeout_sbt to > set up another sleepq_timeout callout or do some > other such under this condition. (So a what is > effectively a no-op ends up the last activity > before the thread is stuck asleep.) >=20 > (I will continue to write sbinuptime() to reference > the value used in that test.) >=20 > With multiple processors, it is observed that, > despite sbinuptime() being physically later: >=20 > sbinuptime() < now >=20 > does sometimes happen for the code involved, > for example: >=20 > sbinuptime(): 0x11abf13bd142 > now : 0x11accb5df419 >=20 > This is even though: >=20 > tmp->c_time =3D=3D td->td_sleeptimo >=20 > is observed. sbinuptime() < now sometimes leads > to: >=20 > sbinuptime() < tmp->c_time =3D=3D td->td_sleeptimo <=3D now >=20 > for example the following happened: >=20 > sbinuptime(): 0x11abf13bd142 > tmp->c_time : 0x11ac2096af71 [=3D=3D td->td_sleeptimo] > now : 0x11accb5df419 >=20 > As I understand, keeping the various > powerpc64 CPUs' sbinuptime() values fully > synchronized is unlikely so the problem would > still exist even if "closer but not identical" > across CPUs was achieved. The comments about various CPUs having somewhat mismatched sbinuptime() values does not seem to apply to timercb vs. softclock_call_cc in: 0xe00000009af7c7d0: at softclock_call_cc+0x234 0xe00000009af7c910: at callout_process+0x2e0 0xe00000009af7c9f0: at handleevents+0x22c 0xe00000009af7caa0: at timercb+0x340 But that means that sbinuptime() values are going backwards on the same CPU during that call chain's activity! Ouch. > [The testing context here is an old PowerMac G5 > 4-core (system total), which actually involves > 2 sockets, 2 cores per socket.] >=20 > Overall this structure does not seemed to be > designed to handle variations in sbinuptime() > values across CPUs. >=20 >=20 > The sleepq_timeout source code (before my > limited modifications for reporting more): >=20 > static void > sleepq_timeout(void *arg) > { > struct sleepqueue_chain *sc __unused; > struct sleepqueue *sq; > struct thread *td; > void *wchan; > int wakeup_swapper; >=20 > td =3D arg; > wakeup_swapper =3D 0; > CTR3(KTR_PROC, "sleepq_timeout: thread %p (pid %ld, %s)", > (void *)td, (long)td->td_proc->p_pid, (void *)td->td_name); >=20 > thread_lock(td); >=20 > if (td->td_sleeptimo > sbinuptime() || td->td_sleeptimo =3D=3D = 0) { > /* > * The thread does not want a timeout (yet). > */ > } else if (TD_IS_SLEEPING(td) && TD_ON_SLEEPQ(td)) { > /* > * See if the thread is asleep and get the wait > * channel if it is. > */ > wchan =3D td->td_wchan; > sc =3D SC_LOOKUP(wchan); > THREAD_LOCKPTR_ASSERT(td, &sc->sc_lock); > sq =3D sleepq_lookup(wchan); > MPASS(sq !=3D NULL); > td->td_flags |=3D TDF_TIMEOUT; > wakeup_swapper =3D sleepq_resume_thread(sq, td, 0); > } else if (TD_ON_SLEEPQ(td)) { > /* > * If the thread is on the SLEEPQ but isn't sleeping > * yet, it can either be on another CPU in between > * sleepq_add() and one of the sleepq_*wait*() > * routines or it can be in sleepq_catch_signals(). > */ > td->td_flags |=3D TDF_TIMEOUT; > } >=20 > thread_unlock(td); > if (wakeup_swapper) > kick_proc0(); > } >=20 >=20 > I wonder if just not having: >=20 > td->td_sleeptimo > sbinuptime() || >=20 > in the if (. . .) is appropriate for the powerpc64 context in > use: presume callout_process 's tmp->c_time <=3D now is > sufficient? (Vs: race issues?) >=20 >=20 >=20 > Adjusted KTR_PROC output example ("show ktr /v" in ddb > captured, so newest to oldest order, flags and such are > in hex despite my not providing 0x prefixes): >=20 > 922 (0xc00000000d12f000:cpu3) 151713214399 = /usr/src/sys/kern/subr_sleepqueue.c.1027: sleepq_timeout thread not want = timeout yet: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) = td_sleeptimo=3D11ac2096af71 sbinuptime=3D11abf13bd142 > 921 (0xc00000000d12f000:cpu3) 151713214374 = /usr/src/sys/kern/subr_sleepqueue.c.1016: sleepq_timeout: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 920 (0xc00000000d12f000:cpu3) 151713214337 = /usr/src/sys/kern/kern_timeout.c.503: callout_process to call = softclock_clock_cc: thread 0xc00000000a8315a0 c_time=3D11ac2096af71 = now=3D11accb5df419 > 919 (0xc00000000a8315a0:cpu3) 151686051357 = /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) > 918 (0xc00000000a8315a0:cpu3) 151686051310 = /usr/src/sys/kern/subr_sleepqueue.c.409: sleepq_set_timeout_sbt: = sbt=3Dfffffed8 pr=3D0 flags=3D100 td_sleeptimo=3D11ac2096af71 = prec=3Dfffffed flags=3D501 > 917 (0xc00000000a8315a0:cpu3) 151686051306 = /usr/src/sys/kern/subr_sleepqueue.c.407: sleepq_set_timeout_sbt = installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, = pmac_thermal) > 916 (0xc00000000a8315a0:cpu3) 151686051267 = /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, = pmac_thermal) on pmac_therm (0xc0000000015082c3) > 915 (0xc00000000a8315a0:cpu3) 151686051162 = /usr/src/sys/kern/subr_sleepqueue.c.631: sleepq resume: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 914 (0xc00000000a8315a0:cpu3) 151686051144 = /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) > 913 (0xc000000001f6e000:cpu0) 151686050654 = /usr/src/sys/kern/subr_sleepqueue.c.837: sleepq_wakeup: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 912 (0xc000000001f6e000:cpu0) 151686050634 = /usr/src/sys/kern/subr_sleepqueue.c.989: sleepq_remove_matching calling = sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) > 911 (0xc00000000a8315a0:cpu3) 151684339557 = /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) > 910 (0xc00000000a8315a0:cpu3) 151684339525 = /usr/src/sys/kern/subr_sleepqueue.c.409: sleepq_set_timeout_sbt: = sbt=3Dcccccbe0 pr=3D0 flags=3D100 td_sleeptimo=3D11abe0139d4d = prec=3Dcccccbe flags=3D501 > 909 (0xc00000000a8315a0:cpu3) 151684339517 = /usr/src/sys/kern/subr_sleepqueue.c.407: sleepq_set_timeout_sbt = installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, = pmac_thermal) > 908 (0xc00000000a8315a0:cpu3) 151684339498 = /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, = pmac_thermal) on smu (0xe000000087fd1670) > 907 (0xc00000000a8315a0:cpu3) 151684339326 = /usr/src/sys/kern/subr_sleepqueue.c.631: sleepq resume: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 906 (0xc00000000a8315a0:cpu3) 151684339313 = /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) > 905 (0xc000000001f6e000:cpu3) 151684338924 = /usr/src/sys/kern/subr_sleepqueue.c.837: sleepq_wakeup: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 904 (0xc000000001f6e000:cpu3) 151684338918 = /usr/src/sys/kern/subr_sleepqueue.c.989: sleepq_remove_matching calling = sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) > 903 (0xc00000000a8315a0:cpu3) 151682628069 = /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) > 902 (0xc00000000a8315a0:cpu3) 151682628004 = /usr/src/sys/kern/subr_sleepqueue.c.409: sleepq_set_timeout_sbt: = sbt=3Dcccccbe0 pr=3D0 flags=3D100 td_sleeptimo=3D11abd3054758 = prec=3Dcccccbe flags=3D501 > 901 (0xc00000000a8315a0:cpu3) 151682628000 = /usr/src/sys/kern/subr_sleepqueue.c.407: sleepq_set_timeout_sbt = installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, = pmac_thermal) > 900 (0xc00000000a8315a0:cpu3) 151682627960 = /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, = pmac_thermal) on smu (0xe000000087fd1670) > 899 (0xc00000000a8315a0:cpu3) 151682627706 = /usr/src/sys/kern/subr_sleepqueue.c.631: sleepq resume: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 898 (0xc00000000a8315a0:cpu3) 151682627683 = /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) > 897 (0xc000000001f6e000:cpu0) 151682627254 = /usr/src/sys/kern/subr_sleepqueue.c.837: sleepq_wakeup: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 896 (0xc000000001f6e000:cpu0) 151682627242 = /usr/src/sys/kern/subr_sleepqueue.c.989: sleepq_remove_matching calling = sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) > 895 (0xc00000000a8315a0:cpu0) 151680915222 = /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) > 894 (0xc00000000a8315a0:cpu0) 151680915139 = /usr/src/sys/kern/subr_sleepqueue.c.409: sleepq_set_timeout_sbt: = sbt=3Dcccccbe0 pr=3D0 flags=3D100 td_sleeptimo=3D11abc5f6f163 = prec=3Dcccccbe flags=3D501 > 893 (0xc00000000a8315a0:cpu0) 151680915133 = /usr/src/sys/kern/subr_sleepqueue.c.407: sleepq_set_timeout_sbt = installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, = pmac_thermal) > 892 (0xc00000000a8315a0:cpu0) 151680915085 = /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, = pmac_thermal) on smu (0xe000000087fd1670) > 891 (0xc00000000a8315a0:cpu0) 151680914811 = /usr/src/sys/kern/subr_sleepqueue.c.631: sleepq resume: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 890 (0xc00000000a8315a0:cpu0) 151680914784 = /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 = (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal) > 889 (0xc000000001f6e000:cpu2) 151680914304 = /usr/src/sys/kern/subr_sleepqueue.c.837: sleepq_wakeup: thread = 0xc00000000a8315a0 (pid 26, pmac_thermal) > 888 (0xc000000001f6e000:cpu2) 151680914284 = /usr/src/sys/kern/subr_sleepqueue.c.989: sleepq_remove_matching calling = sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) > . . . =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)