Date: Thu, 27 Dec 2018 22:45:18 +0000 From: bugzilla-noreply@freebsd.org To: testing@freebsd.org Subject: [Bug 233646] Flakey test case: bin.sh.builtins.functional_test.kill1 Message-ID: <bug-233646-32464-oabj91MzZ1@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-233646-32464@https.bugs.freebsd.org/bugzilla/> References: <bug-233646-32464@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D233646 Jilles Tjoelker <jilles@FreeBSD.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|New |Open --- Comment #3 from Jilles Tjoelker <jilles@FreeBSD.org> --- In the below text, wait(2) means any wait system call; sh(1) uses wait3() w= hich appears as wait4() in ktrace. The test case is meant to test that a terminated, wait(2)ed for but not wait(1)ed for job can be passed to kill(1) without error (the command will = do nothing). The part with the second background job, p2 and wait is intended = to wait for the first background job to terminate and be wait(2)ed for, without taking excessive time or wait(1)ing for it (which would make the %1 specification invalid). If the first background job is slow to terminate, t= he kill command will do something but this is harmless. If the first background job terminates but the kernel has not returned it yet via wait(2), the kill command will kill a zombie which per POSIX does nothing successfully. I noticed that the problem is quickly reproduced on head using a loop like while sh builtins/kill1.0; do :; done using head's sh as well as stable/11's sh, while it can run for quite a whi= le on stable/11 using stable/11's sh as well as head's sh built against stable= /11. Reproducing with ktrace -i seems hard, but reproducing with plain ktrace wo= rks. The below ktrace extract seems to indicate that the kernel is at fault, returning an [ESRCH] error for killing a zombie: 19837 sh CALL fork 19837 sh RET fork 19838/0x4d7e 19837 sh CALL wait4(0xffffffff,0x7fffffffe91c,0x1<WNOHANG>,0) 19837 sh RET wait4 0 19837 sh CALL fork 19837 sh RET fork 19839/0x4d7f 19837 sh CALL sigprocmask(SIG_BLOCK,0x7fffffffe820,0x7fffffffe810) 19837 sh RET sigprocmask 0 19837 sh CALL sigaction(SIGCHLD,0x7fffffffe850,0x7fffffffe830) 19837 sh RET sigaction 0 19837 sh CALL wait4(0xffffffff,0x7fffffffe80c,0x1<WNOHANG>,0) 19837 sh RET wait4 19839/0x4d7f 19837 sh CALL sigaction(SIGCHLD,0x7fffffffe830,0) 19837 sh RET sigaction 0 19837 sh CALL sigprocmask(SIG_SETMASK,0x7fffffffe810,0) 19837 sh RET sigprocmask 0 19837 sh CALL kill(0x4d7e,SIGTERM) 19837 sh RET kill -1 errno 3 No such process Process ID 18007 has not been returned by a wait4() call, so it must either= be still running or a zombie. In either case, a kill() on it must succeed. It appears that there is no test that specifically verifies that killing a zombie process succeeds. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-233646-32464-oabj91MzZ1>