Date: Thu, 05 Sep 2002 21:41:05 +0100 From: Duncan Barclay <dmlb@dmlb.org> To: FreeBSD-gnats-submit@FreeBSD.org Cc: marcel@xclint.net, dmlb@dmlb.org Subject: kern/42457: Hack to allow Linux Matlab to exit Message-ID: <E17n3R3-0000Ms-00@slave.my.domain>
next in thread | raw e-mail | index | archive | help
>Number: 42457 >Category: kern >Synopsis: Hack to allow Linux Matlab to exit >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Sep 05 13:50:01 PDT 2002 >Closed-Date: >Last-Modified: >Originator: Duncan Barclay >Release: FreeBSD 4.6-PRERELEASE i386 >Organization: >Environment: System: FreeBSD slave.my.domain 4.6-PRERELEASE FreeBSD 4.6-PRERELEASE #2: Thu Sep 5 21:11:18 BST 2002 dmlb@slave.my.domain:/usr/src-CVSup/sys/compile/SLAVE i386 >Description: Linux Matlab version 6 and 6.1 and possibly 6.5, are known to hang on exit when the matlab Java VM is used. A kill -9 is required. Matlab when using its JVM creates a number of threads: matlab matlab thread #1 matlab thread #1.1 matlab thread #1.2 matlab thread #1.3 On exit, threads #1.1, #1.2 and #1.3 die gracefully and are reaped by thread #1. However, thread #1 is not reaped correctly with matlab apparently issuing a linux_wait4(-1, &foo, 0 0). This does not reap threads but processes. Thread #1 is created with linux_clone(0xf00, *bar()) The options mask specifies a thread that does not want to send its parent a signal when it dies. From linux clone(2): The low byte of flags contains the number of the signal sent to the parent when the child dies. If this signal is specified as anything other than SIGCHLD , then the parent process must specify the __WALL or __WCLONE options when waiting for the child with wait (2). If no signal is specified, then the parent process is not signaled when the child terminates. [note last sentance] FreeBSD always sends a signal to the parent when terminating a process, from /sys/kern_exit.c:exit1() if (p->p_sigparent && p->p_pptr != initproc) { psignal(p->p_pptr, p->p_sigparent); } else { psignal(p->p_pptr, SIGCHLD); } FreeBSD therefore sends matlab a SIGCHLD. Matlab has a SIGCHLD handler that issues the above wait4. This is shown in the following ktrace output with matlab pid = 6255, and thread #1 pid = 6304. 6304 matlab CALL linux_kill(0x186f,0x20) 6255 matlab PSIG SIG(null) caught handler=0x28c96e10 mask=0x80000000 code=0x0 6304 matlab RET linux_kill 0 6304 matlab CALL exit(0) 6255 matlab RET linux_rt_sigsuspend -1 errno 4 Interrupted system call 6255 matlab PSIG SIGCHLD caught handler=0x28c97460 mask=0x80000000 code=0x0 6255 matlab CALL linux_wait4(0xffffffff,0xbfbfa1b0,0,0) If the above code in kern_exit.c is replaced with if (p->p_sigparent && p->p_pptr != initproc) { psignal(p->p_pptr, p->p_sigparent); } else if (p->p_sigparent != 0) { psignal(p->p_pptr, SIGCHLD); } to not send a SIGCHLD, then matlab reaps the thread. ktrace output with matlab pid = 808, and thread #1 pid = 857. 857 matlab CALL linux_kill(0x328,0x20) 808 matlab PSIG SIG(null) caught handler=0x28c96e10 mask=0x80000000 code=0x0 857 matlab RET linux_kill 0 857 matlab CALL exit(0) 808 matlab RET linux_rt_sigsuspend -1 errno 4 Interrupted system call 808 matlab CALL linux_sigreturn(0xbfbfa928) 808 matlab RET linux_sigreturn JUSTRETURN 808 matlab CALL linux_wait4(0x359,0,0x80000000,0) 808 matlab RET linux_wait4 857/0x359 808 matlab CALL munmap(0x2d75d000,0x1000) 808 matlab RET munmap 0 808 matlab CALL exit(0) >How-To-Repeat: run matlab and type "exit" at the prompt >Fix: Snippet of code above is suggested as a change to kern_exit.c, but is probably dangerous as it stands as it changes exit signalling behaviour. Maintainers of kern_exit.c and the linuxulator are requested to implement a more robust solution. >Release-Note: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E17n3R3-0000Ms-00>