Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 05 Sep 2002 21:41:05 +0100
From:      Duncan Barclay <dmlb@dmlb.org>
To:        FreeBSD-gnats-submit@FreeBSD.org
Cc:        marcel@xclint.net, dmlb@dmlb.org
Subject:   kern/42457: Hack to allow Linux Matlab to exit
Message-ID:  <E17n3R3-0000Ms-00@slave.my.domain>

next in thread | raw e-mail | index | archive | help

>Number:         42457
>Category:       kern
>Synopsis:       Hack to allow Linux Matlab to exit
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Sep 05 13:50:01 PDT 2002
>Closed-Date:
>Last-Modified:
>Originator:     Duncan Barclay
>Release:        FreeBSD 4.6-PRERELEASE i386
>Organization:
>Environment:
System: FreeBSD slave.my.domain 4.6-PRERELEASE FreeBSD 4.6-PRERELEASE #2: Thu Sep 5 21:11:18 BST 2002 dmlb@slave.my.domain:/usr/src-CVSup/sys/compile/SLAVE i386
>Description:
Linux Matlab version 6 and 6.1 and possibly 6.5, are known to hang
on exit when the matlab Java VM is used. A kill -9 is required.

Matlab when using its JVM creates a number of threads:
 matlab
   matlab thread #1
     matlab thread #1.1
     matlab thread #1.2
     matlab thread #1.3

On exit, threads #1.1, #1.2 and #1.3 die gracefully and are reaped by
thread #1. However, thread #1 is not reaped correctly with matlab
apparently issuing a
	linux_wait4(-1, &foo, 0 0).
This does not reap threads but processes.

Thread #1 is created with
	linux_clone(0xf00, *bar())
The options mask specifies a thread that does not want to send its
parent a signal when it dies.

From linux clone(2):
        The low byte of flags contains the number of the signal sent
        to the parent when the child dies. If this signal is specified
        as anything other than SIGCHLD , then the parent process must
        specify the __WALL or __WCLONE options when waiting for the
        child with wait (2). If no signal is specified, then the
        parent process is not signaled when the child terminates.
[note last sentance]

FreeBSD always sends a signal to the parent when terminating
a process, from /sys/kern_exit.c:exit1()

        if (p->p_sigparent && p->p_pptr != initproc) {
                psignal(p->p_pptr, p->p_sigparent);
        } else {
                psignal(p->p_pptr, SIGCHLD);
        }

FreeBSD therefore sends matlab a SIGCHLD. Matlab has a SIGCHLD handler
that issues the above wait4. This is shown in the following ktrace
output with matlab pid = 6255, and thread #1 pid = 6304.

  6304 matlab   CALL  linux_kill(0x186f,0x20)
  6255 matlab   PSIG  SIG(null) caught handler=0x28c96e10 mask=0x80000000 code=0x0
  6304 matlab   RET   linux_kill 0
  6304 matlab   CALL  exit(0)
  6255 matlab   RET   linux_rt_sigsuspend -1 errno 4 Interrupted system call
  6255 matlab   PSIG  SIGCHLD caught handler=0x28c97460 mask=0x80000000 code=0x0
  6255 matlab   CALL  linux_wait4(0xffffffff,0xbfbfa1b0,0,0)
 
If the above code in kern_exit.c is replaced with

        if (p->p_sigparent && p->p_pptr != initproc) {
                psignal(p->p_pptr, p->p_sigparent);
        } else if (p->p_sigparent != 0) {
                psignal(p->p_pptr, SIGCHLD);
        }

to not send a SIGCHLD, then matlab reaps the thread. ktrace output
with matlab pid = 808, and thread #1 pid = 857.

   857 matlab   CALL  linux_kill(0x328,0x20)
   808 matlab   PSIG  SIG(null) caught handler=0x28c96e10 mask=0x80000000 code=0x0
   857 matlab   RET   linux_kill 0
   857 matlab   CALL  exit(0)
   808 matlab   RET   linux_rt_sigsuspend -1 errno 4 Interrupted system call
   808 matlab   CALL  linux_sigreturn(0xbfbfa928)
   808 matlab   RET   linux_sigreturn JUSTRETURN
   808 matlab   CALL  linux_wait4(0x359,0,0x80000000,0)
   808 matlab   RET   linux_wait4 857/0x359
   808 matlab   CALL  munmap(0x2d75d000,0x1000)
   808 matlab   RET   munmap 0
   808 matlab   CALL  exit(0)
  

>How-To-Repeat:
	run matlab and type "exit" at the prompt
>Fix:

Snippet of code above is suggested as a change to kern_exit.c,
but is probably dangerous as it stands as it changes exit
signalling behaviour.

Maintainers of kern_exit.c and the linuxulator are requested to
implement a more robust solution.
>Release-Note:
>Audit-Trail:
>Unformatted:

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E17n3R3-0000Ms-00>