Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 10 Feb 2010 22:55:27 +0530
From:      Naveen Gujje <gujjenaveen@gmail.com>
To:        freebsd-hackers@freebsd.org
Subject:   Re: System() returning ECHILD error on FreeBSD 7.2
Message-ID:  <39c945731002100925i2e466768peac89cdef15463f2@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Naveen Gujje <gujjenaveen at gmail.com
<http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>>; wrote:
 >> signal(SIGCHLD, SigChildHandler);
 >>
 >> void
 >> SigChildHandler(int sig)

 >> {
 >>   pid_t pid;
 >>
 >>   /* get status of all dead procs */
 >>   do {
 >>     int procstat;
 >>     pid = waitpid(-1, &procstat, WNOHANG);
 >>     if (pid < 0) {

 >>       if (errno == EINTR)
 >>         continue;               /* ignore it */
 >>       else {
 >>         if (errno != ECHILD)
 >>           perror("getting waitpid");

 >>         pid = 0;                /* break out */
 >>       }
 >>     }
 >>     else if (pid != 0)
 >>       syslog(LOG_INFO, "child process %d completed", (int) pid);

 >>   } while (pid);
 >>
 >>   signal(SIGCHLD, SigChildHandler);
 >> }

>There are several problems with your signal handler.

>First, the perror() and syslog() functions are not re-entrant,

>so they should not be used inside signal handlers.  This can
>lead to undefined behaviour.  Please refer to the sigaction(2)
>manual page for a list of functions that are considered safe
>to be used inside signal handlers.

>Second, you are using functions that may change the value of
>the global errno variable.  Therefore you must save its value
>at the beginning of the signal handler, and restore it at the
>end.

>Third (not a problem in this particular case, AFAICT, but
>still good to know):  Unlike SysV systems, BSD systems do
>_not_ automatically reset the signal action when the handler
>is called.  Therefore you do not have to call signal() again

>in the handler (but it shouldn't hurt either).  Because of
>the semantic difference of the signal() function on different
>systems, it is preferable to use sigaction(2) instead in
>portable code.

Okay, I followed your suggestion and changed my SigChildHandler to

void
SigChildHandler(int sig)
{
  pid_t pid;
  int status;
  int saved_errno = errno;

  while (((pid = waitpid( (pid_t) -1, &status, WNOHANG)) > 0) ||

         ((-1 == pid) && (EINTR == errno)))
    ;

  errno = saved_errno;
}

and used sigaction(2) to register this handler. Still, system(3) returns
-1 with errno set to ECHILD.

 >> And, in some other part of the code, we call system() to add an ethernet

 >> interface. This system() call is returning -1 with errno set to ECHILD,
 >> though the passed command is executed successfully.  I have noticed that,
 >> the problem is observed only after we register SigChildHandler. If I have a

 >> simple statement like system("ls") before and after the call to
 >> signal(SIGCHLD, SigChildHandler), the call before setting signal handler
 >> succeeds without errors and the call after setting signal handler returns -1

 >> with errno set to ECHILD.
 >>
 >> Here, I believe that within the system() call, the child exited before the
 >> parent got a chance to call _wait4 and thus resulted in ECHILD error.

>I don't think that can happen.

 >> But, for the child to exit without notifying the parent, SIGCHLD has to be
 >> set to SIG_IGN in the parent and this is not the case, because we
are already

 >> setting it to SigChildHandler. If I set SIGCHLD to SIG_DFL before calling
 >> system() then i don't see this problem.
 >>
 >> I would like to know how setting SIGCHLD to SIG_DFL or SigChildHanlder is

 >> making the difference.

>The system() function temporarily blocks SIGCHLD (i.e. it
>adds the signal to the process' signal mask).  However,
>blocking is different from ignoring:  The signal is held

>as long as it is blocked, and as soon as it is removed
>from the mask, it is delivered, i.e. your signal handler
>is called right before the system() function returns.

Yes, I agree with you. Here, I believe, the point in blocking SIGCHLD
is to give preference to wait4() of system() over any other waitXXX() in
parent process. But I still cant get the reason for wait4() to return -1.

>And since you don't save the errno value, your signal
>handler overwrites the value returned from the system()
>function.  So you get ECHILD.

I had a debug print just after wait4() in system() and before we unblock
SIGCHLD. And it's clear that wait4() is returning -1 with errno as ECHILD.

>Best regards
>   Oliver

--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?39c945731002100925i2e466768peac89cdef15463f2>