Date: Wed, 10 Feb 2010 22:55:27 +0530 From: Naveen Gujje <gujjenaveen@gmail.com> To: freebsd-hackers@freebsd.org Subject: Re: System() returning ECHILD error on FreeBSD 7.2 Message-ID: <39c945731002100925i2e466768peac89cdef15463f2@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
Naveen Gujje <gujjenaveen at gmail.com <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>> wrote: >> signal(SIGCHLD, SigChildHandler); >> >> void >> SigChildHandler(int sig) >> { >> pid_t pid; >> >> /* get status of all dead procs */ >> do { >> int procstat; >> pid = waitpid(-1, &procstat, WNOHANG); >> if (pid < 0) { >> if (errno == EINTR) >> continue; /* ignore it */ >> else { >> if (errno != ECHILD) >> perror("getting waitpid"); >> pid = 0; /* break out */ >> } >> } >> else if (pid != 0) >> syslog(LOG_INFO, "child process %d completed", (int) pid); >> } while (pid); >> >> signal(SIGCHLD, SigChildHandler); >> } >There are several problems with your signal handler. >First, the perror() and syslog() functions are not re-entrant, >so they should not be used inside signal handlers. This can >lead to undefined behaviour. Please refer to the sigaction(2) >manual page for a list of functions that are considered safe >to be used inside signal handlers. >Second, you are using functions that may change the value of >the global errno variable. Therefore you must save its value >at the beginning of the signal handler, and restore it at the >end. >Third (not a problem in this particular case, AFAICT, but >still good to know): Unlike SysV systems, BSD systems do >_not_ automatically reset the signal action when the handler >is called. Therefore you do not have to call signal() again >in the handler (but it shouldn't hurt either). Because of >the semantic difference of the signal() function on different >systems, it is preferable to use sigaction(2) instead in >portable code. Okay, I followed your suggestion and changed my SigChildHandler to void SigChildHandler(int sig) { pid_t pid; int status; int saved_errno = errno; while (((pid = waitpid( (pid_t) -1, &status, WNOHANG)) > 0) || ((-1 == pid) && (EINTR == errno))) ; errno = saved_errno; } and used sigaction(2) to register this handler. Still, system(3) returns -1 with errno set to ECHILD. >> And, in some other part of the code, we call system() to add an ethernet >> interface. This system() call is returning -1 with errno set to ECHILD, >> though the passed command is executed successfully. I have noticed that, >> the problem is observed only after we register SigChildHandler. If I have a >> simple statement like system("ls") before and after the call to >> signal(SIGCHLD, SigChildHandler), the call before setting signal handler >> succeeds without errors and the call after setting signal handler returns -1 >> with errno set to ECHILD. >> >> Here, I believe that within the system() call, the child exited before the >> parent got a chance to call _wait4 and thus resulted in ECHILD error. >I don't think that can happen. >> But, for the child to exit without notifying the parent, SIGCHLD has to be >> set to SIG_IGN in the parent and this is not the case, because we are already >> setting it to SigChildHandler. If I set SIGCHLD to SIG_DFL before calling >> system() then i don't see this problem. >> >> I would like to know how setting SIGCHLD to SIG_DFL or SigChildHanlder is >> making the difference. >The system() function temporarily blocks SIGCHLD (i.e. it >adds the signal to the process' signal mask). However, >blocking is different from ignoring: The signal is held >as long as it is blocked, and as soon as it is removed >from the mask, it is delivered, i.e. your signal handler >is called right before the system() function returns. Yes, I agree with you. Here, I believe, the point in blocking SIGCHLD is to give preference to wait4() of system() over any other waitXXX() in parent process. But I still cant get the reason for wait4() to return -1. >And since you don't save the errno value, your signal >handler overwrites the value returned from the system() >function. So you get ECHILD. I had a debug print just after wait4() in system() and before we unblock SIGCHLD. And it's clear that wait4() is returning -1 with errno as ECHILD. >Best regards > Oliver --
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?39c945731002100925i2e466768peac89cdef15463f2>