From owner-freebsd-bugs@FreeBSD.ORG Tue Apr 20 14:10:05 2010 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 570BA106566C for ; Tue, 20 Apr 2010 14:10:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 2A8AC8FC16 for ; Tue, 20 Apr 2010 14:10:04 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o3KEA3uI020739 for ; Tue, 20 Apr 2010 14:10:03 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o3KEA3cx020738; Tue, 20 Apr 2010 14:10:03 GMT (envelope-from gnats) Resent-Date: Tue, 20 Apr 2010 14:10:03 GMT Resent-Message-Id: <201004201410.o3KEA3cx020738@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Lucius Windschuh Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0ACCA106564A for ; Tue, 20 Apr 2010 14:01:50 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21]) by mx1.freebsd.org (Postfix) with ESMTP id D514A8FC19 for ; Tue, 20 Apr 2010 14:01:49 +0000 (UTC) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.14.3/8.14.3) with ESMTP id o3KE1nN1057133 for ; Tue, 20 Apr 2010 14:01:49 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.14.3/8.14.3/Submit) id o3KE1nBO057132; Tue, 20 Apr 2010 14:01:49 GMT (envelope-from nobody) Message-Id: <201004201401.o3KE1nBO057132@www.freebsd.org> Date: Tue, 20 Apr 2010 14:01:49 GMT From: Lucius Windschuh To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Cc: Subject: bin/145884: script: Racy return value X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Apr 2010 14:10:05 -0000 >Number: 145884 >Category: bin >Synopsis: script: Racy return value >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Tue Apr 20 14:10:03 UTC 2010 >Closed-Date: >Last-Modified: >Originator: Lucius Windschuh >Release: 9-CURRENT (i386) >Organization: >Environment: FreeBSD t400 9.0-CURRENT FreeBSD 9.0-CURRENT #139 r206412MP: Fri Apr 9 11:14:32 CEST 2010 root@t400:/usr/obj/usr/src/sys/CURRENT i386 >Description: script -qa $some_file $some_cmd is supposed to return the same value that the execution of $some_cmd gave (TODO: This is not mentioned in the man page). Unfortunately, script's return value changes randomly between zero and the expected value. I am a bit uncertain if this is a kernel or userland bug. As you see in the proposed patch, waiting for the child process to die instead of simply looking for a dead child solves the issue. So either the child is not dead when script tries to exit or the kernel has not yet marked the child dead? Another bit: "ktrace script -qa /tmp/foobar false" always returns the right result: ktrace stops the race, as it seems. Besides this, I was seeing this problem for quite a while (some months, I think). This indicates that it was not introduced by a recent commit. This bug made portupgrade hardly usable, as it did not reliably realize that the build process has failed. >How-To-Repeat: Execute this command many times: $ script -qa /tmp/foobar false && echo "This should not happen" And sometimes, you see "This should not happen" which, well, should not happen. :-) >Fix: See the attached file: Remove WNOHANG. I think that it does the right thing as wait3 returns immediately if no child process exists, so that finish() will return after the last child exited, which is exactly the point at which script shall return. Patch attached with submission follows: Index: src/usr.bin/script/script.c =================================================================== --- src/usr.bin/script/script.c (revision 206560) +++ src/usr.bin/script/script.c (working copy) @@ -223,7 +223,7 @@ int die, e, status; die = e = 0; - while ((pid = wait3(&status, WNOHANG, 0)) > 0) + while ((pid = wait3(&status, 0, 0)) > 0) if (pid == child) { die = 1; if (WIFEXITED(status)) >Release-Note: >Audit-Trail: >Unformatted: