From owner-freebsd-questions@FreeBSD.ORG Tue May 15 17:11:00 2007 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5BDD816A407 for ; Tue, 15 May 2007 17:11:00 +0000 (UTC) (envelope-from freebsd@sopwith.solgatos.com) Received: from schitzo.solgatos.com (pool-71-117-239-32.ptldor.fios.verizon.net [71.117.239.32]) by mx1.freebsd.org (Postfix) with ESMTP id 3AACF13C44C for ; Tue, 15 May 2007 17:10:57 +0000 (UTC) (envelope-from freebsd@sopwith.solgatos.com) Received: from schitzo.solgatos.com (localhost.home.localnet [127.0.0.1]) by schitzo.solgatos.com (8.13.8/8.13.8) with ESMTP id l4FHAvHn003875 for ; Tue, 15 May 2007 10:10:57 -0700 Received: from sopwith.solgatos.com (uucp@localhost) by schitzo.solgatos.com (8.13.8/8.13.4/Submit) with UUCP id l4FHAvE8003872 for freebsd-questions@freebsd.org; Tue, 15 May 2007 10:10:57 -0700 Received: from localhost by sopwith.solgatos.com (8.8.8/6.24) id RAA08176; Tue, 15 May 2007 17:08:35 GMT Message-Id: <200705151708.RAA08176@sopwith.solgatos.com> To: freebsd-questions@freebsd.org Date: Tue, 15 May 2007 10:08:35 +0100 From: Dieter Subject: Looks like atrun has a race condition? (was: at job disappears?) X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 May 2007 17:11:00 -0000 > FreeBSD 6.2 > AMD64 (single CPU) > /var is FFS with soft-updates, on SATA. > > /var/cron/tabs/root contains: > > * * * * * /usr/libexec/atrun > > I had three at jobs queued. They all call the same shell > script with different arguments. First one runs fine. > Second one gets: > > atrun[3212]: cannot open input file: No such file or directory > > And then the third one runs fine. > > The machine is idle except for the at jobs. No reboot, no fsck. > As far as I know, nothing should be mucking around in /var/at except > atrun. Nothing to explain a file disappearing into thin air. Looking at the atrun source, I think there is a race condition. When atrun starts running a job, the first thing it does is chmod the job file to 400. But in main() we have /* Delete older files */ if ((run_time < now) && !(S_IXUSR & buf.st_mode) && (S_IRUSR & buf.st_mode)) unlink(dirent->d_name); Main() doesn't know that run_file() isn't finished with the file and blindly unlinks it. Since run_file() unlinks the file when it is finished, I assume the unlink in main() is to clean up files after a crash? Perhaps main() should only unlink the file if it is really old, say a week.