Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 15 May 2007 10:08:35 +0100
From:      Dieter <freebsd@sopwith.solgatos.com>
To:        freebsd-questions@freebsd.org
Subject:   Looks like atrun has a race condition?  (was: at job disappears?)
Message-ID:  <200705151708.RAA08176@sopwith.solgatos.com>

next in thread | raw e-mail | index | archive | help
> FreeBSD 6.2
> AMD64 (single CPU)
> /var is FFS with soft-updates, on SATA.
> 
> /var/cron/tabs/root  contains:
> 
> 	*     *       *       *       *           /usr/libexec/atrun
> 
> I had three at jobs queued.  They all call the same shell
> script with different arguments.  First one runs fine.
> Second one gets:
> 
> 	atrun[3212]: cannot open input file: No such file or directory
> 
> And then the third one runs fine.
> 
> The machine is idle except for the at jobs.  No reboot, no fsck.
> As far as I know, nothing should be mucking around in /var/at except
> atrun.  Nothing to explain a file disappearing into thin air.

Looking at the atrun source, I think there is a race condition.

When atrun starts running a job, the first thing it does is
chmod the job file to 400.  But in main() we have

	/*  Delete older files
         */
	if ((run_time < now) && !(S_IXUSR & buf.st_mode) && (S_IRUSR & buf.st_mode))
		unlink(dirent->d_name);

Main() doesn't know that run_file() isn't finished with the file and
blindly unlinks it.

Since run_file() unlinks the file when it is finished, I assume the unlink
in main() is to clean up files after a crash?  Perhaps main() should only
unlink the file if it is really old, say a week.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200705151708.RAA08176>