Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Jan 1999 10:17:37 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Kevin Day <toasty@home.dragondata.com>
Cc:        dyson@iquest.net, wes@softweyr.com, hackers@FreeBSD.ORG
Subject:   Re: High Load cron patches - comments?
Message-ID:  <199901281817.KAA09891@apollo.backplane.com>
References:   <199901281748.LAA02377@home.dragondata.com>

next in thread | previous in thread | raw e-mail | index | archive | help
:Here's my problem. 
:
:Cron turned into a massive forkbomb every minute, and especially every 10
:minutes. Not only did the system nearly go dead at those points, but at
:times, it took 5 minutes to catch up.
:
:Supposed you have to run 60 jobs per minute, and they all take around a
:second to execute. If you run them one second at a time, you're likely to
:...
:
:My only goal was to spread cron's jobs out a bit, so I didn't saturate my
:nfs server's ethernet every 10 mins. When users are allowed to submit their
:...
:While I think a way that took how busy the CPU is, rather than how busy cron
:is would be a better metric to go by, it's obviously not as simple as it
:...
:My patches have a feature where they'll continually increasing the fork
:speed, if it's obvious that the backlog is getting to some silly
:proportions. Perhaps this is wrong, and it should just drop new jobs. In my
:case this probably wouldn't be bad, but I think that's definately 'breaking'
:cron, and should be an optional feature.
:...
:What I came up with, sounds a lot like John Dyson's sample piece of code,
:except I used integer math, and he's using floating point. (He's also using
:...
:Kevin

    I think a rate limited cron is a good solution, but I would also ( if you
    haven't already ) supply a max-parallel-jobs option.  Increasing the
    fork rate works to a degree, but you also have to make sure that cron
    (A) cannot kill the machine, and (B) cannot fall into a fork cascade 
    failure by overloading the machine so much that the jobs can't be
    retired faster then new jobs are queued.

    So, for example, you might have a feedback parameter X but you should
    also have an absolute limit Y, which you set relatively high. 

    Lets see... here's a good example.  Lets say that every 10 minutes cron
    decides to fork off 50 jobs simultaniously, but at midnight and noon
    cron wants to fork off 200 jobs simultaniously.  

    Lets say that every 10 minutes, with nominal delaying tactics and no hard 
    limits, you are able to limit the maximum number of parallel jobs to,
    say, 35.  Say you want a relatively sharp feedback to bump up the fork
    rate to get the jobs done before the next 10 minute period occurs.

    These same parameters, however, could fail utterly at noon and midnight.
    At noon and midnight the rate parameters that worked for the 10 minute
    jobs might result, say, in 120 parallel jobs.

    This is where the hard limit comes in.  If you specified a hard limit
    that was nominally greater then the 10 minute parallel job load, but
    less then the midnight and noon job load, you effectively allow your
    nominal case through but force the jobs that get run at midnight
    and noon to 'spread out' a little more.  

    You might specify a hard limit of, for example, 60 parallel jobs.  This
    is well within the 35 parallel jobs that the fork-rate limit produces
    on the 10 minute jobs but prevents the midnight and noon jobs from
    overloading the system.

    In effect, your feedback parameter solves your NFS burstiness problem
    under 'normal' load conditions and the absolute limit handles the more 
    severe noon & midnight cases.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199901281817.KAA09891>