Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 17 May 1999 19:26:45 -0500
From:      Carol Deihl <carol@tinker.com>
To:        ndear@areti.net
Cc:        freebsd-isp@FreeBSD.ORG
Subject:   Re: Web Statistics break up program.
Message-ID:  <3740B3C5.947395D0@tinker.com>
References:  <199905172017.VAA08078@post.mail.areti.net>

next in thread | previous in thread | raw e-mail | index | archive | help
FreeBSD already has the parts to do a nice job, just requires
a little scripting to get it going. Here's what we do.

I wrote a little perl script called /etc/rotatelogs that
rotates the logs for our virtual domains just after midnight.
It uses the "newsyslog" program to do this, which takes care
of removing aged log files. (The script is below.) I leave the
just-rotated access log un-gzipped, since the stats program will
be looking at it soon. I use a separate newsyslog configuration
file for each virtual domain, in case I want to do something
special for a particular domain. Below I've included an
example config file. The scripts assume that these config files
are in a directory /etc/newsyslog.confs, named with the
domain name.

Then at slow times in the wee hours,
I run other scripts that produce the daily, weekly, and
monthly stats. After they're done, they gzip the log files.
We use wusage for our stats, but I presume the
principle would apply for analog. I've included our daily
script below, as an example.

The /etc/webstats.conf file that these scripts reference is
just a file with domain names, one per line, to tell which
domains to work on. That way you can easily stop the stats
production for any particular domain.

The rotatelogs script assumes that it's using a recent
version of newsyslog, that takes a path to the pid file (the file
that contains the process id of the web server daemon), so that
it can notify the web server that the log file(s) moved out
from underneath it. I know this version of newsyslog is in 2.2.8,
but IIRC wasn't in 2.2.1; don't know when the change was made.

Here's our /etc/crontab entries to cause this to happen:

# rotate virtual server log files just after midnight to get nice stats
1      0       *       *       *       root    /etc/rotatelogs
# produce the webstats after logs are rotated
30     0       *       *       *       root    /etc/webstats.daily
# do weekly webstats on Sunday morning
1      7       *       *       7       root    /etc/webstats.weekly
# do monthly webstats on the first
30     4       1       *       *       root    /etc/webstats.monthly

Here's an example newsyslog config file for the domain "mydomain.com".
We keep two logs for each domain, the access log and the error log.
Note the "Z" in the error log line - that tells newsyslog to gzip
the file when it's done, since we don't need to access the
error log file to produce stats. We keep 200 old log files, but
you may choose less :-) 
Also note there are only two lines in the config file, but I've broken
them apart for display here:

/path-to-httpd-access-log-for-mydomain.com
        mydomainuser.group 664 200 * 1 -
        /path-to-httpd-pid-file-for-mydomain.com
/path-to-httpd-error-log-for-mydomain.com
        mydomainuser.group 664 200 * 1 Z
        /path-to-httpd-pid-file-for-mydomain.com


Here are the scripts (some details removed to make it
easier to follow).

#!/usr/bin/perl
#
#       /etc/rotatelogs - rotate web logs
#
#       run from crontab at 00:01 daily
########################################

$debug = 0;     # 0 = really run, 1 = just print

# Carol used to have a special newsyslog that read the path-to-pid-file,
# but now this is in the standard one
#$newsyslog="/usr/local/sbin/newsyslog";

$newsyslog="/usr/sbin/newsyslog";

open CONF, "/etc/webstats.conf";

# Read a config file to say which domains to rotate
while ($dn = <CONF>) {
        chomp $dn;
        if ($dn =~ m/#/o) {
                next;   # skip comments
        }
        if (not $dn) {
                next;   # skip blank lines
        }
        $conffile = "/etc/newsyslog.confs/$dn";

        if ($debug) {
                print "$newsyslog -f $conffile\n";
        }
        else {
                # Run newsyslog to actually rotate the log files
                # example of this system call:
                # /usr/sbin/newsyslog -f
/etc/newsyslog.confs/mydomain.com
                system "$newsyslog -f $conffile";
        }
}

close CONF;

exit 0;
####### end of /etc/rotatelogs


#!/usr/bin/perl
#
#       /etc/webstats.daily - produce daily web stats, zip the log
#
#       run from crontab at 00:30 daily
#       Runs after logs have been rotated
########################################

$debug = 0;     # 0 = really run, 1 = just print

$wusage="/usr/local/bin/wusage";
$gzip="/usr/bin/gzip";

($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst)
        = localtime (time - 86400);
$year = $year + 1900;
$mon  = $mon + 1;
$enddate = "$mon/$mday/$year";

open CONF, "/etc/webstats.conf";

# Read a config file to say which domains to produce stats for
while ($dn = <CONF>) {
        chomp $dn;
        if ($dn =~ m/#/o) {
                next;   # skip comments
        }
        if (not $dn) {
                next;   # skip blank lines
        }
        $logfile = "/path-to-log-files-for-$dn/httpd-access.log.0";

        if ($debug) {
                print "cd /path-to-stats-conf-for-$dn\n";
                print "$wusage -c ./daily.conf -e $enddate -l $logfile;
$gzip $logfile\n";
        }
        else {
                chdir "/path-to-stats-conf-for-$dn";
                system "$wusage -c ./daily.conf -e $enddate -l $logfile;
$gzip $logfile";
        }
}

close CONF;

exit 0;
#########  end of /etc/webstats.daily


"Nicholas J. Dear" wrote:
> We currently use Analog to generate our stats, but now customers are asking
> for weekly, or monthly stats to be generated, rather that just one accumulative
> lot. Is there anyway to automatically break up the stats, and have them to put into
> its own html stats file.
> ...
> Also, something to delete logs after a certain period would be useful. I've
> looked at rotatelogs which comes with Apache, but it doesn't seem to do exactly
> what we need.
-- 
Carol Deihl - carol@tinker.com
Shrier and Deihl - Unix Network Admin and Internet Software Development


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-isp" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3740B3C5.947395D0>