Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 28 Dec 2003 00:16:28 -0500
From:      Chuck Swiger <>
To:        Joachim Dagerot <>
Subject:   Re: What logs etc do I need to checkfrequently?
Message-ID:  <>
In-Reply-To: <>
References:  <>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
Joachim Dagerot wrote:
> As you with good memories know, I lost 3000 pictures of my first sons
> first year this month. I did have a RAID-5 system with fresh disks,
> however, shit happens and I have a feeling that this could have been
> avoided if I read my log files better.

I'm sorry that you lost data.

While you may have been able to notice the problem with the RAID-5 array in 
time to do something, what you ought to do to avoid losing more data sometime 
in the future involves making good backups-- not poring over the system log 
files, not configuring RAID.

> So basically, 
> a) I get a mail each time my a cron-event fires, this happens every 30
> min so the mailbox are quite loaded, not very funny going through.

If you can, change the cron task to not generate output unless there is a 
problem that you should know about.  Failing that, append "> /dev/null 2>&1" 
to the line in your crontab, which will discard the output, meaning you won't 
get mail from cron.

> a1) Is it possible to only get a mail with critical information, where
> and what do I need to do to achieve this?

My comments above should help you reduce the amount of junk mail you get from 

> b1) Where will information about ongoing disk-problems appear?  How can
> I see that there is a flaky disk in a non-rebooted system?

/var/log/messages.  The system will complain quite noticably in the face of 
hardware errors, and should log one or more lines for every bad sector it runs 

On the other hand, depending on the hard drive to fail gradually is risky: 
hard drives can fail catestrophically without giving significant warning. 
Some failure modes-- stiction in particular-- can sometimes be worked around 
on a temporary basis long enough to recover data without heroic measures (ie, 
paying a data recovery company a few grand).

It's important to realize that while RAID modes which provide fault-tolerance 
do improve availability (ie, they can save your data if a drive goes), RAID is 
not a substitute for backups.  In particular, RAID-5 or RAID-1 doesn't help a 
bit if someone deletes or overwrites a file....

> In addition to the questions above, is there something else I need to
> tune/install/setup/configurare to get a very reliable system that
> report critical data to me but where non-critical data is filtered
> out?

/etc/syslog.conf defines the configuration of system logging, and it is worth 
reviewing that to understand what is being logged and where.


Want to link to this message? Use this URL: <>