Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 5 Jun 1999 11:19:00 -0700 (PDT)
From:      "Michael A. Alderete" <alderete@yahoo.com>
To:        freebsd-stable@freebsd.org, freebsd-questions@freebsd.org
Subject:   [HELP!] Crashing FreeBSD 3.1 server with file system corruption
Message-ID:  <19990605181900.24793.rocketmail@web125.yahoomail.com>

next in thread | raw e-mail | index | archive | help
I have a FreeBSD-based web/ftp server that is crashing
regularly, and the crashes are causing
(causedby?)filesystem corruption.

I'm suspecting (and hoping) that it's just a
configuration problem, or a known bug with an easy
workaround. I'd hate to learn that there was something
inherently wrong here!

Here are the details, hopefully someone will recognize
the issue:

* The server hardware is an Intel N440BX motherboard,
with two Intel Pentium III 450 processors. 256 megs of
RAM. Built-in graphics, and a PCI NE2000 Ethernet card
(the built-in EtherExpress interface wouldn't work for
me, another story, another time).

* The disk subsystem is a SCSI RAID controller from
DPT. It's a PCI card and has 4 drives attached,
configured in a RAID 5 with one drive as ahot-standby.

* FreeBSD is version 3.1-STABLE-05051999, with the
kernel recompiled for SMP support and MAXUSERS=256.

* The server worked fine for me as I was installing
and upgrading it, adding and compiling additional
software, uploading megabytes of data to the ftp
directories, etc.

* The server also had no difficulties when I sent out
a company-internal e-mail with a request to bang on
it. That's a load of about 30 users at any given
time(http only, though).

* We put it into production use as our main web server
on Wednesday night. It handled quite a high load as
people checked out the new site design, and downloaded
our latest beta archive.
 
* Thursday night was the first crash, we don't know
what caused it. The server rebooted, but fsck failed
on /home, and so it didn't come up automatically.
Manually running fsck fixed the errors, with a few
files recovered to lost+found.

* The next morning while examining the files in
lost+found, doing a cp of one of the files, the server
crashed again. This time it rebooted itself, found no
file system problems, and came up.

* Last night (Saturday at 5:30am, actually) it failed
again. This time there are file system errors on /usr
and /home. fsck fixed the problems on /usr and most of
them on /home, but now we're getting an error about a
bad sector. /home obviously refuses to mount.

* In all of these cases, we see nothing in
/var/log/messages to indicate what's going wrong.
Presumably something happens to make it think it can't
write to disk, and so it doesn't try to write to the
log...? Even the directory change that caused second
crash showed no evidence of being carried out: the
directory datestamp remained unchanged, and the file
copy did not exist when we rebooted.

Anything known about problems in FreeBSD-STABLE with
SMP configurations and RAID sub-systems? Or other
obvious (or subtle) problems?

Thanks much!
Michael
===

---
Michael A. Alderete
<michael@alderete.com>
<http://www.alderete.com/>;

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990605181900.24793.rocketmail>