Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 11 Nov 2001 15:12:09 +0100
From:      "Jose M. Alcaide" <jose@we.lc.ehu.es>
To:        hardware@FreeBSD.org
Subject:   requesting opinions about strange 3ware problem
Message-ID:  <20011111151209.A319@v-ger.we.lc.ehu.es>

next in thread | raw e-mail | index | archive | help
Warning: this is a long message.

At the end of past July, I installed a 3ware Escalade 7410 RAID controller
with four IBM DTLA-307075 (75 GB) disks attached in RAID-5 configuration.
Everything worked fine for two months, until I installed the new version
of the 3dm utility (I did not installed it before, because its previous
version did not support the Escalade 7xxx series). Once 3dm was installed
everything looked OK. However, some hours later (at 3:02 AM), an error was
detected in one of the ports of the RAID unit ("drive error... check for
cables or media errors". I rebooted the system, entered the 3ware BIOS and
started a rebuild of the RAID-5 unit, which was completed about two or
three hours later. I don't believe on casuality, so I suspected (and I
still suspect) that the 3dm utility "did something" that triggered the
error.

Two or three days later, _also_ at 3:02 AM, an error (same type) was
detected again, but this time on another port/drive. Very strange. I
rebuilt the unit again and disabled 3dm, and I also updated the firmware
of the Escalade 7410 (BTW, after updating the firmware the rebuilds were
significantly faster - about 45').

After two weeks without detecting any error I was convinced that the
problem was "dissolved", but unfortunately it happened again, _also_ at
3:02 AM. Obviously the time is a key datum: at 3:00 cron starts the
"periodic daily" script, which in turn runs /etc/security. This script
does a find(1) in filesystems mounted without the "nosuid" flag. However,
the filesystem residing in the twed0 device is mounted with nosuid, so
that /etc/security does not access the RAID-5 unit. The other filesystems
(/, /var, /usr) reside in an ATA disk (ad0). The symptom is that a high
number of tps with the ATA disks triggers the error in the RAID-5. Weird.

The system is based on a Gigabyte GA-7ZX motherboard, which uses the VIA
KT133 chipset. I read several messages talking about a bug in the BIOS
initialization of the 686B south bridge, which can cause data corruption
of PCI transfers. Maybe this bug could be related to my problem, but
then... why did the errors started to happen just after I installed 3dm?

I also suspected of a power problem, so that I installed a new power
supply (a PC Power&Cooling Turbo-Cool 450ATX), to no avail: the RAID
errors continue.

I am really desperate, so I will be thankful for any opinions, ideas,
suggestions...

  Thanks in advance
  JMA
-- 
****** Jose M. Alcaide  //  jose@we.lc.ehu.es  //  jmas@FreeBSD.org ******
** "Beware of Programmers who carry screwdrivers" --  Leonard Brandwein **

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hardware" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011111151209.A319>