Date: Sun, 11 Nov 2001 15:12:09 +0100 From: "Jose M. Alcaide" <jose@we.lc.ehu.es> To: hardware@FreeBSD.org Subject: requesting opinions about strange 3ware problem Message-ID: <20011111151209.A319@v-ger.we.lc.ehu.es>
next in thread | raw e-mail | index | archive | help
Warning: this is a long message. At the end of past July, I installed a 3ware Escalade 7410 RAID controller with four IBM DTLA-307075 (75 GB) disks attached in RAID-5 configuration. Everything worked fine for two months, until I installed the new version of the 3dm utility (I did not installed it before, because its previous version did not support the Escalade 7xxx series). Once 3dm was installed everything looked OK. However, some hours later (at 3:02 AM), an error was detected in one of the ports of the RAID unit ("drive error... check for cables or media errors". I rebooted the system, entered the 3ware BIOS and started a rebuild of the RAID-5 unit, which was completed about two or three hours later. I don't believe on casuality, so I suspected (and I still suspect) that the 3dm utility "did something" that triggered the error. Two or three days later, _also_ at 3:02 AM, an error (same type) was detected again, but this time on another port/drive. Very strange. I rebuilt the unit again and disabled 3dm, and I also updated the firmware of the Escalade 7410 (BTW, after updating the firmware the rebuilds were significantly faster - about 45'). After two weeks without detecting any error I was convinced that the problem was "dissolved", but unfortunately it happened again, _also_ at 3:02 AM. Obviously the time is a key datum: at 3:00 cron starts the "periodic daily" script, which in turn runs /etc/security. This script does a find(1) in filesystems mounted without the "nosuid" flag. However, the filesystem residing in the twed0 device is mounted with nosuid, so that /etc/security does not access the RAID-5 unit. The other filesystems (/, /var, /usr) reside in an ATA disk (ad0). The symptom is that a high number of tps with the ATA disks triggers the error in the RAID-5. Weird. The system is based on a Gigabyte GA-7ZX motherboard, which uses the VIA KT133 chipset. I read several messages talking about a bug in the BIOS initialization of the 686B south bridge, which can cause data corruption of PCI transfers. Maybe this bug could be related to my problem, but then... why did the errors started to happen just after I installed 3dm? I also suspected of a power problem, so that I installed a new power supply (a PC Power&Cooling Turbo-Cool 450ATX), to no avail: the RAID errors continue. I am really desperate, so I will be thankful for any opinions, ideas, suggestions... Thanks in advance JMA -- ****** Jose M. Alcaide // jose@we.lc.ehu.es // jmas@FreeBSD.org ****** ** "Beware of Programmers who carry screwdrivers" -- Leonard Brandwein ** To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hardware" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011111151209.A319>