From owner-freebsd-hackers@FreeBSD.ORG Fri Oct 8 19:36:43 2004 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 57D0A16A4CE for ; Fri, 8 Oct 2004 19:36:43 +0000 (GMT) Received: from beer.ux6.net (beer.ux6.net [64.62.253.29]) by mx1.FreeBSD.org (Postfix) with SMTP id 2616B43D41 for ; Fri, 8 Oct 2004 19:36:43 +0000 (GMT) (envelope-from miha@ghuug.org) Received: (qmail 65861 invoked by uid 113); 8 Oct 2004 12:36:43 -0700 Received: from 64.62.253.84 by beer.ux6.net (envelope-from , uid 112) with qmail-scanner-1.23 (clamdscan: 0.70. spamassassin: 2.64. Clear:RC:0(64.62.253.84):SA:0(0.0/6.0):. Processed in 3.556197 secs); 08 Oct 2004 19:36:43 -0000 X-Spam-Status: No, hits=0.0 required=6.0 Received: from unknown (HELO miha.netstream-gh.com) (miha@beer.ux6.net@64.62.253.84) by localhost with SMTP; 8 Oct 2004 12:36:39 -0700 From: "Mikhail P." To: hackers@freebsd.org Date: Fri, 8 Oct 2004 19:37:15 +0000 User-Agent: KMail/1.7 Organization: Ghana Unix Users Group MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200410081937.15068.miha@ghuug.org> Subject: ad0: FAILURE - WRITE_DMA X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: miha@ghuug.org List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Oct 2004 19:36:43 -0000 Hi, This question probably has been discussed numerous times, but I'm somewhat unsure what really causes ATA failures.. I have pretty basic server here which has two IDE drives - each is 200GB. System is FreeBSD-5.2.1-p9 That server has been setup about 9 months ago, and just about 3 months ago my logs quickly filled up with: ad0: FAILURE - WRITE_DMA status=51 error=10 LBA=268435455 Server was still running, but I was unable to write to certain files/folders on the drive - whenever I tried to access $HOME/.fetchmailrc, for example, it wouldn't read/write the file and system would fire up a message similar to above. After couple reboots, I started getting more and more of these, and server was unusable, so I had to shut down all services and mount drives read only to backup data from the drives.. At first, I thought, this could be related to poor cooling of the parts, so drives could easily overheat in the long run. After successful backup, I purchased two new drives, with two aluminum drive fans. New drives' models were identical to the old ones - ad0 ATA/ATAPI rev 6 which is Seagate's 200GB drive. I reloaded OS on the new drives, then restored all data from the old drives. All seemed to be fine for 2 months now... but today I woke up, and noticed these messages again. So now the whole situation leads me to a question - is there some issues with the ATA driver/system [or filesystem?] on FreeBSD-5.2.1? What can I do to stop these frequent failures? How do I diagnose the drives (and see whether it is really a hardware issue or something else) remotely (I don't have local access to the server - it is sitting overseas)? It seems to me that if I continue running system as now, I will have these failed drives every 1-2 months! It does not sound like a normal situation. I am running FreeBSD-5.2.1-p9, filesystem is UFS2, and all partitions [except for /] have softupdates "on". Kernel is built on GENERIC, with only added ipfw options. regards, M.