Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 18 Feb 2005 09:03:35 -0600 (CST)
From:      "Reid Linnemann" <lreid@cs.okstate.edu>
To:        freebsd-current@freebsd.org
Subject:   ad WRITE_DMA timing out frequently
Message-ID:  <20050218150335.7B4A9A063E@csa.cs.okstate.edu>

next in thread | raw e-mail | index | archive | help

I've recently brought a machine up from 5.3-STABLE to 6-CURRENT. It
usually just sits in the corner and runs services, but lately I've come
home form work or woken up to find that it is completely unresponsive,
and I have to hard reset the machine. It happens at least once a day,
and it's becoming more and more frequent. When I look at the console, I
always have the same 4 messages before the failure:

ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=2085599
ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=2085599
 kernel: ad0: FAILURE - WRITE_DMA timed out
kernel: g_vfs_done():ad0s1d[WRITE(offset=52772864, length=16384)]error = 5

It seems to me that a sector on the disk might be dead in the ad0s1d
slice (/var), but I want to be certain before I take further steps that
the behavior I'm experiencing is positively unrelated to the migration
to 6-CURRENT.

I started poking around /var to see if anything was amiss, and I found
that mail messages are being stacked up in /var/spool/clientmqueue, even
though nothing should be using the msp queue (I've redirected periodic
outputs to logfiles).  In the last daily run mailed to root in January,
I found records in the submit queue that looked like this:

j0EDINHh049826     2489 Fri Jan 14 07:18 MAILER-DAEMON
                 (Deferred: Permission denied)

There were nearly 500 of them.

Even after redirecting periodic output to logs and clearing out the
client mail queue, this continues to happen, and I have a hunch that it
may be related to the WRITE_DMA timeouts, as it's the only weird
behavior I can see on /var. If anyone can help me shed some light on
this, I'd appreciate it. I've had 2 IDE drives die in this machine
already, I'm going to be severely depressed if I've killed a third.

-Reid



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050218150335.7B4A9A063E>