Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 20 Feb 2006 20:51:58 -0500
From:      Jerry Bell <jbell@stelesys.com>
To:        "V.I.Victor" <idmc_vivr@intgdev.com>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: Every 12-hrs -- "ad0: TIMEOUT - WRITE DMA"
Message-ID:  <43FA723E.5000100@stelesys.com>
In-Reply-To: <W6713813530185481140467298@webmail8>
References:  <W6713813530185481140467298@webmail8>

next in thread | previous in thread | raw e-mail | index | archive | help
I had a drive dying and it showed up just like this - it turned out to 
be the daily scripts that scan for file changes, etc, and my backup 
script were tickling a back sector of the disk.  Have you run the 
smartctl -t long /dev/ad0 command to have it perform a full self test?  
You normally have to let that run for a while, then take another look at 
the smart error log to see if anything showed up.  Mine ended up having 
an error that the drive could not self correct. 

As to why you're able to write a 2 gig file without a problem - if you 
have some binary or config file or man file, etc sitting on those bad 
spots, you wouldn't be writing to those blocks.  Anytime a security 
script iterates through them, they would be tickling that block, causing 
an error.

Another possibility is that you have a bad ide cable.

Hopefully that is of some use.

Jerry
http://www.networkstrike.com

V.I.Victor wrote:
> On Sun, 19 Feb 2006, Mike Tancsa wrote:
>
>   
>> On Sun, 19 Feb 2006 22:21:04 +0000, in sentex.lists.freebsd.questions
>> you wrote:
>>
>>     
>>> On Thu, 16 Feb 2006, Mike Tancsa wrote:
>>>
>>>       
>>>>> For the last 4-days, our (otherwise OK) 5.4-RELEASE machine has been
>>>>> reporting:
>>>>>
>>>>> Feb 12 12:08:05 : ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=2701279
>>>>> Feb 13 00:08:51 : ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=2701279
>>>>> Feb 13 12:09:38 : ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=2963331
>>>>> Feb 14 00:10:24 : ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=2705947
>>>>>
>>>>> So -- can anyone help track this down?
>>>>>           
>>>> It sounds like a hardware issue. Install
>>>> /usr/ports/sysutils/smartmontools and "ask" the drive to see whats up.
>>>>         
>>> I installed 'smartmontools' but haven't used as yet. I've been waiting to
>>> see what happens -- the "problem" simply stopped. There've been no "ad0:
>>> TIMEOUT" messages for 3-days.
>>>       
>> The errors get logged in the drive so you dont have to wait for more
>> errors to happen. Start it running now so you can see if any of the
>> "bad" counters are changing as well as to ask the drive what it was.
>> My guess is you have some bad sectors the drive remapped.
>>     
>
> OK. No problems found... And -- still -- no more "ad0: TIMEOUTs"
>
> But, I'm not really surprised. As mentioned in the original post, a
> 2-gig file had been created that presumably "moved-past" any bad
> sector patches; approx. midway during the TIMEOUT report period.
>
> Plus -- since the drive is (was) storing email, writing logs, etc.
> 24-hrs a day, it seems improbable that bad-sectors would only show-up
> every 12-hrs.
>
> Although I'm uncomfortable with "magic-fixes," I wonder if there's
> more than a coincidental connection between setting the date and the
> reports starting and stopping.
>
>
>
>
>
> _______________________________________________
> freebsd-questions@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org"
>   



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?43FA723E.5000100>