Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Sep 2008 19:29:26 +0200
From:      Mel <fbsd.questions@rachie.is-a-geek.net>
To:        freebsd-questions@freebsd.org
Cc:        Jeremy Chadwick <koitsu@freebsd.org>, Reid Linnemann <lreid@cs.okstate.edu>
Subject:   Re: SATA READ_DMA timeouts - SOLVED?
Message-ID:  <200809301929.27126.fbsd.questions@rachie.is-a-geek.net>
In-Reply-To: <48E259B4.3040100@cs.okstate.edu>
References:  <48E1465A.5040903@cs.okstate.edu> <20080930023736.GA22907@icarus.home.lan> <48E259B4.3040100@cs.okstate.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 30 September 2008 18:54:12 Reid Linnemann wrote:
> Jeremy Chadwick wrote:
> > (I'm not subscribed to freebsd-questions, so please CC me on replies.
> > I'm also not sure how I ended up getting this mail in the first place;
> > it looks like someone BCC'd my koitsu@freebsd.org address).
>
> Yes, I BCC'd you since you are maintaining a page on the wiki
> documenting SATA DMA problems.
>
> > Furthermore, one of the most common reports on the FreeBSD lists is the
> > exact opposite -- users complaining that "their disks are SATA300 but
> > only operate at SATA150" (caused by that jumper).  Users are told to
> > remove the jumper, and are reminded that the reason the jumper is
> > enabled by default is said chipset incompatibilities.
> >
> > That said, your mail confuses me for one reason:
> >
> > Were you receiving DMA errors with the jumper REMOVED (e.g. SATA300
> > operation), or with the jumper ENABLED (SATA150 operation)?  Your below
> > description does not state what exactly you did with the jumper to make
> > your drives work reliably, only "that the jumper capability on your
> > disks was available".
>
> I should have been more clear.
>
> My disks came with no cap on the SATA150 jumper, although FreeBSD
> reported that they were in SATA150 mode. The system would be unusable
> from READ_DMA timeouts if the system was ever powered off and brought
> back up. I had to do some voodoo of booting in single user mode with
> ACPI turned off to repair filesystems and rebuild my gmirror, then load
> ACPI and drop back into multi-user mode. I even had to do this if the
> system was powered off gracefully. So far, since I capped the jumpers
> this has not been the case. I still get them periodically if I do
> something like rebuild a gmirror component, so I can no longer say my
> problem is completely resolved.

Is this on 7.x? Sounds very similar to my experience described in:
http://www.freebsd.org/cgi/query-pr.cgi?pr=122572&cat=kern

The machine is now operational and working in UDMA33 mode with two gmirror'ed 
SATA, using 6.3-p4. Unfortunately, I can't risk "trying 7.x" anymore, since 
it's emergency storage for the main fileserver, so dataloss is 
unacceptable :/. I do not know about the jumper state at the moment. I will 
inform if there will be a window real soon now, to check for jumpers.

Ata info:
# atacontrol list
ATA channel 0:
    Master: acd0 <HL-DT-STDVD-ROM GDR-T10N/1.02> ATA/ATAPI revision 5
    Slave:       no device present
ATA channel 1:
    Master:      no device present
    Slave:       no device present
ATA channel 2:
    Master:  ad4 <WDC WD6400AAKS-65A7B0/01.03B01> Serial ATA II
    Slave:       no device present
ATA channel 3:
    Master:  ad6 <WDC WD6400AAKS-65A7B0/01.03B01> Serial ATA II
    Slave:       no device present

# atacontrol cap ad4

Protocol              Serial ATA II
device model          WDC WD6400AAKS-65A7B0
serial number         WD-WMASY1885186
firmware revision     01.03B01
cylinders             16383
heads                 16
sectors/track         63
lba supported         268435455 sectors
lba48 supported       1250263728 sectors
dma supported
overlap not supported

Feature                      Support  Enable    Value           Vendor
write cache                    yes      yes
read ahead                     yes      yes
Native Command Queuing (NCQ)   yes       -      31/0x1F
Tagged Command Queuing (TCQ)   no       no      31/0x1F
SMART                          yes      yes
microcode download             yes      yes
security                       no       no
power management               yes      yes
advanced power management      no       no      0/0x00
automatic acoustic management  yes      yes     128/0x80        128/0x80

# atacontrol mode ad4
current mode = UDMA33


-- 
Mel

Problem with today's modular software: they start with the modules
    and never get to the software part.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200809301929.27126.fbsd.questions>