From owner-freebsd-current Mon Apr 15 16:49:22 2002 Delivered-To: freebsd-current@freebsd.org Received: from snipe.prod.itd.earthlink.net (snipe.mail.pas.earthlink.net [207.217.120.62]) by hub.freebsd.org (Postfix) with ESMTP id 1E7CF37B404 for ; Mon, 15 Apr 2002 16:49:19 -0700 (PDT) Received: from pool0600.cvx21-bradley.dialup.earthlink.net ([209.179.194.90] helo=mindspring.com) by snipe.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16xGDd-00008M-00; Mon, 15 Apr 2002 16:49:09 -0700 Message-ID: <3CBB66DA.F9C94ED0@mindspring.com> Date: Mon, 15 Apr 2002 16:48:42 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: msch@snafu.de Cc: freebsd-current@freebsd.org Subject: Re: ATA errors on recent -current References: Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Matthias Schuendehuette wrote: > I still have an old FreeBSD Test-Installation (45GB are big enough :-) > with a 4.4-STABLE as of Okt 23, 2001... > = > It boots off the DTLA, uses tagged-queuing and connects using UDMA100..= =2E > ... and doesn't have any problems!! > = > So, to bring some of you down to earth again, the DTLA may be a > horrible disk and I'm one of the last to praise ATA at all (My machine > has two SCSI host adaptors, five SCSI-Disks and several other SCSI > Devices), but it once worked! I think we all already agree, though, that the tagged command queuing problem comes from a code change. That doesn't identify it very closely (or you would have included a patch ;^)). It may be that the OS is slower in older revisions (one would hope that was the case), and that now the code is faster, it's too fast for the hardware. It may also be that the switches between write caching on/off by default in various versions have remove stall points in the write code path which would have otherwise protected the drive from being overwhelmed by the host OS. There are a lot of possibilities for timing problems having been introduced, that don't require that Soren's code be wrong, and that it's impossible to blame the problem on the hardware. On the theory that it is an off-by-one error, introduced either by increased concurrency in an error path, or a direct off-by-one, I've suggested dropping the effective number of tagged commands supported by the drive. That way, if you exceed this number for whatever coding error reason, you won't exceed the capicty of the drive. Since you have one of these beasts, could you maybe try changing the number of tagged command queue entries you permit to be used at one time? > I really, really don't want to blame S=F8ren, he's doing a great job an= d > everybody, who makes something makes occasionally some errors, but (at > least for me) it doesn't seem to be a fundamental technical problem, > because *it once worked* - sorry, but it's true. > = > And maybe it isn't related to tagged queuing and the DTLA at all - if I= > correctly understand Giorgos' mail... As I said: it could be drive settings unrelated to the code itself being correct. I've given three suggestions to verify this, one way or the other: 1) Control the drive DMA speed down 2) Pretend the maximum tagged command queue depth is smaller than it is 3) Toggle the write caching on the drive Until you try all three of these and report back, you can't say that the problem is Soren's. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message