Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 10 Sep 1999 17:49:48 -0400 (EDT)
From:      Andrew Gallatin <gallatin@cs.duke.edu>
To:        "Justin T. Gibbs" <gibbs@caspian.plutotech.com>
Cc:        scsi@freebsd.org, gibbs@freebsd.org, anderson@cs.duke.edu
Subject:   Re: data corruption when using aic7890 
Message-ID:  <14297.30000.669959.29706@grasshopper.cs.duke.edu>
In-Reply-To: <199909102049.OAA03111@caspian.plutotech.com>
References:  <14297.27236.577546.795593@grasshopper.cs.duke.edu> <199909102049.OAA03111@caspian.plutotech.com>

next in thread | previous in thread | raw e-mail | index | archive | help

Justin T. Gibbs writes:
 > >This does seem to have an effect, so you might have the right knob to
 > >twiddle!
 > >
 > >Unfortunately, the change seems to make things even worse.  The
 > >errors are occurring much more freqently now.  Also, the errors are
 > >occuring later in the page.  Where as before, the errors would almost
 > >always occur in the first 500 bytes of the page, now they're occuring
 > >near the end of the page (some around 2500 bytes, most near 3900).
 > 
 > What are the dynamics of your test program?  Are you sure that this
 > is a problem with reads and not with writes?  If it is a problem
 > with reads, WR_DFTHRSH is what you should be tweaking, since the
 > directions are relative to the bus master (i.e. the aic7xxx part).
 > 
 > --
 > Justin

We're using a home-grown program called 'hunt' (as in hunt for
errors).  I've left you source & an i386 binary on freefall in
~gallatin/hunt.

Run it with the arguments -touch=<pagesize> -fileio=<file> -size=<physmem in pages>
So on a 512MB x86, you'd say './hunt.i386 -touch=4096 -fileio=zot -size=131072'

It sequentially writes out the data & reads it back multiple times.
The data set is large enough (>physical memory) so that its not being
cached.  So at least one successful read of the entire file indicates
that it was written properly.  When an error is encountered, that page
of the file is re-written with the correct data.

It prints '<' when it starts initializing, '>' after the file is
completely written, and '.' for each successful read of the file.
On an error, you'll see some information regarding what was read &
what was expected. 

So, using an NCR875 controller I see:
<>!..............................................

On the 7890, I'm seeing: 

<>!..........##error 1 page 16167 expected [0x03f27ff8] saw [0x056deff8]


My tweak didn't seem to help either.  The default setting seem to 
be the most reliable.  Should I continue to tweak this variable, or do 
you have other ideas?

I really appreciate your help.  We've got 20 of these machines & I'd
really like to avoid having to purchase 20 scsi controllers to replace 
the on-board ones.

Thanks!

Drew
------------------------------------------------------------------------------
Andrew Gallatin, Sr Systems Programmer	http://www.cs.duke.edu/~gallatin
Duke University				Email: gallatin@cs.duke.edu
Department of Computer Science		Phone: (919) 660-6590






To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?14297.30000.669959.29706>