Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 28 Aug 1999 11:03:45 -0700 (PDT)
From:      Matthew Jacob <mjacob@feral.com>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        "Justin T. Gibbs" <gibbs@plutotech.com>, hackers@FreeBSD.ORG
Subject:   Re: Should cam_imask be part of bio_imask ?
Message-ID:  <Pine.BSF.4.05.9908281102180.8884-100000@semuta.feral.com>
In-Reply-To: <199908281800.LAA05485@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help
> :I strongly doubt that this is a CAM isr problem- the error pattern isn't
> :entirely clear from what you said, but it looks more like a FIFO or CACHE
> :LINE sized type of problem- it looks to be < 16 bytes, but not a short
> :count. Because this isn't one of the wacky systems I spent most of my
> :career on at Sun where the first and usual suspect was a system memory
> :cache line because IO wasn't cache coherent on Suns between the Sun
> :3/{50,60,75,150} and the advent SuperSparc Viking Chipset, I'd guess a
> :FIFO somewhere in the I/O movement path.
> :
> :Justin- any changes lately where flushing a FIFO in the Adaptec at the end
> :of tranfer might have been spoodged?
> :
> :-matt
> 
>     The problem is definitely aligned in some way.  Here's a diff of
>     a hexdump of one error.  Sometimes I lose a whole page, sometimes two
>     pages, sometimes 16 bytes, but the error is always page aligned.
> 
> 1536c1536
> < 0005ff0 3333 2033 3434 3434 7c20 207c 3030 3030
> ---
> > 0005ff0 7365 3d20 3120 093b 2309 6720 6f6c 6162
> 
>     A cache-line problem would fit the symptoms.  I know it isn't the 
>     hardware... this 1xCPU PPro/200 system has been with me for several
>     years and this test didn't fail like this a month ago.  When I updated 
>     the machine last (unfortunately w/ about a month's worth of changes), 
>     my buildworlds started failing with odd errors.
> 
>     I then switched away from the failing buildworlds (which take an hour)
>     and started doing cp -r's and then diff -r's (takes only 20 min), and as
>     you can see I'm still seeing the problem.
> 
>     Maybe this is DMA related.  Perhaps the cache is not getting cleared?
>     Maybe an MMU optimization someone threw in recently?

That's possible too- I'll admit I'm a bit hazy on i386 specifics- it's
always been a "just works wrt I/O" so for all I know there's a required
i/o flush command when you switch mappings. Gawd I hate these kind of
problems.





To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.05.9908281102180.8884-100000>