Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 26 Aug 2012 17:03:46 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        Ian Lepore <freebsd@damnhippie.dyndns.org>
Cc:        Hans Petter Selasky <hans.petter.selasky@bitfrost.no>, freebsd-arm@freebsd.org, freebsd-mips@freebsd.org, freebsd-arch@freebsd.org
Subject:   Re: Partial cacheline flush problems on ARM and MIPS
Message-ID:  <6D83AF9D-577B-4C83-84B7-C4E3B32695FC@bsdimp.com>
In-Reply-To: <1346002922.1140.56.camel@revolution.hippie.lan>
References:  <1345757300.27688.535.camel@revolution.hippie.lan> <3A08EB08-2BBF-4B0F-97F2-A3264754C4B7@bsdimp.com> <1345763393.27688.578.camel@revolution.hippie.lan> <FD8DC82C-AD3B-4EBC-A625-62A37B9ECBF1@bsdimp.com> <1345765503.27688.602.camel@revolution.hippie.lan> <CAJ-VmonOwgR7TNuYGtTOhAbgz-opti_MRJgc8G%2BB9xB3NvPFJQ@mail.gmail.com> <1345766109.27688.606.camel@revolution.hippie.lan> <CAJ-VmomFhqV5rTDf-kKQfbSuW7SSiSnqPEjGPtxWjaHFA046kQ@mail.gmail.com> <F8C9E811-8597-4ED0-9F9D-786EB2301D6F@bsdimp.com> <1346002922.1140.56.camel@revolution.hippie.lan>

next in thread | previous in thread | raw e-mail | index | archive | help

On Aug 26, 2012, at 11:42 AM, Ian Lepore wrote:

> On Thu, 2012-08-23 at 22:00 -0600, Warner Losh wrote:
>> The bottom line is that you can't mix things like that when cache
>> lines are involved.  The current code that tries is doomed to =
failure.
>> Doomed. You just can't control all flushes, as Ian's missive
>> demonstrates, and trying to accommodate code that does this I don't
>> think can possibly work.  All the interrupt masking, copying in and
>> out, etc I fear is doomed to utter and abject failure. =20
>>=20
> Until last weekend I was in the camp that thought the partial =
cacheline
> flush problem was solvable with sufficiently clever code.  Now I agree
> that we're doomed to failure and it's time to try another direction.
>=20
> We're going to have some implementation work to do in arm and mips
> busdma, but I think the larger part of the task is going to be =
defining
> more rigorously how a driver must interact with the busdma system to
> function correctly on all types of platforms, and then update existing
> drivers to conform.
>=20
> The busdma manpage currently has some vague words about the usage and
> sequencing of sync ops, such as "If read and write operations are not
> preceded and followed by the appropriate synchronization operations,
> behavior is undefined."  I think we should more explicitly spell out
> what the appropriate sequences are.  In particular:
>=20
>      * The PRE and POST operations must occur in pairs; a PREREAD must
>        be followed eventually by a POSTREAD and a PREWRITE must be
>        followed by a POSTWRITE.=20

PREREAD means "I am about to tell the device to put data here, have =
whaterver things might be pending in the CPU complex to get out of the =
way." usually this means 'invalidate the cache for that range', but not =
always.  POSTREAD means 'The device's DMA is done, I'd like to start =
accessing it now.' If the memory will be thrown away without being =
looked at, then does the driver necessarily need to issue the POSTREAD?  =
I think so, but I don't know if that's a new requirement.

>      * The CPU is not allowed to access the mapped memory after a PRE
>        sync and before the corresponding POST sync. =20

Correct.

>      * The DMA hardware is not allowed to access the mapped memory
>        after a POST sync and before the next PRE sync.=20

Correct.

>      * Read and write sync operators may be combined in a single call,
>        PRE and POST operators may not be.  E.G., PREREAD|PREWRITE is
>        allowed, PREREAD|POSTREAD is not.  We should note that while
>        read and write operations may be combined, on some platforms
>        PREREAD|PREWRITE is needlessly expensive when only a read is
>        being performed.

Correct.

> We also need some rules about working with buffers obtained from
> bus_dmamem_alloc() and external buffers passed to bus_dmamap_load().  =
I
> think the rule should be that a buffer obtained from =
bus_dmamem_alloc(),
> or more formally any region of memory mapped by a bus_dmamap_load(), =
is
> a single logical object which can only be accessed by one entity at a
> time.  That means that there cannot be two concurrent DMA operations
> happening in different regions of the same buffer, nor can DMA and CPU
> access be happening concurrently even if in different parts of the
> buffer. =20

There's something subtle that I'm missing.  Why would two DMA operations =
be disallowed?  The rest makes good sense.

> I've always thought that allocating a dma buffer feels like a big
> hassle.  You sometimes have to create a tag for the sole purpose of
> setting the maxsize to get the buffer size you need when you call
> bus_dmamem_alloc().  If bus_dmamem_alloc() took a size parm you could
> just use your parent tag, or a generic tag appropriate to all the IO
> you're doing for a given device.  If you need a variety of buffers for
> small control and command and status transfers of different sizes, you
> end up having to manage up to a dozen tags and maps and buffers.  It's
> all very clunky and inconvenient.  It's just the sort of thing that
> makes you want to allocate a big buffer and subdivide it. Surely we
> could do something to make it easier?

You'd wind up creating a quick tag on the fly for the bus_dmamap_alloc =
if you wanted to do this.  Cleanup then becomes unclear.

Warner





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6D83AF9D-577B-4C83-84B7-C4E3B32695FC>