Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 7 Sep 2015 12:55:27 -0700
From:      Leonardo Fogel <leonardofogel@yahoo.com.br>
To:        freebsd-drivers@freebsd.org
Subject:   Re: Memory barrier
Message-ID:  <1441655727.36257.YahooMailBasic@web120802.mail.ne1.yahoo.com>
In-Reply-To: <20150906180311.GS2072@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
> > Case 1:
> >    bus_write_1(region_0, ...);
> >    /* barrier here */
> >    DELAY(some_time);
> >
> > Case 2:
> >    bus_write_1(region_0, ...);
> >    /* barrier here */
> >    bus_write_1(region_2, ...);
> >
> > In the first one, I want the write to reach the device before the threa=
d busy-waits.
> >
> > In the second one, I want the write to a device (e.g. power management)=
 to
> >  complete before the write to another starts/completes.
>=20
> I believe that the bus_write semantic includes the required serialization=
.
> E.g., on x86 all CPU write buffers are flushed before the write instructi=
on
> is declared completed, because this is the semantic of the uncacheable
> memory.  For powerpc, the system automatically inserts powerpc_iomb() aft=
er
> the write, which is full sync.  I am not aware of other architectures.

I've found the implementation of the bus_space_barrier for the ARM architec=
ture (the one in which I'm interested):

   generic_bs_barrier(bus_space_tag_t t, bus_space_handle_t bsh, bus_size_t=
 offset,
       bus_size_t len, int flags)
   {

           /*
            * dsb() will drain the L1 write buffer and establish a memory a=
ccess
            * barrier point on platforms where that has meaning.  On a writ=
e we
            * also need to drain the L2 write buffer, because most on-chip =
memory
            * mapped devices are downstream of the L2 cache.  Note that thi=
s needs
            * to be done even for memory mapped as Device type, because whi=
le
            * Device memory is not cached, writes to it are still buffered.
            */
           dsb();
           if (flags & BUS_SPACE_BARRIER_WRITE) {
                   cpu_l2cache_drain_writebuf();
           }
   }

The ARM architecture specifies two _data_ barrier instructions: DMB and DSB=
. The first synchronizes memory accesses, and the second synchronizes both =
memory accesses and instruction execution. So, DSB is the answer to Case 1,=
 and DMB or DSB is the answer to Case 2.

The implementation above brings something of which I was not aware: it also=
 drains the L2 write buffer. Older implementations of the "PL310 Store Buff=
er did not have any automatic draining mechanism." (ARM CoreLink Level 2 Ca=
che Controller (L2C-310 or PL310), r3 releases, Software Developers Errata =
Notice.) In newer implementations, the writes to device memory are "Put in =
store buffer, not merged, immediately drained to L3." (CoreLink Level 2 Cac=
he Controller L2C-310 Technical Reference Manual=09Revision: r3p3.)

Leonardo




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1441655727.36257.YahooMailBasic>