Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Jul 2000 20:32:00 +0200 (CEST)
From:      =?ISO-8859-1?Q?G=E9rard_Roudier?= <groudier@club-internet.fr>
To:        Mike Smith <msmith@FreeBSD.ORG>
Cc:        freebsd-alpha@FreeBSD.ORG
Subject:   Re: fxp0 hangs on a PC164 using STABLE 
Message-ID:  <Pine.LNX.4.10.10007211948560.1465-100000@linux.local>
In-Reply-To: <200007202209.PAA00823@mass.osd.bsdi.com>

next in thread | previous in thread | raw e-mail | index | archive | help


On Thu, 20 Jul 2000, Mike Smith wrote:

> > It is my opinion. You may disagree but it will hard for anybody to
> > convince me that I am wrong. ;-)
>=20
> On x86, it's very hard for you to be right; the CPU specification and bus
> bridge behaviour both guarantee retirement of writes in order of issuance=
=2E
> This combined with strong cache coherency makes barriers irrelevant on
> this platform.

Let a PCI device perform:
=09STORE A
=09STORE B

Let the CPU perform and expect:
        LOAD B
        LOAD A

Let some CPU speculative execution carry out to the system BUS:
=09LOAD A
=09LOAD B

My reading of the the Intel docs didn't convince me that such reordering
is not possible.
=20
Typically A is some indicator of an IO completion pushed to a completion=20
queue and B is the associated status data.
=09
> As far as other platforms are concerned, however, you're quite correct.

Are you still so sure. ;-)

> There does need to be an extension to the busspace API to define a range=
=20
> of host memory with a tag/handle pair for barrier activity.

Hmmm... Barrier semantics vary so much between architectures that an
unified semantic that also address device driver's concerns (not only
CPU<->CPU) is either close to impossible or will just be extremally poor,
in my opinion.

I will give a single exemple for Alpha: the wmb instruction on Alpha
orders memory-like stores and non-memory-like stores _independantly_. That
means that the mb instruction must be used as a store barrier when both
kinds of stores are addressed at the same time.

We can, as a second example, consider:

1) Situations when a real barrier (called serialisation) is needed for
   IA32 and a mb is needed for Alpha.
2) Situations where only a mb is needed for Alpha but the IA32 does=20
   not require serialisation.

People who are happy with nightmares can also compare all kind of barriers
and implicit ordering rules that are in existence and try to elaborate an
unified semantic that is both simple and rich. ;-)

And People who are happy with nighmares at the power of two can also read
documentations of various existing host-bridges. ;-)

Most existing software BUS abstractions mostly address consistency given
no concurrency. A common problem with PCI device / software driver pairs
is consistency given concurrency, that not only requires CPU barriers to
be used but also to be careful about posted write flushing. Basically,
also the PCI device and the software driver may have to use some different
kind of barriers for consistency given concurrency to happen as expected,
by flushing when needed posted write transactions. (using read
transactions as we know).

The drivers I maintain will always contain any stuff needed for them to be
as correct as I want them to be, modulo my knowledge and competence on
addressed platforms obviously.

  G=E9rard.



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-alpha" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.LNX.4.10.10007211948560.1465-100000>