Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 10 Jan 2005 11:07:52 -0500
From:      "William H. Magill" <magill@mcgillsociety.org>
To:        David Gilbert <dgilbert@dclg.ca>
Cc:        freebsd-alpha@freebsd.org
Subject:   Re: processor type.
Message-ID:  <C81C63C4-6321-11D9-B8BD-000393768D2C@mcgillsociety.org>
In-Reply-To: <16866.32790.398095.651691@canoe.dclg.ca>
References:  <16866.32790.398095.651691@canoe.dclg.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
On 10 Jan, 2005, at 08:16, David Gilbert wrote:
> I see in the compiler lines crawling by that gcc is asked to optimize
> for 'EV5' while being compatible with 'EV4'.  My Alpha is an EV4 ---
> I'm wondering if I would see better performance with a different flag
> there, but the gcc manual doesn't even acknowledge the existance of
> the options that are in use, let alone the available options.

I'm not a programmer type, but it was pretty well known that the GCC
compiler generated from pretty poor, to downright bad code on the
Alphas, no matter which one, when compared to the DEC C compiler
(known as ccc on the Linux tools CD).

However, I understand that the GCC compiler picked up (some?many?all)
the Alpha optimization enhancements offered by Compaq shortly before
the Intel/HP deal.

And I do know from long TRU64 experience that optimizing for a 
particular
EVx chip can make a big difference.

The man page for the Dec C Compiler under 5.1A:
Compaq C V6.3-028 on Compaq Tru64 UNIX V5.1 (Rev. 732)
Compiler Driver V6.3-026 (sys) cc Driver
states:
(Note: There are two relevant options for ccc -arch and -tune.)

-arch option
       Specifies which version of the Alpha architecture to generate
       instructions for. All Alpha processors implement a core set of 
instruc-
       tions and, in some cases, the following extensions: BWX 
(byte/word-
       manipulation extension), MVI (multimedia extension), FIX (square 
root
       and floating-point convert extension), and CIX (count extension). 
(The
       Alpha Architecture Reference Manual describes the extensions in
       detail.)

       The option specified by the -arch option determines which 
instructions
       the compiler can generate:

       generic
           Generate instructions that are appropriate for all Alpha 
proces-
           sors. This option is the default.

       host
           Generate instructions for the processor that the compiler is 
run-
           ning on (for example, EV6 instructions on an EV6 processor).

      ev4,ev5
           Generate instructions for the EV4 processor (21064, 21064A, 
21066,
           and 21068 chips) and EV5 processor (some 21164 chips). (Note 
that
           chip number 21164 is used for both EV5 and EV56 processors.)

           Applications compiled with this option will not incur any 
emulation
           overhead on any Alpha processor.

       ev56
           Generate instructions for EV56 processors (some 21164 chips).

           This option permits the compiler to generate any EV4 
instruction,
           plus any instructions contained in the BWX extension.

           Applications compiled with this option may incur emulation 
overhead
           on EV4 and EV5 processors.

       ev6 Generate instructions for EV6 processors (21264 chips).

           This option permits the compiler to generate any EV6 
instruction,
           plus any instructions contained in the following extensions: 
BWX,
           MVI, and FIX.

           Applications compiled with this option may incur emulation 
overhead
           on EV4, EV5, EV56, and PCA56 processors.

       ev67
           Generate instructions for EV67 processors (21264A chips).

           This option is the same as the ev6 option except that it also 
per-
           mits the compiler to generate any instructions contained in 
the CIX
           extension.

           If your application uses CIX instructions, it may incur 
emulation
           overhead on all processors that are older than EV67.

      pca56
           Generate instructions for PCA56 processors (21164PC chips).

           This option permits the compiler to generate any EV4 
instruction,
           plus any instructions contained in the BWX and MVI extensions.

           Applications compiled with this option may incur emulation 
overhead
           on EV4, EV5, and EV56 processors.

       A program compiled with any of the options will run on any Alpha 
pro-
       cessor.  Beginning with DIGITAL UNIX V4.0 and continuing with 
subse-
       quent versions, the operating system kernel includes an 
instruction
       emulator. This capability allows any Alpha chip to execute and 
produce
       correct results from Alpha instructions--even if the some of the
       instructions are not implemented on the chip. Applications using 
emu-
       lated instructions will run correctly, but may incur significant 
emula-
       tion overhead at run time.

       The psrinfo -v command can be used to determine which type of 
processor
       is installed on any given Alpha system.

       Note the following differences between the -arch evx and -tune evx
       options (where x designates a specific processor):

         +  -arch evx implies -tune evx, but -tune evx does not imply 
-arch
            evx.

         +  -arch evx can generate unguarded evx-specific instructions.  
If
            you run that application on a pre-evx processor, those 
instruc-
            tions may get emulated (and emulated instructions can be up 
to
            1000 times slower than actual instructions).

         +  -tune evx can generate evx-specific instructions, but those 
are
            always amask-guarded. That expands the code size but avoids
            instruction emulation.

         +  If you want the best performance possible on an evx 
processor and
            are not concerned about performance on earlier processors, 
the
            best choice would be -arch evx (which implies -tune evx).

         +  If you want good performance on an evx processor but also 
want the
            application to run reasonably fast on earlier processors, 
the best
            choice would probably be -tune evx.

===============
-tune option
       Instructs the optimizer to tune the application for a specific 
version
       of the Alpha hardware. This will not prevent the application from 
run-
       ning correctly on other versions of Alpha but it may run more 
slowly
       than generically-tuned code on those versions.

       The option argument can be one of the following, which selects 
instruc-
       tion tuning appropriate for the listed processor(s):

       generic
           All Alpha processors.  This is the default.

       host
           The processor on which the code is compiled.

       ev4 The 21064, 21064A, and 21068 processors.

       ev5,ev56
           The 21164 processor.  (Both EV5 and EV56 are numbered 21164.)

       ev6 The 21264 processor.

       ev67
           The 21264A processor.

       See also the -arch option for an explanation of the differences 
between
       -tune and -arch.


T.T.F.N.
William H. Magill
# Beige G3 [Rev A motherboard - 300 MHz 768 Meg] OS X 10.2.8
# Flat-panel iMac (2.1) [800MHz - Super Drive - 768 Meg] OS X 10.3.7
# PWS433a [Alpha 21164 Rev 7.2 (EV56)- 64 Meg] Tru64 5.1a
# XP1000  [Alpha 21264-3 (EV6) - 256 meg] FreeBSD 5.3
# XP1000  [Alpha 21264-A (EV 6.7) - 384 meg] FreeBSD 5.3
magill@mcgillsociety.org
magill@acm.org
magill@mac.com
whmagill@gmail.com



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C81C63C4-6321-11D9-B8BD-000393768D2C>