IPA(5)IPA(5)


NAME
     IPA - Inter-Procedural Analysis

DESCRIPTION
     This man page describes Inter-Procedural Analysis (IPA) in the MIPSpro
     compilers and how to use it most efficiently.

IPA Overview
     Most compiler optimizations work within a single procedure (for
     example, function or subroutine) at a time.  This helps keep the
     problems manageable, and is a key aspect of supporting separate
     compilation, because it allows the compiler to restrict attention to
     the current source file.

     This intra-procedural focus also presents serious restrictions.  By
     avoiding dependence on information from other procedures, an optimizer
     is forced to make worst-case assumptions about the possible effects of
     those procedures.  For instance, at boundary points including all
     procedure calls, the compiler must typically save (and/or restore) the
     state of all variables to or from memory.

     By contrast, IPA algorithms analyze more than a single procedure
     (preferably the entire program) at once.  The optimizations performed
     by the MIPSpro compilers' IPA utility include:

     * Inlining:  Calls to a procedure are replaced by a suitably modified
       copy of the called procedure's body inline, even if the callee is in
       a different source file.

     * Common block array padding:  Global arrays in Fortran may be padded
       (that is, their size increased in order to reduce cache conflicts).

     * Constant propagation:  Formal parameters which always have a
       particular constant value can be replaced by the constant, allowing
       additional optimization.  Global variables which are initialized to
       constant values and never modified can be replaced by the constant.

     * Dead function elimination:  Functions which are never called can be
       removed from the program image, improving memory utilization.

     * Dead variable elimination:  Variables which are never actually used
       can be eliminated, along with any code that initializes them.

     * Global name optimizations:  Global names in shared code must
       normally be referenced via addresses in a global table, in case they
       are defined or preempted by another dynamic shared object (DSO).  If
       the compiler knows it is compiling a main program and that the name
       is defined in another of the source files comprising the main
       program, an absolute reference can be substituted, eliminating a
       memory reference.  For example, the following code can be used to to
       load X:

            lw $4,%got_disp(X),$gp
            ld $5,0,$4

       It could be changed to the following:

            lui $4,%hi(X)
            ld $5,%lo(X),$4

     Similarly, IPA can optimize the data items that can be referenced by
     simple offsets from the GP register instead of depending on the user
     to provide an ideal value of the -G option (see cc(1) or f77(1)).

IPA and Compilation
     Because IPA must usually process code from multiple source files to be
     effective, it has significant effects on compilation.  These can be
     classified as affecting either the program build process itself, or
     the attributes of the resulting program.

   The Build Process
     The standard Unix C or Fortran compilation model involves two steps.
     First, each source file comprising a program is compiled independently
     of the others, producing a relocatable object file with the .o
     extension.  Then, the resulting object files are linked, along with
     any standard libraries, by ld(1).

     IPA works by postponing much of the compilation process until the link
     step, when all of the program components can be analyzed together.
     Specifically, the following occurs:

     * The compile step does initial processing of the source file, placing
       an intermediate representation of the procedures it contains into
       the output .o file instead of normal relocatable code.  Such object
       files are called WHIRL objects to distinguish them from normal
       relocatable object files.

       This choice is invoked by the -IPA: option.  This processing is
       really two phases:  the normal language processing, plus an IPA
       summary phase which collects local information to be used by IPA
       later.  Because most optimization and code generation options are
       transmitted via the IPA summary phase, they must be present on the
       command line for the compile step.

     * The link step, although it is still invoked by a single ld(1)
       command, becomes several steps.

       First, the linker invokes IPA, which analyzes and transforms all of
       the input WHIRL objects together, writing modified versions of the
       objects.  Then it invokes the compiler on each of the modified
       objects, producing a normal relocatable object.  Finally, it invokes
       the linker again for a normal linkage step, producing an executable
       program or DSO.

       The temporary files created during this expanded link step are all
       created in a temporary subdirectory of the output program's
       directory, unless the environment variable TMPDIR is specified, in
       which case the temporary subdirectory is created under $TMPDIR; like
       other temporary files produced by the compilation process, they are
       normally removed on completion (unless the -keep option is specified
       with the compiler command).

     IPA increases program build time in two ways.  First, although the IPA
     step may not be very expensive itself, it usually increases the size
     of the code by inlining, and therefore expands the time required by
     the rest of the compiler.  More importantly, because IPA analysis can
     propagate information from one module into optimization decisions in
     arbitrary other modules, even minor changes to a single component
     module cause most of the program compilation to be redone.  As a
     result, IPA is best suited for use on programs with a stable source
     base.

     Because full IPA is not always suitable, the MIPSpro compilers also
     support inlining without IPA in cases where both the call and the
     called subprogram are in the same file.  This feature, called the
     standalone inliner, is invoked by default when inlining is specified
     in C++, or may be explicitly invoked using the -INLINE options
     (described following).

   Program Attributes
     Like most optimization performed by the compiler, IPA should not
     change the program's behavior.  Nevertheless, it can affect the
     resulting program in subtle ways.

     The most important effects involve external symbols and DSOs.  Unlike
     other IPA implementations, the MIPSpro compiler does not require that
     all of the components of a program be subjected to IPA analysis.  The
     ld(1) command which invokes IPA may include object files (.o) and
     archives (.a) which have been compiled normally without IPA analysis,
     as well as referencing DSOs which, however they were compiled, cannot
     contribute detailed information because the DSO may be replaced with a
     different implementation after the program is built.

     To analyze the program's use (and non-use) of variables, IPA must
     determine what those unanalyzed objects might reference.  It does so
     by examining their external symbol references.  If an external symbol
     is never referenced by one of the unanalyzed objects, and its address
     is never taken in the code being analyzed, IPA can assume that the
     only possible way for the unanalyzed code to access the named object
     is if it passed as a parameter to an unanalyzed subprogram.
     Otherwise, it must make worst-case assumptions.

     This approach is safe for normal relocatable objects and archives,
     because they cannot be changed without rebuilding (and reanalyzing)
     the program.  For DSOs, however, there is a danger that a modified
     version will make additional references.  To prevent this from causing
     changed program behavior, IPA changes all of the external symbols
     which it assumes to be unreferenced to HIDDEN or INTERAL export class,
     which prevents them from being referenced inadvertently by a future
     version of the DSO.  The DSO case is safe as well.

     The DSO treatment does have an important side effect, though.  All
     referenced DSOs should be referenced on the ld(1) command line.
     Otherwise, their external references will not be seen by IPA, and the
     symbols they require may not be visible at runtime, causing failure.
     If you cannot give references to all DSOs used (for example because
     you will be accessing them with dlopen(3) and don't want them
     automatically loaded with your program), then you must provide an
     explicit exported symbol list with the -exported_symbol or ld(1)
     options.  See dso(5) for more information.

     Because IPA combines code from multiple source files, it is
     recommended that the same compilation options be used for every source
     file that is compiled with IPA analysis.  If it is important for a
     single file to be given unusual compilation options that would be
     inappropriate for the rest of the program (such as rounding mode
     control), it is recommended that that file be compiled without IPA
     analysis.

IPA Options
     IPA is controlled from the command line with two option groups.
     -INLINE:  controls inlining by the standalone inliner.  It should be
     used on your compile command line.  -IPA: controls general IPA
     choices.  If you use separate compile and link steps (i.e. using -c
     when you compile and then linking the .o files produced with a
     separate cc, f77, or ld command), then you need to use -IPA for the
     compile step, with or without individual options described below, and
     also for the link step with any of the following options desired.

     The command line format used for group options is

          -groupname:option[=value][:option2[=value2]]...

     The groupname (IPA or INLINE) is followed by a colon-separated list of
     options, each of which is an option name possibly followed by an equal
     sign and a value.  The option names may generally be abbreviated by
     truncating them to a unique prefix (which may change when new options
     are added to the group).

     Some options are specified with a setting that will enable or disable
     the feature.  To enable a feature, specify the option alone or with
     =1, =ON, or =TRUE; to disable the feature, specify the option with =0,
     =OFF, or =FALSE.  This man page only shows the ON or OFF settings, but
     the other settings are equally valid.

   INLINE Options
     The INLINE option calls the standalone inliner option group, which
     controls the application of subroutine inlining done by the standalone
     inliner, or by the main inliner if -IPA options are enabled.
     Normally, the calls to be replaced by an inlined copy of the called
     subprogram are chosen by heuristics internal to the inliner.

     Most of the options in this group provide control over those choices.
     The individual controls in this group are:

     =(setting)
               Enables or disables stand-alone inlining processing. setting
               can be either ON or OFF.

               This is ignored with a warning for compiles which invoke
               main IPA processing.  When both are used in the command line
               (for a compile which will not invoke main IPA processing),
               =OFF is processed and =ON is overridden with a warning.  If
               used within a specfile read by the stand-alone inliner, =OFF
               skips inline processing within the stand-alone inliner and
               =ON is ignored with a warning.

     all       Changes the default inlining heuristic to attempt to inline
               all routines which are not excluded by a never option or a
               pragma suppressing inlining, either for the routine or for a
               specific callsite.  This option conflicts with none; all
               takes precedence if both are specified. This option may
               cause performance problems because the size of the
               procedures may be very large after inlining. If this option
               must be used, then increase the Olimit by using the
               -OPT:Olimit=n option to enable procedures to be optimized.
               See the opt(5) man page for details.  This may affect
               compilation speed.

     alloca[=setting]
               Enables save/restore of stack when inlining calls with
               alloca.  setting can be either ON or OFF.  If the callee has
               an alloca, then inlining it would normally remove the
               function boundary at which the dynamic space would have been
               released.  This option saves and restores the stack pointer
               before and after the inlined code. Having the option ON is
               essential for correctness (for example, in code where the
               callee is inside a loop).  The default is ON.

     dfe[=setting]
               Performs dead function elimination in the standalone
               inliner.  setting can be either ON or OFF.  This option
               removes any functions that are inlined everywhere they are
               called and are not visible outside the current module.
               C/C++: the default is TRUE when not compiled with -g, FALSE
               otherwise.

     file=filename
               Provides cross-file inlining from within the stand-alone
               inliner. The option searches for routines provided in a
               -INLINE:must list option, in the filename specified. The
               filename provided by this option must be generated using the
               -IPA -c option.  The file generated contains information
               used to perform cross-file inlining.

               For example, suppose the following two files exist:  foo.f
               and bar.f. The following is part of the code from foo.f:

                    program main

                     ....

                    call bar()

                    end

               The following is partial code from bar.f:

                    subroutine bar()

                     ...

                    end

               To inline bar into main using the stand-alone inliner,
               compile bar.f in the following way:

                    f77 -n32 -IPA -c bar.f

               This produces the file, bar.o.  To inline bar into foo.f,
               use:

                    f77 -n32 foo.f -INLINE:must=bar_:file=bar.o

     keep_pu_order[=setting]
               Preserves source subprogram ordering.  The default is OFF.

     library=name
               Identifies archive libraries where the inliner should search
               for subprograms.  This option is similar to -INLINE:file=,
               described previously, where the files (.o) in the archived
               library (.a) must have been generated using the -IPA -c
               options.  For example, suppose function FOO is defined in
               foofile.f. The foofile.f was compiled with -IPA -c and
               transformed into an archive with this command:

                    ar rv foolib.a foofile.o

               The following command will inline the function foo from
               foofile.o (which is automatically extracted from foolib.a)
               into bar.c via the crossfile inlining mechanism described
               above:

                    cc -n32 -INLINE:must=foo:library=foolib.a bar.c

     list[=setting]
               List inlining actions as they occur to stderr.  The default
               is OFF.

     max_pu_size_inline[=n]
               Limits the size of inlined subprograms to n.  The default is
               5000.  Inlining is disabled if, after inlining, the caller
               exceeds this size.

     must=name1[,name2]
               Directs IPA to always attempt to inline any routines with
               names name1, name2, independent of the default inlining
               heuristic.  For C++, the names given must be the "mangled"
               names (see NOTES).  For Fortran, the name given may be
               either the original name, or the external name with the
               underscore appended by the compiler.  In all cases, the
               option applies to all routines encountered with the given
               name, whether static or extern.  In C, a pragma suppressing
               inlining at a particular callsite takes precedence over this
               option.

     never=name1[,name2]
               Directs IPA to never attempt to inline any routines with the
               names name1, name2, independent of the default inlining
               heuristic.  For C++, the names given must be the "mangled"
               names (see NOTES).  For Fortran, the name given may be
               either the original name, or the external name with the
               underscore appended by the compiler.  In all cases, the
               option applies to all routines encountered with the given
               name, whether static or extern.  A pragma requesting
               inlining at a particular callsite takes precedence over this
               option.

     none      Changes the default inlining heuristic so the compiler does
               not attempt to inline any routines which are not specified
               by a must option or a pragma requesting inlining, either for
               the routine or for a specific callsite.  This option
               conflicts with all; all takes precedence if both are
               specified.

     preempt[=setting]
               Enables inlining of functions marked preemptible in the
               standalone inliner.  The default is OFF.  Such inlining
               prevents another definition of such a function in another
               DSO from pre-empting the definition of the function being
               inlined.

     specfile=filename
               Opens filename to read additional options.  The
               specification file contains zero or more lines with inliner
               options in the form expected on the command line.

               For example, the file might contain a single line, similar
               to the following:

                    INLINE:never=errfunc:must=accessor,solver

               It can also contain multiple lines, as in the following:

                    INLINE:all
                    INLINE:never=errfunc

               The specfile option cannot occur in a specification file, so
               specification files cannot invoke other specification files.

     static[=setting]
               Performs inlining of static functions in the standalone
               inliner.  setting can be either ON or OFF.  C/C++: The
               default is TRUE at -O2 optimization levels, FALSE otherwise.

   IPA Options
     The IPA options control the interprocedural analyses and
     transformations performed.  Using the group name without any options
     (for example, -IPA) invokes IPA with the default settings.

     When -apokeep, -pcakeep, or -pfakeep are specified in conjunction with
     -ipa or -IPA, the default settings for IPA suboptions are used with
     the exception of the inline=setting suboption, which is set to OFF.

     The individual controls in this group are:

     addressing[=setting]
               Enables or disables the analysis of address operator usage.
               setting can be either ON or OFF.  -IPA:alias=ON is a
               prerequisite.  The default is OFF.

     aggr_cprop[=setting]
               Enables or disables aggressive interprocedural constant
               propagation.  setting can be either ON or OFF.  Attempts to
               avoid passing constant parameters, replacing the
               corresponding formal parameters by the constant values.  By
               default, less aggressive interprocedural constant
               propagation is done.  The default is OFF.

     alias[=setting]
               Enables or disables alias/mod/ref analysis. setting can be
               ON or OFF.  The default is OFF.

     autognum[=setting]
               Determines the optimal value of the -Gnum option.  setting
               can be ON or OFF.  This option identifies a size bound below
               which data can be allocated relative to the global pointer
               and accessed cheaply.  This optimization is turned off when
               -multigot is specified in the linker command line.  The
               default is ON.  See also -IPA:Gnum.

     cgi[=setting]
               Enables or disables constant global variable identification.
               setting can be ON or OFF.  This option marks non-scalar
               global variables which are never modified as constant, and
               propagates their constant values to all files.  The default
               is ON.

     common_pad_size=n
               Specifies the amount by which to pad common block array
               dimensions.  By default, the compiler automatically chooses
               the amount of padding to improve cache behavior for common
               block array accesses.

     cprop[=setting]
               Enables or disables interprocedural constant propagation.
               setting can be ON or OFF.  This option identifies formal
               parameters which always have a specific constant value.  The
               default is ON.  See also -IPA:aggr_cprop.

     depth=n   This option is identical to maxdepth=n.

     dfe[=setting]
               Enables or disables dead function elimination.  setting can
               be ON or OFF.  This option removes subprograms which are
               never called from the program.  The default is ON.

     dve[=setting]
               Enables or disables dead variable elimination.  setting can
               be ON or OFF.  This option removes variables which are never
               referenced from the program. The default is ON.

     echo[=setting]
               Echo (to stderr) the compile commands and the final link
               command which are invoked from IPA.  setting can be ON or
               OFF.  This can help monitor progress of a large system
               build.  The default is OFF.

     forcedepth=n
               Sets inline depths.  Instead of the default inlining
               heuristics, this options directs IPA to attempt to inline
               all functions at a depth of (at most) n in the callgraph,
               where functions which make no calls are at depth 0, those
               which call only depth 0 functions are at depth 1, and so on.
               This ignores the default heuristic limits on inlining.

     Gfactor=n Sets percentage for GOT multiplication.  n is the percentage
               used to multiply the estimated external GOT entries for
               estimating the total .got size. A value of n=200 means that
               IPA will multiply the estimated external GOT entries by 2 to
               get the estimated total .got size.  The default is 200.

     Gnum=n    User-specified Gnum.  The default is no limit.

     Gspace=n  User-specified size (in bytes) for the area where IPA can
               allocate data that can be referenced relative to the global
               pointer.  The default is 64 Kbytes, which is the maximum
               valid value.

     gp_partition[=setting]
               Enables partitioning for archiving different GP-groups, as
               specified by the user externally or determined by IPA
               internally. setting can be ON or OFF.  This option basically
               enables -IPA picopt in the presence of -multigot.  The
               default is OFF.

     inline[=setting]
               Performs inter-file subprogram inlining during main IPA
               processing.  setting can be ON or OFF.  The default is ON.
               This does not affect the standalone inliner.

     intrinsics=n
               Sets the number of Fortran intrinsic functions in the GOT
               area. This number is added to the estimated external GOT
               entries to get the estimated total .got size. IPA has
               difficulty in estimating the number of Fortran intrinsic
               functions that will be added by the Lowerer after the IPA
               phase.

     keeplight[=setting]
               Directs IPA to not send -keep to the compiler, in order to
               save space. setting can be ON or OFF.  The default is OFF.

     linear[=setting]
               Sets linearization of array references.  setting can be ON
               or OFF.  When inlining Fortran subroutines, IPA tries to map
               formal array parameters to the shape of the actual
               parameter.  It may not always be able to always map it. In
               the case that it cannot map the parameter, it linearizes the
               array reference. By default, it will not inline such
               callsites because they may cause performance problems.  The
               default is OFF.

     map_limit=n
               Controls when IPA enables sp_partition.   n is the maximum
               size (in bytes) of input files mapped before IPA does -IPA
               sp_partition.

     maxdepth=n
               Directs IPA to not attempt to inline functions at a depth of
               more than n in the callgraph, where functions which make no
               calls are at depth 0, those which call only depth 0
               functions are at depth 1, and so on.  Inlining remains
               subject to overriding limits on code expansion.  See also
               forcedepth, space, and plimit.

     max_job=n Limits the maximum parallelism when invoking the compiler
               after IPA to (at most) n compilations running at once.  The
               default is 2 on a uniprocessor host, 4 on a multiprocessor
               host.

     multi_clone=n
               Specifies the maximum number of clones that can be created
               from a single procedure.  By default, this value is 0.
               Aggressive procedure cloning may provide opportunities for
               interprocedural optimization, but it also may significantly
               increase the code size.

     node_bloat=n
               When used in conjunction with IPA:multi_clone, this
               specifies the maximum percentage growth of the total number
               of procedures relative to the original program.

     partition_group=[symbolname[%symbol]|filename%F]]...
               Specifies EXTERNAL symbols belonging to the same group.  All
               unspecified symbols are considered by IPA as belonging to
               the COMMON group, which has the properties of always being
               in memory and available for inlining.  Following the
               symbolname, the user can specify the properties for that
               symbol by adding a percentage symbol (%), followed by the
               property wanted.

               symbol can be I or G.  I indicates symbol is used only
               within the partition, or G indicates symbol should be marked
               as GP-relative, for DATA symbols only.

               Instead of specifying the symbol, the user can specify a
               gp_partition per file, as in the following:

                    partition_group=file_name%F

               Then every defined EXTERNAL symbol that exists in that file
               will have the same group. file_name must be specified in the
               same way that the file is specified in the link-line.  See
               the following example:

        cc -IPA:gp_partition=on:partition_group=
        /usr/tmp/p007.o%F:partition_group=./add.o%F /usr/tmp/p007.o ./add.o

     picopt[=setting]
               Performs PIC optimizations.  This involves turning
               preemptible symbols to non-preemptible symbols whenever
               possible, either through IPA's own analysis or through user
               specifications (required to build DSOs).  The following are
               major benefits of this action:

               * Enables other IPA optimizations such as inlining, constant
                 propagation, DFE, etc.

               * Turns indirect calls to direct calls.

               * Eliminates the generation of GP-prologs, thus generating
                 fewer instructions.

               The preceding benefits are automatically accomplished by IPA
               for building executables.  To obtain similar benefits for
               building DSOs, the user must specify which symbols are
               preemptible.  This can be done by using the ld -exports_file
               option. See the ld man page for details.  The default is ON.

     plimit=n  Stops inlining into a particular subprogram once it reaches
               size n in the intermediate representation.  The default is
               2500.

     relopt[=setting]
               Enables optimizations similar to those achieved with the
               compiler options -O3 and -c, where objects are built with
               the assumption that the compiled objects will be linked into
               a call-shared executable later. setting can be ON or OFF.
               In effect, optimizations based on position-dependent code
               (non-PIC) are performed on those objects.  The default is
               OFF.

     space=n   Stops inlining when the program size has increased by n%.
               For example, n=20 limits code expansion due to inlining to
               approximately 20%.  The default is 100%.

     sp_partition=[setting]
               Enables partitioning for disk/address-saving purpose.
               setting can be ON or OFF.  Mainly used for building huge
               programs (for example, PTC). Partitioning should normally be
               done by IPA internally.  The default is OFF.

     specfile=filename
               Opens filename to read more options.  A specfile contains
               zero or more of the options allowed by IPA. In the following
               example, -IPA:specfile=option_file, the option_file can be
               used to specify anything for -IPA as if it is specified in
               the command line, as in this example:

      -IPA:gp_partition=on:partition_group=p007.o%F:partition_group=add.o%F

               Because specfile= is not legal within a specfile, a specfile
               cannot point at other specfiles.

     use_intrinsic[=setting]
               Enables loading the intrinsic version of standard library
               functions.  This option causes inlining of the malloc
               library.  This improves small object allocations, but is not
               mp safe.  setting can be ON or OFF.  The default is OFF.

NOTES
     Both IPA and standalone inlining are disabled when -g is specified on
     the compiler's command line.

   C/C++ Mangled Name
     For specifying routine names to the -INLINE:never=name and
     -INLINE:must=name options for C++ programs, the mangled internal name
     must be used.  Because C++ allows overloading (that is, the use of the
     same name for multiple objects which can be distinguished by type), it
     uses internal names which are constructed from the original name and
     an encoded version of the object's type.  To find this mangled name,
     do the following:

     * Compile the source module where the name is defined, as in this
       example:

          CC -c source.cxx -o source.o

     * Use the original name to find the mangled name in the object file
       using nm(1) and grep(1).  For example, if the original name was
       mysub, use the following:

          nm -B source.o | grep mysub
          0f89f950 T mysub__10yourclassFv
          0f89facc T mysub__10myclassFv

     * The previous commands might produce several potential matches,
       particularly if overloading is actually occurring.  If so, use the
       c++filt filter to determine which one is the one you want:

          /usr/lib/c++/c++filt
          mysub__10myclassFv
          myclass::mysub(void)

       You can continue entering possible names until one of them matches
       the name you want.

SEE ALSO
     cc(1), f77(1), ld(1)
     dso(5)

     MIPSpro C and C++ Pragmas, publication 007-3587-001

     C Language Reference Manual, publication 007-0701-120

     Compiler Information File (CIF) Reference Manual

     MIPSpro Fortran 77 Programmer's Guide

     MIPSpro Fortran 90 Commands and Directives Reference Manual

     MIPSpro 64-Bit Porting and Transition Guide