IPA(5)IPA(5) NAME IPA - Inter-Procedural Analysis DESCRIPTION This man page describes Inter-Procedural Analysis (IPA) in the MIPSpro compilers and how to use it most efficiently. IPA Overview Most compiler optimizations work within a single procedure (for example, function or subroutine) at a time. This helps keep the problems manageable, and is a key aspect of supporting separate compilation, because it allows the compiler to restrict attention to the current source file. This intra-procedural focus also presents serious restrictions. By avoiding dependence on information from other procedures, an optimizer is forced to make worst-case assumptions about the possible effects of those procedures. For instance, at boundary points including all procedure calls, the compiler must typically save (and/or restore) the state of all variables to or from memory. By contrast, IPA algorithms analyze more than a single procedure (preferably the entire program) at once. The optimizations performed by the MIPSpro compilers' IPA utility include: * Inlining: Calls to a procedure are replaced by a suitably modified copy of the called procedure's body inline, even if the callee is in a different source file. * Common block array padding: Global arrays in Fortran may be padded (that is, their size increased in order to reduce cache conflicts). * Constant propagation: Formal parameters which always have a particular constant value can be replaced by the constant, allowing additional optimization. Global variables which are initialized to constant values and never modified can be replaced by the constant. * Dead function elimination: Functions which are never called can be removed from the program image, improving memory utilization. * Dead variable elimination: Variables which are never actually used can be eliminated, along with any code that initializes them. * Global name optimizations: Global names in shared code must normally be referenced via addresses in a global table, in case they are defined or preempted by another dynamic shared object (DSO). If the compiler knows it is compiling a main program and that the name is defined in another of the source files comprising the main program, an absolute reference can be substituted, eliminating a memory reference. For example, the following code can be used to to load X: lw $4,%got_disp(X),$gp ld $5,0,$4 It could be changed to the following: lui $4,%hi(X) ld $5,%lo(X),$4 Similarly, IPA can optimize the data items that can be referenced by simple offsets from the GP register instead of depending on the user to provide an ideal value of the -G option (see cc(1) or f77(1)). IPA and Compilation Because IPA must usually process code from multiple source files to be effective, it has significant effects on compilation. These can be classified as affecting either the program build process itself, or the attributes of the resulting program. The Build Process The standard Unix C or Fortran compilation model involves two steps. First, each source file comprising a program is compiled independently of the others, producing a relocatable object file with the .o extension. Then, the resulting object files are linked, along with any standard libraries, by ld(1). IPA works by postponing much of the compilation process until the link step, when all of the program components can be analyzed together. Specifically, the following occurs: * The compile step does initial processing of the source file, placing an intermediate representation of the procedures it contains into the output .o file instead of normal relocatable code. Such object files are called WHIRL objects to distinguish them from normal relocatable object files. This choice is invoked by the -IPA: option. This processing is really two phases: the normal language processing, plus an IPA summary phase which collects local information to be used by IPA later. Because most optimization and code generation options are transmitted via the IPA summary phase, they must be present on the command line for the compile step. * The link step, although it is still invoked by a single ld(1) command, becomes several steps. First, the linker invokes IPA, which analyzes and transforms all of the input WHIRL objects together, writing modified versions of the objects. Then it invokes the compiler on each of the modified objects, producing a normal relocatable object. Finally, it invokes the linker again for a normal linkage step, producing an executable program or DSO. The temporary files created during this expanded link step are all created in a temporary subdirectory of the output program's directory, unless the environment variable TMPDIR is specified, in which case the temporary subdirectory is created under $TMPDIR; like other temporary files produced by the compilation process, they are normally removed on completion (unless the -keep option is specified with the compiler command). IPA increases program build time in two ways. First, although the IPA step may not be very expensive itself, it usually increases the size of the code by inlining, and therefore expands the time required by the rest of the compiler. More importantly, because IPA analysis can propagate information from one module into optimization decisions in arbitrary other modules, even minor changes to a single component module cause most of the program compilation to be redone. As a result, IPA is best suited for use on programs with a stable source base. Because full IPA is not always suitable, the MIPSpro compilers also support inlining without IPA in cases where both the call and the called subprogram are in the same file. This feature, called the standalone inliner, is invoked by default when inlining is specified in C++, or may be explicitly invoked using the -INLINE options (described following). Program Attributes Like most optimization performed by the compiler, IPA should not change the program's behavior. Nevertheless, it can affect the resulting program in subtle ways. The most important effects involve external symbols and DSOs. Unlike other IPA implementations, the MIPSpro compiler does not require that all of the components of a program be subjected to IPA analysis. The ld(1) command which invokes IPA may include object files (.o) and archives (.a) which have been compiled normally without IPA analysis, as well as referencing DSOs which, however they were compiled, cannot contribute detailed information because the DSO may be replaced with a different implementation after the program is built. To analyze the program's use (and non-use) of variables, IPA must determine what those unanalyzed objects might reference. It does so by examining their external symbol references. If an external symbol is never referenced by one of the unanalyzed objects, and its address is never taken in the code being analyzed, IPA can assume that the only possible way for the unanalyzed code to access the named object is if it passed as a parameter to an unanalyzed subprogram. Otherwise, it must make worst-case assumptions. This approach is safe for normal relocatable objects and archives, because they cannot be changed without rebuilding (and reanalyzing) the program. For DSOs, however, there is a danger that a modified version will make additional references. To prevent this from causing changed program behavior, IPA changes all of the external symbols which it assumes to be unreferenced to HIDDEN or INTERAL export class, which prevents them from being referenced inadvertently by a future version of the DSO. The DSO case is safe as well. The DSO treatment does have an important side effect, though. All referenced DSOs should be referenced on the ld(1) command line. Otherwise, their external references will not be seen by IPA, and the symbols they require may not be visible at runtime, causing failure. If you cannot give references to all DSOs used (for example because you will be accessing them with dlopen(3) and don't want them automatically loaded with your program), then you must provide an explicit exported symbol list with the -exported_symbol or ld(1) options. See dso(5) for more information. Because IPA combines code from multiple source files, it is recommended that the same compilation options be used for every source file that is compiled with IPA analysis. If it is important for a single file to be given unusual compilation options that would be inappropriate for the rest of the program (such as rounding mode control), it is recommended that that file be compiled without IPA analysis. IPA Options IPA is controlled from the command line with two option groups. -INLINE: controls inlining by the standalone inliner. It should be used on your compile command line. -IPA: controls general IPA choices. If you use separate compile and link steps (i.e. using -c when you compile and then linking the .o files produced with a separate cc, f77, or ld command), then you need to use -IPA for the compile step, with or without individual options described below, and also for the link step with any of the following options desired. The command line format used for group options is -groupname:option[=value][:option2[=value2]]... The groupname (IPA or INLINE) is followed by a colon-separated list of options, each of which is an option name possibly followed by an equal sign and a value. The option names may generally be abbreviated by truncating them to a unique prefix (which may change when new options are added to the group). Some options are specified with a setting that will enable or disable the feature. To enable a feature, specify the option alone or with =1, =ON, or =TRUE; to disable the feature, specify the option with =0, =OFF, or =FALSE. This man page only shows the ON or OFF settings, but the other settings are equally valid. INLINE Options The INLINE option calls the standalone inliner option group, which controls the application of subroutine inlining done by the standalone inliner, or by the main inliner if -IPA options are enabled. Normally, the calls to be replaced by an inlined copy of the called subprogram are chosen by heuristics internal to the inliner. Most of the options in this group provide control over those choices. The individual controls in this group are: =(setting) Enables or disables stand-alone inlining processing. setting can be either ON or OFF. This is ignored with a warning for compiles which invoke main IPA processing. When both are used in the command line (for a compile which will not invoke main IPA processing), =OFF is processed and =ON is overridden with a warning. If used within a specfile read by the stand-alone inliner, =OFF skips inline processing within the stand-alone inliner and =ON is ignored with a warning. all Changes the default inlining heuristic to attempt to inline all routines which are not excluded by a never option or a pragma suppressing inlining, either for the routine or for a specific callsite. This option conflicts with none; all takes precedence if both are specified. This option may cause performance problems because the size of the procedures may be very large after inlining. If this option must be used, then increase the Olimit by using the -OPT:Olimit=n option to enable procedures to be optimized. See the opt(5) man page for details. This may affect compilation speed. alloca[=setting] Enables save/restore of stack when inlining calls with alloca. setting can be either ON or OFF. If the callee has an alloca, then inlining it would normally remove the function boundary at which the dynamic space would have been released. This option saves and restores the stack pointer before and after the inlined code. Having the option ON is essential for correctness (for example, in code where the callee is inside a loop). The default is ON. dfe[=setting] Performs dead function elimination in the standalone inliner. setting can be either ON or OFF. This option removes any functions that are inlined everywhere they are called and are not visible outside the current module. C/C++: the default is TRUE when not compiled with -g, FALSE otherwise. file=filename Provides cross-file inlining from within the stand-alone inliner. The option searches for routines provided in a -INLINE:must list option, in the filename specified. The filename provided by this option must be generated using the -IPA -c option. The file generated contains information used to perform cross-file inlining. For example, suppose the following two files exist: foo.f and bar.f. The following is part of the code from foo.f: program main .... call bar() end The following is partial code from bar.f: subroutine bar() ... end To inline bar into main using the stand-alone inliner, compile bar.f in the following way: f77 -n32 -IPA -c bar.f This produces the file, bar.o. To inline bar into foo.f, use: f77 -n32 foo.f -INLINE:must=bar_:file=bar.o keep_pu_order[=setting] Preserves source subprogram ordering. The default is OFF. library=name Identifies archive libraries where the inliner should search for subprograms. This option is similar to -INLINE:file=, described previously, where the files (.o) in the archived library (.a) must have been generated using the -IPA -c options. For example, suppose function FOO is defined in foofile.f. The foofile.f was compiled with -IPA -c and transformed into an archive with this command: ar rv foolib.a foofile.o The following command will inline the function foo from foofile.o (which is automatically extracted from foolib.a) into bar.c via the crossfile inlining mechanism described above: cc -n32 -INLINE:must=foo:library=foolib.a bar.c list[=setting] List inlining actions as they occur to stderr. The default is OFF. max_pu_size_inline[=n] Limits the size of inlined subprograms to n. The default is 5000. Inlining is disabled if, after inlining, the caller exceeds this size. must=name1[,name2] Directs IPA to always attempt to inline any routines with names name1, name2, independent of the default inlining heuristic. For C++, the names given must be the "mangled" names (see NOTES). For Fortran, the name given may be either the original name, or the external name with the underscore appended by the compiler. In all cases, the option applies to all routines encountered with the given name, whether static or extern. In C, a pragma suppressing inlining at a particular callsite takes precedence over this option. never=name1[,name2] Directs IPA to never attempt to inline any routines with the names name1, name2, independent of the default inlining heuristic. For C++, the names given must be the "mangled" names (see NOTES). For Fortran, the name given may be either the original name, or the external name with the underscore appended by the compiler. In all cases, the option applies to all routines encountered with the given name, whether static or extern. A pragma requesting inlining at a particular callsite takes precedence over this option. none Changes the default inlining heuristic so the compiler does not attempt to inline any routines which are not specified by a must option or a pragma requesting inlining, either for the routine or for a specific callsite. This option conflicts with all; all takes precedence if both are specified. preempt[=setting] Enables inlining of functions marked preemptible in the standalone inliner. The default is OFF. Such inlining prevents another definition of such a function in another DSO from pre-empting the definition of the function being inlined. specfile=filename Opens filename to read additional options. The specification file contains zero or more lines with inliner options in the form expected on the command line. For example, the file might contain a single line, similar to the following: INLINE:never=errfunc:must=accessor,solver It can also contain multiple lines, as in the following: INLINE:all INLINE:never=errfunc The specfile option cannot occur in a specification file, so specification files cannot invoke other specification files. static[=setting] Performs inlining of static functions in the standalone inliner. setting can be either ON or OFF. C/C++: The default is TRUE at -O2 optimization levels, FALSE otherwise. IPA Options The IPA options control the interprocedural analyses and transformations performed. Using the group name without any options (for example, -IPA) invokes IPA with the default settings. When -apokeep, -pcakeep, or -pfakeep are specified in conjunction with -ipa or -IPA, the default settings for IPA suboptions are used with the exception of the inline=setting suboption, which is set to OFF. The individual controls in this group are: addressing[=setting] Enables or disables the analysis of address operator usage. setting can be either ON or OFF. -IPA:alias=ON is a prerequisite. The default is OFF. aggr_cprop[=setting] Enables or disables aggressive interprocedural constant propagation. setting can be either ON or OFF. Attempts to avoid passing constant parameters, replacing the corresponding formal parameters by the constant values. By default, less aggressive interprocedural constant propagation is done. The default is OFF. alias[=setting] Enables or disables alias/mod/ref analysis. setting can be ON or OFF. The default is OFF. autognum[=setting] Determines the optimal value of the -Gnum option. setting can be ON or OFF. This option identifies a size bound below which data can be allocated relative to the global pointer and accessed cheaply. This optimization is turned off when -multigot is specified in the linker command line. The default is ON. See also -IPA:Gnum. cgi[=setting] Enables or disables constant global variable identification. setting can be ON or OFF. This option marks non-scalar global variables which are never modified as constant, and propagates their constant values to all files. The default is ON. common_pad_size=n Specifies the amount by which to pad common block array dimensions. By default, the compiler automatically chooses the amount of padding to improve cache behavior for common block array accesses. cprop[=setting] Enables or disables interprocedural constant propagation. setting can be ON or OFF. This option identifies formal parameters which always have a specific constant value. The default is ON. See also -IPA:aggr_cprop. depth=n This option is identical to maxdepth=n. dfe[=setting] Enables or disables dead function elimination. setting can be ON or OFF. This option removes subprograms which are never called from the program. The default is ON. dve[=setting] Enables or disables dead variable elimination. setting can be ON or OFF. This option removes variables which are never referenced from the program. The default is ON. echo[=setting] Echo (to stderr) the compile commands and the final link command which are invoked from IPA. setting can be ON or OFF. This can help monitor progress of a large system build. The default is OFF. forcedepth=n Sets inline depths. Instead of the default inlining heuristics, this options directs IPA to attempt to inline all functions at a depth of (at most) n in the callgraph, where functions which make no calls are at depth 0, those which call only depth 0 functions are at depth 1, and so on. This ignores the default heuristic limits on inlining. Gfactor=n Sets percentage for GOT multiplication. n is the percentage used to multiply the estimated external GOT entries for estimating the total .got size. A value of n=200 means that IPA will multiply the estimated external GOT entries by 2 to get the estimated total .got size. The default is 200. Gnum=n User-specified Gnum. The default is no limit. Gspace=n User-specified size (in bytes) for the area where IPA can allocate data that can be referenced relative to the global pointer. The default is 64 Kbytes, which is the maximum valid value. gp_partition[=setting] Enables partitioning for archiving different GP-groups, as specified by the user externally or determined by IPA internally. setting can be ON or OFF. This option basically enables -IPA picopt in the presence of -multigot. The default is OFF. inline[=setting] Performs inter-file subprogram inlining during main IPA processing. setting can be ON or OFF. The default is ON. This does not affect the standalone inliner. intrinsics=n Sets the number of Fortran intrinsic functions in the GOT area. This number is added to the estimated external GOT entries to get the estimated total .got size. IPA has difficulty in estimating the number of Fortran intrinsic functions that will be added by the Lowerer after the IPA phase. keeplight[=setting] Directs IPA to not send -keep to the compiler, in order to save space. setting can be ON or OFF. The default is OFF. linear[=setting] Sets linearization of array references. setting can be ON or OFF. When inlining Fortran subroutines, IPA tries to map formal array parameters to the shape of the actual parameter. It may not always be able to always map it. In the case that it cannot map the parameter, it linearizes the array reference. By default, it will not inline such callsites because they may cause performance problems. The default is OFF. map_limit=n Controls when IPA enables sp_partition. n is the maximum size (in bytes) of input files mapped before IPA does -IPA sp_partition. maxdepth=n Directs IPA to not attempt to inline functions at a depth of more than n in the callgraph, where functions which make no calls are at depth 0, those which call only depth 0 functions are at depth 1, and so on. Inlining remains subject to overriding limits on code expansion. See also forcedepth, space, and plimit. max_job=n Limits the maximum parallelism when invoking the compiler after IPA to (at most) n compilations running at once. The default is 2 on a uniprocessor host, 4 on a multiprocessor host. multi_clone=n Specifies the maximum number of clones that can be created from a single procedure. By default, this value is 0. Aggressive procedure cloning may provide opportunities for interprocedural optimization, but it also may significantly increase the code size. node_bloat=n When used in conjunction with IPA:multi_clone, this specifies the maximum percentage growth of the total number of procedures relative to the original program. partition_group=[symbolname[%symbol]|filename%F]]... Specifies EXTERNAL symbols belonging to the same group. All unspecified symbols are considered by IPA as belonging to the COMMON group, which has the properties of always being in memory and available for inlining. Following the symbolname, the user can specify the properties for that symbol by adding a percentage symbol (%), followed by the property wanted. symbol can be I or G. I indicates symbol is used only within the partition, or G indicates symbol should be marked as GP-relative, for DATA symbols only. Instead of specifying the symbol, the user can specify a gp_partition per file, as in the following: partition_group=file_name%F Then every defined EXTERNAL symbol that exists in that file will have the same group. file_name must be specified in the same way that the file is specified in the link-line. See the following example: cc -IPA:gp_partition=on:partition_group= /usr/tmp/p007.o%F:partition_group=./add.o%F /usr/tmp/p007.o ./add.o picopt[=setting] Performs PIC optimizations. This involves turning preemptible symbols to non-preemptible symbols whenever possible, either through IPA's own analysis or through user specifications (required to build DSOs). The following are major benefits of this action: * Enables other IPA optimizations such as inlining, constant propagation, DFE, etc. * Turns indirect calls to direct calls. * Eliminates the generation of GP-prologs, thus generating fewer instructions. The preceding benefits are automatically accomplished by IPA for building executables. To obtain similar benefits for building DSOs, the user must specify which symbols are preemptible. This can be done by using the ld -exports_file option. See the ld man page for details. The default is ON. plimit=n Stops inlining into a particular subprogram once it reaches size n in the intermediate representation. The default is 2500. relopt[=setting] Enables optimizations similar to those achieved with the compiler options -O3 and -c, where objects are built with the assumption that the compiled objects will be linked into a call-shared executable later. setting can be ON or OFF. In effect, optimizations based on position-dependent code (non-PIC) are performed on those objects. The default is OFF. space=n Stops inlining when the program size has increased by n%. For example, n=20 limits code expansion due to inlining to approximately 20%. The default is 100%. sp_partition=[setting] Enables partitioning for disk/address-saving purpose. setting can be ON or OFF. Mainly used for building huge programs (for example, PTC). Partitioning should normally be done by IPA internally. The default is OFF. specfile=filename Opens filename to read more options. A specfile contains zero or more of the options allowed by IPA. In the following example, -IPA:specfile=option_file, the option_file can be used to specify anything for -IPA as if it is specified in the command line, as in this example: -IPA:gp_partition=on:partition_group=p007.o%F:partition_group=add.o%F Because specfile= is not legal within a specfile, a specfile cannot point at other specfiles. use_intrinsic[=setting] Enables loading the intrinsic version of standard library functions. This option causes inlining of the malloc library. This improves small object allocations, but is not mp safe. setting can be ON or OFF. The default is OFF. NOTES Both IPA and standalone inlining are disabled when -g is specified on the compiler's command line. C/C++ Mangled Name For specifying routine names to the -INLINE:never=name and -INLINE:must=name options for C++ programs, the mangled internal name must be used. Because C++ allows overloading (that is, the use of the same name for multiple objects which can be distinguished by type), it uses internal names which are constructed from the original name and an encoded version of the object's type. To find this mangled name, do the following: * Compile the source module where the name is defined, as in this example: CC -c source.cxx -o source.o * Use the original name to find the mangled name in the object file using nm(1) and grep(1). For example, if the original name was mysub, use the following: nm -B source.o | grep mysub 0f89f950 T mysub__10yourclassFv 0f89facc T mysub__10myclassFv * The previous commands might produce several potential matches, particularly if overloading is actually occurring. If so, use the c++filt filter to determine which one is the one you want: /usr/lib/c++/c++filt mysub__10myclassFv myclass::mysub(void) You can continue entering possible names until one of them matches the name you want. SEE ALSO cc(1), f77(1), ld(1) dso(5) MIPSpro C and C++ Pragmas, publication 007-3587-001 C Language Reference Manual, publication 007-0701-120 Compiler Information File (CIF) Reference Manual MIPSpro Fortran 77 Programmer's Guide MIPSpro Fortran 90 Commands and Directives Reference Manual MIPSpro 64-Bit Porting and Transition Guide