OPT(5)OPT(5) NAME OPT - Miscellaneous compiler optimizations option group SYNOPSIS -OPT: ... DESCRIPTION This man page describes the general optimization options accepted by the f90(1), f77(1), CC(1), cc(1), and c89(1) commands. The -OPT: option group controls miscellaneous optimizations. This option overrides default optimizations. You can specify more than one suboption to the -OPT: option either by using colons to separate each suboption or by specifying multiple options on the command line. For example, the following command lines are equivalent: f90 -OPT:cis=ON:cray_ivdep=OFF b.f f90 -OPT:cis=ON -OPT:cray_ivdep=OFF b.f Some -OPT: suboptions are specified with a setting that either enables or disables the feature. To enable a feature, specify the suboption either alone or with =1, =ON, or =TRUE. To disable a feature, specify the suboption with either =0, =OFF, or =FALSE. For example, the following command lines are equivalent: f90 -OPT:cis:cray_ivdep=OFF:div_split=FALSE a.f f90 -OPT:cis=1:cray_ivdep=0:div_split=OFF a.f For brevity, this man page shows only the ON or OFF settings to suboptions, but 0, 1, TRUE, and FALSE are also allowed as settings. There are other options to the compiler commands that control optimization. For information on the general optimization levels, see the -O option on the command man page for your compiler. For information about inlining and interprocedural optimization, see the -INLINE: ... option or the ipa(5) man page. For information on loop nest optimization, see the lno(5) man page. The -OPT: option accepts the following suboptions: Suboptions Action alias=name Specifies the pointer aliasing model to be used. By specifiying one of the following for name, the compiler is able to make assumptions throughout the compilation: ANY or COMMON_SCALAR ANY specifies that any two memory references can be aliased unless the compiler can determine otherwise. Default pointer aliasing model. COMMON_SCALAR specifies that scalar variables that are defined in a common block along with array variables are not referenced or modified by any accesses of the array variables. (C and C++ only) cray_pointer or no_cray_pointer cray_pointer asserts that a pointee's storage is never overlaid on another variable's storage. The pointee is stored in memory before a call to an external procedure and is read out of memory as its next reference. It is also stored before a RETURN or END statement of a subprogram no_cray_pointer asserts that a pointee's storage can overlay on another variable's storage. Default. (Fortran 90 or FORTRAN 77) disjoint Activates an additional rule to improve alias analysis. This assures that two pointers based on different symbols will never access the same memory location. Distinct pointer expressions are assumed to never point to overlapping storage. (Fortran 90 or FORTRAN 77) TYPED or NO_TYPED TYPED specifies that pointers of distinct base types are assumed to point to distinct, non-overlapping objects. NO_TYPED specifies that pointers to different base types may point to the same object. (C and C++ only) UNNAMED or NO_UNNAMED UNNAMED specifies that pointers are assumed never to point to named objects. NO_UNNAMED specifies that the pointer may point to named object. (C and C++ only) RESTRICT or NO_RESTRICT RESTRICT specifies that distinct pointers are assumed to point to distinct, non-overlapping objects. NO_RESTRICT specifies that distinct pointers may point to overlapping storage. (C and C++ only) parm or no_parm parm asserts that Fortran parameters do not alias to any other variable. Default. no_parm asserts that Fortran parameters can alias to any other variable. (Fortran 90 or FORTRAN 77) cis[ = ( ON|OFF )] Converts SIN/COS pairs with the same argument to a single call that calculates both values at once. The default is ON. const_copy_limit=n (cc, c89, and f77 only) Do not do const/copy propagation if there are more than n expressions in a subprogram. Default is n=10000. cray_ivdep[ = ( ON|OFF )] Instructs the compiler to ignore all vector dependencies when encountering IVDEP directives. The default is OFF. div_split[ = ( ON|OFF )] Enables or disables the calculation of x/y as x*(1.0/y). The default is div_split=OFF. This is typically enabled by the -OPT:IEEE_arithmetic=3 option. Also see the -OPT:recip option. This option should be used with caution because it produces less accurate results. fast_bit_intrinsics[ = ( ON|OFF )] ON turns off the check for the bit count being within range for Fortran bit intrinsics (for example, BTEST and ISHFT). The default is OFF. fast_complex[ = ( ON|OFF )] fast_complex=ON enables fast calculations for values declared as type complex. When set to ON, complex absolute value (norm) and complex division use fast algorithms that overflow for an operand (divisor, in the case of division) that has an absolute value that is larger than the square root of the largest representable floating-point number (or underflow for a value that is smaller than the square root of the smallest representable floating point number). The default is OFF. fast_complex=ON is enabled if -OPT:roundoff=3 is in effect. fast_exp[ = ( ON|OFF )] fast_exp=ON enables optimization of exponentiation by replacing the run-time call for exponentiation by multiplication and/or square root operations for certain compile-time constant exponents (integers and halves). This can produce differently rounded results than the run-time function. fast_exp=ON is in effect unless -OPT:roundoff=1 is in effect. The default is ON. fast_io[ = ( ON|OFF )] (C/C++ only) Enables inlining of printf(), fprintf(), sprintf(), scanf(), fscanf(), sscanf(), and printw(). This option is in effect only when the candidates for inlining are marked as intrinsic in the stdio.h and curses.h files. The default is OFF. fast_nint[ = ( ON|OFF )] fast_nint=ON uses hardware features to implement NINT and ANINT (both single- and double-precision versions). The default is OFF, but fast_nint=ON is enabled by default if -OPT:roundoff=3 is in effect. fast_nint=ON is also enabled when fast_trunc=ON is in effect. FORTRAN 77 and Fortran 90 only: This violates the FORTRAN 77 and Fortran 90 standards for certain argument values because it rounds as specified by the IEEE standard, rather than as specified by the Fortran standards (for example, FORTRAN 77 specifies that NINT(1.5) is 2, and NINT(2.5) is 3, while IEEE rounds both of these to 2). If fast_trunc is also enabled, NINT and ANINT are implemented with round instructions (i.e., fast_nint takes precedence for these intrinsics). fast_sqrt[ = ( ON|OFF )] fast_sqrt=ON calculates square roots using the identity sqrt(x) = x*rsqrt(x), where rsqrt is the reciprocal square root operation. The default is OFF. This option is ignored unless -mips4 and -r8000 are also in effect. WARNING: This option results in sqrt(0.0) producing a NaN result. Use it only when zero sqrt operands are not valid. fast_trunc[ = ( ON|OFF )] fast_trunc=ON inlines the NINT, ANINT, AINT, and AMOD Fortran intrinsics, both single- and double-precision versions. The default is OFF. fast_trunc=ON is enabled automatically if -OPT:roundoff=1 (or greater) is in effect. FORTRAN 77 and Fortran 90 only: Although this is compliant with the FORTRAN 77 and Fortran 90 standards, it reduces the valid argument range. If fast_nint is also enabled, NINT and ANINT are implemented with round instructions (that is, fast_nint takes precedence for these intrinsics). fold_reassociate[ = ( ON|OFF )] fold_reassociate=ON allows optimizations involving reassociation of floating-point quantities. The default is OFF. fold_reassociate=ON is enabled automatically when -O3 is in effect or when -OPT:roundoff=2 or greater is in effect. fold_unsafe_relops[ = ( ON|OFF )] fold_unsafe_relops=ON folds relational operators in the presence of possible integer overflow. The default is ON. fold_unsigned_relops[ = ( ON|OFF )] fold_unsigned_relops=ON folds unsigned relational operators in the presence of possible integer overflow. The default is OFF. got_call_conversion[ = ( ON|OFF )] got_call_conversion=ON loads function addresses to be moved out of loops. The load is set up with the proper relocation so that the address is resolved at program start-up time. The default is OFF when -O2 or lower is in effect. got_call_conversion=ON when -O3 is in effect. NOTE: This option should be disabled when compiling shared objects that contain function addresses that may be preempted by rld(1). For more information, see the dso(5) man page. IEEE_arithmetic=n Specifies the level of conformance to ANSI/IEEE 754-1985, the IEEE Standard for Binary Floating-point Arithmetic, which describes a standard for, among other things, NaN and inf operands, arithmetic round off, and overflow. n can be one of the following: n Description 1 Inhibits optimizations that produce less accurate results than required by ANSI/IEEE 754-1985. This is the default. 2 Allows compiler optimizations that can produce less accurate inexact results (but accurate exact results) on the target hardware. For example, -OPT:recip is enabled to use the hardware recip instruction. Also, expressions that would have produced a NaN or an inf may produce different answers, but otherwise answers are the same as those obtained when IEEE_arithmetic=1 is in effect. Examples: 0*X may be changed to 0, and X/X may be changed to 1 even though this is inaccurate when X is +inf, -inf, or NaN. 3 Performs arbitrary, mathematically valid transformations, even if they can produce inaccurate results for operations specified in ANSI/IEEE 754-1985. These transformations can cause overflow or underflow for a valid operand range. An example is the conversion of x/y to x*recip(y) for MIPS IV targets. Also see the -OPT:roundoff=n option. IEEE_comparisons[ = ( ON|OFF )] Forces all comparisons to yield results that conform to ANSI/IEEE 754-1985, the IEEE Standard for Binary Floating- point Arithmetic, which describes a standard for NaN and inf operands. The default is IEEE_comparisons=OFF. IEEE_comparisons=OFF produces non-IEEE results for comparisons. For example, x=x is treated as TRUE without executing a test. NOTE: This option has been deprecated and will be removed in a future release. The preferred alternative is the -OPT:IEEE_NaN_inf= option. (Fortran 90 only) IEEE_NaN_inf[ = ( ON|OFF )] IEEE_NaN_inf=ON forces all operations that might have IEEE-754 NaN or infinity operands to yield results that conform to ANSI/IEEE 754-1985, the IEEE Standard for Binary Floating- point Arithmetic, which specifies the standard for NaN and inf operands. The default is OFF. IEEE_NaN_inf=OFF produces non-IEEE results for various operations. For example, x=x is treated as TRUE without executing a test and x/x is simplified to 1 without dividing. Turning this option on may suppress many such common optimizations and hurt performance. inline_intrinsics[ = ( ON|OFF )] inline_intrinsics=OFF turns all Fortran intrinsics that have a library function into a call to that function. The default is ON. liberal_ivdep[ = ( ON|OFF )] Specifies that the compiler should use UNICOS semantics when a !DIR$ IVDEP directive (Fortran) or a #pragma ivdep statement (C) is encountered. With UNICOS semantics, the compiler ignores all loop iteration dependencies. The default is OFF. Also see the -OPT:cray_ivdep[ = ( ON|OFF )] option. Olimit=n Specifies that any routine bigger than n should not be optimized. If -O2 or greater is in effect and a routine is so big that the compile speed may be slow, the compiler generates a message indicating the Olimit value that is needed to optimize. You can recompile with that value of n. The -OPT:Olimit=0 option is not recommended for general use. pad_common[ = ( ON|OFF )] pad_common=ON reorganizes common blocks to improve the cache behavior of accesses to members of the common block. This may involve adding padding between members and/or breaking a common block into a collection of common blocks. The default is OFF. This option should not be used unless the common block definitions (including EQUIVALENCE) are consistent among all sources comprising a program. In addition, pad_common=ON should not be specified if common blocks are initialized with DATA statements. If specified, pad_common=ON must be used for all of source files in the program. pad_common=ON is supported for Fortran only. It should not be used if a common block is referenced from C code. procedure_reorder[ = ( ON|OFF )] procedure_reorder=ON must be specified in conjunction with -LD_LAYOUT:reorder_file=feedback_file to enable linker cording. Linker cording is the linker's ability to optimize the layout of functions based upon a feedback file; this minimizes page faults and I-cache misses. The default is OFF. For more information on the -LD_LAYOUT option, see the ld(1) man page. For an example that shows reordering of code regions, see the MIPSpro N32/64 Compiling and Performance Tuning Guide. recip[ = ( ON|OFF )] recip=ON specifies that faster, but potentially less accurate, reciprocal operations should be performed. The default is OFF. If -O3 or -OPT:IEEE_arithmetic=2 or above are in effect, recip=ON is enabled. The recip=ON setting is effective only if -r8000 is in effect. reorg_common[ = ( ON|OFF )] reorg_common=ON reorganizes common blocks to improve the cache behavior of accesses to members of the common block. The reorganization is done only if the compiler detects that it is safe to do so. reorg_common=ON is enabled when -O3 is in effect and when all files that reference the common block are compiled at -O3. reorg_common=OFF is set when the file that contains the common block is compiled at -O2 (or below). roundoff=n Specifies the level of acceptable departure from source language floating-point, round-off, and overflow semantics. n can be one of the following: n Description 0 Inhibits optimizations that might affect the floating-point behavior. This is the default when optimization levels -O0, -O1, and -O2 are in effect. 1 Allows simple transformations that might cause limited round-off or overflow differences. Compounding such transformations could have more extensive effects. 2 Allows more extensive transformations, such as the reordering of reduction loops. This is the default level when -O3 is in effect. 3 Enables any mathematically valid transformation. To obtain best performance in conjunction with software pipelining, specify roundoff=2 or roundoff=3. This is because reassociation is required for many transformations to break recurrences in loops. Note that the optimizations enabled by this option can rearrange expressions across parentheses or even statement boundaries. Also see the descriptions for the -OPT:IEEE_arithmetic, -OPT:fast_complex, -OPT:fast_trunc, and -OPT:fast_nint options. rsqrt[ = ( ON|OFF )] rsqrt=ON specifies that faster, but potentially less accurate, square root operations should be performed. The default is OFF. If -OPT:IEEE_arithmetic=2 (or above) or -O3 are in effect, rsqrt=ON is enabled. space[ = ( ON|OFF )] space=ON specifies that code space is to be given priority in tradeoffs with execution time in optimization choices. The default is OFF. speculative_ptr_deref [ = ( ON|OFF )] This option allows speculative loads of memory locations that differ by a small offset from some referenced memory location. For example, speculate a[i+1] if a[i] is referenced; and speculate p->field2 if p->field1 is referenced. The feature is turned ON by default at -O2 and above. However, the legal offset ranges are different. At -O2, the range is 32 (-16 .. +16). At -O3, the range is 128 (-64 .. +64). This optimization may result in an exception if the speculated location is on a different page than that of the referenced memory location. The chances of this happening with these legal offset ranges is very remote. swp[ = ( ON|OFF )] swp=ON enables software pipelining. swp=ON is enabled when -O3 is in effect. The default is OFF. unroll_analysis[ = ( ON|OFF )] unroll_analysis=ON analyzes resource usage and recurrences in bodies of innermost loops that do not qualify for being fully unrolled. Such loops are unrolled only to the extent to which there is a potential benefit in doing so. A loop could be unrolled, for example, to decrease the shortest possible schedule length per iteration. The default is ON. unroll_analysis=ON can have the negative effect of unrolling loops less than the upper limit dictated by the OPT:unroll_times_max and OPT:unroll_size specifications. unroll_size=n Specifies the maximum size (in instructions) of an unrolled loop. Specify an integer for n. When -OPT:space=OFF is in effect, the default is 80. When -OPT:space=ON is in effect, the default is 20. This option indirectly determines which loops can be fully unrolled. Also see the -OPT:unroll_times_max option. unroll_times_max=n Specifies the maximum number of times a loop will be unrolled if it is not going to be fully unrolled. Specify an integer for n. The default is 8 when -r8000 or -r10000 are in effect, and the default is 4 in all other cases. Also see the -OPT:unroll_size option. warning=[ = ( ON|OFF) ] Enables or suppresses warning messages from the compiler about the level and types of optimizations performed. The default is ON. wrap_around_unsafe_opt=[ = ( ON|OFF )] wrap_around_unsafe_opt=OFF disables both the induction variable replacement and linear function test replacement optimizations. By default, these optimizations are enabled at -O3. This option is disabled by default at -O0. Setting this option to OFF can degrade performance. It is provided, however, as a diagnostic tool to identify the situation described previously. SEE ALSO cc(1), CC(1), cord(1), dso(1), f77(1), f90(1), fpmode(1), hinv(1), ld(1), make(1), pixie(1), pmake(1), prof(1), rld(1), smake(1) math(3M) auto_p(5), gp_overflow(5), ipa(5), lno(5), pe_environ(5) C Language Reference Manual MIPSpro C and C++ Pragmas MIPSpro N32/64 Compiling and Performance Tuning Guide Compiler Information File (CIF) Reference Manual MIPSpro Fortran 77 Programmer's Guide MIPSpro Fortran 90 Commands and Directives Reference Manual MIPSpro 64-Bit Porting and Transition Guide