scfft(3S)




SCFFT(3S)                                                            SCFFT(3S)


NAME
     SCFFT, DZFFT, CSFFT, ZDFFT - Computes a real-to-complex or complex-to-
     real Fast Fourier Transform (FFT)

SYNOPSIS
     Single precision -> Single precision complex

          Fortran:
               CALL SCFFT (isign, n, scale, x, y, table, work, isys)

          C/C++:
               #include <scsl_fft.h>
               int scfft (int isign, int n, float scale, float *x,
               scsl_complex *y, float *table, float *work, int *isys);

          C++ STL:
               #include <complex.h>
               #include <scsl_fft.h>
               int scfft (int isign, int n, float scale, float *x,
               complex<float> *y, float*table, float *work, int *isys);

     Double precision  -> Double precision complex

          Fortran:
               CALL DZFFT (isign, n, scale, x, y, table, work, isys)

          C/C++:
               #include <scsl_fft.h>
               int dzfft (int isign, int n, double scale, float *x,
               scsl_zomplex *y, double *table, double *work, int *isys);

          C++ STL:
               #include <complex.h>
               #include <scsl_fft.h>
               int dzfft (int isign, int n, double scale, double *x,
               complex<double> *y, double *table, double *work, int *isys);

     Single precision complex -> Single precision

          Fortran:
               CALL CSFFT (isign, n, scale, x, y, table, work, isys)

          C/C++:
               #include <scsl_fft.h>
               int csfft (int isign, int n, float scale, scsl_complex *x,
               float *y, float *table, float *work, int *isys);

          C++ STL:
               #include <complex.h>
               #include <scsl_fft.h>
               int csfft (int isign, int n, float scale, complex<float> *x,
               float *y, float *table, float *work, int *isys);


     Double precision complex -> Double precision

          Fortran:
               CALL ZDFFT (isign, n, scale, x, y, table, work, isys)

          C/C++:
               #include <scsl_fft.h>
               int zdfft (int isign, int n, double scale, scsl_zomplex *x,
               double *y, double *table, double *work, int *isys);

          C++ STL:
               #include <complex.h>
               #include <scsl_fft.h>
               int zdfft (int isign, int n, double scale, complex<double> *x,
               double *y, double *table, double *work, int *isys);

IMPLEMENTATION
     These routines are part of the SCSL Scientific Library and can be loaded
     using either the -lscs or the -lscs_mp option.  The -lscs_mp option
     directs the linker to use the multi-processor version of the library.

     When linking to SCSL with -lscs or -lscs_mp, the default integer size is
     4 bytes (32 bits). Another version of SCSL is available in which integers
     are 8 bytes (64 bits).  This version allows the user access to larger
     memory sizes and helps when porting legacy Cray codes.  It can be loaded
     by using the -lscs_i8 option or the -lscs_i8_mp option. A program may use
     only one of the two versions; 4-byte integer and 8-byte integer library
     calls cannot be mixed.

     The C and C++ prototypes shown above are appropriate for the 4-byte
     integer version of SCSL. When using the 8-byte integer version, the
     variables of type int become long long and the <scsl_fft_i8.h> header
     file should be included.

DESCRIPTION
     SCFFT/DZFFT computes the FFT of the real array X, and it stores the
     results in the complex array Y.  CSFFT/ZDFFT computes the corresponding
     inverse complex-to-real transform.

     It is customary in FFT applications to use zero-based subscripts; the
     formulas are simpler that way.  For these routines, suppose that the
     arrays are declared as follows:

          Fortran:

               REAL    X(0:n-1)
               COMPLEX Y(0:n/2)


          C/C++:

               float        x[n];


               scsl_complex y[n/2+1];


          C++ STL:

               float          x[n];
               complex<float> y[n/2+1];


     Then the output array is the FFT of the input array, using the following
     formula for the FFT:


                         n-1
          Y(k) = scale * Sum [ X(j)*w**(isign*j*k) ] for k = 0, ..., n/2
                         j=0

          where:

     w =       exp(2*pi*i/n),

     i =       + sqrt(-1),

     pi =      3.14159...,

     isign =   +1 or -1.

     Different authors use different conventions for which of the transforms,
     isign = +1 or isign = -1, is the forward or inverse transform, and what
     the scale factor should be in either case.  You can make these routines
     compute any of the various possible definitions, however, by choosing the
     appropriate values for isign and scale.

     The relevant fact from FFT theory is this:  If you call SCFFT with any
     particular values of isign and scale, the mathematical inverse function
     is computed by calling CSFFT with -isign and 1/(n*scale).  In particular,
     if you use isign = +1 and scale = 1.0 in SCFFT for the forward FFT, you
     can compute the inverse FFT by using CSFFT with isign = -1 and
     scale = 1.0/n.

     See the NOTES section of this man page for information about the
     interpretation of the data types described in the following arguments.

     These routines have the following arguments:

     isign     Integer.  (input)
               Specifies whether to initialize the table array or to do the
               forward or inverse Fourier transform, as follows:


               If isign = 0, the routine initializes the table array and
               returns.  In this case, the only arguments used or checked are
               isign, n, and table.

               If isign = +1 or -1, the value of isign is the sign of the
               exponent used in the FFT formula.

     n         Integer.  (input)
               Size of transform.  If n is not positive, the routine returns
               without calculating the transform.

     scale     Scale factor.  (input)
               SCFFT: Single precision.
               DZFFT: Double precision.
               CSFFT: Single precision.
               ZDFFT: Double precision.
               Each element of the output array is multiplied by scale after
               taking the Fourier transform, as defined in the preceding
               formula.

     x         Input array of values to be transformed. (input)
               SCFFT: Single precision array of dimension n.
               DZFFT: Double precision array of dimension n.
               CSFFT: Single precision complex array of dimension n/2 + 1.
               (input)
               ZDFFT: Double precision complex array of dimension n/2 + 1.
               (input)

     y         Output array of transformed values. (output)
               SCFFT: Single precision complex array of dimension n/2 + 1.
               (input)
               DZFFT: Double precision complex array of dimension n/2 + 1.
               (input)
               CSFFT: Single precision array of dimension n.
               ZDFFT: Double precision array of dimension n.

               The output array, y, is the FFT of the the input array, x,
               computed according to the preceding formula.  The output array
               may be equivalenced to the input array in the calling program.
               Be careful when dimensioning the arrays, in this case, to allow
               for the fact that the complex array contains two (real) words
               more than the real array.

     table     Array of dimension (n + NFR) (input or output)
               SCFFT, CSFFT: Single precision array.
               DZFFT, ZDFFT: Double precision array.

               Table of factors and roots of unity.  See the description of
               the isys argument for the value of NFR.


               If isign = 0, the routine initializes table (table is output
               only).

               If isign = +1 or -1, the values in table are assumed to be
               initialized already by a prior call with isign = 0 (table is
               input only).

     work      Array of dimension n + 2
               SCFFT, CSFFT: Single precision array.
               DZFFT, ZDFFT: Double precision array.

               Work array used for intermediate calculations.  Its address
               space must be different from that of the input and output
               arrays.

     isys      Integer array dimensioned 0..isys(0).
               An array that gives implementation-specific information.  All
               features and functions of the FFT routines specific to any
               particular implementation are confined to this isys array.

               In the Origin series implementation, isys(0)=0 and isys(0)=1
               are supported.  In SCSL versions prior to 1.3, only isys(0)=0
               was allowed. For isys(0)=0, NFR=15, and for isys(0)=1, NFR=256.
               The NFR words of storage in the table array contain a
               factorization of the length of the transform.

               The smaller value of NFR for isys(0)=0 is historical. It is too
               small to store all the required factors for the highest
               performing FFT, so when isys(0)=0, extra space is allocated
               when the table array is initialized. To avoid memory leaks,
               this extra space must be deallocated when the table array is no
               longer needed. The SCFFTF routine is used to release this
               memory. Due to the potential for memory leaks, the use of
               isys(0)=0 should be avoided.

               For isys(0)=1, the value of NFR is large enough so that no
               extra memory needs to be allocated, and there is no need to
               call SCFFTF to release memory. If called, it does nothing.

               NOTE: isys(0)=1 means that isys is an integer array with two
               elements. The second element, isys(1), will not be accessed.

NOTES
     The following data types are described in this documentation:

          Term Used                     Data type

     Fortran:

          Array dimensioned 0..n-1      x(0:n-1)


          Array of dimensions (m,n)     x(m,n)

          Array of dimensions (m,n,p)   x(m,n,p)

          Integer                       INTEGER (INTEGER*8 for -lscs_i8[_mp])

          Single precision              REAL

          Double precision              DOUBLE PRECISION

          Single precision complex      COMPLEX

          Double precision complex      DOUBLE COMPLEX

     C/C++:

          Array dimensioned 0..n-1      x[n]

          Array of dimensions (m,n)     x[m*n] or x[n][m]

          Array of dimensions (m,n,p)   x[m*n*p] or x[p][n][m]

          Integer                       int (long long for -lscs_i8[_mp])

          Single precision              float

          Double precision              double

          Single precision complex      scsl_complex

          Double precision complex      scsl_zomplex

     C++ STL:

          Array dimensioned 0..n-1      x[n]

          Array of dimensions (m,n)     x[m*n] or x[n][m]

          Array of dimensions (m,n,p)   x[m*n*p] or x[p][n][m]

          Integer                       int (long long for -lscs_i8[_mp])

          Single precision              float

          Double precision              double

          Single precision complex      complex<float>

          Double precision complex      complex<double>


CAUTIONS
     Transform sizes with a prime factor exceeding 232-1 are not supported for
     the 8-byte integer version of the library.

     In addition to the work array, the FFT routines also dynamically allocate
     scratch space from the stack. The amount of space allocated can be
     slightly bigger than the size of the largest processor cache. For single
     processor runs, the default stack size is large enough that these
     allocations generally cause no problems. But for parallel runs, you need
     to ensure that the stack size of slave threads is big enough to hold this
     scratch space. Failure to reserve sufficient stack space will cause
     programs to dump core due to stack overflows.  The stack size of MP
     library slave threads is controlled via the MP_SLAVE_STACKSIZE
     environment variable or the mp_set_slave_stacksize() library routine. See
     the mp(3C), mp(3F) and pe_environ(5) reference pages for more information
     on controlling the slave stack size. For pthreads applications, the
     thread's stack size is specified as one of many creation attributes
     provided in the pthread_attr_t argument to pthread_create(3P).  The
     stacksize attribute should be set explicitly to a non-default value using
     the pthread_attr_setstacksize(3P) call, described in the
     pthread_attr_init(3P) man page.

     Care must be exercised if copies of the table array are used: even though
     a copy exists, the original must persist. As an example, the following
     code will not work:

          #include <scsl_fft.h>
          float x[1024];
          scsl_complex y[513];
          float table[1024+256];
          float work[1024+2];
          int isys[2];
          isys[0] = 1;
          {
             float table_orig[1024+256];

             scfft(0, 1024, 1.0f, x, y, table_orig, work, isys)
             bcopy(table_orig, table, (1024+256)*sizeof(float));
          }
          scfft(1, 1024, 1.0f, x, y, table, work, isys)


     In this example, because table_orig is a stack variable that does not
     persist outside of the code block delimited by the braces, the data in
     the copy, table, are not guaranteed to be valid. However, the following
     code will work because table_orig is persistent:

          #include <scsl_fft.h>
          float x[1024];
          scsl_complex y[513];
          float table_orig[1024+256];
          float table[1024+256];


          float work[1024+2];
          int isys[2];
          isys[0] = 1;
          scfft(0, 1024, 1.0f, x, y, table_orig, work, isys)
          bcopy(table_orig, table, (1024+256)*sizeof(float));
          scfft(1, 1024, 1.0f, x, y, table, work, isys)


EXAMPLES
     These examples use the table and workspace sizes appropriate to Origin
     series.

     Example 1:  Initialize the TABLE array in preparation for doing an FFT of
     size 1024.  In this case only the arguments isign, n, and table are used.
     You can use dummy arguments or zeros for the other arguments in the
     subroutine call.

     Fortran:

          REAL TABLE(1024+256)
          INTEGER ISYS(0:1)
          ISYS(0) = 1
          CALL SCFFT(0, 1024, 0.0, DUMMY, DUMMY, TABLE, DUMMY, ISYS)


     C/C++:

          #include <scsl_fft.h>
          float table[1024+256];
          int isys[2];
          isys[0] = 1;
          scfft(0, 1024, 0.0f, NULL, NULL, table, NULL, isys);


     C++ STL:

          #include <complex.h>
          #include <scsl_fft.h>
          float table[1024+256];
          int isys[2];
          isys[0] = 1;
          scfft(0, 1024, 0.0f, NULL, NULL, table, NULL, isys);


     Example 2:  X is a real array dimensioned (0...1023), and Y is a complex
     array dimensioned (0...:512).  Take the FFT of X and store the results in
     Y.  Before taking the FFT, initialize the TABLE array, as in example 1.

     Fortran:

          REAL X(0:1023)
          COMPLEX Y(0:512)


          REAL TABLE(1024+256)
          REAL WORK(1024+2)
          INTEGER ISYS(0:1)
          ISYS(0) = 1
          CALL SCFFT(0, 1024, 1.0, X, Y, TABLE, WORK, ISYS)
          CALL SCFFT(1, 1024, 1.0, X, Y, TABLE, WORK, ISYS)


     C/C++:

          #include <scsl_fft.h>
          float x[1024];
          scsl_complex y[513];
          float table[1024+256];
          float work[1024+2];
          int isys[2];
          isys[0] = 1;
          scfft(0, 1024, 1.0f, x, y, table, work, isys)
          scfft(1, 1024, 1.0f, x, y, table, work, isys)


     C++ STL:

          #include <complex.h>
          #include <scsl_fft.h>
          float x[1024];
          complex<float> y[513];
          float table[1024+256];
          float work[1024+2];
          int isys[2];
          isys[0] = 1;
          scfft(0, 1024, 1.0f, x, y, table, work, isys)
          scfft(1, 1024, 1.0f, x, y, table, work, isys)


     Example 3:  With X and Y as in example 2, take the inverse FFT of Y and
     store it back in X.  The scale factor 1/1024 is used.  Assume that the
     TABLE array is initialized already.

     Fortran:

          CALL CSFFT(-1, 1024, 1.0/1024.0, Y, X, TABLE, WORK, ISYS)


     C/C++ and C++ STL:

          csfft(-1, 1024, 1.0f/1024.0f, y, x, table, work, isys)


     Example 4:  Perform the same computation as in example 2, but equivalence
     the input and output arrays to save storage space.  Use the 8-byte
     integer version of SCSL.


     Fortran:

          REAL X(0:1023)
          COMPLEX Y(0:512)
          EQUIVALENCE ( X(1), Y(1) )
          REAL TABLE(1024+256)
          REAL WORK(1024+2)
          INTEGER*8 ISYS(0:1)
          ISYS(0) = 1_8
          CALL SCFFT(0_8, 1024_8, 1.0, X, Y, TABLE, WORK, ISYS)
          CALL SCFFT(1_8, 1024_8, 1.0, X, Y, TABLE, WORK, ISYS)


     C/C++:

          #include <scsl_fft_i8.h>
          float *x;
          scsl_complex y[513];
          float table[1024+256];
          float work[1024+2];
          long long isys[2];
          isys[0] = 1LL;
          x = (float *) &y[0];
          scfft(0LL, 1024LL, 1.0f, x, y, table, work, isys)
          scfft(1LL, 1024LL, 1.0f, x, y, table, work, isys)


     C++ STL:

          #include <complex.h>
          #include <scsl_fft_i8.h>
          float *x;
          complex<float> y[513];
          float table[1024+256];
          float work[1024+2];
          long long isys[2];
          isys[0] = 1LL;
          x = (float *) &y[0];
          scfft(0LL, 1024LL, 1.0f, x, y, table, work, isys)
          scfft(1LL, 1024LL, 1.0f, x, y, table, work, isys)


     Example 5:  Perform the same computation as in example 2, but assume that
     the lower bound of each Fortran array is 1, rather than 0.  The
     subroutine calls are not changed.

     Fortran:

          REAL X(1024)
          COMPLEX Y(513)
          CALL SCFFT(0, 1024, 1.0, X, Y, TABLE, WORK, ISYS)
          CALL SCFFT(1, 1024, 1.0, X, Y, TABLE, WORK, ISYS)


SEE ALSO
     INTRO_FFT(3S), INTRO_SCSL(3S), CCFFT(3S), CCFFTM(3S), SCFFTM(3S)


                                                                       Page 11