afIntro(3dm) afIntro(3dm) NAME afIntro, AFintro - Introduction to the Silicon Graphics Audio File Library (AF) SYNOPSIS #include <dmedia/audiofile.h> -laudiofile DESCRIPTION The Silicon Graphics Audio File Library (AF) provides a uniform programming interface to standard digital audio file formats. Thirteen audio file formats are currently supported by the library: Extended AIFF-C standard AIFF (older version) NeXT/Sun SND/AU WAVE (RIFF) Berkeley/IRCAM/CARL SoundFile MPEG1 audio bitstream Sound Designer II Audio Visual Research Amiga IFF/8SVX SampleVision VOC SoundFont2 Raw (headerless) Note that the library will continue to support additional file formats and data formats, and this has significant ramifications for how it should be used. The Audio File Library is released as a Dynamic Shared Object (DSO). This means that as the library starts supporting new file formats, programs written at an earlier time can automatically support these new formats. It requires a bit of caution to write a program so that it operates correctly as new Audio File Libraries are released. See afGetFileFormat(3dm) for an example of this. Make sure to read the CAVEATS section of each Audio Library function man page; this is where the warning about correct use will appear. SGI has adopted AIFF-C as its default digital audio file format. This means that the default file configuration for writing (see afNewFileSetup(3dm)) is set to match the default parameters of this format. For backward compatibility, the Audio File Library fully supports the older AIFF standard in addition to AIFF-C. See aiff(4) for more detailed information. Key goals of the Audio File Library are file format transparency and data format transparency. The same calls for opening a file, reading/writing basic header information (e.g., sample rate, sample format), and reading/writing sample data will work with any supported audio file format. The basic library routines for reading audio samples from files and writing samples to files, afReadFrames(3dm) and afWriteFrames(3dm), contain built-in codec support for compressed audio data. The library currently supports read-only and write-only file access. To edit an existing file, you must create a new file and copy audio and other data from the original file. Example code which shows how to copy the logical components of a file is available in the /usr/share/src/dmedia/soundfile directory. LOGICAL COMPONENTS OF AN AUDIO FILE The Audio File Library API breaks audio files up into these logical components: tracks, instrument configurations, and miscellaneous data chunks. tracks consist of audio sample data, parameters which characterize the data (sample rate, mono/stereo, compression type), and marker structures which store sample frame locations in the track. Markers are used for indicating loop point locations within audio tracks, for example. instrument configurations are collections of parameters which can be used to configure samplers to play back audio track data. These parameters include loops, gain levels, and keyboard mapping information. Though these were originally designed only for use with AIFF-C files, they have been expanded to provide access to instrument and/or sample parameters in all file types which support such things. miscellaneous data chunks include text strings (author, copyright, name, annotation) and pieces of auxiliary information (MIDI exclusive data, application- specific data). The kinds of miscellaneous data which may be stored in an audio file depend on the file format. The AIFF-C format, for example, has been designed so that it can be extended in the future to support different kinds of auxiliary data without breaking backward compatibility. The library API has been designed to accommodate extended file formats which contain multiple instrument configurations with arbitrary numbers of loops, and additional types of miscellaneous data. Future releases of the AF may support formats with multiple audio tracks. Audio File Library routines such as afReadFrames(3dm) manipulate data for a specified track in an audio file, which is specified by an integer identifier. The current implementation of the library only supports a single audio track per file, regardless of the file format, so for now, library routines expect the constant value AF_DEFAULT_TRACK whenever an audio track identifier argument is required. DATA FORMAT TRANSPARENCY In addition to allowing transparent access to a variety of audio file formats, the AF also allows transparent conversion of the audio data format between the audio track (file) and the audio buffer (as returned by afReadFrames(3dm)) and for the reverse case (afWriteFrames(3dm)). Data Format consists of the following parameters: Byte Order Sample Format Sample Width Channel Count Sampling Rate Compression Type The format of the data that is loaded by the AF into the application's audio buffer is known as the virtual format of the data. By default, this format is identical to the track format with two important exceptions: 1) The byte order always defaults to big-endian. 2) The data will always be in uncompressed format. Both of these are done in order to assure backwards compatibility with the older AF. It is possible to set the virtual byte order to little- endian using afSetVirtualByteOrder(3dm), but in the current AF the virtual compression will always be 'none'. The virtual format of the audio data may be changed at any time, and may be done without interruption of any ongoing audio i/o. Here are the current compression and decompression engines supported by the Audio File Library. Note that many others may be supported in the future. The Audio File Library implements software codecs for the following: CCITT G.711 mu-law/A-law (64 kb/s encoding for 8 kHz 16-bit data) CCITT G.722 ADPCM (48, 56, and 64 kb/s encoding for 16 kHz 16-bit data) CCITT G.726 ADPCM (16, 24, 32, or 40 kb/s for 8 kHz 16-bit data) CCITT G.728 ADPCM (16 kb/s for 8 kHz 16-bit data) GSM 06.10 (13 kb/s for 8 kHz 16-bit data) IMA ADPCM (for 16-bit data at any sample rate) MS ADPCM (for 16-bit data at any sample rate) The codecs operate in real-time at the recommended sample rates. Real- time operation for stereo data may require setting a high non-degrading process priority (see schedctl(2) or npri(1)). The Audio File Library also includes built-in support for MPEG bitstream and Aware MultiRate audio compression. In order to enable the MultiRate compression support, you need to purchase a special license from Silicon Graphics, Inc. (see AwareIntro(3dm) and afAware(3dm) for more information). Apple proprietary compression algorithms (ACE2, ACE8, MAC3, MAC6) mentioned in the AIFF-C spec are not supported by the Audio File Library. PCM MAPPING In the Audio File library, the PCM mapping solves the following problem: When you want to convert integer data to floating point data or vice versa, how do you specify the numeric mapping of one format to another? For example, do you map to the integer range or to [-1.0,1.0]? How do you deal with the asymmetry of integers about 0? An application may want to take (possibly compressed) integer files with 8,16,24, or 32 bit data in them, and read those files into a buffer in a floating point format. Sometimes the integer data in the file is signed, and sometimes it is unsigned. Similarly, an application may want to write floating point buffers which are converted to signed or unsigned integers in real time as they are written to the file. The user may have written his or her code to expect the floating point sample values within a certain range, perhaps -1.0 to 1.0, perhaps 0.0 to 1.0, perhaps -1000.0 to 1000.0. If an integer->float conversion is specified only by a slope, it is impossible to achieve some of those mappings (since the intercept is fixed). Therefore there is the concept of an intercept. That way, for example, integer data which ranged from [0,65535] could be mapped to the range [-1.0,1.0]. However, this is still not enough information. The integer range is not symmetric and the floating point range effectively is, and so we need to be more specific about how the endpoints of the mapping line up. The addition of a minimum and maximum clip value allow the application to determine how it wishes to map values outside of a symmetrical range onto another symmetrical range. This technique assumes that there exists one PCM value which corresponds to 'zero volts', and that there exists a differential PCM value which, when added to or subtracted from the 'zero volt' value, produces a value corresponding to 'full-voltage'. Also the model (optionally) includes the notion of a maximum and minimum PCM value; the library will always clip any PCM values it inputs or outputs to these values: slope the 'full-voltage' differential PCM value intercept the 'zero-volt' PCM value minclip the minimum 'legal' PCM value maxclip the maximum 'legal' PCM value The idea of 'voltage' is merely a canonical form and is in no way intended to correspond with the hardware. Note that it does not matter what the 'full-voltage' level is numerically. The user simply specifies the parameters of their data, and the library uses the PCM mapping to map the user's values to a desired internal mapping, or perhaps to another user PCM mapping. If maxclip <= minclip, this implies no clipping is to be done. It means all PCM values are legal, even if they are outside the range of 'full- voltage'. In the Audio File Library, you specify a PCM mapping for the virtual format and optionally also for the track format. This pseudo-code shows exactly how the AF maps each sample value "in_pcm" to a sample value "out_pcm." For an AFfilehandle opened for input, "in" is the track and "out" is your buffer. For an AFfilehandle opened for output, "in" is your buffer and "out" is the track: /* transform in_pcm to volts */ if (in_maxclip > in_minclip) { if (in_pcm < in_minclip) in_pcm = in_minclip; if (in_pcm > in_maxclip) in_pcm = in_maxclip; } volts = (in_pcm - in_intercept) / in_slope; /* transform volts to out_pcm */ out_pcm = out_intercept + out_slope * volts; if (out_maxclip > out_minclip) { if (out_pcm < out_minclip) out_pcm = out_minclip; if (out_pcm > out_maxclip) out_pcm = out_maxclip; } AUDIO FILE LIBRARY PROGRAMMING INTERFACE The basic data types for the Audio File Library are the AFfilesetup and AFfilehandle structures. AFfilesetup is an opaque structure which is used to configure a new audio file before it is created by afOpenFile(3dm). Parameters stored in an AFfilesetup include the file format, sample format and sample rate, and types and sizes of miscellaneous data chunks to be stored in the file. afOpenFile(3dm) allocates file header space according to the parameters stored in the AFfilesetup(3dm) argument. An AFfilehandle is an opaque structure which is used to read audio data and auxiliary information from an existing file, or write audio data and auxiliary information to a new file. The AFfilehandle structure maintains state information such as the current location of the logical audio sample read/write pointer and the logical locations of the data read/write pointers for the various miscellaneous data chunks. Following is a list of the functions included in the Audio File Library. Functions for performing basic operations on an AFfilehandle structure, and for allocating and freeing AFfilesetup and AFfilehandle structures: afCloseFile(3dm) close and deallocate an AFfilehandle afFreeFileSetup(3dm) deallocate an AFfilesetup struct afIdentifyFD(3dm) determine the audio file format for a Unix file descriptor afIdentifyNamedFD(3dm) determine the audio file format for a Unix file descriptor and supplied filename afNewFileSetup(3dm) create an AFfilesetup struct afOpenFD(3dm) create an AFfilehandle for a Unix file descriptor afOpenFile(3dm) create an AFfilehandle for a named file afReadMisc(3dm) read buffer of miscellaneous data afReadFrames(3dm) read a buffer of sample frames from an audio track afSeekMisc(3dm) seek to given location in miscellaneous data afSetConversionParams(3dm) set, via dmParams(3dm), the parameters used for format and rate conversion in an audio track. This includes the rate conversion algorithm, if any, and the dithering algorithm. afSetErrorHandler(3dm) supply an error reporting routine for the library afSetTrackPCMMapping(3dm) set the PCM mapping (slope, intercept, min clip, max clip) for the audio data in a track, overriding the track's default afSyncFile(3dm) update file header without closing file afWriteFrames(3dm) write a buffer of sample frames to an audio track afWriteMisc(3dm) write buffer of miscellaneous data Functions for obtaining information from an AFfilehandle structure: afGetAESChannelData(3dm) read AES channel status for an audio track afGetByteOrder(3dm) get the byte order (big- or little-endian) for samples in a track afGetChannels(3dm) get the number of interleaved channels in a track afGetCompression(3dm) get compression type for a track afGetCompressionParams(3dm) get compression type and algorithm- specific compression parameters for a track afGetDataOffset(3dm) get the offset in bytes from the beginning of the file to the beginning of the audio data afGetFD(3dm) get Unix file descriptor from an AFfilehandle afGetFileFormat(3dm) get the file format for an AFfilehandle afGetFormatParams(3dm) get (via dmParams(3dm)) the sample format, channels, byte order, etc. for a track afGetFrameCount(3dm) get the total number of frames in a track. afGetInstIDs(3dm) get list of instrument map id's afGetInstParamLong(3dm) get value of an instrument map parameter afGetLoopCount(3dm) get the loop count (number of repetitions) for a given loop afGetLoopEnd(3dm) get the marker id for a loop's end frame afGetLoopEndFrame(3dm) get a loop's end frame directly afGetLoopIDs(3dm) get a list of loop id's for an instrument config afGetLoopMode(3dm) get the loop mode for a given loop afGetLoopStart(3dm) get the marker id for a loop's start frame afGetLoopStartFrame(3dm) get a loop's start frame directly afGetLoopTrack(3dm) get the track id for a given loop afGetMarkComment(3dm) get text comment for a marker afGetMarkIDs(3dm) get list of markers for an audio track afGetMarkName(3dm) get name of a marker afGetMarkPosition(3dm) get track location for a given marker afGetMiscIDs(3dm) get list of miscellaneous data chunks afGetMiscSize(3dm) get size of a miscellaneous data chunk afGetMiscType(3dm) get type of data in miscellaneous data chunk afGetPCMMapping(3dm) get the PCM mapping (slope, intercept, min clip, max clip) for the audio data in a track afGetRate(3dm) get the sample rate for an audio track afGetSampleFormat(3dm) get the sample format and resolution (sample width) for a track afGetTrackBytes(3dm) get the total raw byte count for the audio data in a track afGetTrackIDs(3dm) get list of track id's for an AFfilehandle Functions for setting and getting the virtual (audio data buffer) format. afGetVirtualByteOrder(3dm) get the byte order (big- or little- endian) of the audio data buffer afSetVirtualByteOrder(3dm) set the byte order (big- or little- endian) of the audio data buffer afGetVirtualChannels(3dm) get the number of interleaved channels in the audio data buffer afSetVirtualChannels(3dm) set the number of interleaved channels in the audio data buffer afSetVirtualFormatParams(3dm) set (via dmParams(3dm)) the virtual sample format, channels, byte order, etc. for a track afGetVirtualFrameSize(3dm) get the frame size in bytes for the audio data buffer afGetVirtualPCMMapping(3dm) get the PCM mapping (slope, intercept, min clip, max clip) for the audio data buffer afSetVirtualPCMMapping(3dm) set the PCM mapping (slope, intercept, min clip, max clip) for the audio data buffer afGetVirtualRate(3dm) get the sampling rate of the audio data buffer. afSetVirtualRate(3dm) set the sampling rate for the audio data buffer. Data will be automatically rate-converted to match the virtual rate setting. afGetVirtualSampleFormat(3dm) get the sample format and resolution (sample width) of the audio data buffer afSetVirtualSampleFormat(3dm) set the sample format and resolution of the audio data buffer Functions for setting initialization parameters in an AFfilesetup structure (which is used to configure an audio file when the file is opened): afInitAESChannelData(3dm) reserve space for AES channel status in a new file afInitByteOrder(3dm) configure the byte order for the audio data in a track afInitChannels(3dm) configure number of channels for a new track afInitCompression(3dm) configure compression type for a track afInitCompressionParams(3dm) configure compression type and algorithm- specific parameters afInitFileFormat(3dm) configure the format for a new file afInitFormatParams(3dm) configure (via dmParams(3dm)) the sample format, channels, byte order, etc. for a new track afInitDataOffset(3dm) configure the offset in bytes from the beginning of the file for the audio data afInitFrameCount(3dm) configure the expected frame count for the audio data afInitInstIDs(3dm) configure instrument config id's for a new file afInitLoopIDs(3dm) configure loop id's for an instrument map afInitMarkComment(3dm) configure text comment of a marker afInitMarkIDs(3dm) configure marker id's for an audio track afInitMarkName(3dm) configure name of a marker afInitMiscIDs(3dm) configure miscellaneous data chunk id's afInitMiscSize(3dm) configure size of a miscellaneous chunk afInitMiscType(3dm) configure type of data for a miscellaneous chunk afInitPCMMapping(3dm) configure the PCM mapping (slope, intercept, min clip, max clip) for the audio data in a track afInitRate(3dm) configure sample rate for a new track afInitSampleFormat(3dm) configure sample format for a new track afInitTrackIDs(3dm) configure track id's for a new audio file Functions for setting values in an audio file after it has been opened. afSetAESChannelData(3dm) write AES channel status to an audio track afSetLoopCount(3dm) set loop count (number of repetitions) for a specified loop afSetLoopEnd(3dm) set marker for a loop's end frame afSetLoopEndFrame(3dm) set a loop's end frame directly afSetLoopMode(3dm) set loop mode for a specified loop afSetLoopStart(3dm) set marker for a loop's start frame afSetLoopStartFrame(3dm) set a loop's start frame directly afSetLoopTrack(3dm) set the track id for a given loop afSetMarkPosition(3dm) set the track location for a given marker Functions for querying static parameters associated with the Audio File Library afQuery(3dm) return the value of a parameter as an AUpvlist struct afQueryLong(3dm) return the value of a parameter as a long integer afQueryDouble(3dm) return the value of a parameter as a double precision float point afQueryPointer(3dm) return the value of a parameter as a generic pointer (void *) CAVEATS FOR USING THE HANDLE'S FILE DESCRIPTOR The file descriptor returned by afGetFD(3dm) is not intended to allow users to read, write, and seek in the file without the knowledge of the Audio File Library. Doing so will cause the library to give unpredictable results unless the user saves and restores the file position whenever they modify it. This can be done using afSaveFilePosition(3dm) and afRestoreFilePosition(3dm). The same precautions must be also used with the file descriptor given to afOpenFD(). Developers can get the offset of the audio data in an audio file via the afGetDataOffset(3dm) function. CAVEATS ABOUT THE MANNER IN WHICH THE AF ACCESSES FILES SGI gives no guarantees about the number or nature of UNIX system calls that will result from a given AF call. In particular, afReadFrames() and afWriteFrames() could actually read or write any amount of data from the file, or could read or write more than once in varying chunk sizes. Also, afOpenFile(), afSeekFrame(), afSyncFile(), afCloseFile(), and other AF functions could result in any amount of data being read from or written to the file. The AF will not write to a file opened for read access or read from a file opened for write access. Users who are attempting to optimize the I/O in their program by managing I/O system call behavior should be aware that at this time we offer no guarantees about when the AF will perform system calls. CAVEATS FOR MULTITHREADED PROGRAMMING The Audio File Library is NOT a multi-thread and/or multi-processor safe library, in the following sense: Users can make multiple, simultaneous, uncoordinated AF calls on different AFfilehandles from different threads and the library will operate fine. Each AFfilehandle completely encapsulates the state needed to do operations on that AFfilehandle (except for error handling, which is explained next). Users cannot make multiple, simultaneous, uncoordinated AF calls from different threads to set or access the library's global state--namely, the error handler function. If two threads simultaneously try to set the global error handler (even the same error handler), the behavior is undefined. See below for an alternative. Furthermore, if the user writes an error handler, then makes multiple, simultaneous, uncoordinated AF calls on different filehandles from different threads, and both AF calls issue an error simultaneously, then two instances of the user's error handler will be called in a simultaneous, uncoordinated manner in two threads. If this situation is possible in a user's program, the user should use semaphores in their error handler in order to make sure their handler doesn't try and report or deal with two errors at the same time. Note that any AF function can cause an AF error to occur. Do not assume a function will not produce and error just because it is simple. A new form of MT-safe error handling mechanism is now available; if an application wishes to use it, it should call afSetErrorHandler(3dm) with a NULL value to disable the old error handler system, and call dmGetError(3dm) when a function returns an error value. The application must also add -ldmedia to the link arguments if it calls this routine. Now the most important caveat: Users cannot make multiple, simultaneous, uncoordinated AF calls on the same AFfilehandle from different threads, even if the order of execution of those calls does not matter to the user. Doing so will very likely cause a core dump, or at least corruption of the AFfilehandle. This behavior will never be changed, as we refuse to make our developers pay the price of semaphore locking code at the beginning and end of every afReadFrames and afWriteFrames call. Most users do not need, and in fact really do not want, semaphore protection that is built-in to the AF calls themselves. FILES Audio File Library header file: /usr/include/dmedia/audiofile.h Audio File Library code examples: /usr/share/src/dmedia/soundcommands/* /usr/share/src/dmedia/soundfile/* The programs playaiff(1) and playaifc(1) are now installed as links to the program sfplay(1); recordaiff(1) and recordaifc(1) are now installed as links to the program sfrecord(1). These programs are based on calls to the and Audio File Library and Audio Library. The file aifcconvert.c is actually the source for several programs which are installed in /usr/sbin: aifc2aiff(1), aiff2aifc(1), aifccompress(1), and aifcdecompress(1). BUGS The AIFF-C "comments chunk" described in the format spec is not yet supported by the library. AIFF-C files which contain comment-marker data will parse, but there is not yet a way to access comment-marker information through the Audio File API. DOCUMENTATION Digital Audio/MIDI Programming Guide Audio Interchange File Format AIFF-C Specification, Apple Computer Inc. Addendum to the Audio Interchange File Format AIFF-C Specification, Silicon Graphics Inc. CCITT Recommendation G.711 CCITT Recommendation G.722 aware(5), Introduction to Aware audio compression, Aware Inc. ISO/IEC MPEG Specification afAware(3dm), Audio File Library interface to Aware audio compression RELATED LIBRARIES ALintro(3dm), Introduction to the SGI Audio Library CDaudio(3), Introduction to the SGI Audio Compact Disc library DATaudio(3), Introduction to the SGI Digital Audio Tape Library dmIntro(3dm), Introduction to the IRIS Digital Media Libraries /usr/include/dmedia/midi.h, header file for the SGI MIDI Library SEE ALSO IRIX Real-time Support: npri(1), select(2), sproc(2), setitimer(2), schedctl(2) Audio File Library demo programs: recordaifc(1), playaifc(1), aifcinfo(1), dmGetError(3dm), aiff2aifc(1), aifc2aiff(1), aifccompress(1), aifcdecompress(1) Page 13