sfkeywords(1) sfkeywords(1) NAME sfkeywords - soundfile keywords used in sfinfo, sfplay, and sfconvert SYNOPSIS Many of the sf programs require descriptions of soundfile formats. These descriptions are always specified using the same set of keywords, which are given one after the other on the command line, separated by spaces. byteorder e endian (e is big or little) channels n n-channel file (1 or 2) rate r sampling rate r, in Hertz format f file format f (see below) integer n s n-bit integer file, where s is: 2scomp: 2's complement signed data unsigned: unsigned data float m floating point file, maxamp m (usually 1.0) mulaw mulaw file (8-bit only) dataoff o data starts at byte offset o (for raw data) The keywords do not need to be spelled out; only the first character, or the first 2 characters for 'float' and 'format', is required. DESCRIPTION These keywords are used in situations where information about a soundfile format is needed, such as in sfconvert: sfconvert in.snd out.aif format aiff integer 16 2 chan 2 Specifies a stereo, 16-bit (2's complement signed) integer aiff file. Note that some keywords, such as 'integer', require parameters. These parameters can also be abbreviated, except for the parameter of the 'format' keyword. The 'format' keyword specifies the file format. Currently supported file formats are: aiff Audio Interchange File Format aifc AIFF-C File Format next NeXT/Sun Format wave MS RIFF WAVE Format The 'channels' and 'rate' keywords are fairly straightforward. They simply specify how many interleaved channels of data the soundfile has and what sampling rate the data is meant to be played at (in Hertz). Here are some notes on sampling rates: Some files, particularly mulaw-encoded 8-bit NeXT soundfiles, have a sampling rate of 8012.8210513 Hz, which is often abbreviated to 8012.82 Hz or 8.013 kHz. When converting another file to a file with this sampling rate, you should be sure to specify the full-precision rate. Otherwise some programs may not recognize the file as playable. WAVE files store sampling rate as an integral number of samples per second, therefore they cannot support this sampling rate. The sfconvert and soundfiler utilities will perform high-quality linear phase sampling rate conversion between the standard rates 8000, 11025, 16000, 22050, 32000, 44100, and 48000 Hz. For conversions where the source or target rate is not one of these standard rates, sfconvert and soundfiler use a lower-quality algorithm, and issue a warning to this effect. For these lower-quality conversions, some loss of quality is likely, and audible artifacts may occur in the output sound, especially on conversions from a higher to a lower sampling rate. This lower quality algorithm, which was present in earlier releases, uses third- order polynomial interpolation and does marginal anti-aliasing. A high- quality algorithm capable of conversion between arbitrary pairs of sampling rates is under development. In order to allow high-quality rate conversion in fairly common cases, if you attempt to convert an 8012.8210513 Hz soundfile to a soundfile with any standard rate except 8000 Hz, sfconvert and soundfiler will assume the input rate is 8000 Hz and perform the conversion, again issuing a warning to this effect. If the -0.16 % shift in pitch (less than three hundredths of a semitone) is not acceptable, you can first convert the 8012.8210513 Hz soundfile into a 8000 Hz soundfile and then convert the 8000 Hz soundfile to another standard rate, as in the following: sfconvert in.aiff temp.aiff rate 8000 sfconvert temp.aiff out.aiff rate 16000 In this case sfconvert and soundfiler will use the older algorithm, which is of acceptable quality for small changes in sampling rate, to do the first conversion, and the new algorithm to do the second conversion with the best quality. The dual of the previous conversion is possible with a similar procedure. You may convert from any standard rate to 8012.8210513 Hz by first converting to 8000 Hz, and then to 8012.8210513 Hz: sfconvert in.aiff temp.aiff rate 8000 sfconvert temp.aiff out.aiff rate 8012.8210513 The 'integer', 'float', and 'mulaw' keywords are mutually exclusive (although no error will be reported if you use more than 1). Each specifies the encoding format of the actual samples themselves: - an 'integer' soundfile stores sound information as simple unsigned or 2's complement 1-32 bit integers. In the signed case, 0 is the zero signal level. In the unsigned case, (2^b)/2 is the zero signal level, where b is the number of bits per integer. - a 'mulaw' soundfile, which for these programs must be in 8-bit format, stores companded 13-bit sample values in an 8-bit, unsigned-like format. If you play a mulaw file using sfplay, its samples are automatically converted to 16-bit samples which the audio hardware can output. - a 'float' soundfile consists of IEEE standard floating point numbers. Generally, -1.0 represents full negative amplitude and 1.0 represents full positive amplitude, but it is quite possible to generate a soundfile with sample values of magnitude greater than 1.0. For this reason, the 'float' keyword takes an argument as to what value should be treated as full maximum amplitude. This is usually 1.0. If you play a floating point file using sfplay, its sample values are automatically scaled based on a 1.0 maxamp and converted to 24-bit integers which the audio hardware can output. When converting floating point data to integer data and vice versa, the sf programs always assume that the highest positive value ((2^b)/2-1 for b-bit 2's complement integers) maps to the floating point maximum amplitude, usually 1.0. For example, when converting 16-bit 2's complement integers to floats of maximum amplitude 1.0, 32767 will map to +1.0, and -32767 will map to -1.0. This was done so that it is possible to convert a floating point file to an integer file without clipping a value off the positive end of the integral range. This means that when converting ints to floats, it is possible that there will be one value in the output file that is less than -maxamp where maxamp is the maximum amplitude specified after the 'float' keyword. If this is a problem, use a slightly different maximum amplitude which puts all output values inside the actual desired maximum amplitude. The 'byteorder' keyword specifies the byte ordering (endian) of the data. This only applies to > 8 bit data, and is currently only consulted for integer data. Integer data can be big endian, meaning it conforms to SGI MIPS / Motorola byte ordering, or it can be little endian, meaning it conforms to Intel byte ordering. All formats supported by the sf programs use big endian except WAVE. Any >8 bit raw file transferred to/from a PC should be converted to/from little endian (respectively). For UNIX and Macintosh (t.m.) files, big endian data is almost always desired, and it is the default. Note that little endian floating point representations are currently not supported. In the soundfiler program, big endian is always assumed for raw data, AIFF, and AIFF-C, and little endian is assumed for WAVE. The 'dataoff' keyword is used only when specifying the format of raw data. This feature can be useful if you have a file which contains some sound data that starts somewhere in the middle of the file. The offset is given in bytes from the beginning of the file. The 'dataoff' keyword can be used to convert or play a soundfile in a format that the sf programs do not recognize, if the offset of the sound data can be determined. It would then be possible to convert the file to an aiff or other file which is more easily manipulated on Silicon Graphics machines. CAVEATS Some keywords only make sense in certain contexts: - 'channels', 'rate', 'integer', 'float', 'mulaw' can be used anywhere. - 'format' does not make sense when describing the format of raw (headerless) data. Its purpose is to specify which type of header (aiff, next, wave, etc.) to format the file with. - 'dataoff' only makes sense when describing raw data, since the offset of the sound data is known for soundfiles which have headers. BUGS See the above discussion about rate conversion for an important note about conversion to/from a nonstandard rate (standard rates are those which appear on the Audio Control Panel). Note that no dithering is done on conversions from integers of higher resolution to lower resolution. This will be amended in a future release. There should be a 'datasize' keyword to use with 'dataoff' when converting a soundfile of an unsupported format to a playable file. This is coming. Currently sfconvert assumes that the sound data continues to the end of the file. AUTHOR Silicon Graphics Inc.; Apple Computer, Inc. for AIFF code. SEE ALSO intro(3a) for more about the audio library. sfplay(1), sfinfo(1), sfconvert(1), soundfiler(1). Page 4