FCNTL(2) FCNTL(2) NAME fcntl - file and descriptor control C SYNOPSIS #include <unistd.h> #include <fcntl.h> int fcntl (int fildes, int cmd, ... /* arg */); DESCRIPTION fcntl provides for control over open descriptors. fildes is an open descriptor obtained from a creat, open, dup, fcntl, pipe, socket, or socketpair system call. The commands available are: F_DUPFD Return a new descriptor as follows: Lowest numbered available descriptor greater than or equal to the third argument, arg, taken as an object of type int. Refers to the same object as the original descriptor. Same file pointer as the original file (i.e., both file descriptors share one file pointer). Same access mode (read, write or read/write). Same descriptor status flags (i.e., both descriptors share the same status flags). Shares any locks associated with the original file descriptor. The close-on-exec flag, FD_CLOEXEC associated with the new descriptor is cleared to keep the file open across calls to the exec(2) family of functions. F_GETFD Get the file descriptor flags associated with the descriptor fildes. If the FD_CLOEXEC flag is 0 the descriptor will remain open across exec, otherwise the descriptor will be closed upon execution of exec. F_SETFD Set the file descriptor flags for fildes. Currently the only flag implemented is FD_CLOEXEC. Note: this flag is a per- process and per-descriptor flag; setting or clearing it for a particular descriptor will not affect the flag on descriptors copied from it by a dup(2) or F_DUPFD operation, nor will it affect the flag on other processes instances of that descriptor. F_GETFL Get file status flags and file access modes. The file access modes may be extracted from the return value using the mask O_ACCMODE. F_SETFL Set file status flags to the third argument, arg, taken as an object of type int. Only the following flags can be set [see FLCFLUSH, FLCINVAL, FDIRECT, and FASYNC. Since arg is used as a bit vector to set the flags, values for all the flags must be specified in arg. (Typically, arg may be constructed by obtaining existing values by F_GETFL and then changing the particular flags.) FAPPEND is equivalent to O_APPEND; FSYNC is equivalent to O_SYNC; FDSYNC is equivalent to O_DSYNC; FRSYNC is equivalent to O_RSYNC; FNDELAY is equivalent to O_NDELAY; FNONBLK is equivalent to O_NONBLOCK; FLCFLUSH is equivalent to O_LCFLUSH; FLCINVAL is equivalent to O_LCINVAL; and FDIRECT is equivalent to O_DIRECT. FASYNC is equivalent to calling ioctl with the FIOASYNC command (except that with ioctl all flags need not be specified). This enables the SIGIO facilities and is currently supported only on sockets. Since the descriptor status flags are shared with descriptors copied from a given descriptor by a dup(2) or F_DUPFD operation, and by other processes instances of that descriptor a F_SETFL operation will affect those other descriptors and other instances of the given descriptors as well. For example, setting or clearing the FNDELAY flag will logically cause an FIONBIO ioctl(2) to be performed on the object referred to by that descriptor. Thus all descriptors referring to that object will be affected. Flags not understood for a particular descriptor are silently ignored except for FDIRECT. FDIRECT will return EINVAL if used on other than an EFS, XFS or BDS file system file. In Irix 6.5.24 and beyond, remote NFS Version 3 file systems also support FDIRECT operation. F_FREESP Alter storage space associated with a section of the ordinary file fildes. The section is specified by a variable of data type struct flock pointed to by the third argument arg. The data type struct flock is defined in the <fcntl.h> header file [see fcntl(5)] and contains the following members: l_whence is 0, 1, or 2 to indicate that the relative offset l_start will be measured from the start of the file, the current position, or the end of the file, respectively. l_start is the offset from the position specified in l_whence. l_len is the size of the section. An l_len of 0 frees up to the end of the file; in this case, the end of file (i.e., file size) is set to the beginning of the section freed. Any data previously written into this section is no longer accessible. If the section specified is beyond the current end of file, the file is grown and filled with zeroes. The l_len field is currently ignored, and should be set to 0. F_ALLOCSP This command is identical to F_FREESP. F_FREESP64 This command is identical to F_FREESP except that the type of the data referred to by the third argument arg is a struct flock64. In this version of the structure, l_start and l_len are of type off64_t instead of off_t (64 bits instead of 32 bits). F_ALLOCSP64 This command is identical to F_FREESP64. F_FSSETDM Set the di_dmevmask and di_dmstate fields in an XFS on-disk inode. The only legitimate values for these fields are those previously returned in the bs_dmevmask and bs_dmstate fields of the bulkstat structure. The data referred to by the third argument arg is a struct fsdmidata. This structure's members are fsd_dmevmask and fsd_dmstate. The di_dmevmask field is set to the value in fsd_dmevmask. The di_dmstate field is set to the value in fsd_dmstate. This command is restricted to root or to processes with device management capabilities. Its sole purpose is to allow backup and restore programs to restore the aforementioned critical on-disk inode fields. F_DIOINFO Get information required to perform direct I/O on the specified fildes. Direct I/O is performed directly to and from a user's data buffer. Since the kernels buffer cache is no longer between the two, the user's data buffer must conform to the same type of constraints as required for accessing a raw disk partition. The third argument, arg, points to a data type struct dioattr which is defined in the <fcntl.h> header file and contains the following members: d_mem is the memory alignment requirement of the user's data buffer. d_miniosz specifies block size, minimum I/O request size, and I/O alignment. The size of all I/O requests must be a multiple of this amount and the value of the seek pointer at the time of the I/O request must also be an integer multiple of this amount. d_maxiosz is the maximum I/O request size which can be performed on the fildes. If an I/O request does not meet these constraints, the read(2) or write(2) will return with EINVAL. In IRIX 6.5.19 and beyond, the alignment requirement has been relaxed to allow for 512 byte sizes. It is, however, still strongly recommended that the alignment values returned by F_DIOINFO be used for performance and for portability to older IRIX systems. All I/O requests are kept consistent with any data brought into the cache with an access through a non-direct I/O file descriptor. See also F_SETFL above and open (2). F_GETOWN Used by sockets: get the process ID or process group currently receiving SIGIO and SIGURG signals; process groups are returned as negative values. F_SETOWN Used by sockets: set the process or process group to receive SIGIO and SIGURG signals; process groups are specified by supplying arg as negative, otherwise arg is interpreted as a process ID. F_CLOSEM Close all file descriptors greater than or equal to fildes. F_FSGETXATTR Get additional attributes associated with files in XFS file systems. The arg points to a variable of type struct fsxattr. The structure fields include: fsx_xflags (extended flag bits), fsx_extsize (nominal extent size in file system blocks), fsx_nextents (number of data extents in the file). A fsx_extsize value returned indicates that a preferred extent size was previously set on the file, a fsx_extsize of 0 indicates that the defaults for that filesystem will be used. Currently the meaningful bits for the fsx_xflags field are: Bit 0 (0x1) - XFS_XFLAG_REALTIME The file is a realtime file. Bit 1 (0x2) - XFS_XFLAG_PREALLOC The file has preallocated space. Bit 7 (0x80) - XFS_XFLAG_NODUMP The file should be skipped by backup utilities. Bit 8 (0x100) - XFS_XFLAG_RTINHERIT Realtime inheritance bit - new files created in the directory will be automatically realtime, and new directories created in the directory will inherit the inheritance bit. Bit 9 (0x200) - XFS_XFLAG_PROJINHERIT New files created in the directory will have the project ID of the directory, and new directories created in the directory will inherit the inheritance bit. Bit 10 (0x400) - XFS_XFLAG_NOSYMLINKS Symbolic links cannot be created in a directory with this bit set. Bit 31 (0x80000000) - XFS_XFLAG_HASATTR The file has extended attributes associated with it. F_FSGETXATTRA Identical to F_FSGETXATTR except that the fsx_nextents field contains the number of attribute extents in the file. F_FSSETXATTR Set additional attributes associated with files in XFS file systems. The arg points to a variable of type struct fsxattr, but only the following fields are used in this call: fsx_xflags and fsx_extsize. The fsx_xflags realtime file bit, and the file's extent size, may be changed only when the file is empty. F_GETBMAP Get the block map for a segment of a file in an XFS file system. The arg points to an array of variables of type struct getbmap. All sizes and offsets in the structure are in units of 512 bytes. The structure fields include: bmv_offset (file offset of segment), bmv_block (starting block of segment), bmv_length (length of segment), bmv_count (number of array entries, including the first), and bmv_entries (number of entries filled in). The first structure in the array is a header, and the remaining structures in the array contain block map information on return. The header controls iterative calls to the F_GETBMAP command. The caller fills in the bmv_offset and bmv_length fields of the header to indicate the area of interest in the file, and fills in the bmv_count field to indicate the length of the array. If the bmv_length value is set to -1 then the length of the interesting area is the rest of the file. On return from a call, the header is updated so that the command can be used again to obtain more information, without re-initializing the structures. Also on return, the bmv_entries field of the header is set to the number of array entries actually filled in. The non- header structures will be filled in with bmv_offset, bmv_block, and bmv_length. If a region of the file has no blocks (is a hole in the file) then the bmv_block field is set to -1. F_GETBMAPA Identical to F_GETBMAP except that information about the attribute fork of the file is returned. F_RESVSP This command is used to allocate space to a file. A range of bytes is specified with the struct flock. The blocks are allocated, but not zeroed, and the file size does not change. It is only supported on XFS and BDS filesystems. If the XFS filesystem is configured to flag unwritten file extents, performance will be negatively affected when writing to preallocated space, since extra filesystem transactions are required to convert extent flags on the range of the file written. If xfs_growfs(1M) with the -n option reports unwritten=1, then the filesystem was made to flag unwritten extents. Only the root user is permitted to execute xfs_growfs(1M). F_RESVSP64 This command is identical to F_RESVSP except that the type of the data referred to by the third argument arg is a struct flock64. In this version of the structure, l_start and l_len are of type off64_t instead of off_t (64 bits instead of 32 bits). F_UNRESVSP This command is used to free space from a file. A range of bytes is specified with the struct flock. Partial filesystem blocks are zeroed, and whole filesystem blocks are removed from the file. The file size does not change. It is only supported on XFS and BDS filesystems. F_UNRESVSP64 This command is identical to F_UNRESVSP except that the type of the data referred to by the third argument arg is a struct flock64. In this version of the structure, l_start and l_len are of type off64_t instead of off_t (64 bits instead of 32 bits). F_FSYNC fsync data in a range of an ordinary file fildes. The section is specified by a variable of data type struct flock pointed to by the third argument arg. The data type struct flock is defined in the <fcntl.h> header file [see fcntl(5)]. If field l_type is set to 1, the call behaves like fdatasync(2). If field l_type is set to 0, the call behaves like fsync(2). fdatasync(2) syncs only the inode state required to ensure that the data is permanently on the disk. fsync(2) syncs everything that fdatasync(2) flushes but also syncs out the other state associated with the file such as the current timestamps, permissions, owner, etc. l_start specifies the start of the range in the file to be sync'ed. l_len specifies the size of the range. A l_len of 0 flushes everything up to the end of the file. The remaining fields are ignored and should be set to 0. F_FSYNC64 This command is identical to F_FSYNC except that the type of the data referred to by the third argument arg is a struct flock64. In this version of the structure, l_start and l_len are of type off64_t instead of off_t (64 bits instead of 32 bits). F_GETBIOSIZE This command gets information about the preferred buffered I/O size used by the system when performing buffered I/O (e.g. standard Unix non-direct I/O) to and from the file. The information is passed back in a structure of type struct biosize pointed to by the third argument arg. The data type struct biosize is defined in the <fcntl.h> header file [see fcntl(5)]. biosize lengths are expressed in log base 2. That is if the value is 14, then the true size is 2^14 (2 raised to the 14th power). The biosz_read field will contain the current value used by the system when reading from the file. Except at the end-of-file, the system will read from the file in multiples of this length. The biosz_write field will contain the current value used by the system when writing to the file. Except at the end-of-file, the system will write to the file in multiples of this length. The dfl_biosz_read and dfl_biosz_write will be set to the system default values for the opened file. The biosz_flags field will be set to 1 if the current read or write value has been explicitly set. The F_GETBIOSIZE fcntl is supported only on XFS filesystems. F_SETBIOSIZE This command the preferred buffered I/O size used by the system when performing buffered I/O (e.g. standard Unix non-direct I/O) to and from the file. The information is passed in a structure of type struct biosize pointed to by the third argument arg. Using smaller preferred I/O sizes can result in performance improvements if the file is typically accessed using small synchronous I/Os or if only a small amount of the file is accessed using small random I/Os, resulting in little or no use of the additional data read in near the random I/Os. To explicitly set the preferred I/O sizes, the biosz_flags field should be set to 0 and the biosz_read and biosz_write fields should be set to the log base 2 of the desired read and write lengths, respectively (e.g. 13 for 8K bytes, 14 for 16K bytes, 15 for 32K bytes, etc.). Valid values are 13-16 inclusive for machines with a 4K byte pagesize and 14-16 for machines with a 16K byte pagesize. The specified read and write values must also result in lengths that are greater than or equal to the filesystem block size. The dfl_biosz_read and dfl_biosz_write fields are ignored. If biosizes have already been explicitly set due to a prior use of F_SETBIOSIZE, and the requested sizes are larger then the existing sizes, the fcntl call will return successfully and the system will use the smaller of the two sizes. However, if biosz_flags is set to 1, the system will use the new values regardless of whether the new sizes are larger or smaller than the old. To reset the biosize values to the defaults for the filesystem that the file resides in, the biosz_flags filed should be set to 2. The remainder of the fields will be ignored in that case. Changes made by F_SETBIOSIZE are transient. The sizes are reset to the default values once the reference count on the file drops to zero (e.g. all open file descriptors to that file have been closed). See fstab(4) for details on how to set the default biosize values for a filesystem. The F_SETBIOSIZE fcntl is supported only on XFS filesystems. The following commands are used for record-locking. Locks may be placed on an entire file or on segments of a file. F_GETLK Get the first lock which blocks the lock description given by the variable of type struct flock pointed to by arg. The information retrieved overwrites the information passed to fcntl in the flock structure. If no lock is found that would prevent this lock from being created, then the structure is passed back unchanged except that the lock type will be set to F_UNLCK and the l_whence field will be set to SEEK_SET. If a lock is found that would prevent this lock from being created, then the structure is overwritten with a description of the first lock that is preventing such a lock from being created. The returned structure will also contain the process ID and the system ID of the process holding the lock. This command never creates a lock; it tests whether a particular lock could be created. F_SETLK Set or clear a file segment lock according to the variable of type struct flock pointed to by arg [see fcntl(5)]. The cmd F_SETLK is used to establish read (F_RDLCK) and write (F_WRLCK) locks, as well as remove either type of lock (F_UNLCK). If a read or write lock cannot be set fcntl will return immediately with an error value of -1. F_SETLKW This cmd is the same as F_SETLK except that if a read or write lock is blocked by other locks, the process will sleep until the segment is free to be locked. F_GETLK64 This cmd is identical to F_GETLK but uses a struct flock64 instead of a struct flock (see F_FREESP64 above). F_SETLK64 This cmd is identical to F_SETLK but uses a struct flock64 instead of a struct flock. F_SETLKW64 This cmd is identical to F_SETLKW but uses a struct flock64 instead of a struct flock. F_SETBSDLK This cmd is identical to F_SETLK and provided for the backward compatibility only. Newer applications should use F_SETLK instead. F_SETBSDLKW This cmd is identical to F_SETLKW and provided for the backward compatibility only. Newer applications should use F_SETLKW instead. F_RSETLK Used by the network lock daemon, lockd(3N), to communicate with the NFS server kernel to handle locks on NFS files. F_RSETLKW Used by the network lock daemon, lockd(3N), to communicate with the NFS server kernel to handle locks on NFS files. F_RGETLK Used by the network lock daemon, lockd(3N), to communicate with the NFS server kernel to handle locks on NFS files. F_CHKFL This flag is used internally by F_SETFL to check the legality of file flag changes. A read lock prevents any process from write locking the protected area. More than one read lock may exist for a given segment of a file at a given time. The file descriptor on which a read lock is being placed must have been opened with read access. A write lock prevents any process from read locking or write locking the protected area. Only one write lock and no read locks may exist for a given segment of a file at a given time. The file descriptor on which a write lock is being placed must have been opened with write access. The structure flock describes the type (l_type), starting offset (l_whence), relative offset (l_start), size (l_len), process id (l_pid), and system id (l_sysid) of the segment of the file to be affected. The process id and system id fields are used only with the F_GETLK cmd to return the values for a blocking lock. Locks may start and extend beyond the current end of a file, but may not be negative relative to the beginning of the file. A lock may be set to always extend to the end of file by setting l_len to zero (0). If such a lock also has l_whence and l_start set to zero (0), the whole file will be locked. Changing or unlocking a segment from the middle of a larger locked segment leaves two smaller segments for either end. Locking a segment that is already locked by the calling process causes the old lock type to be removed and the new lock type to take effect. All locks associated with a file for a given process are removed when a file descriptor for that file is closed by that process or the process holding that file descriptor terminates. Locks are not inherited by a child process in a fork(2) system call. When file locking is used in conjunction with memory-mapped files over NFS, the smallest locking granularity which will work properly with multiple clients is the page size of the system. All clients must use the same granularity. When mandatory file and record locking is active on a file, [see chmod(2)], read(2), creat(2), open(2), and write(2) system calls issued on the file will be affected by the record locks in effect. The following commands are used for SMB opportunistic locks. An SMB server application will register oplocks on files and grant them to SMB clients. When external references are made to oplocked files, the SMB server is notified to revoke the oplocks granted to clients before operations from the external references are allowed to continue. F_OPLKREG The oplock registration command identifies the file to oplock and, via arg, the write side of the pipe (e.g. p[1] from the pipe(int *p) call) to use as the signaling mechanism. The same write side pipe can be used for any number of oplocked files. If any external references to the file already exist or the caller already has an oplock on the file, the F_OPLKREG command fails with EAGAIN. If successful, the value of OP_EXCLUSIVE is returned. F_OPLKSTAT The oplock state change command is used to get state change information on any recently externally referenced files registered with the given write side pipe (eg p[1] from a pipe(int *p) call). The returned oplock_stat_t structure pointed at by arg contains the current state (os_state) and the dev/ino information (os_dev/os_ino) to identify the file. This is only done on the write side of a pipe for which select() indicates there is a byte of data to read() on the read side. A byte of data must then be read() from the read side of the pipe for each successful F_OPLKSTAT run on the write side for select() to again give proper notification. External references that cause state change notification will hang for a while until the SMB server acknowledges the revocation (typically after revoking the oplock it granted to the SMB client) or until the systunable oplock_timeout expires. F_OPLKACK The oplock acknowledgement command is primarily used to respond to oplock state changes due to external references on the given file. The value given by arg can be OP_REVOKE to revoke the oplock either voluntarily or as an acknowledgement of a state change reported in an F_OPLKSTAT command, or it can be -1 to request the current state of the given file. If F_OPLKACK is not used to voluntarily revoke the oplock, the oplock is automatically revoked on the SMB server's last close() of the file. If F_OPLKACK is not used to revoke the oplock in response to a state change indicated in an F_OPLKSTAT command, the oplock is automatically revoked when the oplock_timeout expires. fcntl will fail if one or more of the following are true: [EACCES] cmd is F_SETLK, the type of lock (l_type) is a read lock ( F_RDLCK, ) and the segment of a file to be lock is already write locked by another process, or the type is a write lock ( F_WRLOCK, ) and the segment of a file to be locked in already read or write locked by another process. [EBADF] Fildes is not a valid open file descriptor. [EBADF] cmd is F_SETLK, or SETLKW, the type of lock (l_type) is a read lock (F_RDLCK), and fildes is not a valid file descriptor open for reading. [EBADF] cmd is F_SETLK, or SETLKW, the type of lock (l_type) is a write lock (F_WRLCK), and fildes is not a valid file descriptor open for writing. [EBADF] cmd is F_FREESP and fildes is not a valid file descriptor open for writing. [EBADF] cmd is F_OPLOCKREG and the file is not a regular file or the arg is not the write side of a pipe. [EMFILE] cmd is F_DUPFD and {OPEN_MAX} file descriptors are currently in use by this process, or no file descriptors greater than or equal to arg are available. [EINVAL] cmd is F_DUPFD. arg is either negative, or greater than or equal to the maximum number of open file descriptors allowed each user [see getdtablesize(2)]. [EINVAL] cmd is F_GETLK, F_SETLK, or F_SETLKW and arg or the data it points to is not valid. [EINVAL] cmd is F_SETFL, arg includes FDIRECT and is being performed on other than an EFS, XFS, BDS or NFS Version 3 file system file. [EINVAL] cmd is F_SETBIOSIZE and arg is invalid. [EINVAL] cmd is F_OPLKREG and fildes is a file in a filesystem other than XFS. Kernel level oplocks are only supported for XFS. [EINVAL] cmd is F_OPLKACK and the arg is not OP_REVOKE or -1. [EAGAIN] cmd is F_FREESP , the file exists, mandatory file/record locking is set, and there are outstanding record locks on the file. This restriction is not currently enforced. [EAGAIN] cmd is F_SETLK or F_SETLKW , mandatory file locking bit is set for the file, and the file is currently being mapped to virtual memory via mmap [see mmap(2)]. This restriction is not currently enforced. [EAGAIN] cmd is F_OPLKREG and there is more than one reference on the file. Oplocks thus cannot be used to guarantee exclusive access to the file. [EAGAIN] cmd is F_OPLKSTAT and there are no state change messages for the specified write side pipe. [EPERM] cmd is F_OPLKREG or F_OPLKSTAT or F_OPLKACK and the user is not superuser. [ENOLCK] cmd is F_SETLK or F_SETLKW, the type of lock is a read or write lock, and there are no more record locks available (too many file segments locked) because the system maximum {FLOCK_MAX} [see intro(2)], has been exceeded. This can also occur if the object of the lock resides on a remote system and the requisite locking daemons are not configured in both the local and the remote systems. In particular, if lockd(1M) is running but statd(1M) is not, this error will be returned. An additional source for this error is when statd(1M) is running but cannot be contacted. This can occur when the address for the local host cannot be determined. [See lockd(1M) and statd(1M).] [EINTR] cmd is F_SETLKW and a signal interrupted the process while it was waiting for the lock to be granted. [EDEADLK] cmd is F_SETLKW, the lock is blocked by some lock from another process, and putting the calling-process to sleep, waiting for that lock to become free, would cause a deadlock. [EDEADLK] cmd is F_FREESP, mandatory record locking is enabled, O_NDELAY and O_NONBLOCK are being clear and a deadlock condition was detected. [EFAULT] cmd is F_FREESP, and the value pointed to by the third argument arg resulted in an address outside the process's allocated address space. [EFAULT] cmd is F_GETLK, F_SETLK or F_SETLKW, and arg points outside the program address space. [ESRCH] cmd is F_SETOWN and no process can be found corresponding to that specified by arg. [EIO] An I/O error occurred while reading from or writing to the file system. [EOVERFLOW] cmd is F_GETLK and the process ID of the process holding the requested lock is too large to be stored in the l_pid field. [ETIMEDOUT] The object of the fcntl is located on a remote system which is not available [see intro(2)]. SEE ALSO lockd(1M), close(2), creat(2), dup(2), exec(2), fork(2), getdtablesize(2), intro(2), open(2), pipe(2), fcntl(5). DIAGNOSTICS Upon successful completion, the value returned depends on cmd as follows: F_DUPFD A new file descriptor. F_GETFD Value of flag (only the low-order bit is defined). F_SETFD Value other than -1. F_GETFL Value of file flags. F_SETFL Value other than -1. F_FREESP Value of 0. F_ALLOCSP Value of 0. F_FREESP64 Value of 0. F_ALLOCSP64 Value of 0. F_DIOINFO Value of 0. F_GETOWN pid of socket owner. F_SETOWN Value other than -1. F_FSGETXATTR Value of 0. F_FSSETXATTR Value of 0. F_GETBMAP Value of 0. F_RESVSP Value of 0. F_RESVSP64 Value of 0. F_UNRESVSP Value of 0. F_UNRESVSP64 Value of 0. F_GETLK Value other than -1. F_SETLK Value other than -1. F_SETLKW Value other than -1. F_GETLK64 Value other than -1. F_SETLK64 Value other than -1. F_SETLKW64 Value other than -1. Otherwise, a value of -1 is returned and errno is set to indicate the error. Page 13