SPROC(2) SPROC(2) NAME sproc, sprocsp, nsproc - create a new share group process C SYNOPSIS #include <sys/types.h> #include <sys/prctl.h> pid_t sproc (void (*entry) (void *), unsigned inh, ...); Type of optional third argument: void *arg; pid_t sprocsp (void (*entry) (void *, size_t), unsigned inh, void *arg, caddr_t sp, size_t len); DESCRIPTION The sproc and sprocsp system calls are a variant of the standard fork(2) call. Like fork, the sproc calls create a new process that is a clone of the calling process. The difference is that after an sproc call, the new child process shares the virtual address space of the parent process (assuming that this sharing option is selected, as described below), rather than simply being a copy of the parent. The parent and the child each have their own program counter value and stack pointer, but all the text and data space is visible to both processes. This provides one of the basic mechanisms upon which parallel programs can be built. The system call nsproc is no longer supported as an external interface; any calls to it should be replaced with sprocsp. A group of processes created by sproc calls from a common ancestor is referred to as a share group or shared process group. A share group is initially formed when a process first executes an sproc or sprocsp call. All subsequent sproc calls by either the parent or other children in this share group will add another process to the share group. In addition to virtual address space, members of a share group can share other attributes such as file tables, current working directories, effective userids and others described below. The three calls differ in just two ways - how the stack for the new process is initialized and in the interpretation of the inh argument. If the argument sp is set to NULL then the system will create a stack region for the child. This stack region will not overlap with any other area of the share group's address space. These stack regions grow downward, and are automatically grown if the process accesses new areas of the stack. The len argument specifies how much margin (in bytes) the system should attempt to leave for the child's stack. This margin is used when the system attempts to place additional stacks or other virtual spaces (e.g. from mmap). The system will attempt to leave enough room such that the stack could grow to len bytes if it needs to. This margin in no way sets a limit on stack growth nor guarantees a particular stack size. The process can continue to grow its stack up to the maximum permissible size (specified via the resource limit RLIMIT_STACK) as long as it doesn't run into any other virtual space of the share group. Conversely, if the share group's virtual space gets crowded, parts of the stack that haven't yet been claimed could be used for additional stacks or other requested virtual spaces. A minimum of 16K for len is recommended. Note that there are no 'red' zones - a process growing its stack could easily start accessing the stack of another process in the share group. If len is set to be smaller than the stack size required by the sproc at creation time, an error message indicating that there is "not enough memory to lock stack" may be reported to the system log. This indicates that the system attempted to place the sproc's stack using the len value supplied in the sprocsp call, but that the initial size of the sproc's stack would overlap into other portions of the share group's virtual space. The offending sproc will be killed. If sp is set to a valid virtual address in the share group then the stack of the new process is set to this value. With this option, the entire responsibility of stack management is the calling process's. The system will no longer attempt to automatically grow the process's stack region. sp should point to the top (highest address) of the new stack. It will automatically be rounded down to provide the appropriate alignment. No validity checks are made on sp. sproc is equivalent to calling sprocsp with the sp argument set to NULL and the len argument set to the rlim_cur value of the resource limit RLIMIT_STACK. This means that each time a process calls sproc, the total size of each member of the share group increases by the size of the new process's stack. Calling sproc or sprocsp too often, when the stack size is set very large can easily cause the share group to grow larger than the per-process maximum allowable size {PROCSIZE_MAX} [see intro(2)]. In this case, the call will fail and return ENOMEM. A process with lots of distinct virtual spaces (e.g. lots of files mapped via mmap(2)) can fragment the calling process's address space such that it is impossible to find a suitable place for the new child's stack. This case will also cause sproc or sprocsp to fail. The new child process resulting from sproc(2) differs from a normally forked process in the following ways: If the PR_SADDR bit is set in inh then the new process will share ALL the virtual space of the parent, except the PRDA (see below). During a normal fork(2) or if the PR_SADDR is not set, the writable portions of the process's address space are marked copy-on-write. If either process writes into a given page, then a copy is made of the page and given to the process. Thus writes by one process will not be visible to the other forks. With the PR_SADDR option of sproc(2), however, all the processes have read/write privileges to the entire virtual space. The new process can reference the parent's stack. The new process has its own process data area (PRDA) which contains, among other things, the process id. Part of the PRDA is used by the system, part by system libraries, and part is available to the application program [see <sys/prctl.h>]. The PRDA is at a fixed virtual address in each process which is given by the constant PRDA defined in prctl.h. The machine state (general/floating point registers) is not duplicated with the exception of the floating point control register. This means that if a process has enabled floating point traps, these will be enabled in the child process. If created via sproc the new process will be invoked as follows: entry(void *arg) If created via sprocsp the new process will be invoked as follows: entry(void *arg, size_t stksize) where stksize is the len argument the parent passed to sprocsp. In addition to the attributes inherited during the sproc call itself, the inh flag to sproc can request that the new process have future changes in any member of the share group be applied to itself. A process can only request that a child process share attributes that it itself is sharing. The creator of a share group is effectively sharing everything. These persisting attributes are selectable via the inh flag: PR_SADDR All virtual space attributes (shared memory, mapped files, data space) are shared. If one process in a share group attaches to a shared memory segment, all processes in the group can access that segment. PR_SFDS The open file table is kept synchronized. If one member of the share group opens a file, the open file descriptor will appear in the file tables of all members of the share group. Note especially that the converse is also true: if one member closes a file, it is closed for all members of the group; this has been known to surprise applications programmers! Note also that there is only one file pointer for each file descriptor shared within a shared process group. PR_SDIR The current and root directories are kept synchronized. If one member of the group issues a chdir(2) or chroot(2) call, the current working directory or root directory will be changed for all members of the share group. PR_SUMASK The file creation mask, umask is kept synchronized. PR_SULIMIT The limit on maximum file size is kept synchronized. PR_SID The real and effective user and group ids are kept synchronized. To take advantage of sharing all possible attributes, the constant PR_SALL may be used. In addition to specifying shared attributes, the inh flag can be used to pass flags that govern certain operations within the sproc call itself. Currently two flags are supported: PR_BLOCK causes the calling process to be blocked [see blockproc(2)] before returning from a successful call. This can be used to allow the child process access to the parent's stack without the possibility of collision. PR_NOLIBC causes the child to not join the C library (libc) arena (see below). If all sproc calls that a process makes specify this flag then the C library arena will never be created. The creation of the C library arena includes the initialization of the per-thread system error value errno. No scheduling synchronization is implied between shared processes: they are free to run on any processor in any sequence. Any required synchronization must be provided by the application using locks and semaphores [see usinit(3P)] or other mechanisms. If one member of a share group exits or otherwise dies, its stack is removed from the virtual space of the share group. If the process which first created the share group exits, its stack is not removed. This ensures continued access by other share group members to the environment and starting argument vectors. In addition, if the PR_SETEXITSIG option [see prctl(2)] has been enabled then all remaining members of the share group will be signaled. By default, standard C library routines such as printf and malloc function properly even though two or more shared processes access them simultaneously. To accomplish this, a special arena is set up [see usinit(3P)] to hold the locks and semaphores required. Unless the PR_NOLIBC flag is present, the parent will initialize and each child will join the C library arena. Arenas have a configurable maximum number of processes that can join, that is set when the arena is first created. This maximum (default 8) can be configured using usconfig(3P). Each process in the share group needs access to this arena and requires a single file lock [see fcntl(2)]. This may require more file locks to be configured into the system than the default system configuration provides. Programs using share groups that are invoking system services (either system calls or libc routines), should be compiled with the feature test macro _SGI_MP_SOURCE set in any file containing functions that share group members might access (see CAVEATS section below). Currently, this is only required for correct treatment of the system error value errno (see discussion below) but in the future may be required for the correct functioning of other services. sproc will fail and no new process will be created if one or more of the following are true: [ENOMEM] If there is not enough virtual space to allocate a new stack. The default stack size is settable via prctl(2), or setrlimit(2). [EAGAIN] The system-imposed limit on the total number of processes under execution, {NPROC} [see intro(2)], would be exceeded. [EAGAIN] The system-imposed limit on the total number of processes under execution by a single user {CHILD_MAX} [see intro(2)], would be exceeded. [EAGAIN] Amount of system memory required is temporarily unavailable. [EINVAL] sp was null and len was less than 8192. [EPERM] The system call is not permitted from a pthreaded program (see CAVEATS section below). When called with the PR_NOLIBC flag not set, in addition to the above errors sproc will fail and no new process will be created if one or more of the following are true: [ENOSPC] If the size of the share group exceeds the number of users specified via usconfig(3P) (8 by default). Any changes via usconfig(3P) must be done BEFORE the first sproc is performed. [ENOLCK] There are not enough file locks in the system. [EACCES] The shared arena file (located in /usr/tmp) used in conjunction with the C library could not be opened or created for read/write. New process pid # could not join I/O arena:<..> if the new share group member could not properly join the C library arena. The new process exits with a -1. See also the possible errors from usinit(3P). NOTES IrisGL processes that share virtual address space will share access to the graphics hardware and associated data structures. IrisGL calls made by such processes must be single threaded to avoid simultaneous access to these resources. Furthermore, gflush(3G) must be called prior to leaving the critical section represented by the set of graphics calls. This manual entry has described ways in which processes created by sproc differ from those created by fork. Attributes and behavior not mentioned as different should be assumed to work the same way for sproc processes as for processes created by fork. Here are some respects in which the two types of processes are the same: The parent and child after an sproc each have a unique process id (pid), but are in the same process group. A signal sent to a specific pid in a share group [see kill(2)] will be received by only the process to which it was sent. Other members of the share group will not be affected. A signal sent to an entire process group will be received by all the members of the process group, regardless of share group affiliations [see killpg(3B)]. See prctl(2) for ways to alter this behavior. If the child process resulting from an sproc dies or calls exit(2), the parent process receives the SIGCLD signal [see sigset(2), sigaction(2), and sigvec(3B)]. CAVEATS Removing virtual space (e.g. unmapping a file) is an expensive operation and forces all processes in the share group to single thread their memory management operations for the duration of the unmap system call. The reason for this is that the system must insure that no other processes in the share group can reference the virtual space that is being removed or the underlying physical pages during or after the removal. To accomplish this, the system memory management code does the following: Locks a lock on the share group that prevents any other process in the group from doing any memory management operations (page faults, protection faults, second level TLB misses, mmap(2), munmap(2), sbrk(2)). Sends TLB shootdown interrupts to all other cpus in the system that cause them to remove any entries from the processor's Translation Lookaside Buffer (TLB) for the share group for the address range being deleted. Removes the virtual mapping from the share group's memory management data structures and frees any underlying physical pages. Releases the lock to allow parallel operations to continue. pixie(1) and prof(1) do not work on processes that call sproc and do not share address space (i.e. PR_SADDR is not set). Note that the global variable errno is normally a single location shared by all processes in a share group in which address space is a shared attribute. This means that if multiple processes in the group make system calls or other library functions which set errno, the value of errno is no longer useful, since it may be overwritten at any time by a call in another process in the share group. To have each thread have its own private version of errno, programs should be compiled with the feature test macro _SGI_MP_SOURCE defined before including the header file errno.h. Note however that some system supplied libraries have not been converted to set the per-thread error value - they will only set the global error value. This will be corrected in future releases. This means an application compiled with _SGI_MP_SOURCE and directly referencing errno will reference the per-thread error value and not get the global error value that a non-converted library might have set. There are two workarounds to this problem: 1) define the feature test macro _SGI_MP_SOURCE only in files that test errno as the result of an error from a function defined in libc, libw, libm, libadm, libgen, or libmalloc; or 2) for accesses of errno in response to errors from functions not in one of the above mentioned libraries, call goserror(3C) (which always returns the global error value). perror(3C) always reads the 'appropriate' error value so for a threaded application it will read the per-thread value. This means that threaded programs that call errno setting functions in non-converted libraries and attempt to have perror print out the error will not get the correct error value. In this case strerror(goserror()) should be used instead. rld(1) does not support execution of sproc during shared object initialization, such as that described under the -init flag to ld(1). In particular, C++ users must take care that their code does not contain global objects which have constructors which call sproc(2). Should be non-deterministic and unpredictable. The sproc model of threading is incompatible with POSIX threads. Attempts to create an sproc process from a pthreaded program will be rejected [see pthreads(5)]. SEE ALSO blockproc(2), fcntl(2), fork(2), intro(2), prctl(2), setrlimit(2), goserror(3C), oserror(3C), pcreate(3C), pthreads(5), usconfig(3P), usinit(3P), rld(1), ld(1). DIAGNOSTICS Upon successful completion, sproc returns the process id of the new process. Otherwise, a value of -1 is returned to the calling process, and errno is set to indicate the error. Page 7