summaryrefslogtreecommitdiff
path: root/share/doc/psd/20.ipctut/tutor.me
diff options
context:
space:
mode:
Diffstat (limited to 'share/doc/psd/20.ipctut/tutor.me')
-rw-r--r--share/doc/psd/20.ipctut/tutor.me939
1 files changed, 939 insertions, 0 deletions
diff --git a/share/doc/psd/20.ipctut/tutor.me b/share/doc/psd/20.ipctut/tutor.me
new file mode 100644
index 00000000000..fba4583f139
--- /dev/null
+++ b/share/doc/psd/20.ipctut/tutor.me
@@ -0,0 +1,939 @@
+.\" Copyright (c) 1986, 1993
+.\" The Regents of the University of California. All rights reserved.
+.\"
+.\" Redistribution and use in source and binary forms, with or without
+.\" modification, are permitted provided that the following conditions
+.\" are met:
+.\" 1. Redistributions of source code must retain the above copyright
+.\" notice, this list of conditions and the following disclaimer.
+.\" 2. Redistributions in binary form must reproduce the above copyright
+.\" notice, this list of conditions and the following disclaimer in the
+.\" documentation and/or other materials provided with the distribution.
+.\" 3. All advertising materials mentioning features or use of this software
+.\" must display the following acknowledgement:
+.\" This product includes software developed by the University of
+.\" California, Berkeley and its contributors.
+.\" 4. Neither the name of the University nor the names of its contributors
+.\" may be used to endorse or promote products derived from this software
+.\" without specific prior written permission.
+.\"
+.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
+.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
+.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+.\" SUCH DAMAGE.
+.\"
+.\" @(#)tutor.me 8.1 (Berkeley) 8/14/93
+.\"
+.oh 'Introductory 4.4BSD IPC''PSD:20-%'
+.eh 'PSD:20-%''Introductory 4.4BSD IPC'
+.rs
+.sp 2
+.sz 14
+.ft B
+.ce 2
+An Introductory 4.4BSD
+Interprocess Communication Tutorial
+.sz 10
+.sp 2
+.ce
+.i "Stuart Sechrest"
+.ft
+.sp
+.ce 4
+Computer Science Research Group
+Computer Science Division
+Department of Electrical Engineering and Computer Science
+University of California, Berkeley
+.sp 2
+.ce
+.i ABSTRACT
+.sp
+.(c
+.pp
+Berkeley UNIX\(dg 4.4BSD offers several choices for interprocess communication.
+To aid the programmer in developing programs which are comprised of
+cooperating
+processes, the different choices are discussed and a series of example
+programs are presented. These programs
+demonstrate in a simple way the use of pipes, socketpairs, sockets
+and the use of datagram and stream communication. The intent of this
+document is to present a few simple example programs, not to describe the
+networking system in full.
+.)c
+.sp 2
+.(f
+\(dg\|UNIX is a trademark of AT&T Bell Laboratories.
+.)f
+.b
+.sh 1 "Goals"
+.r
+.pp
+Facilities for interprocess communication (IPC) and networking
+were a major addition to UNIX in the Berkeley UNIX 4.2BSD release.
+These facilities required major additions and some changes
+to the system interface.
+The basic idea of this interface is to make IPC similar to file I/O.
+In UNIX a process has a set of I/O descriptors, from which one reads
+and to which one writes.
+Descriptors may refer to normal files, to devices (including terminals),
+or to communication channels.
+The use of a descriptor has three phases: its creation,
+its use for reading and writing, and its destruction. By using descriptors
+to write files, rather than simply naming the target file in the write
+call, one gains a surprising amount of flexibility. Often, the program that
+creates a descriptor will be different from the program that uses the
+descriptor. For example the shell can create a descriptor for the output
+of the `ls'
+command that will cause the listing to appear in a file rather than
+on a terminal.
+Pipes are another form of descriptor that have been used in UNIX
+for some time.
+Pipes allow one-way data transmission from one process
+to another; the two processes and the pipe must be set up by a common
+ancestor.
+.pp
+The use of descriptors is not the only communication interface
+provided by UNIX.
+The signal mechanism sends a tiny amount of information from one
+process to another.
+The signaled process receives only the signal type,
+not the identity of the sender,
+and the number of possible signals is small.
+The signal semantics limit the flexibility of the signaling mechanism
+as a means of interprocess communication.
+.pp
+The identification of IPC with I/O is quite longstanding in UNIX and
+has proved quite successful. At first, however, IPC was limited to
+processes communicating within a single machine. With Berkeley UNIX
+4.2BSD this expanded to include IPC between machines. This expansion
+has necessitated some change in the way that descriptors are created.
+Additionally, new possibilities for the meaning of read and write have
+been admitted. Originally the meanings, or semantics, of these terms
+were fairly simple. When you wrote something it was delivered. When
+you read something, you were blocked until the data arrived.
+Other possibilities exist,
+however. One can write without full assurance of delivery if one can
+check later to catch occasional failures. Messages can be kept as
+discrete units or merged into a stream.
+One can ask to read, but insist on not waiting if nothing is immediately
+available. These new possibilities are allowed in the Berkeley UNIX IPC
+interface.
+.pp
+Thus Berkeley UNIX 4.4BSD offers several choices for IPC.
+This paper presents simple examples that illustrate some of
+the choices.
+The reader is presumed to be familiar with the C programming language
+[Kernighan & Ritchie 1978],
+but not necessarily with the system calls of the UNIX system or with
+processes and interprocess communication.
+The paper reviews the notion of a process and the types of
+communication that are supported by Berkeley UNIX 4.4BSD.
+A series of examples are presented that create processes that communicate
+with one another. The programs show different ways of establishing
+channels of communication.
+Finally, the calls that actually transfer data are reviewed.
+To clearly present how communication can take place,
+the example programs have been cleared of anything that
+might be construed as useful work.
+They can, therefore, serve as models
+for the programmer trying to construct programs which are comprised of
+cooperating processes.
+.b
+.sh 1 "Processes"
+.pp
+A \fIprogram\fP is both a sequence of statements and a rough way of referring
+to the computation that occurs when the compiled statements are run.
+A \fIprocess\fP can be thought of as a single line of control in a program.
+Most programs execute some statements, go through a few loops, branch in
+various directions and then end. These are single process programs.
+Programs can also have a point where control splits into two independent lines,
+an action called \fIforking.\fP
+In UNIX these lines can never join again. A call to the system routine
+\fIfork()\fP, causes a process to split in this way.
+The result of this call is that two independent processes will be
+running, executing exactly the same code.
+Memory values will be the same for all values set before the fork, but,
+subsequently, each version will be able to change only the
+value of its own copy of each variable.
+Initially, the only difference between the two will be the value returned by
+\fIfork().\fP The parent will receive a process id for the child,
+the child will receive a zero.
+Calls to \fIfork(),\fP
+therefore, typically precede, or are included in, an if-statement.
+.pp
+A process views the rest of the system through a private table of descriptors.
+The descriptors can represent open files or sockets (sockets are communication
+objects that will be discussed below). Descriptors are referred to
+by their index numbers in the table. The first three descriptors are often
+known by special names, \fI stdin, stdout\fP and \fIstderr\fP.
+These are the standard input, output and error.
+When a process forks, its descriptor table is copied to the child.
+Thus, if the parent's standard input is being taken from a terminal
+(devices are also treated as files in UNIX), the child's input will
+be taken from the
+same terminal. Whoever reads first will get the input. If, before forking,
+the parent changes its standard input so that it is reading from a
+new file, the child will take its input from the new file. It is
+also possible to take input from a socket, rather than from a file.
+.b
+.sh 1 "Pipes"
+.r
+.pp
+Most users of UNIX know that they can pipe the output of a
+program ``prog1'' to the input of another, ``prog2,'' by typing the command
+\fI``prog1 | prog2.''\fP
+This is called ``piping'' the output of one program
+to another because the mechanism used to transfer the output is called a
+pipe.
+When the user types a command, the command is read by the shell, which
+decides how to execute it. If the command is simple, for example,
+.i "``prog1,''"
+the shell forks a process, which executes the program, prog1, and then dies.
+The shell waits for this termination and then prompts for the next
+command.
+If the command is a compound command,
+.i "``prog1 | prog2,''"
+the shell creates two processes connected by a pipe. One process
+runs the program, prog1, the other runs prog2. The pipe is an I/O
+mechanism with two ends, or sockets. Data that is written into one socket
+can be read from the other.
+.(z
+.ft CW
+.so pipe.c
+.ft
+.ce 1
+Figure 1\ \ Use of a pipe
+.)z
+.pp
+Since a program specifies its input and output only by the descriptor table
+indices, which appear as variables or constants,
+the input source and output destination can be changed without
+changing the text of the program.
+It is in this way that the shell is able to set up pipes. Before executing
+prog1, the process can close whatever is at \fIstdout\fP
+and replace it with one
+end of a pipe. Similarly, the process that will execute prog2 can substitute
+the opposite end of the pipe for
+\fIstdin.\fP
+.pp
+Let us now examine a program that creates a pipe for communication between
+its child and itself (Figure 1).
+A pipe is created by a parent process, which then forks.
+When a process forks, the parent's descriptor table is copied into
+the child's.
+.pp
+In Figure 1, the parent process makes a call to the system routine
+\fIpipe().\fP
+This routine creates a pipe and places descriptors for the sockets
+for the two ends of the pipe in the process's descriptor table.
+\fIPipe()\fP
+is passed an array into which it places the index numbers of the
+sockets it created.
+The two ends are not equivalent. The socket whose index is
+returned in the low word of the array is opened for reading only,
+while the socket in the high end is opened only for writing.
+This corresponds to the fact that the standard input is the first
+descriptor of a process's descriptor table and the standard output
+is the second. After creating the pipe, the parent creates the child
+with which it will share the pipe by calling \fIfork().\fP
+Figure 2 illustrates the effect of a fork.
+The parent process's descriptor table points to both ends of the pipe.
+After the fork, both parent's and child's descriptor tables point to
+the pipe.
+The child can then use the pipe to send a message to the parent.
+.(z
+.so fig2.pic
+.ce 2
+Figure 2\ \ Sharing a pipe between parent and child
+.ce 0
+.)z
+.pp
+Just what is a pipe?
+It is a one-way communication mechanism, with one end opened
+for reading and the other end for writing.
+Therefore, parent and child need to agree on which way to turn
+the pipe, from parent to child or the other way around.
+Using the same pipe for communication both from parent to child and
+from child to parent would be possible (since both processes have
+references to both ends), but very complicated.
+If the parent and child are to have a two-way conversation,
+the parent creates two pipes, one for use in each direction.
+(In accordance with their plans, both parent and child in the example above
+close the socket that they will not use. It is not required that unused
+descriptors be closed, but it is good practice.)
+A pipe is also a \fIstream\fP communication mechanism; that
+is, all messages sent through the pipe are placed in order
+and reliably delivered. When the reader asks for a certain
+number of bytes from this
+stream, he is given as many bytes as are available, up
+to the amount of the request. Note that these bytes may have come from
+the same call to \fIwrite()\fR or from several calls to \fIwrite()\fR
+which were concatenated.
+.b
+.sh 1 "Socketpairs"
+.r
+.pp
+Berkeley UNIX 4.4BSD provides a slight generalization of pipes. A pipe is a
+pair of connected sockets for one-way stream communication. One may
+obtain a pair of connected sockets for two-way stream communication
+by calling the routine \fIsocketpair().\fP
+The program in Figure 3 calls \fIsocketpair()\fP
+to create such a connection. The program uses the link for
+communication in both directions. Since socketpairs are
+an extension of pipes, their use resembles that of pipes.
+Figure 4 illustrates the result of a fork following a call to
+\fIsocketpair().\fP
+.pp
+\fISocketpair()\fP
+takes as
+arguments a specification of a domain, a style of communication, and a
+protocol.
+These are the parameters shown in the example.
+Domains and protocols will be discussed in the next section.
+Briefly,
+a domain is a space of names that may be bound
+to sockets and implies certain other conventions.
+Currently, socketpairs have only been implemented for one
+domain, called the UNIX domain.
+The UNIX domain uses UNIX path names for naming sockets.
+It only allows communication
+between sockets on the same machine.
+.pp
+Note that the header files
+.i "<sys/socket.h>"
+and
+.i "<sys/types.h>."
+are required in this program.
+The constants AF_UNIX and SOCK_STREAM are defined in
+.i "<sys/socket.h>,"
+which in turn requires the file
+.i "<sys/types.h>"
+for some of its definitions.
+.(z
+.ft CW
+.so socketpair.c
+.ft
+.ce 1
+Figure 3\ \ Use of a socketpair
+.)z
+.(z
+.so fig3.pic
+.ce 1
+Figure 4\ \ Sharing a socketpair between parent and child
+.)z
+.b
+.sh 1 "Domains and Protocols"
+.r
+.pp
+Pipes and socketpairs are a simple solution for communicating between
+a parent and child or between child processes.
+What if we wanted to have processes that have no common ancestor
+with whom to set up communication?
+Neither standard UNIX pipes nor socketpairs are
+the answer here, since both mechanisms require a common ancestor to
+set up the communication.
+We would like to have two processes separately create sockets
+and then have messages sent between them. This is often the
+case when providing or using a service in the system. This is
+also the case when the communicating processes are on separate machines.
+In Berkeley UNIX 4.4BSD one can create individual sockets, give them names and
+send messages between them.
+.pp
+Sockets created by different programs use names to refer to one another;
+names generally must be translated into addresses for use.
+The space from which an address is drawn is referred to as a
+.i domain.
+There are several domains for sockets.
+Two that will be used in the examples here are the UNIX domain (or AF_UNIX,
+for Address Format UNIX) and the Internet domain (or AF_INET).
+UNIX domain IPC is an experimental facility in 4.2BSD and 4.3BSD.
+In the UNIX domain, a socket is given a path name within the file system
+name space.
+A file system node is created for the socket and other processes may
+then refer to the socket by giving the proper pathname.
+UNIX domain names, therefore, allow communication between any two processes
+that work in the same file system.
+The Internet domain is the UNIX implementation of the DARPA Internet
+standard protocols IP/TCP/UDP.
+Addresses in the Internet domain consist of a machine network address
+and an identifying number, called a port.
+Internet domain names allow communication between machines.
+.pp
+Communication follows some particular ``style.''
+Currently, communication is either through a \fIstream\fP
+or by \fIdatagram.\fP
+Stream communication implies several things. Communication takes
+place across a connection between two sockets. The communication
+is reliable, error-free, and, as in pipes, no message boundaries are
+kept. Reading from a stream may result in reading the data sent from
+one or several calls to \fIwrite()\fP
+or only part of the data from a single call, if there is not enough room
+for the entire message, or if not all the data from a large message
+has been transferred.
+The protocol implementing such a style will retransmit messages
+received with errors. It will also return error messages if one tries to
+send a message after the connection has been broken.
+Datagram communication does not use connections. Each message is
+addressed individually. If the address is correct, it will generally
+be received, although this is not guaranteed. Often datagrams are
+used for requests that require a response from the
+recipient. If no response
+arrives in a reasonable amount of time, the request is repeated.
+The individual datagrams will be kept separate when they are read, that
+is, message boundaries are preserved.
+.pp
+The difference in performance between the two styles of communication is
+generally less important than the difference in semantics. The
+performance gain that one might find in using datagrams must be weighed
+against the increased complexity of the program, which must now concern
+itself with lost or out of order messages. If lost messages may simply be
+ignored, the quantity of traffic may be a consideration. The expense
+of setting up a connection is best justified by frequent use of the connection.
+Since the performance of a protocol changes as it is tuned for different
+situations, it is best to seek the most up-to-date information when
+making choices for a program in which performance is crucial.
+.pp
+A protocol is a set of rules, data formats and conventions that regulate the
+transfer of data between participants in the communication.
+In general, there is one protocol for each socket type (stream,
+datagram, etc.) within each domain.
+The code that implements a protocol
+keeps track of the names that are bound to sockets,
+sets up connections and transfers data between sockets,
+perhaps sending the data across a network.
+This code also keeps track of the names that are bound to sockets.
+It is possible for several protocols, differing only in low level
+details, to implement the same style of communication within
+a particular domain. Although it is possible to select
+which protocol should be used, for nearly all uses it is sufficient to
+request the default protocol. This has been done in all of the example
+programs.
+.pp
+One specifies the domain, style and protocol of a socket when
+it is created. For example, in Figure 5a the call to \fIsocket()\fP
+causes the creation of a datagram socket with the default protocol
+in the UNIX domain.
+.b
+.sh 1 "Datagrams in the UNIX Domain"
+.r
+.(z
+.ft CW
+.so udgramread.c
+.ft
+.ce 1
+Figure 5a\ \ Reading UNIX domain datagrams
+.)z
+.pp
+Let us now look at two programs that create sockets separately.
+The programs in Figures 5a and 5b use datagram communication
+rather than a stream.
+The structure used to name UNIX domain sockets is defined
+in the file \fI<sys/un.h>.\fP
+The definition has also been included in the example for clarity.
+.pp
+Each program creates a socket with a call to \fIsocket().\fP
+These sockets are in the UNIX domain.
+Once a name has been decided upon it is attached to a socket by the
+system call \fIbind().\fP
+The program in Figure 5a uses the name ``socket'',
+which it binds to its socket.
+This name will appear in the working directory of the program.
+The routines in Figure 5b use its
+socket only for sending messages. It does not create a name for
+the socket because no other process has to refer to it.
+.(z
+.ft CW
+.so udgramsend.c
+.ft
+.ce 1
+Figure 5b\ \ Sending a UNIX domain datagrams
+.)z
+.pp
+Names in the UNIX domain are path names. Like file path names they may
+be either absolute (e.g. ``/dev/imaginary'') or relative (e.g. ``socket'').
+Because these names are used to allow processes to rendezvous, relative
+path names can pose difficulties and should be used with care.
+When a name is bound into the name space, a file (inode) is allocated in the
+file system. If
+the inode is not deallocated, the name will continue to exist even after
+the bound socket is closed. This can cause subsequent runs of a program
+to find that a name is unavailable, and can cause
+directories to fill up with these
+objects. The names are removed by calling \fIunlink()\fP or using
+the \fIrm\fP\|(1) command.
+Names in the UNIX domain are only used for rendezvous. They are not used
+for message delivery once a connection is established. Therefore, in
+contrast with the Internet domain, unbound sockets need not be (and are
+not) automatically given addresses when they are connected.
+.pp
+There is no established means of communicating names to interested
+parties. In the example, the program in Figure 5b gets the
+name of the socket to which it will send its message through its
+command line arguments. Once a line of communication has been created,
+one can send the names of additional, perhaps new, sockets over the link.
+Facilities will have to be built that will make the distribution of
+names less of a problem than it now is.
+.b
+.sh 1 "Datagrams in the Internet Domain"
+.r
+.(z
+.ft CW
+.so dgramread.c
+.ft
+.ce 1
+Figure 6a\ \ Reading Internet domain datagrams
+.)z
+.pp
+The examples in Figure 6a and 6b are very close to the previous example
+except that the socket is in the Internet domain.
+The structure of Internet domain addresses is defined in the file
+\fI<netinet/in.h>\fP.
+Internet addresses specify a host address (a 32-bit number)
+and a delivery slot, or port, on that
+machine. These ports are managed by the system routines that implement
+a particular protocol.
+Unlike UNIX domain names, Internet socket names are not entered into
+the file system and, therefore,
+they do not have to be unlinked after the socket has been closed.
+When a message must be sent between machines it is sent to
+the protocol routine on the destination machine, which interprets the
+address to determine to which socket the message should be delivered.
+Several different protocols may be active on
+the same machine, but, in general, they will not communicate with one another.
+As a result, different protocols are allowed to use the same port numbers.
+Thus, implicitly, an Internet address is a triple including a protocol as
+well as the port and machine address.
+An \fIassociation\fP is a temporary or permanent specification
+of a pair of communicating sockets.
+An association is thus identified by the tuple
+<\fIprotocol, local machine address, local port,
+remote machine address, remote port\fP>.
+An association may be transient when using datagram sockets;
+the association actually exists during a \fIsend\fP operation.
+.(z
+.ft CW
+.so dgramsend.c
+.ft
+.ce 1
+Figure 6b\ \ Sending an Internet domain datagram
+.)z
+.pp
+The protocol for a socket is chosen when the socket is created. The
+local machine address for a socket can be any valid network address of the
+machine, if it has more than one, or it can be the wildcard value
+INADDR_ANY.
+The wildcard value is used in the program in Figure 6a.
+If a machine has several network addresses, it is likely
+that messages sent to any of the addresses should be deliverable to
+a socket. This will be the case if the wildcard value has been chosen.
+Note that even if the wildcard value is chosen, a program sending messages
+to the named socket must specify a valid network address. One can be willing
+to receive from ``anywhere,'' but one cannot send a message ``anywhere.''
+The program in Figure 6b is given the destination host name as a command
+line argument.
+To determine a network address to which it can send the message, it looks
+up
+the host address by the call to \fIgethostbyname()\fP.
+The returned structure includes the host's network address,
+which is copied into the structure specifying the
+destination of the message.
+.pp
+The port number can be thought of as the number of a mailbox, into
+which the protocol places one's messages. Certain daemons, offering
+certain advertised services, have reserved
+or ``well-known'' port numbers. These fall in the range
+from 1 to 1023. Higher numbers are available to general users.
+Only servers need to ask for a particular number.
+The system will assign an unused port number when an address
+is bound to a socket.
+This may happen when an explicit \fIbind\fP
+call is made with a port number of 0, or
+when a \fIconnect\fP or \fIsend\fP
+is performed on an unbound socket.
+Note that port numbers are not automatically reported back to the user.
+After calling \fIbind(),\fP asking for port 0, one may call
+\fIgetsockname()\fP to discover what port was actually assigned.
+The routine \fIgetsockname()\fP
+will not work for names in the UNIX domain.
+.pp
+The format of the socket address is specified in part by standards within the
+Internet domain. The specification includes the order of the bytes in
+the address. Because machines differ in the internal representation
+they ordinarily use
+to represent integers, printing out the port number as returned by
+\fIgetsockname()\fP may result in a misinterpretation. To
+print out the number, it is necessary to use the routine \fIntohs()\fP
+(for \fInetwork to host: short\fP) to convert the number from the
+network representation to the host's representation. On some machines,
+such as 68000-based machines, this is a null operation. On others,
+such as VAXes, this results in a swapping of bytes. Another routine
+exists to convert a short integer from the host format to the network format,
+called \fIhtons()\fP; similar routines exist for long integers.
+For further information, refer to the
+entry for \fIbyteorder\fP in section 3 of the manual.
+.b
+.sh 1 "Connections"
+.r
+.pp
+To send data between stream sockets (having communication style SOCK_STREAM),
+the sockets must be connected.
+Figures 7a and 7b show two programs that create such a connection.
+The program in 7a is relatively simple.
+To initiate a connection, this program simply creates
+a stream socket, then calls \fIconnect()\fP,
+specifying the address of the socket to which
+it wishes its socket connected. Provided that the target socket exists and
+is prepared to handle a connection, connection will be complete,
+and the program can begin to send
+messages. Messages will be delivered in order without message
+boundaries, as with pipes. The connection is destroyed when either
+socket is closed (or soon thereafter). If a process persists
+in sending messages after the connection is closed, a SIGPIPE signal
+is sent to the process by the operating system. Unless explicit action
+is taken to handle the signal (see the manual page for \fIsignal\fP
+or \fIsigvec\fP),
+the process will terminate and the shell
+will print the message ``broken pipe.''
+.(z
+.ft CW
+.so streamwrite.c
+.ft
+.ce 1
+Figure 7a\ \ Initiating an Internet domain stream connection
+.)z
+.(z
+.ft CW
+.so streamread.c
+.ft
+.ce 1
+Figure 7b\ \ Accepting an Internet domain stream connection
+.sp 2
+.ft CW
+.so strchkread.c
+.ft
+.ce 1
+Figure 7c\ \ Using select() to check for pending connections
+.)z
+.(z
+.so fig8.pic
+.sp
+.ce 1
+Figure 8\ \ Establishing a stream connection
+.)z
+.pp
+Forming a connection is asymmetrical; one process, such as the
+program in Figure 7a, requests a connection with a particular socket,
+the other process accepts connection requests.
+Before a connection can be accepted a socket must be created and an address
+bound to it. This
+situation is illustrated in the top half of Figure 8. Process 2
+has created a socket and bound a port number to it. Process 1 has created an
+unnamed socket.
+The address bound to process 2's socket is then made known to process 1 and,
+perhaps to several other potential communicants as well.
+If there are several possible communicants,
+this one socket might receive several requests for connections.
+As a result, a new socket is created for each connection. This new socket
+is the endpoint for communication within this process for this connection.
+A connection may be destroyed by closing the corresponding socket.
+.pp
+The program in Figure 7b is a rather trivial example of a server. It
+creates a socket to which it binds a name, which it then advertises.
+(In this case it prints out the socket number.) The program then calls
+\fIlisten()\fP for this socket.
+Since several clients may attempt to connect more or less
+simultaneously, a queue of pending connections is maintained in the system
+address space. \fIListen()\fP
+marks the socket as willing to accept connections and initializes the queue.
+When a connection is requested, it is listed in the queue. If the
+queue is full, an error status may be returned to the requester.
+The maximum length of this queue is specified by the second argument of
+\fIlisten()\fP; the maximum length is limited by the system.
+Once the listen call has been completed, the program enters
+an infinite loop. On each pass through the loop, a new connection is
+accepted and removed from the queue, and, hence, a new socket for the
+connection is created. The bottom half of Figure 8 shows the result of
+Process 1 connecting with the named socket of Process 2, and Process 2
+accepting the connection. After the connection is created, the
+service, in this case printing out the messages, is performed and the
+connection socket closed. The \fIaccept()\fP
+call will take a pending connection
+request from the queue if one is available, or block waiting for a request.
+Messages are read from the connection socket.
+Reads from an active connection will normally block until data is available.
+The number of bytes read is returned. When a connection is destroyed,
+the read call returns immediately. The number of bytes returned will
+be zero.
+.pp
+The program in Figure 7c is a slight variation on the server in Figure 7b.
+It avoids blocking when there are no pending connection requests by
+calling \fIselect()\fP
+to check for pending requests before calling \fIaccept().\fP
+This strategy is useful when connections may be received
+on more than one socket, or when data may arrive on other connected
+sockets before another connection request.
+.pp
+The programs in Figures 9a and 9b show a program using stream communication
+in the UNIX domain. Streams in the UNIX domain can be used for this sort
+of program in exactly the same way as Internet domain streams, except for
+the form of the names and the restriction of the connections to a single
+file system. There are some differences, however, in the functionality of
+streams in the two domains, notably in the handling of
+\fIout-of-band\fP data (discussed briefly below). These differences
+are beyond the scope of this paper.
+.(z
+.ft CW
+.so ustreamwrite.c
+.ft
+.ce 1
+Figure 9a\ \ Initiating a UNIX domain stream connection
+.sp 2
+.ft CW
+.so ustreamread.c
+.ft
+.ce 1
+Figure 9b\ \ Accepting a UNIX domain stream connection
+.)z
+.b
+.sh 1 "Reads, Writes, Recvs, etc."
+.r
+.pp
+UNIX 4.4BSD has several system calls for reading and writing information.
+The simplest calls are \fIread() \fP and \fIwrite().\fP \fIWrite()\fP
+takes as arguments the index of a descriptor, a pointer to a buffer
+containing the data and the size of the data.
+The descriptor may indicate either a file or a connected socket.
+``Connected'' can mean either a connected stream socket (as described
+in Section 8) or a datagram socket for which a \fIconnect()\fP
+call has provided a default destination (see the \fIconnect()\fP manual page).
+\fIRead()\fP also takes a descriptor that indicates either a file or a socket.
+\fIWrite()\fP requires a connected socket since no destination is
+specified in the parameters of the system call.
+\fIRead()\fP can be used for either a connected or an unconnected socket.
+These calls are, therefore, quite flexible and may be used to
+write applications that require no assumptions about the source of
+their input or the destination of their output.
+There are variations on \fIread() \fP and \fIwrite()\fP
+that allow the source and destination of the input and output to use
+several separate buffers, while retaining the flexibility to handle
+both files and sockets. These are \fIreadv()\fP and \fI writev(),\fP
+for read and write \fIvector.\fP
+.pp
+It is sometimes necessary to send high priority data over a
+connection that may have unread low priority data at the
+other end. For example, a user interface process may be interpreting
+commands and sending them on to another process through a stream connection.
+The user interface may have filled the stream with as yet unprocessed
+requests when the user types
+a command to cancel all outstanding requests.
+Rather than have the high priority data wait
+to be processed after the low priority data, it is possible to
+send it as \fIout-of-band\fP
+(OOB) data. The notification of pending OOB data results in the generation of
+a SIGURG signal, if this signal has been enabled (see the manual
+page for \fIsignal\fP or \fIsigvec\fP).
+See [Leffler 1986] for a more complete description of the OOB mechanism.
+There are a pair of calls similar to \fIread\fP and \fIwrite\fP
+that allow options, including sending
+and receiving OOB information; these are \fI send()\fP
+and \fIrecv().\fP
+These calls are used only with sockets; specifying a descriptor for a file will
+result in the return of an error status. These calls also allow
+\fIpeeking\fP at data in a stream.
+That is, they allow a process to read data without removing the data from
+the stream. One use of this facility is to read ahead in a stream
+to determine the size of the next item to be read.
+When not using these options, these calls have the same functions as
+\fIread()\fP and \fIwrite().\fP
+.pp
+To send datagrams, one must be allowed to specify the destination.
+The call \fIsendto()\fP
+takes a destination address as an argument and is therefore used for
+sending datagrams. The call \fIrecvfrom()\fP
+is often used to read datagrams, since this call returns the address
+of the sender, if it is available, along with the data.
+If the identity of the sender does not matter, one may use \fIread()\fP
+or \fIrecv().\fP
+.pp
+Finally, there are a pair of calls that allow the sending and
+receiving of messages from multiple buffers, when the address of the
+recipient must be specified. These are \fIsendmsg()\fP and
+\fIrecvmsg().\fP
+These calls are actually quite general and have other uses,
+including, in the UNIX domain, the transmission of a file descriptor from one
+process to another.
+.pp
+The various options for reading and writing are shown in Figure 10,
+together with their parameters. The parameters for each system call
+reflect the differences in function of the different calls.
+In the examples given in this paper, the calls \fIread()\fP and
+\fIwrite()\fP have been used whenever possible.
+.(z
+.ft CW
+ /*
+ * The variable descriptor may be the descriptor of either a file
+ * or of a socket.
+ */
+ cc = read(descriptor, buf, nbytes)
+ int cc, descriptor; char *buf; int nbytes;
+
+ /*
+ * An iovec can include several source buffers.
+ */
+ cc = readv(descriptor, iov, iovcnt)
+ int cc, descriptor; struct iovec *iov; int iovcnt;
+
+ cc = write(descriptor, buf, nbytes)
+ int cc, descriptor; char *buf; int nbytes;
+
+ cc = writev(descriptor, iovec, ioveclen)
+ int cc, descriptor; struct iovec *iovec; int ioveclen;
+
+ /*
+ * The variable ``sock'' must be the descriptor of a socket.
+ * Flags may include MSG_OOB and MSG_PEEK.
+ */
+ cc = send(sock, msg, len, flags)
+ int cc, sock; char *msg; int len, flags;
+
+ cc = sendto(sock, msg, len, flags, to, tolen)
+ int cc, sock; char *msg; int len, flags;
+ struct sockaddr *to; int tolen;
+
+ cc = sendmsg(sock, msg, flags)
+ int cc, sock; struct msghdr msg[]; int flags;
+
+ cc = recv(sock, buf, len, flags)
+ int cc, sock; char *buf; int len, flags;
+
+ cc = recvfrom(sock, buf, len, flags, from, fromlen)
+ int cc, sock; char *buf; int len, flags;
+ struct sockaddr *from; int *fromlen;
+
+ cc = recvmsg(sock, msg, flags)
+ int cc, socket; struct msghdr msg[]; int flags;
+.ft
+.sp 1
+.ce 1
+Figure 10\ \ Varieties of read and write commands
+.)z
+.b
+.sh 1 "Choices"
+.r
+.pp
+This paper has presented examples of some of the forms
+of communication supported by
+Berkeley UNIX 4.4BSD. These have been presented in an order chosen for
+ease of presentation. It is useful to review these options emphasizing the
+factors that make each attractive.
+.pp
+Pipes have the advantage of portability, in that they are supported in all
+UNIX systems. They also are relatively
+simple to use. Socketpairs share this simplicity and have the additional
+advantage of allowing bidirectional communication. The major shortcoming
+of these mechanisms is that they require communicating processes to be
+descendants of a common process. They do not allow intermachine communication.
+.pp
+The two communication domains, UNIX and Internet, allow processes with no common
+ancestor to communicate.
+Of the two, only the Internet domain allows
+communication between machines.
+This makes the Internet domain a necessary
+choice for processes running on separate machines.
+.pp
+The choice between datagrams and stream communication is best made by
+carefully considering the semantic and performance
+requirements of the application.
+Streams can be both advantageous and disadvantageous. One disadvantage
+is that a process is only allowed a limited number of open streams,
+as there are usually only 64 entries available in the open descriptor
+table. This can cause problems if a single server must talk with a large
+number of clients.
+Another is that for delivering a short message the stream setup and
+teardown time can be unnecessarily long. Weighed against this are
+the reliability built into the streams. This will often be the
+deciding factor in favor of streams.
+.b
+.sh 1 "What to do Next"
+.r
+.pp
+Many of the examples presented here can serve as models for multiprocess
+programs and for programs distributed across several machines.
+In developing a new multiprocess program, it is often easiest to
+first write the code to create the processes and communication paths.
+After this code is debugged, the code specific to the application can
+be added.
+.pp
+An introduction to the UNIX system and programming using UNIX system calls
+can be found in [Kernighan and Pike 1984].
+Further documentation of the Berkeley UNIX 4.4BSD IPC mechanisms can be
+found in [Leffler et al. 1986].
+More detailed information about particular calls and protocols
+is provided in sections
+2, 3 and 4 of the
+UNIX Programmer's Manual [CSRG 1986].
+In particular the following manual pages are relevant:
+.(b
+.TS
+l l.
+creating and naming sockets socket(2), bind(2)
+establishing connections listen(2), accept(2), connect(2)
+transferring data read(2), write(2), send(2), recv(2)
+addresses inet(4F)
+protocols tcp(4P), udp(4P).
+.TE
+.)b
+.(b
+.sp
+.b
+Acknowledgements
+.pp
+I would like to thank Sam Leffler and Mike Karels for their help in
+understanding the IPC mechanisms and all the people whose comments
+have helped in writing and improving this report.
+.pp
+This work was sponsored by the Defense Advanced Research Projects Agency
+(DoD), ARPA Order No. 4031, monitored by the Naval Electronics Systems
+Command under contract No. N00039-C-0235.
+The views and conclusions contained in this document are those of the
+author and should not be interpreted as representing official policies,
+either expressed or implied, of the Defense Research Projects Agency
+or of the US Government.
+.)b
+.(b
+.sp
+.b
+References
+.r
+.sp
+.ls 1
+B.W. Kernighan & R. Pike, 1984,
+.i "The UNIX Programming Environment."
+Englewood Cliffs, N.J.: Prentice-Hall.
+.sp
+.ls 1
+B.W. Kernighan & D.M. Ritchie, 1978,
+.i "The C Programming Language,"
+Englewood Cliffs, N.J.: Prentice-Hall.
+.sp
+.ls 1
+S.J. Leffler, R.S. Fabry, W.N. Joy, P. Lapsley, S. Miller & C. Torek, 1986,
+.i "An Advanced 4.4BSD Interprocess Communication Tutorial."
+Computer Systems Research Group,
+Department of Electrical Engineering and Computer Science,
+University of California, Berkeley.
+.sp
+.ls 1
+Computer Systems Research Group, 1986,
+.i "UNIX Programmer's Manual, 4.4 Berkeley Software Distribution."
+Computer Systems Research Group,
+Department of Electrical Engineering and Computer Science,
+University of California, Berkeley.
+.)b