diff options
author | Theo de Raadt <deraadt@cvs.openbsd.org> | 1995-10-18 08:53:40 +0000 |
---|---|---|
committer | Theo de Raadt <deraadt@cvs.openbsd.org> | 1995-10-18 08:53:40 +0000 |
commit | d6583bb2a13f329cf0332ef2570eb8bb8fc0e39c (patch) | |
tree | ece253b876159b39c620e62b6c9b1174642e070e /share/doc/smm/06.nfs |
initial import of NetBSD tree
Diffstat (limited to 'share/doc/smm/06.nfs')
-rw-r--r-- | share/doc/smm/06.nfs/0.t | 75 | ||||
-rw-r--r-- | share/doc/smm/06.nfs/1.t | 588 | ||||
-rw-r--r-- | share/doc/smm/06.nfs/2.t | 530 | ||||
-rw-r--r-- | share/doc/smm/06.nfs/Makefile | 7 | ||||
-rw-r--r-- | share/doc/smm/06.nfs/ref.t | 123 |
5 files changed, 1323 insertions, 0 deletions
diff --git a/share/doc/smm/06.nfs/0.t b/share/doc/smm/06.nfs/0.t new file mode 100644 index 00000000000..4d77f560e2a --- /dev/null +++ b/share/doc/smm/06.nfs/0.t @@ -0,0 +1,75 @@ +.\" Copyright (c) 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This document is derived from software contributed to Berkeley by +.\" Rick Macklem at The University of Guelph. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. All advertising materials mentioning features or use of this software +.\" must display the following acknowledgement: +.\" This product includes software developed by the University of +.\" California, Berkeley and its contributors. +.\" 4. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)0.t 8.1 (Berkeley) 6/8/93 +.\" +.(l C +.sz 14 +.b "The 4.4BSD NFS Implementation" +.sp +.sz 10 +Rick Macklem +.i "University of Guelph" +.)l +.sp 2 +.ce 1 +.sz 12 +.b "ABSTRACT" +.eh 'SMM:06-%''The 4.4BSD NFS Implementation' +.oh 'The 4.4BSD NFS Implementation''SMM:06-%' +.pp +The 4.4BSD implementation of the Network File System (NFS)\** is +intended to interoperate with +.(f +\**Network File System (NFS) is believed to be a registered trademark of +Sun Microsystems Inc. +.)f +other NFS Version 2 Protocol (RFC1094) implementations but also +allows use of an alternate protocol that is hoped to provide better +performance in certain environments. +This paper will informally discuss these various protocol features and +their use. +There is a brief overview of the implementation followed +by several sections on various problem areas related to NFS +and some hints on how to deal with them. +.pp +Not Quite NFS (NQNFS) is an NFS like protocol designed to maintain full cache +consistency between clients in a crash tolerant manner. It is an adaptation +of the NFS protocol such that the server supports both NFS +and NQNFS clients while maintaining full consistency between the server and +NQNFS clients. +It borrows heavily from work done on Spritely-NFS [Srinivasan89], but uses +Leases [Gray89] to avoid the need to recover server state information +after a crash. +.sp diff --git a/share/doc/smm/06.nfs/1.t b/share/doc/smm/06.nfs/1.t new file mode 100644 index 00000000000..6804ed1fb4a --- /dev/null +++ b/share/doc/smm/06.nfs/1.t @@ -0,0 +1,588 @@ +.\" Copyright (c) 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This document is derived from software contributed to Berkeley by +.\" Rick Macklem at The University of Guelph. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. All advertising materials mentioning features or use of this software +.\" must display the following acknowledgement: +.\" This product includes software developed by the University of +.\" California, Berkeley and its contributors. +.\" 4. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)1.t 8.1 (Berkeley) 6/8/93 +.\" +.sh 1 "NFS Implementation" +.pp +The 4.4BSD implementation of NFS and the alternate protocol nicknamed +Not Quite NFS (NQNFS) are kernel resident, but make use of a few system +daemons. +The kernel implementation does not use an RPC library, handling the RPC +request and reply messages directly in \fImbuf\fR data areas. NFS +interfaces to the network using +sockets via. the kernel interface available in +\fIsys/kern/uipc_syscalls.c\fR as \fIsosend(), soreceive(),\fR... +There are connection management routines for support of sockets for connection +oriented protocols and timeout/retransmit support for datagram sockets on +the client side. +For connection oriented transport protocols, +such as TCP/IP, there is one connection +for each client to server mount point that is maintained until an umount. +If the connection breaks, the client will attempt a reconnect with a new +socket. +The client side can operate without any daemons running, but performance +will be improved by running nfsiod daemons that perform read-aheads +and write-behinds. +For the server side to function, the daemons portmap, mountd and +nfsd must be running. +The mountd daemon performs two important functions. +.ip 1) +Upon startup and after a hangup signal, mountd reads the exports +file and pushes the export information for each local file system down +into the kernel via. the mount system call. +.ip 2) +Mountd handles remote mount protocol (RFC1094, Appendix A) requests. +.lp +The nfsd master daemon forks off children that enter the kernel +via. the nfssvc system call. The children normally remain kernel +resident, providing a process context for the NFS RPC servers. The only +exception to this is when a Kerberos [Steiner88] +ticket is received and at that time +the nfsd exits the kernel temporarily to verify the ticket via. the +Kerberos libraries and then returns to the kernel with the results. +(This only happens for Kerberos mount points as described further under +Security.) +Meanwhile, the master nfsd waits to accept new connections from clients +using connection oriented transport protocols and passes the new sockets down +into the kernel. +The client side mount_nfs along with portmap and +mountd are the only parts of the NFS subsystem that make any +use of the Sun RPC library. +.sh 1 "Mount Problems" +.pp +There are several problems that can be encountered at the time of an NFS +mount, ranging from a unresponsive NFS server (crashed, network partitioned +from client, etc.) to various interoperability problems between different +NFS implementations. +.pp +On the server side, +if the 4.4BSD NFS server will be handling any PC clients, mountd will +require the \fB-n\fR option to enable non-root mount request servicing. +Running of a pcnfsd\** daemon will also be necessary. +.(f +\** Pcnfsd is available in source form from Sun Microsystems and many +anonymous ftp sites. +.)f +The server side requires that the daemons +mountd and nfsd be running and that +they be registered with portmap properly. +If problems are encountered, +the safest fix is to kill all the daemons and then restart them in +the order portmap, mountd and nfsd. +Other server side problems are normally caused by problems with the format +of the exports file, which is covered under +Security and in the exports man page. +.pp +On the client side, there are several mount options useful for dealing +with server problems. +In cases where a file system is not critical for system operation, the +\fB-b\fR +mount option may be specified so that mount_nfs will go into the +background for a mount attempt on an unresponsive server. +This is useful for mounts specified in +\fIfstab(5)\fR, +so that the system will not get hung while booting doing +\fBmount -a\fR +because a file server is not responsive. +On the other hand, if the file system is critical to system operation, this +option should not be used so that the client will wait for the server to +come up before completing bootstrapping. +There are also three mount options to help deal with interoperability issues +with various non-BSD NFS servers. The +\fB-P\fR +option specifies that the NFS +client use a reserved IP port number to satisfy some servers' security +requirements.\** +.(f +\**Any security benefit of this is highly questionable and as +such the BSD server does not require a client to use a reserved port number. +.)f +The +\fB-c\fR +option stops the NFS client from doing a \fIconnect\fR on the UDP +socket, so that the mount works with servers that send NFS replies from +port numbers other than the standard 2049.\** +.(f +\**The Encore Multimax is known +to require this. +.)f +Finally, the +\fB-g=\fInum\fR +option sets the maximum size of the group list in the credentials passed +to an NFS server in every RPC request. Although RFC1057 specifies a maximum +size of 16 for the group list, some servers can't handle that many. +If a user, particularly root doing a mount, +keeps getting access denied from a file server, try temporarily +reducing the number of groups that user is in to less than 5 +by editing /etc/group. If the user can then access the file system, slowly +increase the number of groups for that user until the limit is found and +then peg the limit there with the +\fB-g=\fInum\fR +option. +This implies that the server will only see the first \fInum\fR +groups that the user is in, which can cause some accessibility problems. +.pp +For sites that have many NFS servers, amd [Pendry93] +is a useful administration tool. +It also reduces the number of actual NFS mount points, alleviating problems +with commands such as df(1) that hang when any of the NFS servers is +unreachable. +.sh 1 "Dealing with Hung Servers" +.pp +There are several mount options available to help a client deal with +being hung waiting for response from a crashed or unreachable\** server. +.(f +\**Due to a network partitioning or similar. +.)f +By default, a hard mount will continue to try to contact the server +``forever'' to complete the system call. This type of mount is appropriate +when processes on the client that access files in the file system do not +tolerate file I/O systems calls that return -1 with \fIerrno == EINTR\fR +and/or access to the file system is critical for normal system operation. +.lp +There are two other alternatives: +.ip 1) +A soft mount (\fB-s\fR option) retries an RPC \fIn\fR +times and then the corresponding +system call returns -1 with errno set to EINTR. +For TCP transport, the actual RPC request is not retransmitted, but the +timeout intervals waiting for a reply from the server are done +in the same manner as UDP for this purpose. +The problem with this type of mount is that most applications do not +expect an EINTR error return from file I/O system calls (since it never +occurs for a local file system) and get confused by the error return +from the I/O system call. +The option +\fB-x=\fInum\fR +is used to set the RPC retry limit and if set too low, the error returns +will start occurring whenever the NFS server is slow due to heavy load. +Alternately, a large retry limit can result in a process hung for a long +time, due to a crashed server or network partitioning. +.ip 2) +An interruptible mount (\fB-i\fR option) checks to see if a termination signal +is pending for the process when waiting for server response and if it is, +the I/O system call posts an EINTR. Normally this results in the process +being terminated by the signal when returning from the system call. +This feature allows you to ``^C'' out of processes that are hung +due to unresponsive servers. +The problem with this approach is that signals that are caught by +a process are not recognized as termination signals +and the process will remain hung.\** +.(f +\**Unfortunately, there are also some resource allocation situations in the +BSD kernel where the termination signal will be ignored and the process +will not terminate. +.)f +.sh 1 "RPC Transport Issues" +.pp +The NFS Version 2 protocol runs over UDP/IP transport by +sending each Sun Remote Procedure Call (RFC1057) +request/reply message in a single UDP +datagram. Since UDP does not guarantee datagram delivery, the +Remote Procedure Call (RPC) layer +times out and retransmits an RPC request if +no RPC reply has been received. Since this round trip timeout (RTO) value +is for the entire RPC operation, including RPC message transmission to the +server, queuing at the server for an nfsd, performing the RPC and +sending the RPC reply message back to the client, it can be highly variable +for even a moderately loaded NFS server. +As a result, the RTO interval must be a conservation (large) estimate, in +order to avoid extraneous RPC request retransmits.\** +.(f +\**At best, an extraneous RPC request retransmit increases +the load on the server and at worst can result in damaged files +on the server when non-idempotent RPCs are redone [Juszczak89]. +.)f +Also, with an 8Kbyte read/write data size +(the default), the read/write reply/request will be an 8+Kbyte UDP datagram +that must normally be fragmented at the IP layer for transmission.\** +.(f +\**6 IP fragments for an Ethernet, +which has an maximum transmission unit of 1500bytes. +.)f +For IP fragments to be successfully reassembled into +the IP datagram at the receive end, all +fragments must be received within a fairly short ``time to live''. +If one fragment is lost/damaged in transit, +the entire RPC must be retransmitted and redone. +This problem can be exaggerated by a network interface on the receiver that +cannot handle the reception of back to back network packets. [Kent87a] +.pp +There are several tuning mount +options on the client side that can prove useful when trying to +alleviate performance problems related to UDP RPC transport. +The options +\fB-r=\fInum\fR +and +\fB-w=\fInum\fR +specify the maximum read or write data size respectively. +The size \fInum\fR +should be a power of 2 (4K, 2K, 1K) and adjusted downward from the +maximum of 8Kbytes +whenever IP fragmentation is causing problems. The best indicator of +IP fragmentation problems is a significant number of +\fIfragments dropped after timeout\fR +reported by the \fIip:\fR section of a \fBnetstat -s\fR +command on either the client or server. +Of course, if the fragments are being dropped at the server, it can be +fun figuring out which client(s) are involved. +The most likely candidates are clients that are not +on the same local area network as the +server or have network interfaces that do not receive several +back to back network packets properly. +.pp +By default, the 4.4BSD NFS client dynamically estimates the retransmit +timeout interval for the RPC and this appears to work reasonably well for +many environments. However, the +\fB-d\fR +flag can be specified to turn off +the dynamic estimation of retransmit timeout, so that the client will +use a static initial timeout interval.\** +.(f +\**After the first retransmit timeout, the initial interval is backed off +exponentially. +.)f +The +\fB-t=\fInum\fR +option can be used with +\fB-d\fR +to set the initial timeout interval to other than the default of 2 seconds. +The best indicator that dynamic estimation should be turned off would +be a significant number\** in the \fIX Replies\fR field and a +.(f +\**Even 0.1% of the total RPCs is probably significant. +.)f +large number in the \fIRetries\fR field +in the \fIRpc Info:\fR section as reported +by the \fBnfsstat\fR command. +On the server, there would be significant numbers of \fIInprog\fR recent +request cache hits in the \fIServer Cache Stats:\fR section as reported +by the \fBnfsstat\fR command, when run on the server. +.pp +The tradeoff is that a smaller timeout interval results in a better +average RPC response time, but increases the risk of extraneous retries +that in turn increase server load and the possibility of damaged files +on the server. It is probably best to err on the safe side and use a large +(>= 2sec) fixed timeout if the dynamic retransmit timeout estimation +seems to be causing problems. +.pp +An alternative to all this fiddling is to run NFS over TCP transport instead +of UDP. +Since the 4.4BSD TCP implementation provides reliable +delivery with congestion control, it avoids all of the above problems. +It also permits the use of read and write data sizes greater than the 8Kbyte +limit for UDP transport.\** +.(f +\**Read/write data sizes greater than 8Kbytes will not normally improve +performance unless the kernel constant MAXBSIZE is increased and the +file system on the server has a block size greater than 8Kbytes. +.)f +NFS over TCP usually delivers comparable to significantly better performance +than NFS over UDP +unless the client or server processor runs at less than 5-10MIPS. For a +slow processor, the extra CPU overhead of using TCP transport will become +significant and TCP transport may only be useful when the client +to server interconnect traverses congested gateways. +The main problem with using TCP transport is that it is only supported +between BSD clients and servers.\** +.(f +\**There are rumors of commercial NFS over TCP implementations on the horizon +and these may well be worth exploring. +.)f +.sh 1 "Other Tuning Tricks" +.pp +Another mount option that may improve performance over +certain network interconnects is \fB-a=\fInum\fR +which sets the number of blocks that the system will +attempt to read-ahead during sequential reading of a file. The default value +of 1 seems to be appropriate for most situations, but a larger value might +achieve better performance for some environments, such as a mount to a server +across a ``high bandwidth * round trip delay'' interconnect. +.pp +For the adventurous, playing with the size of the buffer cache +can also improve performance for some environments that use NFS heavily. +Under some workloads, a buffer cache of 4-6Mbytes can result in significant +performance improvements over 1-2Mbytes, both in client side system call +response time and reduced server RPC load. +The buffer cache size defaults to 10% of physical memory, +but this can be overridden by specifying the BUFPAGES option +in the machine's config file.\** +.(f +BUFPAGES is the number of physical machine pages allocated to the buffer cache. +ie. BUFPAGES * NBPG = buffer cache size in bytes +.)f +When increasing the size of BUFPAGES, it is also advisable to increase the +number of buffers NBUF by a corresponding amount. +Note that there is a tradeoff of memory allocated to the buffer cache versus +available for paging, which implies that making the buffer cache larger +will increase paging rate, with possibly disastrous results. +.sh 1 "Security Issues" +.pp +When a machine is running an NFS server it opens up a great big security hole. +For ordinary NFS, the server receives client credentials +in the RPC request as a user id +and a list of group ids and trusts them to be authentic! +The only tool available to restrict remote access to +file systems with is the exports(5) file, +so file systems should be exported with great care. +The exports file is read by mountd upon startup and after a hangup signal +is posted for it and then as much of the access specifications as possible are +pushed down into the kernel for use by the nfsd(s). +The trick here is that the kernel information is stored on a per +local file system mount point and client host address basis and cannot refer to +individual directories within the local server file system. +It is best to think of the exports file as referring to the various local +file systems and not just directory paths as mount points. +A local file system may be exported to a specific host, all hosts that +match a subnet mask or all other hosts (the world). The latter is very +dangerous and should only be used for public information. It is also +strongly recommended that file systems exported to ``the world'' be exported +read-only. +For each host or group of hosts, the file system can be exported read-only or +read/write. +You can also define one of three client user id to server credential +mappings to help control access. +Root (user id == 0) can be mapped to some default credentials while all other +user ids are accepted as given. +If the default credentials for user id equal zero +are root, then there is essentially no remapping. +Most NFS file systems are exported this way, most commonly mapping +user id == 0 to the credentials for the user nobody. +Since the client user id and group id list is used unchanged on the server +(except for root), this also implies that +the user id and group id space must be common between the client and server. +(ie. user id N on the client must refer to the same user on the server) +All user ids can be mapped to a default set of credentials, typically that of +the user nobody. This essentially gives world access to all +users on the corresponding hosts. +.pp +There is also a non-standard BSD +\fB-kerb\fR export option that requires the client provide +a KerberosIV rcmd service ticket to authenticate the user on the server. +If successful, the Kerberos principal is looked up in the server's password +and group databases to get a set of credentials and a map of client userid to +these credentials is then cached. +The use of TCP transport is strongly recommended, +since the scheme depends on the TCP connection to avert replay attempts. +Unfortunately, this option is only usable +between BSD clients and servers since it is +not compatible with other known ``kerberized'' NFS systems. +To enable use of this Kerberos option, both mount_nfs on the client and +nfsd on the server must be rebuilt with the -DKERBEROS option and +linked to KerberosIV libraries. +The file system is then exported to the client(s) with the \fB-kerb\fR option +in the exports file on the server +and the client mount specifies the +\fB-K\fR +and +\fB-T\fR +options. +The +\fB-m=\fIrealm\fR +mount option may be used to specify a Kerberos Realm for the ticket +(it must be the Kerberos Realm of the server) that is other than +the client's local Realm. +To access files in a \fB-kerb\fR mount point, the user must have a valid +TGT for the server's Realm, as provided by kinit or similar. +.pp +As well as the standard NFS Version 2 protocol (RFC1094) implementation, BSD +systems can use a variant of the protocol called Not Quite NFS (NQNFS) that +supports a variety of protocol extensions. +This protocol uses 64bit file offsets +and sizes, an \fIaccess rpc\fR, an \fIappend\fR option on the write rpc +and extended file attributes to support 4.4BSD file system functionality +more fully. +It also makes use of a variant of short term +\fIleases\fR [Gray89] with delayed write client caching, +in an effort to provide full cache consistency and better performance. +This protocol is available between 4.4BSD systems only and is used when +the \fB-q\fR mount option is specified. +It can be used with any of the aforementioned options for NFS, such as TCP +transport (\fB-T\fR) and KerberosIV authentication (\fB-K\fR). +Although this protocol is experimental, it is recommended over NFS for +mounts between 4.4BSD systems.\** +.(f +\**I would appreciate email from anyone who can provide +NFS vs. NQNFS performance measurements, +particularly fast clients, many clients or over an internetwork +connection with a large ``bandwidth * RTT'' product. +.)f +.sh 1 "Monitoring NFS Activity" +.pp +The basic command for monitoring NFS activity on clients and servers is +nfsstat. It reports cumulative statistics of various NFS activities, +such as counts of the various different RPCs and cache hit rates on the client +and server. Of particular interest on the server are the fields in the +\fIServer Cache Stats:\fR section, which gives numbers for RPC retries received +in the first three fields and total RPCs in the fourth. The first three fields +should remain a very small percentage of the total. If not, it +would indicate one or more clients doing retries too aggressively and the fix +would be to isolate these clients, +disable the dynamic RTO estimation on them and +make their initial timeout interval a conservative (ie. large) value. +.pp +On the client side, the fields in the \fIRpc Info:\fR section are of particular +interest, as they give an overall picture of NFS activity. +The \fITimedOut\fR field is the number of I/O system calls that returned -1 +for ``soft'' mounts and can be reduced +by increasing the retry limit or changing +the mount type to ``intr'' or ``hard''. +The \fIInvalid\fR field is a count of trashed RPC replies that are received +and should remain zero.\** +.(f +\**Some NFS implementations run with UDP checksums disabled, so garbage RPC +messages can be received. +.)f +The \fIX Replies\fR field counts the number of repeated RPC replies received +from the server and is a clear indication of a too aggressive RTO estimate. +Unfortunately, a good NFS server implementation will use a ``recent request +cache'' [Juszczak89] that will suppress the extraneous replies. +A large value for \fIRetries\fR indicates a problem, but +it could be any of: +.ip \(bu +a too aggressive RTO estimate +.ip \(bu +an overloaded NFS server +.ip \(bu +IP fragments being dropped (gateway, client or server) +.lp +and requires further investigation. +The \fIRequests\fR field is the total count of RPCs done on all servers. +.pp +The \fBnetstat -s\fR comes in useful during investigation of RPC transport +problems. +The field \fIfragments dropped after timeout\fR in +the \fIip:\fR section indicates IP fragments are +being lost and a significant number of these occurring indicates that the +use of TCP transport or a smaller read/write data size is in order. +A significant number of \fIbad checksums\fR reported in the \fIudp:\fR +section would suggest network problems of a more generic sort. +(cabling, transceiver or network hardware interface problems or similar) +.pp +There is a RPC activity logging facility for both the client and +server side in the kernel. +When logging is enabled by setting the kernel variable nfsrtton to +one, the logs in the kernel structures nfsrtt (for the client side) +and nfsdrt (for the server side) are updated upon the completion +of each RPC in a circular manner. +The pos element of the structure is the index of the next element +of the log array to be updated. +In other words, elements of the log array from \fIlog\fR[pos] to +\fIlog\fR[pos - 1] are in chronological order. +The include file <sys/nfsrtt.h> should be consulted for details on the +fields in the two log structures.\** +.(f +\**Unfortunately, a monitoring tool that uses these logs is still in the +planning (dreaming) stage. +.)f +.sh 1 "Diskless Client Support" +.pp +The NFS client does include kernel support for diskless/dataless operation +where the root file system and optionally the swap area is remote NFS mounted. +A diskless/dataless client is configured using a version of the +``swapvmunix.c'' file as provided in the directory \fIcontrib/diskless.nfs\fR. +If the swap device == NODEV, it specifies an NFS mounted swap area and should +be configured the same size as set up by diskless_setup when run on the server. +This file must be put in the \fIsys/compile/<machine_name>\fR kernel build +directory after the config command has been run, since config does +not know about specifying NFS root and swap areas. +The kernel variable mountroot must be set to nfs_mountroot instead of +ffs_mountroot and the kernel structure nfs_diskless must be filled in +properly. +There are some primitive system administration tools in the \fIcontrib/diskless.nfs\fR directory to assist in filling in +the nfs_diskless structure and in setting up an NFS server for +diskless/dataless clients. +The tools were designed to provide a bare bones capability, to allow maximum +flexibility when setting up different servers. +.lp +The tools are as follows: +.ip \(bu +diskless_offset.c - This little program reads a ``vmunix'' object file and +writes the file byte offset of the nfs_diskless structure in it to +standard out. It was kept separate because it sometimes has to +be compiled/linked in funny ways depending on the client architecture. +(See the comment at the beginning of it.) +.ip \(bu +diskless_setup.c - This program is run on the server and sets up files for a +given client. It mostly just fills in an nfs_diskless structure and +writes it out to either the "vmunix" file or a separate file called +/var/diskless/setup.<official-hostname> +.ip \(bu +diskless_boot.c - There are two functions in here that may be used +by a bootstrap server such as tftpd to permit sharing of the ``vmunix'' +object file for similar clients. This saves disk space on the bootstrap +server and simplify organization, but are not critical for correct operation. +They read the ``vmunix'' +file, but optionally fill in the nfs_diskless structure from a +separate "setup.<official-hostname>" file so that there is only +one copy of "vmunix" for all similar (same arch etc.) clients. +These functions use a text file called +/var/diskless/boot.<official-hostname> to control the netboot. +.lp +The basic setup steps are: +.ip \(bu +make a "vmunix" for the client(s) with mountroot() == nfs_mountroot() +and swdevt[0].sw_dev == NODEV if it is to do nfs swapping as well +(See the same swapvmunix.c file) +.ip \(bu +run diskless_offset on the vmunix file to find out the byte offset +of the nfs_diskless structure +.ip \(bu +Run diskless_setup on the server to set up the server and fill in the +nfs_diskless structure for that client. +The nfs_diskless structure can either be written into the +vmunix file (the -x option) or +saved in /var/diskless/setup.<official-hostname>. +.ip \(bu +Set up the bootstrap server. If the nfs_diskless structure was written into +the ``vmunix'' file, any vanilla bootstrap protocol such as bootp/tftp can +be used. If the bootstrap server has been modified to use the functions in +diskless_boot.c, then a +file called /var/diskless/boot.<official-hostname> +must be created. +It is simply a two line text file, where the first line is the pathname +of the correct ``vmunix'' file and the second line has the pathname of +the nfs_diskless structure file and its byte offset in it. +For example: +.br + /var/diskless/vmunix.pmax +.br + /var/diskless/setup.rickers.cis.uoguelph.ca 642308 +.br +.ip \(bu +Create a /var subtree for each client in an appropriate place on the server, +such as /var/diskless/var/<client-hostname>/... +By using the <client-hostname> to differentiate /var for each host, +/etc/rc can be modified to mount the correct /var from the server. diff --git a/share/doc/smm/06.nfs/2.t b/share/doc/smm/06.nfs/2.t new file mode 100644 index 00000000000..841dd5f9147 --- /dev/null +++ b/share/doc/smm/06.nfs/2.t @@ -0,0 +1,530 @@ +.\" Copyright (c) 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This document is derived from software contributed to Berkeley by +.\" Rick Macklem at The University of Guelph. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. All advertising materials mentioning features or use of this software +.\" must display the following acknowledgement: +.\" This product includes software developed by the University of +.\" California, Berkeley and its contributors. +.\" 4. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)2.t 8.1 (Berkeley) 6/8/93 +.\" +.sh 1 "Not Quite NFS, Crash Tolerant Cache Consistency for NFS" +.pp +Not Quite NFS (NQNFS) is an NFS like protocol designed to maintain full cache +consistency between clients in a crash tolerant manner. +It is an adaptation of the NFS protocol such that the server supports both NFS +and NQNFS clients while maintaining full consistency between the server and +NQNFS clients. +This section borrows heavily from work done on Spritely-NFS [Srinivasan89], +but uses Leases [Gray89] to avoid the need to recover server state information +after a crash. +The reader is strongly encouraged to read these references before +trying to grasp the material presented here. +.sh 2 "Overview" +.pp +The protocol maintains cache consistency by using a somewhat +Sprite [Nelson88] like protocol, +but is based on short term leases\** instead of hard state information +about open files. +.(f +\** A lease is a ticket permitting an activity that is +valid until some expiry time. +.)f +The basic principal is that the protocol will disable client caching of a +file whenever that file is write shared\**. +.(f +\** Write sharing occurs when at least one client is modifying a file while +other client(s) are reading the file. +.)f +Whenever a client wishes to cache data for a file it must hold a valid lease. +There are three types of leases: read caching, write caching and non-caching. +The latter type requires that all file operations be done synchronously with +the server via. RPCs. +A read caching lease allows for client data caching, but no file modifications +may be done. +A write caching lease allows for client caching of writes, +but requires that all writes be pushed to the server when the lease expires. +If a client has dirty buffers\** +.(f +\** Cached write data is not yet pushed (written) to the server. +.)f +when a write cache lease has almost expired, it will attempt to +extend the lease but is required to push the dirty buffers if extension fails. +A client gets leases by either doing a \fBGetLease RPC\fR or by piggybacking +a \fBGetLease Request\fR onto another RPC. Piggybacking is supported for the +frequent RPCs Getattr, Setattr, Lookup, Readlink, Read, Write and Readdir +in an effort to minimize the number of \fBGetLease RPCs\fR required. +All leases are at the granularity of a file, since all NFS RPCs operate on +individual files and NFS has no intrinsic notion of a file hierarchy. +Directories, symbolic links and file attributes may be read cached but +are not write cached. +The exception here is the attribute file_size, which is updated during cached +writing on the client to reflect a growing file. +.pp +It is the server's responsibility to ensure that consistency is maintained +among the NQNFS clients by disabling client caching whenever a server file +operation would cause inconsistencies. +The possibility of inconsistencies occurs whenever a client has +a write caching lease and any other client, +or local operations on the server, +tries to access the file or when +a modify operation is attempted on a file being read cached by client(s). +At this time, the server sends an \fBeviction notice\fR to all clients holding +the lease and then waits for lease termination. +Lease termination occurs when a \fBvacated the premises\fR message has been +received from all the clients that have signed the lease or when the lease +expires via. timeout. +The message pair \fBeviction notice\fR and \fBvacated the premises\fR roughly +correspond to a Sprite server\(->client callback, but are not implemented as an +actual RPC, to avoid the server waiting indefinitely for a reply from a dead +client. +.pp +Server consistency checking can be viewed as issuing intrinsic leases for a +file operation for the duration of the operation only. For example, the +\fBCreate RPC\fR will get an intrinsic write lease on the directory in which +the file is being created, disabling client read caches for that directory. +.pp +By relegating this responsibility to the server, consistency between the +server and NQNFS clients is maintained when NFS clients are modifying the +file system as well.\** +.(f +\** The NFS clients will continue to be \fIapproximately\fR consistent with +the server. +.)f +.pp +The leases are issued as time intervals to avoid the requirement of time of day +clock synchronization. There are three important time constants known to +the server. The \fBmaximum_lease_term\fR sets an upper bound on lease duration. +The \fBclock_skew\fR is added to all lease terms on the server to correct for +differing clock speeds between the client and server and \fBwrite_slack\fR is +the number of seconds the server is willing to wait for a client with +an expired write caching lease to push dirty writes. +.pp +The server maintains a \fBmodify_revision\fR number for each file. It is +defined as a unsigned quadword integer that is never zero and that must +increase whenever the corresponding file is modified on the server. +It is used +by the client to determine whether or not cached data for the file is +stale. +Generating this value is easier said than done. The current implementation +uses the following technique, which is believed to be adequate. +The high order longword is stored in the ufs inode and is initialized to one +when an inode is first allocated. +The low order longword is stored in main memory only and is initialized to +zero when an inode is read in from disk. +When the file is modified for the first time within a given second of +wall clock time, the high order longword is incremented by one and +the low order longword reset to zero. +For subsequent modifications within the same second of wall clock +time, the low order longword is incremented. If the low order longword wraps +around to zero, the high order longword is incremented again. +Since the high order longword only increments once per second and the inode +is pushed to disk frequently during file modification, this implies +0 \(<= Current\(miDisk \(<= 5. +When the inode is read in from disk, 10 +is added to the high order longword, which ensures that the quadword +is greater than any value it could have had before a crash. +This introduces apparent modifications every time the inode falls out of +the LRU inode cache, but this should only reduce the client caching performance +by a (hopefully) small margin. +.sh 2 "Crash Recovery and other Failure Scenarios" +.pp +The server must maintain the state of all the current leases held by clients. +The nice thing about short term leases is that maximum_lease_term seconds +after the server stops issuing leases, there are no current leases left. +As such, server crash recovery does not require any state recovery. After +rebooting, the server refuses to service any RPCs except for writes until +write_slack seconds after the last lease would have expired\**. +.(f +\** The last lease expiry time may be safely estimated as +"boottime+maximum_lease_term+clock_skew" for machines that cannot store +it in nonvolatile RAM. +.)f +By then, the server would not have any outstanding leases to recover the +state of and the clients have had at least write_slack seconds to push dirty +writes to the server and get the server sync'd up to date. After this, the +server simply services requests in a manner similar to NFS. +In an effort to minimize the effect of "recovery storms" [Baker91], +the server replies \fBtry_again_later\fR to the RPCs it is not +yet ready to service. +.pp +After a client crashes, the server may have to wait for a lease to timeout +before servicing a request if write sharing of a file with a cachable lease +on the client is about to occur. +As for the client, it simply starts up getting any leases it now needs. Any +outstanding leases for that client on the server prior to the crash will either be renewed or expire +via timeout. +.pp +Certain network partitioning failures are more problematic. If a client to +server network connection is severed just before a write caching lease expires, +the client cannot push the dirty writes to the server. After the lease expires +on the server, the server permits other clients to access the file with the +potential of getting stale data. Unfortunately I believe this failure scenario +is intrinsic in any delay write caching scheme unless the server is required to +wait \fBforever\fR for a client to regain contact\**. +.(f +\** Gray and Cheriton avoid this problem by using a \fBwrite through\fR policy. +.)f +Since the write caching lease has expired on the client, +it will sync up with the +server as soon as the network connection has been re-established. +.pp +There is another failure condition that can occur when the server is congested. +The worst case scenario would have the client pushing dirty writes to the server +but a large request queue on the server delays these writes for more than +\fBwrite_slack\fR seconds. It is hoped that a congestion control scheme using +the \fBtry_again_later\fR RPC reply after booting combined with +the following lease termination rule for write caching leases +can minimize the risk of this occurrence. +A write caching lease is only terminated on the server when there are have +been no writes to the file and the server has not been overloaded during +the previous write_slack seconds. The server has not been overloaded +is approximated by a test for sleeping nfsd(s) at the end of the write_slack +period. +.sh 2 "Server Disk Full" +.pp +There is a serious unresolved problem for delayed write caching with respect to +server disk space allocation. +When the disk on the file server is full, delayed write RPCs can fail +due to "out of space". +For NFS, this occurrence results in an error return from the close system +call on the file, since the dirty blocks are pushed on close. +Processes writing important files can check for this error return +to ensure that the file was written successfully. +For NQNFS, the dirty blocks are not pushed on close and as such the client +may not attempt the write RPC until after the process has done the close +which implies no error return from the close. +For the current prototype, +the only solution is to modify programs writing important +file(s) to call fsync and check for an error return from it instead of close. +.sh 2 "Protocol Details" +.pp +The protocol specification is identical to that of NFS [Sun89] except for +the following changes. +.ip \(bu +RPC Information +.(l + Program Number 300105 + Version Number 1 +.)l +.ip \(bu +Readdir_and_Lookup RPC +.(l + struct readdirlookargs { + fhandle file; + nfscookie cookie; + unsigned count; + unsigned duration; + }; + + struct entry { + unsigned cachable; + unsigned duration; + modifyrev rev; + fhandle entry_fh; + nqnfs_fattr entry_attrib; + unsigned fileid; + filename name; + nfscookie cookie; + entry *nextentry; + }; + + union readdirlookres switch (stat status) { + case NFS_OK: + struct { + entry *entries; + bool eof; + } readdirlookok; + default: + void; + }; + + readdirlookres + NQNFSPROC_READDIRLOOK(readdirlookargs) = 18; +.)l +Reads entries in a directory in a manner analogous to the NFSPROC_READDIR RPC +in NFS, but returns the file handle and attributes of each entry as well. +This allows the attribute and lookup caches to be primed. +.ip \(bu +Get Lease RPC +.(l + struct getleaseargs { + fhandle file; + cachetype readwrite; + unsigned duration; + }; + + union getleaseres switch (stat status) { + case NFS_OK: + bool cachable; + unsigned duration; + modifyrev rev; + nqnfs_fattr attributes; + default: + void; + }; + + getleaseres + NQNFSPROC_GETLEASE(getleaseargs) = 19; +.)l +Gets a lease for "file" valid for "duration" seconds from when the lease +was issued on the server\**. +.(f +\** To be safe, the client may only assume that the lease is valid +for ``duration'' seconds from when the RPC request was sent to the server. +.)f +The lease permits client caching if "cachable" is true. +The modify revision level and attributes for the file are also returned. +.ip \(bu +Eviction Message +.(l + void + NQNFSPROC_EVICTED (fhandle) = 21; +.)l +This message is sent from the server to the client. When the client receives +the message, it should flush data associated with the file represented by +"fhandle" from its caches and then send the \fBVacated Message\fR back to +the server. Flushing includes pushing any dirty writes via. write RPCs. +.ip \(bu +Vacated Message +.(l + void + NQNFSPROC_VACATED (fhandle) = 20; +.)l +This message is sent from the client to the server in response to the +\fBEviction Message\fR. See above. +.ip \(bu +Access RPC +.(l + struct accessargs { + fhandle file; + bool read_access; + bool write_access; + bool exec_access; + }; + + stat + NQNFSPROC_ACCESS(accessargs) = 22; +.)l +The access RPC does permission checking on the server for the given type +of access required by the client for the file. +Use of this RPC avoids accessibility problems caused by client->server uid +mapping. +.ip \(bu +Piggybacked Get Lease Request +.pp +The piggybacked get lease request is functionally equivalent to the Get Lease +RPC except that is attached to one of the other NQNFS RPC requests as follows. +A getleaserequest is prepended to all of the request arguments for NQNFS +and a getleaserequestres is inserted in all NFS result structures just after +the "stat" field only if "stat == NFS_OK". +.(l + union getleaserequest switch (cachetype type) { + case NQLREAD: + case NQLWRITE: + unsigned duration; + default: + void; + }; + + union getleaserequestres switch (cachetype type) { + case NQLREAD: + case NQLWRITE: + bool cachable; + unsigned duration; + modifyrev rev; + default: + void; + }; +.)l +The get lease request applies to the file that the attached RPC operates on +and the file attributes remain in the same location as for the NFS RPC reply +structure. +.ip \(bu +Three additional "stat" values +.pp +Three additional values have been added to the enumerated type "stat". +.(l + NQNFS_EXPIRED=500 + NQNFS_TRYLATER=501 + NQNFS_AUTHERR=502 +.)l +The "expired" value indicates that a lease has expired. +The "try later" +value is returned by the server when it wishes the client to retry the +RPC request after a short delay. It is used during crash recovery (Section 2) +and may also be useful for server congestion control. +The "authetication error" value is returned for kerberized mount points to +indicate that there is no cached authentication mapping and a Kerberos ticket +for the principal is required. +.sh 2 "Data Types" +.ip \(bu +cachetype +.(l + enum cachetype { + NQLNONE = 0, + NQLREAD = 1, + NQLWRITE = 2 + }; +.)l +Type of lease requested. NQLNONE is used to indicate no piggybacked lease +request. +.ip \(bu +modifyrev +.(l + typedef unsigned hyper modifyrev; +.)l +The "modifyrev" is a unsigned quadword integer value that is never zero +and increases every time the corresponding file is modified on the server. +.ip \(bu +nqnfs_time +.(l + struct nqnfs_time { + unsigned seconds; + unsigned nano_seconds; + }; +.)l +For NQNFS times are handled at nano second resolution instead of micro second +resolution for NFS. +.ip \(bu +nqnfs_fattr +.(l + struct nqnfs_fattr { + ftype type; + unsigned mode; + unsigned nlink; + unsigned uid; + unsigned gid; + unsigned hyper size; + unsigned blocksize; + unsigned rdev; + unsigned hyper bytes; + unsigned fsid; + unsigned fileid; + nqnfs_time atime; + nqnfs_time mtime; + nqnfs_time ctime; + unsigned flags; + unsigned generation; + modifyrev rev; + }; +.)l +The nqnfs_fattr structure is modified from the NFS fattr so that it stores +the file size as a 64bit quantity and the storage occupied as a 64bit number +of bytes. It also has fields added for the 4.4BSD va_flags and va_gen fields +as well as the file's modify rev level. +.ip \(bu +nqnfs_sattr +.(l + struct nqnfs_sattr { + unsigned mode; + unsigned uid; + unsigned gid; + unsigned hyper size; + nqnfs_time atime; + nqnfs_time mtime; + unsigned flags; + unsigned rdev; + }; +.)l +The nqnfs_sattr structure is modified from the NFS sattr structure in the +same manner as fattr. +.lp +The arguments to several of the NFS RPCs have been modified as well. Mostly, +these are minor changes to use 64bit file offsets or similar. The modified +argument structures follow. +.ip \(bu +Lookup RPC +.(l + struct lookup_diropargs { + unsigned duration; + fhandle dir; + filename name; + }; + + union lookup_diropres switch (stat status) { + case NFS_OK: + struct { + union getleaserequestres lookup_lease; + fhandle file; + nqnfs_fattr attributes; + } lookup_diropok; + default: + void; + }; + +.)l +The additional "duration" argument tells the server to get a lease for the +name being looked up if it is non-zero and the lease is specified +in "lookup_lease". +.ip \(bu +Read RPC +.(l + struct nqnfs_readargs { + fhandle file; + unsigned hyper offset; + unsigned count; + }; +.)l +.ip \(bu +Write RPC +.(l + struct nqnfs_writeargs { + fhandle file; + unsigned hyper offset; + bool append; + nfsdata data; + }; +.)l +The "append" argument is true for apeend only write operations. +.ip \(bu +Get Filesystem Attributes RPC +.(l + union nqnfs_statfsres (stat status) { + case NFS_OK: + struct { + unsigned tsize; + unsigned bsize; + unsigned blocks; + unsigned bfree; + unsigned bavail; + unsigned files; + unsigned files_free; + } info; + default: + void; + }; +.)l +The "files" field is the number of files in the file system and the "files_free" +is the number of additional files that can be created. +.sh 1 "Summary" +.pp +The configuration and tuning of an NFS environment tends to be a bit of a +mystic art, but hopefully this paper along with the man pages and other +reading will be helpful. Good Luck. diff --git a/share/doc/smm/06.nfs/Makefile b/share/doc/smm/06.nfs/Makefile new file mode 100644 index 00000000000..e36a0a613c7 --- /dev/null +++ b/share/doc/smm/06.nfs/Makefile @@ -0,0 +1,7 @@ +# @(#)Makefile 8.1 (Berkeley) 6/8/93 + +DIR= smm/06.nfs +SRCS= 0.t 1.t 2.t ref.t +MACROS= -me + +.include <bsd.doc.mk> diff --git a/share/doc/smm/06.nfs/ref.t b/share/doc/smm/06.nfs/ref.t new file mode 100644 index 00000000000..039363bb0a4 --- /dev/null +++ b/share/doc/smm/06.nfs/ref.t @@ -0,0 +1,123 @@ +.\" Copyright (c) 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" This document is derived from software contributed to Berkeley by +.\" Rick Macklem at The University of Guelph. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. All advertising materials mentioning features or use of this software +.\" must display the following acknowledgement: +.\" This product includes software developed by the University of +.\" California, Berkeley and its contributors. +.\" 4. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" @(#)ref.t 8.1 (Berkeley) 6/8/93 +.\" +.sh 1 "Bibliography" +.ip [Baker91] 16 +Mary Baker and John Ousterhout, Availability in the Sprite Distributed +File System, In \fIOperating System Review\fR, (25)2, pg. 95-98, +April 1991. +.ip [Baker91a] 16 +Mary Baker, Private Email Communication, May 1991. +.ip [Burrows88] 16 +Michael Burrows, Efficient Data Sharing, Technical Report #153, +Computer Laboratory, University of Cambridge, Dec. 1988. +.ip [Gray89] 16 +Cary G. Gray and David R. Cheriton, Leases: An Efficient Fault-Tolerant +Mechanism for Distributed File Cache Consistency, In \fIProc. of the +Twelfth ACM Symposium on Operating Systems Principals\fR, Litchfield Park, +AZ, Dec. 1989. +.ip [Howard88] 16 +John H. Howard, Michael L. Kazar, Sherri G. Menees, David A. Nichols, +M. Satyanarayanan, Robert N. Sidebotham and Michael J. West, +Scale and Performance in a Distributed File System, \fIACM Trans. on +Computer Systems\fR, (6)1, pg 51-81, Feb. 1988. +.ip [Juszczak89] 16 +Chet Juszczak, Improving the Performance and Correctness of an NFS Server, +In \fIProc. Winter 1989 USENIX Conference,\fR pg. 53-63, San Diego, CA, January 1989. +.ip [Keith90] 16 +Bruce E. Keith, Perspectives on NFS File Server Performance Characterization, +In \fIProc. Summer 1990 USENIX Conference\fR, pg. 267-277, Anaheim, CA, +June 1990. +.ip [Kent87] 16 +Christopher. A. Kent, \fICache Coherence in Distributed Systems\fR, +Research Report 87/4, +Digital Equipment Corporation Western Research Laboratory, April 1987. +.ip [Kent87a] 16 +Christopher. A. Kent and Jeffrey C. Mogul, +\fIFragmentation Considered Harmful\fR, Research Report 87/3, +Digital Equipment Corporation Western Research Laboratory, Dec. 1987. +.ip [Macklem91] 16 +Rick Macklem, Lessons Learned Tuning the 4.3BSD Reno Implementation of the +NFS Protocol, In \fIProc. Winter USENIX Conference\fR, pg. 53-64, +Dallas, TX, January 1991. +.ip [Nelson88] 16 +Michael N. Nelson, Brent B. Welch, and John K. Ousterhout, Caching in the +Sprite Network File System, \fIACM Transactions on Computer Systems\fR (6)1 +pg. 134-154, February 1988. +.ip [Nowicki89] 16 +Bill Nowicki, Transport Issues in the Network File System, In +\fIComputer Communication Review\fR, pg. 16-20, Vol. 19, Number 2, April 1989. +.ip [Ousterhout90] 16 +John K. Ousterhout, Why Aren't Operating Systems Getting Faster As Fast as +Hardware? In \fIProc. Summer 1990 USENIX Conference\fR, pg. 247-256, Anaheim, +CA, June 1990. +.ip [Pendry93] 16 +Jan-Simon Pendry, 4.4 BSD Automounter Reference Manual, In +\fIsrc/usr.sbin/amd/doc directory of 4.4 BSD distribution tape\fR. +.ip [Reid90] 16 +Jim Reid, N(e)FS: the Protocol is the Problem, In +\fIProc. Summer 1990 UKUUG Conference\fR, +London, England, July 1990. +.ip [Sandberg85] 16 +Russel Sandberg, David Goldberg, Steve Kleiman, Dan Walsh, and Bob Lyon, +Design and Implementation of the Sun Network filesystem, In \fIProc. Summer +1985 USENIX Conference\fR, pages 119-130, Portland, OR, June 1985. +.ip [Schroeder85] 16 +Michael D. Schroeder, David K. Gifford and Roger M. Needham, A Caching +File System For A Programmer's Workstation, In \fIProc. of the Tenth +ACM Symposium on Operating Systems Principals\fR, pg. 25-34, Orcas Island, +WA, Dec. 1985. +.ip [Srinivasan89] 16 +V. Srinivasan and Jeffrey. C. Mogul, \fISpritely NFS: Implementation and +Performance of Cache-Consistency Protocols\fR, Research Report 89/5, +Digital Equipment Corporation Western Research Laboratory, May 1989. +.ip [Steiner88] 16 +Jennifer G. Steiner, Clifford Neuman and Jeffrey I. Schiller, +Kerberos: An Authentication Service for Open Network Systems, In +\fIProc. Winter 1988 USENIX Conference\fR, Dallas, TX, February 1988. +.ip [Stern] 16 +Hal Stern, \fIManaging NFS and NIS\fR, O'Reilly and Associates, +ISBN 0-937175-75-7. +.ip [Sun87] 16 +Sun Microsystems Inc., \fIXDR: External Data Representation Standard\fR, +RFC1014, Network Information Center, SRI International, June 1987. +.ip [Sun88] 16 +Sun Microsystems Inc., \fIRPC: Remote Procedure Call Protocol Specification Version 2\fR, +RFC1057, Network Information Center, SRI International, June 1988. +.ip [Sun89] 16 +Sun Microsystems Inc., \fINFS: Network File System Protocol Specification\fR, +ARPANET Working Group Requests for Comment, DDN Network Information Center, +SRI International, Menlo Park, CA, March 1989, RFC-1094. |