This is cvsclient.info, produced by Makeinfo version 3.12f from ./cvsclient.texi.  File: cvsclient.info, Node: Top, Next: Introduction, Up: (dir) CVS Client/Server ***************** This document describes the client/server protocol used by CVS. It does not describe how to use or administer client/server CVS; see the regular CVS manual for that. This is version 1.10.7 of the protocol specification--*Note Introduction::, for more on what this version number means. * Menu: * Introduction:: What is CVS and what is the client/server protocol for? * Goals:: Basic design decisions, requirements, scope, etc. * Connection and Authentication:: Various ways to connect to the server * Password scrambling:: Scrambling used by pserver * Protocol:: Complete description of the protocol * Protocol Notes:: Possible enhancements, limitations, etc. of the protocol  File: cvsclient.info, Node: Introduction, Next: Goals, Prev: Top, Up: Top Introduction ************ CVS is a version control system (with some additional configuration management functionality). It maintains a central "repository" which stores files (often source code), including past versions, information about who modified them and when, and so on. People who wish to look at or modify those files, known as "developers", use CVS to "check out" a "working directory" from the repository, to "check in" new versions of files to the repository, and other operations such as viewing the modification history of a file. If developers are connected to the repository by a network, particularly a slow or flaky one, the most efficient way to use the network is with the CVS-specific protocol described in this document. Developers, using the machine on which they store their working directory, run the CVS "client" program. To perform operations which cannot be done locally, it connects to the CVS "server" program, which maintains the repository. For more information on how to connect see *Note Connection and Authentication::. This document describes the CVS protocol. Unfortunately, it does not yet completely document one aspect of the protocol--the detailed operation of each CVS command and option--and one must look at the CVS user documentation, `cvs.texinfo', for that information. The protocol is non-proprietary (anyone who wants to is encouraged to implement it) and an implementation, known as CVS, is available under the GNU Public License. The CVS distribution, containing this implementation, `cvs.texinfo', and a copy (possibly more or less up to date than what you are reading now) of this document, `cvsclient.texi', can be found at the usual GNU FTP sites, with a filename such as `cvs-VERSION.tar.gz'. This is version 1.10.7 of the protocol specification. This version number is intended only to aid in distinguishing different versions of this specification. Although the specification is currently maintained in conjunction with the CVS implementation, and carries the same version number, it also intends to document what is involved with interoperating with other implementations (such as other versions of CVS); see *Note Requirements::. This version number should not be used by clients or servers to determine what variant of the protocol to speak; they should instead use the `valid-requests' and `Valid-responses' mechanism (*note Protocol::.), which is more flexible.  File: cvsclient.info, Node: Goals, Next: Connection and Authentication, Prev: Introduction, Up: Top Goals ***** * Do not assume any access to the repository other than via this protocol. It does not depend on NFS, rdist, etc. * Providing a reliable transport is outside this protocol. The protocol expects a reliable transport that is transparent (that is, there is no translation of characters, including characters such as such as linefeeds or carriage returns), and can transmit all 256 octets (for example for proper handling of binary files, compression, and encryption). The encoding of characters specified by the protocol (the names of requests and so on) is the invariant ISO 646 character set (a subset of most popular character sets including ASCII and others). For more details on running the protocol over the TCP reliable transport, see *Note Connection and Authentication::. * Security and authentication are handled outside this protocol (but see below about `cvs kserver' and `cvs pserver'). * The protocol makes it possible for updates to be atomic with respect to checkins; that is if someone commits changes to several files in one cvs command, then an update by someone else would either get all the changes, or none of them. The current CVS server can't do this, but that isn't the protocol's fault. * The protocol is, with a few exceptions, transaction-based. That is, the client sends all its requests (without waiting for server responses), and then waits for the server to send back all responses (without waiting for further client requests). This has the advantage of minimizing network turnarounds and the disadvantage of sometimes transferring more data than would be necessary if there were a richer interaction. Another, more subtle, advantage is that there is no need for the protocol to provide locking for features such as making checkins atomic with respect to updates. Any such locking can be handled entirely by the server. A good server implementation (such as the current CVS server) will make sure that it does not have any such locks in place whenever it is waiting for communication with the client; this prevents one client on a slow or flaky network from interfering with the work of others. * It is a general design goal to provide only one way to do a given operation (where possible). For example, implementations have no choice about whether to terminate lines with linefeeds or some other character(s), and request and response names are case-sensitive. This is to enhance interoperability. If a protocol allows more than one way to do something, it is all too easy for some implementations to support only some of them (perhaps accidentally).  File: cvsclient.info, Node: Connection and Authentication, Next: Password scrambling, Prev: Goals, Up: Top How to Connect to and Authenticate Oneself to the CVS server ************************************************************ Connection and authentication occurs before the CVS protocol itself is started. There are several ways to connect. server If the client has a way to execute commands on the server, and provide input to the commands and output from them, then it can connect that way. This could be the usual rsh (port 514) protocol, Kerberos rsh, SSH, or any similar mechanism. The client may allow the user to specify the name of the server program; the default is `cvs'. It is invoked with one argument, `server'. Once it invokes the server, the client proceeds to start the cvs protocol. kserver The kerberized server listens on a port (in the current implementation, by having inetd call "cvs kserver") which defaults to 1999. The client connects, sends the usual kerberos authentication information, and then starts the cvs protocol. Note: port 1999 is officially registered for another use, and in any event one cannot register more than one port for CVS, so GSS-API (see below) is recommended instead of kserver as a way to support kerberos. pserver The name "pserver" is somewhat confusing. It refers to both a generic framework which allows the CVS protocol to support several authentication mechanisms, and a name for a specific mechanism which transfers a username and a cleartext password. Servers need not support all mechanisms, and in fact servers will typically want to support only those mechanisms which meet the relevant security needs. The pserver server listens on a port (in the current implementation, by having inetd call "cvs pserver") which defaults to 2401 (this port is officially registered). The client connects, and sends the following: * the string `BEGIN AUTH REQUEST', a linefeed, * the cvs root, a linefeed, * the username, a linefeed, * the password trivially encoded (see *Note Password scrambling::), a linefeed, * the string `END AUTH REQUEST', and a linefeed. The client must send the identical string for cvs root both here and later in the `Root' request of the cvs protocol itself. Servers are encouraged to enforce this restriction. The possible server responses (each of which is followed by a linefeed) are the following. Note that although there is a small similarity between this authentication protocol and the cvs protocol, they are separate. `I LOVE YOU' The authentication is successful. The client proceeds with the cvs protocol itself. `I HATE YOU' The authentication fails. After sending this response, the server may close the connection. It is up to the server to decide whether to give this response, which is generic, or a more specific response using `E' and/or `error'. `E TEXT' Provide a message for the user. After this reponse, the authentication protocol continues with another response. Typically the server will provide a series of `E' responses followed by `error'. Compatibility note: CVS 1.9.10 and older clients will print `unrecognized auth response' and TEXT, and then exit, upon receiving this response. `error CODE TEXT' The authentication fails. After sending this response, the server may close the connection. The CODE is a code describing why it failed, intended for computer consumption. The only code currently defined is `0' which is nonspecific, but clients must silently treat any unrecognized codes as nonspecific. The TEXT should be supplied to the user. Compatibility note: CVS 1.9.10 and older clients will print `unrecognized auth response' and TEXT, and then exit, upon receiving this response. Note that TEXT for this response, or the TEXT in an `E' response, is not designed for machine parsing. More vigorous use of CODE, or future extensions, will be needed to prove a cleaner machine-parseable indication of what the error was. If the client wishes to merely authenticate without starting the cvs protocol, the procedure is the same, except BEGIN AUTH REQUEST is replaced with BEGIN VERIFICATION REQUEST, END AUTH REQUEST is replaced with END VERIFICATION REQUEST, and upon receipt of I LOVE YOU the connection is closed rather than continuing. Another mechanism is GSSAPI authentication. GSSAPI is a generic interface to security services such as kerberos. GSSAPI is specified in RFC2078 (GSSAPI version 2) and RFC1508 (GSSAPI version 1); we are not aware of differences between the two which affect the protocol in incompatible ways, so we make no attempt to specify one version or the other. The procedure here is to start with `BEGIN GSSAPI REQUEST'. GSSAPI authentication information is then exchanged between the client and the server. Each packet of information consists of a two byte big endian length, followed by that many bytes of data. After the GSSAPI authentication is complete, the server continues with the responses described above (`I LOVE YOU', etc.). future possibilities There are a nearly unlimited number of ways to connect and authenticate. One might want to allow access based on IP address (similar to the usual rsh protocol but with different/no restrictions on ports < 1024), to adopt mechanisms such as Pluggable Authentication Modules (PAM), to allow users to run their own servers under their own usernames without root access, or any number of other possibilities. The way to add future mechanisms, for the most part, should be to continue to use port 2401, but to use different strings in place of `BEGIN AUTH REQUEST'.  File: cvsclient.info, Node: Password scrambling, Next: Protocol, Prev: Connection and Authentication, Up: Top Password scrambling algorithm ***************************** The pserver authentication protocol, as described in *Note Connection and Authentication::, trivially encodes the passwords. This is only to prevent inadvertent compromise; it provides no protection against even a relatively unsophisticated attacker. For comparison, HTTP Basic Authentication (as described in RFC2068) uses BASE64 for a similar purpose. CVS uses its own algorithm, described here. The scrambled password starts with `A', which serves to identify the scrambling algorithm in use. After that follows a single octet for each character in the password, according to a fixed encoding. The values are shown here, with the encoded values in decimal. Control characters, space, and characters outside the invariant ISO 646 character set are not shown; such characters are not recommended for use in passwords. There is a long discussion of character set issues in *Note Protocol Notes::. 0 111 P 125 p 58 ! 120 1 52 A 57 Q 55 a 121 q 113 " 53 2 75 B 83 R 54 b 117 r 32 3 119 C 43 S 66 c 104 s 90 4 49 D 46 T 124 d 101 t 44 % 109 5 34 E 102 U 126 e 100 u 98 & 72 6 82 F 40 V 59 f 69 v 60 ' 108 7 81 G 89 W 47 g 73 w 51 ( 70 8 95 H 38 X 92 h 99 x 33 ) 64 9 65 I 103 Y 71 i 63 y 97 * 76 : 112 J 45 Z 115 j 94 z 62 + 67 ; 86 K 50 k 93 , 116 < 118 L 42 l 39 - 74 = 110 M 123 m 37 . 68 > 122 N 91 n 61 / 87 ? 105 O 35 _ 56 o 48  File: cvsclient.info, Node: Protocol, Next: Protocol Notes, Prev: Password scrambling, Up: Top The CVS client/server protocol ****************************** In the following, `\n' refers to a linefeed and `\t' refers to a horizontal tab; "requests" are what the client sends and "responses" are what the server sends. In general, the connection is governed by the client--the server does not send responses without first receiving requests to do so; see *Note Response intro:: for more details of this convention. It is typical, early in the connection, for the client to transmit a `Valid-responses' request, containing all the responses it supports, followed by a `valid-requests' request, which elicits from the server a `Valid-requests' response containing all the requests it understands. In this way, the client and server each find out what the other supports before exchanging large amounts of data (such as file contents). * Menu: General protocol conventions: * Entries Lines:: Transmitting RCS data * File Modes:: Read, write, execute, and possibly more... * Filenames:: Conventions regarding filenames * File transmissions:: How file contents are transmitted * Strings:: Strings in various requests and responses * Dates:: Times and dates The protocol itself: * Request intro:: General conventions relating to requests * Requests:: List of requests * Response intro:: General conventions relating to responses * Response pathnames:: The "pathname" in responses * Responses:: List of responses * Text tags:: More details about the MT response An example session, and some further observations: * Example:: A conversation between client and server * Requirements:: Things not to omit from an implementation * Obsolete:: Former protocol features  File: cvsclient.info, Node: Entries Lines, Next: File Modes, Up: Protocol Entries Lines ============= Entries lines are transmitted as: / NAME / VERSION / CONFLICT / OPTIONS / TAG_OR_DATE TAG_OR_DATE is either `T' TAG or `D' DATE or empty. If it is followed by a slash, anything after the slash shall be silently ignored. VERSION can be empty, or start with `0' or `-', for no user file, new user file, or user file to be removed, respectively. CONFLICT, if it starts with `+', indicates that the file had conflicts in it. The rest of CONFLICT is `=' if the timestamp matches the file, or anything else if it doesn't. If CONFLICT does not start with a `+', it is silently ignored. OPTIONS signifies the keyword expansion options (for example `-ko'). In an `Entry' request, this indicates the options that were specified with the file from the previous file updating response (*note Response intro::., for a list of file updating responses); if the client is specifying the `-k' or `-A' option to `update', then it is the server which figures out what overrides what.  File: cvsclient.info, Node: File Modes, Next: Filenames, Prev: Entries Lines, Up: Protocol File Modes ========== A mode is any number of repetitions of MODE-TYPE = DATA separated by `,'. MODE-TYPE is an identifier composed of alphanumeric characters. Currently specified: `u' for user, `g' for group, `o' for other (see below for discussion of whether these have their POSIX meaning or are more loose). Unrecognized values of MODE-TYPE are silently ignored. DATA consists of any data not containing `,', `\0' or `\n'. For `u', `g', and `o' mode types, data consists of alphanumeric characters, where `r' means read, `w' means write, `x' means execute, and unrecognized letters are silently ignored. The two most obvious ways in which the mode matters are: (1) is it writeable? This is used by the developer communication features, and is implemented even on OS/2 (and could be implemented on DOS), whose notion of mode is limited to a readonly bit. (2) is it executable? Unix CVS users need CVS to store this setting (for shell scripts and the like). The current CVS implementation on unix does a little bit more than just maintain these two settings, but it doesn't really have a nice general facility to store or version control the mode, even on unix, much less across operating systems with diverse protection features. So all the ins and outs of what the mode means across operating systems haven't really been worked out (e.g. should the VMS port use ACLs to get POSIX semantics for groups?).  File: cvsclient.info, Node: Filenames, Next: File transmissions, Prev: File Modes, Up: Protocol Conventions regarding transmission of file names ================================================ In most contexts, `/' is used to separate directory and file names in filenames, and any use of other conventions (for example, that the user might type on the command line) is converted to that form. The only exceptions might be a few cases in which the server provides a magic cookie which the client then repeats verbatim, but as the server has not yet been ported beyond unix, the two rules provide the same answer (and what to do if future server ports are operating on a repository like e:/foo or CVS_ROOT:[FOO.BAR] has not been carefully thought out). Characters outside the invariant ISO 646 character set should be avoided in filenames. This restriction may need to be relaxed to allow for characters such as `[' and `]' (see above about non-unix servers); this has not been carefully considered (and currently implementations probably use whatever character sets that the operating systems they are running on allow, and/or that users specify). Of course the most portable practice is to restrict oneself further, to the POSIX portable filename character set as specified in POSIX.1.  File: cvsclient.info, Node: File transmissions, Next: Strings, Prev: Filenames, Up: Protocol File transmissions ================== File contents (noted below as FILE TRANSMISSION) can be sent in one of two forms. The simpler form is a number of bytes, followed by a linefeed, followed by the specified number of bytes of file contents. These are the entire contents of the specified file. Second, if both client and server support `gzip-file-contents', a `z' may precede the length, and the `file contents' sent are actually compressed with `gzip' (RFC1952/1951) compression. The length specified is that of the compressed version of the file. In neither case are the file content followed by any additional data. The transmission of a file will end with a linefeed iff that file (or its compressed form) ends with a linefeed. The encoding of file contents depends on the value for the `-k' option. If the file is binary (as specified by the `-kb' option in the appropriate place), then it is just a certain number of octets, and the protocol contributes nothing towards determining the encoding (using the file name is one widespread, if not universally popular, mechanism). If the file is text (not binary), then the file is sent as a series of lines, separated by linefeeds. If the keyword expansion is set to something other than `-ko', then it is expected that the file conform to the RCS expectations regarding keyword expansion--in particular, that it is in a character set such as ASCII in which 0x24 is a dollar sign (`$').  File: cvsclient.info, Node: Strings, Next: Dates, Prev: File transmissions, Up: Protocol Strings ======= In various contexts, for example the `Argument' request and the `M' response, one transmits what is essentially an arbitrary string. Often this will have been supplied by the user (for example, the `-m' option to the `ci' request). The protocol has no mechanism to specify the character set of such strings; it would be fairly safe to stick to the invariant ISO 646 character set but the existing practice is probably to just transmit whatever the user specifies, and hope that everyone involved agrees which character set is in use, or sticks to a common subset.  File: cvsclient.info, Node: Dates, Next: Request intro, Prev: Strings, Up: Protocol Dates ===== The protocol contains times and dates in various places. For the `-D' option to the `annotate', `co', `diff', `export', `history', `rdiff', `rtag', `tag', and `update' requests, the server should support two formats: 26 May 1997 13:01:40 -0000 ; RFC 822 as modified by RFC 1123 5/26/1997 13:01:40 GMT ; traditional The former format is preferred; the latter however is sent by the CVS command line client (versions 1.5 through at least 1.9). For the `-d' option to the `log' request, servers should at least support RFC 822/1123 format. Clients are encouraged to use this format too (traditionally the command line CVS client has just passed along the date format specified by the user, however). The `Mod-time' response and `Checkin-time' request use RFC 822/1123 format (see the descriptions of that response and request for details). For `Notify', see the description of that request.  File: cvsclient.info, Node: Request intro, Next: Requests, Prev: Dates, Up: Protocol Request intro ============= By convention, requests which begin with a capital letter do not elicit a response from the server, while all others do - save one. The exception is `gzip-file-contents'. Unrecognized requests will always elicit a response from the server, even if that request begins with a capital letter. The term "command" means a request which expects a response (except `valid-requests'). The general model is that the client transmits a great number of requests, but nothing happens until the very end when the client transmits a command. Although the intention is that transmitting several commands in one connection should be legal, existing servers probably have some bugs with some combinations of more than one command, and so clients may find it necessary to make several connections in some cases. This should be thought of as a workaround rather than a desired attribute of the protocol.