diff options
author | Bob Beck <beck@cvs.openbsd.org> | 1999-09-29 06:30:11 +0000 |
---|---|---|
committer | Bob Beck <beck@cvs.openbsd.org> | 1999-09-29 06:30:11 +0000 |
commit | d7a28c8e58fea890c759cc33cd38ab83a7c526c6 (patch) | |
tree | f0f30a4771b74f546171ab069514b642ac12a521 /usr.sbin/httpd/htdocs/manual/misc | |
parent | 0ec93a585fb52894b76953291e90f5b41f3b543e (diff) |
Apache 1.3.9 + Mod_ssl 2.4.2 - now builds with apaci nastiness.
Diffstat (limited to 'usr.sbin/httpd/htdocs/manual/misc')
-rw-r--r-- | usr.sbin/httpd/htdocs/manual/misc/API.html | 962 | ||||
-rw-r--r-- | usr.sbin/httpd/htdocs/manual/misc/FAQ.html | 2798 | ||||
-rw-r--r-- | usr.sbin/httpd/htdocs/manual/misc/howto.html | 7 | ||||
-rw-r--r-- | usr.sbin/httpd/htdocs/manual/misc/known_client_problems.html | 292 | ||||
-rw-r--r-- | usr.sbin/httpd/htdocs/manual/misc/perf-dec.html | 31 | ||||
-rw-r--r-- | usr.sbin/httpd/htdocs/manual/misc/perf-tuning.html | 12 | ||||
-rw-r--r-- | usr.sbin/httpd/htdocs/manual/misc/security_tips.html | 120 |
7 files changed, 2499 insertions, 1723 deletions
diff --git a/usr.sbin/httpd/htdocs/manual/misc/API.html b/usr.sbin/httpd/htdocs/manual/misc/API.html index 1ad15723ed3..8b46bcd390c 100644 --- a/usr.sbin/httpd/htdocs/manual/misc/API.html +++ b/usr.sbin/httpd/htdocs/manual/misc/API.html @@ -1,7 +1,7 @@ <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> -<html><head> -<title>Apache API notes</title> -</head> +<HTML><HEAD> +<TITLE>Apache API notes</TITLE> +</HEAD> <!-- Background white, links blue (unvisited), navy (visited), red (active) --> <BODY BGCOLOR="#FFFFFF" @@ -13,20 +13,20 @@ <DIV ALIGN="CENTER"> <IMG SRC="../images/sub.gif" ALT="[APACHE DOCUMENTATION]"> <H3> - Apache HTTP Server Version 1.2 + Apache HTTP Server Version 1.3 </H3> </DIV> -<h1 ALIGN="CENTER">Apache API notes</h1> +<H1 ALIGN="CENTER">Apache API notes</H1> These are some notes on the Apache API and the data structures you -have to deal with, etc. They are not yet nearly complete, but +have to deal with, <EM>etc.</EM> They are not yet nearly complete, but hopefully, they will help you get your bearings. Keep in mind that the API is still subject to change as we gain experience with it. -(See the TODO file for what <em>might</em> be coming). However, +(See the TODO file for what <EM>might</EM> be coming). However, it will be easy to adapt modules to any changes that are made. (We have more modules to adapt than you do). -<p> +<P> A few notes on general pedagogical style here. In the interest of conciseness, all structure declarations here are incomplete --- the @@ -34,77 +34,80 @@ real ones have more slots that I'm not telling you about. For the most part, these are reserved to one component of the server core or another, and should be altered by modules with caution. However, in some cases, they really are things I just haven't gotten around to -yet. Welcome to the bleeding edge.<p> +yet. Welcome to the bleeding edge.<P> Finally, here's an outline, to give you some bare idea of what's coming up, and in what order: -<ul> -<li> <a href="#basics">Basic concepts.</a> -<menu> - <li> <a href="#HMR">Handlers, Modules, and Requests</a> - <li> <a href="#moduletour">A brief tour of a module</a> -</menu> -<li> <a href="#handlers">How handlers work</a> -<menu> - <li> <a href="#req_tour">A brief tour of the <code>request_rec</code></a> - <li> <a href="#req_orig">Where request_rec structures come from</a> - <li> <a href="#req_return">Handling requests, declining, and returning error codes</a> - <li> <a href="#resp_handlers">Special considerations for response handlers</a> - <li> <a href="#auth_handlers">Special considerations for authentication handlers</a> - <li> <a href="#log_handlers">Special considerations for logging handlers</a> -</menu> -<li> <a href="#pools">Resource allocation and resource pools</a> -<li> <a href="#config">Configuration, commands and the like</a> -<menu> - <li> <a href="#per-dir">Per-directory configuration structures</a> - <li> <a href="#commands">Command handling</a> - <li> <a href="#servconf">Side notes --- per-server configuration, virtual servers, etc.</a> -</menu> -</ul> - -<h2><a name="basics">Basic concepts.</a></h2> - -We begin with an overview of the basic concepts behind the +<UL> +<LI> <A HREF="#basics">Basic concepts.</A> +<MENU> + <LI> <A HREF="#HMR">Handlers, Modules, and Requests</A> + <LI> <A HREF="#moduletour">A brief tour of a module</A> +</MENU> +<LI> <A HREF="#handlers">How handlers work</A> +<MENU> + <LI> <A HREF="#req_tour">A brief tour of the <CODE>request_rec</CODE></A> + <LI> <A HREF="#req_orig">Where request_rec structures come from</A> + <LI> <A HREF="#req_return">Handling requests, declining, and returning error + codes</A> + <LI> <A HREF="#resp_handlers">Special considerations for response handlers</A> + <LI> <A HREF="#auth_handlers">Special considerations for authentication + handlers</A> + <LI> <A HREF="#log_handlers">Special considerations for logging handlers</A> +</MENU> +<LI> <A HREF="#pools">Resource allocation and resource pools</A> +<LI> <A HREF="#config">Configuration, commands and the like</A> +<MENU> + <LI> <A HREF="#per-dir">Per-directory configuration structures</A> + <LI> <A HREF="#commands">Command handling</A> + <LI> <A HREF="#servconf">Side notes --- per-server configuration, + virtual servers, <EM>etc</EM>.</A> +</MENU> +</UL> + +<H2><A NAME="basics">Basic concepts.</A></H2> + +We begin with an overview of the basic concepts behind the API, and how they are manifested in the code. -<h3><a name="HMR">Handlers, Modules, and Requests</a></h3> +<H3><A NAME="HMR">Handlers, Modules, and Requests</A></H3> Apache breaks down request handling into a series of steps, more or less the same way the Netscape server API does (although this API has a few more stages than NetSite does, as hooks for stuff I thought might be useful in the future). These are: -<ul> - <li> URI -> Filename translation - <li> Auth ID checking [is the user who they say they are?] - <li> Auth access checking [is the user authorized <em>here</em>?] - <li> Access checking other than auth - <li> Determining MIME type of the object requested - <li> `Fixups' --- there aren't any of these yet, but the phase is +<UL> + <LI> URI -> Filename translation + <LI> Auth ID checking [is the user who they say they are?] + <LI> Auth access checking [is the user authorized <EM>here</EM>?] + <LI> Access checking other than auth + <LI> Determining MIME type of the object requested + <LI> `Fixups' --- there aren't any of these yet, but the phase is intended as a hook for possible extensions like - <code>SetEnv</code>, which don't really fit well elsewhere. - <li> Actually sending a response back to the client. - <li> Logging the request -</ul> + <CODE>SetEnv</CODE>, which don't really fit well elsewhere. + <LI> Actually sending a response back to the client. + <LI> Logging the request +</UL> These phases are handled by looking at each of a succession of -<em>modules</em>, looking to see if each of them has a handler for the +<EM>modules</EM>, looking to see if each of them has a handler for the phase, and attempting invoking it if so. The handler can typically do one of three things: -<ul> - <li> <em>Handle</em> the request, and indicate that it has done so - by returning the magic constant <code>OK</code>. - <li> <em>Decline</em> to handle the request, by returning the magic - integer constant <code>DECLINED</code>. In this case, the +<UL> + <LI> <EM>Handle</EM> the request, and indicate that it has done so + by returning the magic constant <CODE>OK</CODE>. + <LI> <EM>Decline</EM> to handle the request, by returning the magic + integer constant <CODE>DECLINED</CODE>. In this case, the server behaves in all respects as if the handler simply hadn't been there. - <li> Signal an error, by returning one of the HTTP error codes. + <LI> Signal an error, by returning one of the HTTP error codes. This terminates normal handling of the request, although an ErrorDocument may be invoked to try to mop up, and it will be logged in any case. -</ul> +</UL> Most phases are terminated by the first module that handles them; however, for logging, `fixups', and non-access authentication @@ -112,62 +115,62 @@ checking, all handlers always run (barring an error). Also, the response phase is unique in that modules may declare multiple handlers for it, via a dispatch table keyed on the MIME type of the requested object. Modules may declare a response-phase handler which can handle -<em>any</em> request, by giving it the key <code>*/*</code> (i.e., a +<EM>any</EM> request, by giving it the key <CODE>*/*</CODE> (<EM>i.e.</EM>, a wildcard MIME type specification). However, wildcard handlers are only invoked if the server has already tried and failed to find a more specific response handler for the MIME type of the requested object -(either none existed, or they all declined).<p> +(either none existed, or they all declined).<P> The handlers themselves are functions of one argument (a -<code>request_rec</code> structure. vide infra), which returns an -integer, as above.<p> +<CODE>request_rec</CODE> structure. vide infra), which returns an +integer, as above.<P> -<h3><a name="moduletour">A brief tour of a module</a></h3> +<H3><A NAME="moduletour">A brief tour of a module</A></H3> At this point, we need to explain the structure of a module. Our candidate will be one of the messier ones, the CGI module --- this -handles both CGI scripts and the <code>ScriptAlias</code> config file +handles both CGI scripts and the <CODE>ScriptAlias</CODE> config file command. It's actually a great deal more complicated than most modules, but if we're going to have only one example, it might as well -be the one with its fingers in every place.<p> +be the one with its fingers in every place.<P> Let's begin with handlers. In order to handle the CGI scripts, the module declares a response handler for them. Because of -<code>ScriptAlias</code>, it also has handlers for the name -translation phase (to recognize <code>ScriptAlias</code>ed URIs), the -type-checking phase (any <code>ScriptAlias</code>ed request is typed -as a CGI script).<p> +<CODE>ScriptAlias</CODE>, it also has handlers for the name +translation phase (to recognize <CODE>ScriptAlias</CODE>ed URIs), the +type-checking phase (any <CODE>ScriptAlias</CODE>ed request is typed +as a CGI script).<P> The module needs to maintain some per (virtual) -server information, namely, the <code>ScriptAlias</code>es in effect; +server information, namely, the <CODE>ScriptAlias</CODE>es in effect; the module structure therefore contains pointers to a functions which builds these structures, and to another which combines two of them (in case the main server and a virtual server both have -<code>ScriptAlias</code>es declared).<p> +<CODE>ScriptAlias</CODE>es declared).<P> Finally, this module contains code to handle the -<code>ScriptAlias</code> command itself. This particular module only +<CODE>ScriptAlias</CODE> command itself. This particular module only declares one command, but there could be more, so modules have -<em>command tables</em> which declare their commands, and describe -where they are permitted, and how they are to be invoked. <p> +<EM>command tables</EM> which declare their commands, and describe +where they are permitted, and how they are to be invoked. <P> A final note on the declared types of the arguments of some of these -commands: a <code>pool</code> is a pointer to a <em>resource pool</em> +commands: a <CODE>pool</CODE> is a pointer to a <EM>resource pool</EM> structure; these are used by the server to keep track of the memory -which has been allocated, files opened, etc., either to service a +which has been allocated, files opened, <EM>etc.</EM>, either to service a particular request, or to handle the process of configuring itself. That way, when the request is over (or, for the configuration pool, when the server is restarting), the memory can be freed, and the files -closed, <i>en masse</i>, without anyone having to write explicit code to +closed, <EM>en masse</EM>, without anyone having to write explicit code to track them all down and dispose of them. Also, a -<code>cmd_parms</code> structure contains various information about +<CODE>cmd_parms</CODE> structure contains various information about the config file being read, and other status information, which is sometimes of use to the function which processes a config-file command -(such as <code>ScriptAlias</code>). +(such as <CODE>ScriptAlias</CODE>). With no further ado, the module itself: - -<pre> + +<PRE> /* Declarations of handlers. */ int translate_scriptalias (request_rec *); @@ -219,37 +222,37 @@ module cgi_module = { NULL, /* logger */ NULL /* header parser */ }; -</pre> +</PRE> -<h2><a name="handlers">How handlers work</a></h2> +<H2><A NAME="handlers">How handlers work</A></H2> -The sole argument to handlers is a <code>request_rec</code> structure. +The sole argument to handlers is a <CODE>request_rec</CODE> structure. This structure describes a particular request which has been made to the server, on behalf of a client. In most cases, each connection to -the client generates only one <code>request_rec</code> structure.<p> +the client generates only one <CODE>request_rec</CODE> structure.<P> -<h3><a name="req_tour">A brief tour of the <code>request_rec</code></a></h3> +<H3><A NAME="req_tour">A brief tour of the <CODE>request_rec</CODE></A></H3> -The <code>request_rec</code> contains pointers to a resource pool +The <CODE>request_rec</CODE> contains pointers to a resource pool which will be cleared when the server is finished handling the request; to structures containing per-server and per-connection -information, and most importantly, information on the request itself.<p> +information, and most importantly, information on the request itself.<P> The most important such information is a small set of character strings describing attributes of the object being requested, including its URI, filename, content-type and content-encoding (these being filled in by the translation and type-check handlers which handle the -request, respectively). <p> +request, respectively). <P> Other commonly used data items are tables giving the MIME headers on the client's original request, MIME headers to be sent back with the response (which modules can add to at will), and environment variables for any subprocesses which are spawned off in the course of servicing the request. These tables are manipulated using the -<code>table_get</code> and <code>table_set</code> routines. <p> +<CODE>ap_table_get</CODE> and <CODE>ap_table_set</CODE> routines. <P> <BLOCKQUOTE> Note that the <SAMP>Content-type</SAMP> header value <EM>cannot</EM> be - set by module content-handlers using the <SAMP>table_*()</SAMP> + set by module content-handlers using the <SAMP>ap_table_*()</SAMP> routines. Rather, it is set by pointing the <SAMP>content_type</SAMP> field in the <SAMP>request_rec</SAMP> structure to an appropriate string. <EM>E.g.</EM>, @@ -261,17 +264,17 @@ Finally, there are pointers to two data structures which, in turn, point to per-module configuration structures. Specifically, these hold pointers to the data structures which the module has built to describe the way it has been configured to operate in a given -directory (via <code>.htaccess</code> files or -<code><Directory></code> sections), for private data it has +directory (via <CODE>.htaccess</CODE> files or +<CODE><Directory></CODE> sections), for private data it has built in the course of servicing the request (so modules' handlers for one phase can pass `notes' to their handlers for other phases). There -is another such configuration vector in the <code>server_rec</code> -data structure pointed to by the <code>request_rec</code>, which -contains per (virtual) server configuration data.<p> +is another such configuration vector in the <CODE>server_rec</CODE> +data structure pointed to by the <CODE>request_rec</CODE>, which +contains per (virtual) server configuration data.<P> -Here is an abridged declaration, giving the fields most commonly used:<p> +Here is an abridged declaration, giving the fields most commonly used:<P> -<pre> +<PRE> struct request_rec { pool *pool; @@ -279,17 +282,17 @@ struct request_rec { server_rec *server; /* What object is being requested */ - + char *uri; char *filename; char *path_info; char *args; /* QUERY_ARGS, if any */ struct stat finfo; /* Set by server core; * st_mode set to zero if no such file */ - + char *content_type; char *content_encoding; - + /* MIME header environments, in and out. Also, an array containing * environment variables to be passed to subprocesses, so people can * write modules to add to that environment. @@ -299,18 +302,18 @@ struct request_rec { * redirects (so the headers printed for ErrorDocument handlers will * have them). */ - + table *headers_in; table *headers_out; table *err_headers_out; table *subprocess_env; /* Info about the request itself... */ - + int header_only; /* HEAD request, as opposed to GET */ char *protocol; /* Protocol, as given to us, or HTTP/0.9 */ - char *method; /* GET, HEAD, POST, etc. */ - int method_number; /* M_GET, M_POST, etc. */ + char *method; /* GET, HEAD, POST, <EM>etc.</EM> */ + int method_number; /* M_GET, M_POST, <EM>etc.</EM> */ /* Info for logging */ @@ -327,119 +330,122 @@ struct request_rec { * These are config vectors, with one void* pointer for each module * (the thing pointed to being the module's business). */ - - void *per_dir_config; /* Options set in config files, etc. */ + + void *per_dir_config; /* Options set in config files, <EM>etc.</EM> */ void *request_config; /* Notes on *this* request */ - + }; -</pre> +</PRE> -<h3><a name="req_orig">Where request_rec structures come from</a></h3> +<H3><A NAME="req_orig">Where request_rec structures come from</A></H3> -Most <code>request_rec</code> structures are built by reading an HTTP +Most <CODE>request_rec</CODE> structures are built by reading an HTTP request from a client, and filling in the fields. However, there are a few exceptions: -<ul> - <li> If the request is to an imagemap, a type map (i.e., a - <code>*.var</code> file), or a CGI script which returned a +<UL> + <LI> If the request is to an imagemap, a type map (<EM>i.e.</EM>, a + <CODE>*.var</CODE> file), or a CGI script which returned a local `Location:', then the resource which the user requested is going to be ultimately located by some URI other than what the client originally supplied. In this case, the server does - an <em>internal redirect</em>, constructing a new - <code>request_rec</code> for the new URI, and processing it + an <EM>internal redirect</EM>, constructing a new + <CODE>request_rec</CODE> for the new URI, and processing it almost exactly as if the client had requested the new URI - directly. <p> + directly. <P> - <li> If some handler signaled an error, and an - <code>ErrorDocument</code> is in scope, the same internal - redirect machinery comes into play.<p> + <LI> If some handler signaled an error, and an + <CODE>ErrorDocument</CODE> is in scope, the same internal + redirect machinery comes into play.<P> - <li> Finally, a handler occasionally needs to investigate `what + <LI> Finally, a handler occasionally needs to investigate `what would happen if' some other request were run. For instance, the directory indexing module needs to know what MIME type would be assigned to a request for each directory entry, in - order to figure out what icon to use.<p> + order to figure out what icon to use.<P> - Such handlers can construct a <em>sub-request</em>, using the - functions <code>sub_req_lookup_file</code> and - <code>sub_req_lookup_uri</code>; this constructs a new - <code>request_rec</code> structure and processes it as you + Such handlers can construct a <EM>sub-request</EM>, using the + functions <CODE>ap_sub_req_lookup_file</CODE>, + <CODE>ap_sub_req_lookup_uri</CODE>, and + <CODE>ap_sub_req_method_uri</CODE>; these construct a new + <CODE>request_rec</CODE> structure and processes it as you would expect, up to but not including the point of actually sending a response. (These functions skip over the access checks if the sub-request is for a file in the same directory - as the original request).<p> + as the original request).<P> (Server-side includes work by building sub-requests and then actually invoking the response handler for them, via the - function <code>run_sub_request</code>). -</ul> + function <CODE>ap_run_sub_req</CODE>). +</UL> -<h3><a name="req_return">Handling requests, declining, and returning error codes</a></h3> +<H3><A NAME="req_return">Handling requests, declining, and returning error + codes</A></H3> As discussed above, each handler, when invoked to handle a particular -<code>request_rec</code>, has to return an <code>int</code> to +<CODE>request_rec</CODE>, has to return an <CODE>int</CODE> to indicate what happened. That can either be -<ul> - <li> OK --- the request was handled successfully. This may or may +<UL> + <LI> OK --- the request was handled successfully. This may or may not terminate the phase. - <li> DECLINED --- no erroneous condition exists, but the module + <LI> DECLINED --- no erroneous condition exists, but the module declines to handle the phase; the server tries to find another. - <li> an HTTP error code, which aborts handling of the request. -</ul> + <LI> an HTTP error code, which aborts handling of the request. +</UL> -Note that if the error code returned is <code>REDIRECT</code>, then -the module should put a <code>Location</code> in the request's -<code>headers_out</code>, to indicate where the client should be -redirected <em>to</em>. <p> +Note that if the error code returned is <CODE>REDIRECT</CODE>, then +the module should put a <CODE>Location</CODE> in the request's +<CODE>headers_out</CODE>, to indicate where the client should be +redirected <EM>to</EM>. <P> -<h3><a name="resp_handlers">Special considerations for response handlers</a></h3> +<H3><A NAME="resp_handlers">Special considerations for response + handlers</A></H3> Handlers for most phases do their work by simply setting a few fields -in the <code>request_rec</code> structure (or, in the case of access +in the <CODE>request_rec</CODE> structure (or, in the case of access checkers, simply by returning the correct error code). However, -response handlers have to actually send a request back to the client. <p> +response handlers have to actually send a request back to the client. <P> They should begin by sending an HTTP response header, using the -function <code>send_http_header</code>. (You don't have to do +function <CODE>ap_send_http_header</CODE>. (You don't have to do anything special to skip sending the header for HTTP/0.9 requests; the function figures out on its own that it shouldn't do anything). If -the request is marked <code>header_only</code>, that's all they should +the request is marked <CODE>header_only</CODE>, that's all they should do; they should return after that, without attempting any further -output. <p> +output. <P> Otherwise, they should produce a request body which responds to the -client as appropriate. The primitives for this are <code>rputc</code> -and <code>rprintf</code>, for internally generated output, and -<code>send_fd</code>, to copy the contents of some <code>FILE *</code> -straight to the client. <p> +client as appropriate. The primitives for this are <CODE>ap_rputc</CODE> +and <CODE>ap_rprintf</CODE>, for internally generated output, and +<CODE>ap_send_fd</CODE>, to copy the contents of some <CODE>FILE *</CODE> +straight to the client. <P> At this point, you should more or less understand the following piece -of code, which is the handler which handles <code>GET</code> requests +of code, which is the handler which handles <CODE>GET</CODE> requests which have no more specific handler; it also shows how conditional -<code>GET</code>s can be handled, if it's desirable to do so in a -particular response handler --- <code>set_last_modified</code> checks -against the <code>If-modified-since</code> value supplied by the +<CODE>GET</CODE>s can be handled, if it's desirable to do so in a +particular response handler --- <CODE>ap_set_last_modified</CODE> checks +against the <CODE>If-modified-since</CODE> value supplied by the client, if any, and returns an appropriate code (which will, if nonzero, be USE_LOCAL_COPY). No similar considerations apply for -<code>set_content_length</code>, but it returns an error code for -symmetry.<p> +<CODE>ap_set_content_length</CODE>, but it returns an error code for +symmetry.<P> -<pre> +<PRE> int default_handler (request_rec *r) { int errstatus; FILE *f; - + if (r->method_number != M_GET) return DECLINED; if (r->finfo.st_mode == 0) return NOT_FOUND; - if ((errstatus = set_content_length (r, r->finfo.st_size)) - || (errstatus = set_last_modified (r, r->finfo.st_mtime))) + if ((errstatus = ap_set_content_length (r, r->finfo.st_size)) + || (errstatus = ap_set_last_modified (r, r->finfo.st_mtime))) return errstatus; - + f = fopen (r->filename, "r"); if (f == NULL) { @@ -447,252 +453,394 @@ int default_handler (request_rec *r) r->filename, r); return FORBIDDEN; } - + register_timeout ("send", r); - send_http_header (r); + ap_send_http_header (r); if (!r->header_only) send_fd (f, r); - pfclose (r->pool, f); + ap_pfclose (r->pool, f); return OK; } -</pre> +</PRE> Finally, if all of this is too much of a challenge, there are a few ways out of it. First off, as shown above, a response handler which has not yet produced any output can simply return an error code, in which case the server will automatically produce an error response. Secondly, it can punt to some other handler by invoking -<code>internal_redirect</code>, which is how the internal redirection +<CODE>ap_internal_redirect</CODE>, which is how the internal redirection machinery discussed above is invoked. A response handler which has -internally redirected should always return <code>OK</code>. <p> +internally redirected should always return <CODE>OK</CODE>. <P> -(Invoking <code>internal_redirect</code> from handlers which are -<em>not</em> response handlers will lead to serious confusion). +(Invoking <CODE>ap_internal_redirect</CODE> from handlers which are +<EM>not</EM> response handlers will lead to serious confusion). -<h3><a name="auth_handlers">Special considerations for authentication handlers</a></h3> +<H3><A NAME="auth_handlers">Special considerations for authentication + handlers</A></H3> Stuff that should be discussed here in detail: -<ul> - <li> Authentication-phase handlers not invoked unless auth is +<UL> + <LI> Authentication-phase handlers not invoked unless auth is configured for the directory. - <li> Common auth configuration stored in the core per-dir - configuration; it has accessors <code>auth_type</code>, - <code>auth_name</code>, and <code>requires</code>. - <li> Common routines, to handle the protocol end of things, at least - for HTTP basic authentication (<code>get_basic_auth_pw</code>, - which sets the <code>connection->user</code> structure field - automatically, and <code>note_basic_auth_failure</code>, which - arranges for the proper <code>WWW-Authenticate:</code> header + <LI> Common auth configuration stored in the core per-dir + configuration; it has accessors <CODE>ap_auth_type</CODE>, + <CODE>ap_auth_name</CODE>, and <CODE>ap_requires</CODE>. + <LI> Common routines, to handle the protocol end of things, at least + for HTTP basic authentication (<CODE>ap_get_basic_auth_pw</CODE>, + which sets the <CODE>connection->user</CODE> structure field + automatically, and <CODE>ap_note_basic_auth_failure</CODE>, which + arranges for the proper <CODE>WWW-Authenticate:</CODE> header to be sent back). -</ul> +</UL> -<h3><a name="log_handlers">Special considerations for logging handlers</a></h3> +<H3><A NAME="log_handlers">Special considerations for logging handlers</A></H3> When a request has internally redirected, there is the question of what to log. Apache handles this by bundling the entire chain of -redirects into a list of <code>request_rec</code> structures which are -threaded through the <code>r->prev</code> and <code>r->next</code> -pointers. The <code>request_rec</code> which is passed to the logging +redirects into a list of <CODE>request_rec</CODE> structures which are +threaded through the <CODE>r->prev</CODE> and <CODE>r->next</CODE> +pointers. The <CODE>request_rec</CODE> which is passed to the logging handlers in such cases is the one which was originally built for the initial request from the client; note that the bytes_sent field will only be correct in the last request in the chain (the one for which a -response was actually sent). - -<h2><a name="pools">Resource allocation and resource pools</a></h2> +response was actually sent). +<H2><A NAME="pools">Resource allocation and resource pools</A></H2> +<P> One of the problems of writing and designing a server-pool server is that of preventing leakage, that is, allocating resources (memory, -open files, etc.), without subsequently releasing them. The resource +open files, <EM>etc.</EM>), without subsequently releasing them. The resource pool machinery is designed to make it easy to prevent this from happening, by allowing resource to be allocated in such a way that -they are <em>automatically</em> released when the server is done with -them. <p> - +they are <EM>automatically</EM> released when the server is done with +them. +</P> +<P> The way this works is as follows: the memory which is allocated, file -opened, etc., to deal with a particular request are tied to a -<em>resource pool</em> which is allocated for the request. The pool -is a data structure which itself tracks the resources in question. <p> - -When the request has been processed, the pool is <em>cleared</em>. At +opened, <EM>etc.</EM>, to deal with a particular request are tied to a +<EM>resource pool</EM> which is allocated for the request. The pool +is a data structure which itself tracks the resources in question. +</P> +<P> +When the request has been processed, the pool is <EM>cleared</EM>. At that point, all the memory associated with it is released for reuse, all files associated with it are closed, and any other clean-up functions which are associated with the pool are run. When this is over, we can be confident that all the resource tied to the pool have -been released, and that none of them have leaked. <p> - +been released, and that none of them have leaked. +</P> +<P> Server restarts, and allocation of memory and resources for per-server configuration, are handled in a similar way. There is a -<em>configuration pool</em>, which keeps track of resources which were +<EM>configuration pool</EM>, which keeps track of resources which were allocated while reading the server configuration files, and handling the commands therein (for instance, the memory that was allocated for per-server module configuration, log files and other files that were opened, and so forth). When the server restarts, and has to reread the configuration files, the configuration pool is cleared, and so the memory and file descriptors which were taken up by reading them the -last time are made available for reuse. <p> - +last time are made available for reuse. +</P> +<P> It should be noted that use of the pool machinery isn't generally obligatory, except for situations like logging handlers, where you really need to register cleanups to make sure that the log file gets closed when the server restarts (this is most easily done by using the -function <code><a href="#pool-files">pfopen</a></code>, which also +function <CODE><A HREF="#pool-files">ap_pfopen</A></CODE>, which also arranges for the underlying file descriptor to be closed before any -child processes, such as for CGI scripts, are <code>exec</code>ed), or +child processes, such as for CGI scripts, are <CODE>exec</CODE>ed), or in case you are using the timeout machinery (which isn't yet even documented here). However, there are two benefits to using it: resources allocated to a pool never leak (even if you allocate a scratch string, and just forget about it); also, for memory -allocation, <code>palloc</code> is generally faster than -<code>malloc</code>.<p> - +allocation, <CODE>ap_palloc</CODE> is generally faster than +<CODE>malloc</CODE>. +</P> +<P> We begin here by describing how memory is allocated to pools, and then discuss how other resources are tracked by the resource pool machinery. - -<h3>Allocation of memory in pools</h3> - +</P> +<H3>Allocation of memory in pools</H3> +<P> Memory is allocated to pools by calling the function -<code>palloc</code>, which takes two arguments, one being a pointer to +<CODE>ap_palloc</CODE>, which takes two arguments, one being a pointer to a resource pool structure, and the other being the amount of memory to -allocate (in <code>char</code>s). Within handlers for handling +allocate (in <CODE>char</CODE>s). Within handlers for handling requests, the most common way of getting a resource pool structure is -by looking at the <code>pool</code> slot of the relevant -<code>request_rec</code>; hence the repeated appearance of the +by looking at the <CODE>pool</CODE> slot of the relevant +<CODE>request_rec</CODE>; hence the repeated appearance of the following idiom in module code: - -<pre> +</P> +<PRE> int my_handler(request_rec *r) { struct my_structure *foo; ... - foo = (foo *)palloc (r->pool, sizeof(my_structure)); + foo = (foo *)ap_palloc (r->pool, sizeof(my_structure)); } -</pre> - -Note that <em>there is no <code>pfree</code></em> --- -<code>palloc</code>ed memory is freed only when the associated -resource pool is cleared. This means that <code>palloc</code> does not -have to do as much accounting as <code>malloc()</code>; all it does in +</PRE> +<P> +Note that <EM>there is no <CODE>ap_pfree</CODE></EM> --- +<CODE>ap_palloc</CODE>ed memory is freed only when the associated +resource pool is cleared. This means that <CODE>ap_palloc</CODE> does not +have to do as much accounting as <CODE>malloc()</CODE>; all it does in the typical case is to round up the size, bump a pointer, and do a -range check.<p> - -(It also raises the possibility that heavy use of <code>palloc</code> +range check. +</P> +<P> +(It also raises the possibility that heavy use of <CODE>ap_palloc</CODE> could cause a server process to grow excessively large. There are two ways to deal with this, which are dealt with below; briefly, you -can use <code>malloc</code>, and try to be sure that all of the memory -gets explicitly <code>free</code>d, or you can allocate a sub-pool of +can use <CODE>malloc</CODE>, and try to be sure that all of the memory +gets explicitly <CODE>free</CODE>d, or you can allocate a sub-pool of the main pool, allocate your memory in the sub-pool, and clear it out periodically. The latter technique is discussed in the section on sub-pools below, and is used in the directory-indexing code, in order to avoid excessive storage allocation when listing directories with thousands of files). - -<h3>Allocating initialized memory</h3> - +</P> +<H3>Allocating initialized memory</H3> +<P> There are functions which allocate initialized memory, and are -frequently useful. The function <code>pcalloc</code> has the same -interface as <code>palloc</code>, but clears out the memory it -allocates before it returns it. The function <code>pstrdup</code> -takes a resource pool and a <code>char *</code> as arguments, and +frequently useful. The function <CODE>ap_pcalloc</CODE> has the same +interface as <CODE>ap_palloc</CODE>, but clears out the memory it +allocates before it returns it. The function <CODE>ap_pstrdup</CODE> +takes a resource pool and a <CODE>char *</CODE> as arguments, and allocates memory for a copy of the string the pointer points to, -returning a pointer to the copy. Finally <code>pstrcat</code> is a +returning a pointer to the copy. Finally <CODE>ap_pstrcat</CODE> is a varargs-style function, which takes a pointer to a resource pool, and -at least two <code>char *</code> arguments, the last of which must be -<code>NULL</code>. It allocates enough memory to fit copies of each +at least two <CODE>char *</CODE> arguments, the last of which must be +<CODE>NULL</CODE>. It allocates enough memory to fit copies of each of the strings, as a unit; for instance: - -<pre> - pstrcat (r->pool, "foo", "/", "bar", NULL); -</pre> - +</P> +<PRE> + ap_pstrcat (r->pool, "foo", "/", "bar", NULL); +</PRE> +<P> returns a pointer to 8 bytes worth of memory, initialized to -<code>"foo/bar"</code>. - -<h3><a name="pool-files">Tracking open files, etc.</a></h3> - +<CODE>"foo/bar"</CODE>. +</P> +<H3><A NAME="pools-used">Commonly-used pools in the Apache Web server</A></H3> +<P> +A pool is really defined by its lifetime more than anything else. There +are some static pools in http_main which are passed to various +non-http_main functions as arguments at opportune times. Here they are: +</P> +<DL COMPACT> + <DT>permanent_pool + </DT> + <DD> + <UL> + <LI>never passed to anything else, this is the ancestor of all pools + </LI> + </UL> + </DD> + <DT>pconf + </DT> + <DD> + <UL> + <LI>subpool of permanent_pool + </LI> + <LI>created at the beginning of a config "cycle"; exists until the + server is terminated or restarts; passed to all config-time + routines, either via cmd->pool, or as the "pool *p" argument on + those which don't take pools + </LI> + <LI>passed to the module init() functions + </LI> + </UL> + </DD> + <DT>ptemp + </DT> + <DD> + <UL> + <LI>sorry I lie, this pool isn't called this currently in 1.3, I + renamed it this in my pthreads development. I'm referring to + the use of ptrans in the parent... contrast this with the later + definition of ptrans in the child. + </LI> + <LI>subpool of permanent_pool + </LI> + <LI>created at the beginning of a config "cycle"; exists until the + end of config parsing; passed to config-time routines <EM>via</EM> + cmd->temp_pool. Somewhat of a "bastard child" because it isn't + available everywhere. Used for temporary scratch space which + may be needed by some config routines but which is deleted at + the end of config. + </LI> + </UL> + </DD> + <DT>pchild + </DT> + <DD> + <UL> + <LI>subpool of permanent_pool + </LI> + <LI>created when a child is spawned (or a thread is created); lives + until that child (thread) is destroyed + </LI> + <LI>passed to the module child_init functions + </LI> + <LI>destruction happens right after the child_exit functions are + called... (which may explain why I think child_exit is redundant + and unneeded) + </LI> + </UL> + </DD> + <DT>ptrans + <DT> + <DD> + <UL> + <LI>should be a subpool of pchild, but currently is a subpool of + permanent_pool, see above + </LI> + <LI>cleared by the child before going into the accept() loop to receive + a connection + </LI> + <LI>used as connection->pool + </LI> + </UL> + </DD> + <DT>r->pool + </DT> + <DD> + <UL> + <LI>for the main request this is a subpool of connection->pool; for + subrequests it is a subpool of the parent request's pool. + </LI> + <LI>exists until the end of the request (<EM>i.e.</EM>, + ap_destroy_sub_req, or + in child_main after process_request has finished) + </LI> + <LI>note that r itself is allocated from r->pool; <EM>i.e.</EM>, + r->pool is + first created and then r is the first thing palloc()d from it + </LI> + </UL> + </DD> +</DL> +<P> +For almost everything folks do, r->pool is the pool to use. But you +can see how other lifetimes, such as pchild, are useful to some +modules... such as modules that need to open a database connection once +per child, and wish to clean it up when the child dies. +</P> +<P> +You can also see how some bugs have manifested themself, such as setting +connection->user to a value from r->pool -- in this case +connection exists +for the lifetime of ptrans, which is longer than r->pool (especially if +r->pool is a subrequest!). So the correct thing to do is to allocate +from connection->pool. +</P> +<P> +And there was another interesting bug in mod_include/mod_cgi. You'll see +in those that they do this test to decide if they should use r->pool +or r->main->pool. In this case the resource that they are registering +for cleanup is a child process. If it were registered in r->pool, +then the code would wait() for the child when the subrequest finishes. +With mod_include this could be any old #include, and the delay can be up +to 3 seconds... and happened quite frequently. Instead the subprocess +is registered in r->main->pool which causes it to be cleaned up when +the entire request is done -- <EM>i.e.</EM>, after the output has been sent to +the client and logging has happened. +</P> +<H3><A NAME="pool-files">Tracking open files, etc.</A></H3> +<P> As indicated above, resource pools are also used to track other sorts of resources besides memory. The most common are open files. The -routine which is typically used for this is <code>pfopen</code>, which +routine which is typically used for this is <CODE>ap_pfopen</CODE>, which takes a resource pool and two strings as arguments; the strings are -the same as the typical arguments to <code>fopen</code>, e.g., - -<pre> +the same as the typical arguments to <CODE>fopen</CODE>, <EM>e.g.</EM>, +</P> +<PRE> ... - FILE *f = pfopen (r->pool, r->filename, "r"); + FILE *f = ap_pfopen (r->pool, r->filename, "r"); if (f == NULL) { ... } else { ... } -</pre> - -There is also a <code>popenf</code> routine, which parallels the -lower-level <code>open</code> system call. Both of these routines +</PRE> +<P> +There is also a <CODE>ap_popenf</CODE> routine, which parallels the +lower-level <CODE>open</CODE> system call. Both of these routines arrange for the file to be closed when the resource pool in question -is cleared. <p> - -Unlike the case for memory, there <em>are</em> functions to close -files allocated with <code>pfopen</code>, and <code>popenf</code>, -namely <code>pfclose</code> and <code>pclosef</code>. (This is +is cleared. +</P> +<P> +Unlike the case for memory, there <EM>are</EM> functions to close +files allocated with <CODE>ap_pfopen</CODE>, and <CODE>ap_popenf</CODE>, +namely <CODE>ap_pfclose</CODE> and <CODE>ap_pclosef</CODE>. (This is because, on many systems, the number of files which a single process can have open is quite limited). It is important to use these -functions to close files allocated with <code>pfopen</code> and -<code>popenf</code>, since to do otherwise could cause fatal errors on +functions to close files allocated with <CODE>ap_pfopen</CODE> and +<CODE>ap_popenf</CODE>, since to do otherwise could cause fatal errors on systems such as Linux, which react badly if the same -<code>FILE*</code> is closed more than once. <p> - -(Using the <code>close</code> functions is not mandatory, since the +<CODE>FILE*</CODE> is closed more than once. +</P> +<P> +(Using the <CODE>close</CODE> functions is not mandatory, since the file will eventually be closed regardless, but you should consider it in cases where your module is opening, or could open, a lot of files). - -<h3>Other sorts of resources --- cleanup functions</h3> - +</P> +<H3>Other sorts of resources --- cleanup functions</H3> +<BLOCKQUOTE> More text goes here. Describe the the cleanup primitives in terms of -which the file stuff is implemented; also, <code>spawn_process</code>. - -<h3>Fine control --- creating and dealing with sub-pools, with a note -on sub-requests</h3> - -On rare occasions, too-free use of <code>palloc()</code> and the +which the file stuff is implemented; also, <CODE>spawn_process</CODE>. +</BLOCKQUOTE> +<P> +Pool cleanups live until clear_pool() is called: clear_pool(a) recursively +calls destroy_pool() on all subpools of a; then calls all the cleanups for a; +then releases all the memory for a. destroy_pool(a) calls clear_pool(a) +and then releases the pool structure itself. <EM>i.e.</EM>, clear_pool(a) doesn't +delete a, it just frees up all the resources and you can start using it +again immediately. +</P> +<H3>Fine control --- creating and dealing with sub-pools, with a note +on sub-requests</H3> + +On rare occasions, too-free use of <CODE>ap_palloc()</CODE> and the associated primitives may result in undesirably profligate resource allocation. You can deal with such a case by creating a -<em>sub-pool</em>, allocating within the sub-pool rather than the main +<EM>sub-pool</EM>, allocating within the sub-pool rather than the main pool, and clearing or destroying the sub-pool, which releases the -resources which were associated with it. (This really <em>is</em> a +resources which were associated with it. (This really <EM>is</EM> a rare situation; the only case in which it comes up in the standard module set is in case of listing directories, and then only with -<em>very</em> large directories. Unnecessary use of the primitives +<EM>very</EM> large directories. Unnecessary use of the primitives discussed here can hair up your code quite a bit, with very little -gain). <p> +gain). <P> -The primitive for creating a sub-pool is <code>make_sub_pool</code>, +The primitive for creating a sub-pool is <CODE>ap_make_sub_pool</CODE>, which takes another pool (the parent pool) as an argument. When the main pool is cleared, the sub-pool will be destroyed. The sub-pool may also be cleared or destroyed at any time, by calling the functions -<code>clear_pool</code> and <code>destroy_pool</code>, respectively. -(The difference is that <code>clear_pool</code> frees resources -associated with the pool, while <code>destroy_pool</code> also +<CODE>ap_clear_pool</CODE> and <CODE>ap_destroy_pool</CODE>, respectively. +(The difference is that <CODE>ap_clear_pool</CODE> frees resources +associated with the pool, while <CODE>ap_destroy_pool</CODE> also deallocates the pool itself. In the former case, you can allocate new resources within the pool, and clear it again, and so forth; in the -latter case, it is simply gone). <p> +latter case, it is simply gone). <P> One final note --- sub-requests have their own resource pools, which are sub-pools of the resource pool for the main request. The polite way to reclaim the resources associated with a sub request which you -have allocated (using the <code>sub_req_lookup_...</code> functions) -is <code>destroy_sub_request</code>, which frees the resource pool. +have allocated (using the <CODE>ap_sub_req_...</CODE> functions) +is <CODE>ap_destroy_sub_req</CODE>, which frees the resource pool. Before calling this function, be sure to copy anything that you care about which might be allocated in the sub-request's resource pool into someplace a little less volatile (for instance, the filename in its -<code>request_rec</code> structure). <p> +<CODE>request_rec</CODE> structure). <P> (Again, under most circumstances, you shouldn't feel obliged to call this function; only 2K of memory or so are allocated for a typical sub request, and it will be freed anyway when the main request pool is cleared. It is only when you are allocating many, many sub-requests for a single main request that you should seriously consider the -<code>destroy...</code> functions). +<CODE>ap_destroy_...</CODE> functions). -<h2><a name="config">Configuration, commands and the like</a></h2> +<H2><A NAME="config">Configuration, commands and the like</A></H2> One of the design goals for this server was to maintain external compatibility with the NCSA 1.3 server --- that is, to read the same @@ -702,7 +850,7 @@ hand, another design goal was to move as much of the server's functionality into modules which have as little as possible to do with the monolithic server core. The only way to reconcile these goals is to move the handling of most commands from the central server into the -modules. <p> +modules. <P> However, just giving the modules command tables is not enough to divorce them completely from the server core. The server has to @@ -711,96 +859,96 @@ maintaining data which is private to the modules, and which can be either per-server, or per-directory. Most things are per-directory, including in particular access control and authorization information, but also information on how to determine file types from suffixes, -which can be modified by <code>AddType</code> and -<code>DefaultType</code> directives, and so forth. In general, the -governing philosophy is that anything which <em>can</em> be made +which can be modified by <CODE>AddType</CODE> and +<CODE>DefaultType</CODE> directives, and so forth. In general, the +governing philosophy is that anything which <EM>can</EM> be made configurable by directory should be; per-server information is generally used in the standard set of modules for information like -<code>Alias</code>es and <code>Redirect</code>s which come into play +<CODE>Alias</CODE>es and <CODE>Redirect</CODE>s which come into play before the request is tied to a particular place in the underlying -file system. <p> +file system. <P> Another requirement for emulating the NCSA server is being able to handle the per-directory configuration files, generally called -<code>.htaccess</code> files, though even in the NCSA server they can +<CODE>.htaccess</CODE> files, though even in the NCSA server they can contain directives which have nothing at all to do with access control. Accordingly, after URI -> filename translation, but before performing any other phase, the server walks down the directory hierarchy of the underlying filesystem, following the translated -pathname, to read any <code>.htaccess</code> files which might be +pathname, to read any <CODE>.htaccess</CODE> files which might be present. The information which is read in then has to be -<em>merged</em> with the applicable information from the server's own -config files (either from the <code><Directory></code> sections -in <code>access.conf</code>, or from defaults in -<code>srm.conf</code>, which actually behaves for most purposes almost -exactly like <code><Directory /></code>).<p> +<EM>merged</EM> with the applicable information from the server's own +config files (either from the <CODE><Directory></CODE> sections +in <CODE>access.conf</CODE>, or from defaults in +<CODE>srm.conf</CODE>, which actually behaves for most purposes almost +exactly like <CODE><Directory /></CODE>).<P> Finally, after having served a request which involved reading -<code>.htaccess</code> files, we need to discard the storage allocated +<CODE>.htaccess</CODE> files, we need to discard the storage allocated for handling them. That is solved the same way it is solved wherever else similar problems come up, by tying those structures to the -per-transaction resource pool. <p> +per-transaction resource pool. <P> -<h3><a name="per-dir">Per-directory configuration structures</a></h3> +<H3><A NAME="per-dir">Per-directory configuration structures</A></H3> -Let's look out how all of this plays out in <code>mod_mime.c</code>, +Let's look out how all of this plays out in <CODE>mod_mime.c</CODE>, which defines the file typing handler which emulates the NCSA server's behavior of determining file types from suffixes. What we'll be looking at, here, is the code which implements the -<code>AddType</code> and <code>AddEncoding</code> commands. These -commands can appear in <code>.htaccess</code> files, so they must be +<CODE>AddType</CODE> and <CODE>AddEncoding</CODE> commands. These +commands can appear in <CODE>.htaccess</CODE> files, so they must be handled in the module's private per-directory data, which in fact, -consists of two separate <code>table</code>s for MIME types and +consists of two separate <CODE>table</CODE>s for MIME types and encoding information, and is declared as follows: -<pre> +<PRE> typedef struct { table *forced_types; /* Additional AddTyped stuff */ table *encoding_types; /* Added with AddEncoding... */ } mime_dir_config; -</pre> +</PRE> When the server is reading a configuration file, or -<code><Directory></code> section, which includes one of the MIME -module's commands, it needs to create a <code>mime_dir_config</code> +<CODE><Directory></CODE> section, which includes one of the MIME +module's commands, it needs to create a <CODE>mime_dir_config</CODE> structure, so those commands have something to act on. It does this by invoking the function it finds in the module's `create per-dir config slot', with two arguments: the name of the directory to which -this configuration information applies (or <code>NULL</code> for -<code>srm.conf</code>), and a pointer to a resource pool in which the -allocation should happen. <p> +this configuration information applies (or <CODE>NULL</CODE> for +<CODE>srm.conf</CODE>), and a pointer to a resource pool in which the +allocation should happen. <P> -(If we are reading a <code>.htaccess</code> file, that resource pool +(If we are reading a <CODE>.htaccess</CODE> file, that resource pool is the per-request resource pool for the request; otherwise it is a resource pool which is used for configuration data, and cleared on restarts. Either way, it is important for the structure being created to vanish when the pool is cleared, by registering a cleanup on the -pool if necessary). <p> +pool if necessary). <P> For the MIME module, the per-dir config creation function just -<code>palloc</code>s the structure above, and a creates a couple of -<code>table</code>s to fill it. That looks like this: +<CODE>ap_palloc</CODE>s the structure above, and a creates a couple of +<CODE>table</CODE>s to fill it. That looks like this: -<pre> +<PRE> void *create_mime_dir_config (pool *p, char *dummy) { mime_dir_config *new = - (mime_dir_config *) palloc (p, sizeof(mime_dir_config)); + (mime_dir_config *) ap_palloc (p, sizeof(mime_dir_config)); + + new->forced_types = ap_make_table (p, 4); + new->encoding_types = ap_make_table (p, 4); - new->forced_types = make_table (p, 4); - new->encoding_types = make_table (p, 4); - return new; } -</pre> +</PRE> -Now, suppose we've just read in a <code>.htaccess</code> file. We +Now, suppose we've just read in a <CODE>.htaccess</CODE> file. We already have the per-directory configuration structure for the next -directory up in the hierarchy. If the <code>.htaccess</code> file we -just read in didn't have any <code>AddType</code> or -<code>AddEncoding</code> commands, its per-directory config structure +directory up in the hierarchy. If the <CODE>.htaccess</CODE> file we +just read in didn't have any <CODE>AddType</CODE> or +<CODE>AddEncoding</CODE> commands, its per-directory config structure for the MIME module is still valid, and we can just use it. -Otherwise, we need to merge the two structures somehow. <p> +Otherwise, we need to merge the two structures somehow. <P> To do that, the server invokes the module's per-directory config merge function, if one is present. That function takes three arguments: @@ -809,171 +957,172 @@ allocate the result. For the MIME module, all that needs to be done is overlay the tables from the new per-directory config structure with those from the parent: -<pre> +<PRE> void *merge_mime_dir_configs (pool *p, void *parent_dirv, void *subdirv) { mime_dir_config *parent_dir = (mime_dir_config *)parent_dirv; mime_dir_config *subdir = (mime_dir_config *)subdirv; mime_dir_config *new = - (mime_dir_config *)palloc (p, sizeof(mime_dir_config)); + (mime_dir_config *)ap_palloc (p, sizeof(mime_dir_config)); - new->forced_types = overlay_tables (p, subdir->forced_types, + new->forced_types = ap_overlay_tables (p, subdir->forced_types, parent_dir->forced_types); - new->encoding_types = overlay_tables (p, subdir->encoding_types, + new->encoding_types = ap_overlay_tables (p, subdir->encoding_types, parent_dir->encoding_types); return new; } -</pre> +</PRE> As a note --- if there is no per-directory merge function present, the server will just use the subdirectory's configuration info, and ignore -the parent's. For some modules, that works just fine (e.g., for the +the parent's. For some modules, that works just fine (<EM>e.g.</EM>, for the includes module, whose per-directory configuration information -consists solely of the state of the <code>XBITHACK</code>), and for +consists solely of the state of the <CODE>XBITHACK</CODE>), and for those modules, you can just not declare one, and leave the -corresponding structure slot in the module itself <code>NULL</code>.<p> +corresponding structure slot in the module itself <CODE>NULL</CODE>.<P> -<h3><a name="commands">Command handling</a></h3> +<H3><A NAME="commands">Command handling</A></H3> Now that we have these structures, we need to be able to figure out how to fill them. That involves processing the actual -<code>AddType</code> and <code>AddEncoding</code> commands. To find -commands, the server looks in the module's <code>command table</code>. +<CODE>AddType</CODE> and <CODE>AddEncoding</CODE> commands. To find +commands, the server looks in the module's <CODE>command table</CODE>. That table contains information on how many arguments the commands take, and in what formats, where it is permitted, and so forth. That information is sufficient to allow the server to invoke most command-handling functions with pre-parsed arguments. Without further -ado, let's look at the <code>AddType</code> command handler, which -looks like this (the <code>AddEncoding</code> command looks basically +ado, let's look at the <CODE>AddType</CODE> command handler, which +looks like this (the <CODE>AddEncoding</CODE> command looks basically the same, and won't be shown here): -<pre> +<PRE> char *add_type(cmd_parms *cmd, mime_dir_config *m, char *ct, char *ext) { if (*ext == '.') ++ext; - table_set (m->forced_types, ext, ct); + ap_table_set (m->forced_types, ext, ct); return NULL; } -</pre> +</PRE> This command handler is unusually simple. As you can see, it takes four arguments, two of which are pre-parsed arguments, the third being the per-directory configuration structure for the module in question, -and the fourth being a pointer to a <code>cmd_parms</code> structure. +and the fourth being a pointer to a <CODE>cmd_parms</CODE> structure. That structure contains a bunch of arguments which are frequently of use to some, but not all, commands, including a resource pool (from which memory can be allocated, and to which cleanups should be tied), and the (virtual) server being configured, from which the module's -per-server configuration data can be obtained if required.<p> +per-server configuration data can be obtained if required.<P> Another way in which this particular command handler is unusually simple is that there are no error conditions which it can encounter. If there were, it could return an error message instead of -<code>NULL</code>; this causes an error to be printed out on the -server's <code>stderr</code>, followed by a quick exit, if it is in -the main config files; for a <code>.htaccess</code> file, the syntax +<CODE>NULL</CODE>; this causes an error to be printed out on the +server's <CODE>stderr</CODE>, followed by a quick exit, if it is in +the main config files; for a <CODE>.htaccess</CODE> file, the syntax error is logged in the server error log (along with an indication of where it came from), and the request is bounced with a server error -response (HTTP error status, code 500). <p> +response (HTTP error status, code 500). <P> The MIME module's command table has entries for these commands, which look like this: -<pre> +<PRE> command_rec mime_cmds[] = { -{ "AddType", add_type, NULL, OR_FILEINFO, TAKE2, +{ "AddType", add_type, NULL, OR_FILEINFO, TAKE2, "a mime type followed by a file extension" }, -{ "AddEncoding", add_encoding, NULL, OR_FILEINFO, TAKE2, - "an encoding (e.g., gzip), followed by a file extension" }, +{ "AddEncoding", add_encoding, NULL, OR_FILEINFO, TAKE2, + "an encoding (<EM>e.g.</EM>, gzip), followed by a file extension" }, { NULL } }; -</pre> +</PRE> The entries in these tables are: -<ul> - <li> The name of the command - <li> The function which handles it - <li> a <code>(void *)</code> pointer, which is passed in the - <code>cmd_parms</code> structure to the command handler --- +<UL> + <LI> The name of the command + <LI> The function which handles it + <LI> a <CODE>(void *)</CODE> pointer, which is passed in the + <CODE>cmd_parms</CODE> structure to the command handler --- this is useful in case many similar commands are handled by the same function. - <li> A bit mask indicating where the command may appear. There are - mask bits corresponding to each <code>AllowOverride</code> - option, and an additional mask bit, <code>RSRC_CONF</code>, + <LI> A bit mask indicating where the command may appear. There are + mask bits corresponding to each <CODE>AllowOverride</CODE> + option, and an additional mask bit, <CODE>RSRC_CONF</CODE>, indicating that the command may appear in the server's own - config files, but <em>not</em> in any <code>.htaccess</code> + config files, but <EM>not</EM> in any <CODE>.htaccess</CODE> file. - <li> A flag indicating how many arguments the command handler wants + <LI> A flag indicating how many arguments the command handler wants pre-parsed, and how they should be passed in. - <code>TAKE2</code> indicates two pre-parsed arguments. Other - options are <code>TAKE1</code>, which indicates one pre-parsed - argument, <code>FLAG</code>, which indicates that the argument - should be <code>On</code> or <code>Off</code>, and is passed in - as a boolean flag, <code>RAW_ARGS</code>, which causes the + <CODE>TAKE2</CODE> indicates two pre-parsed arguments. Other + options are <CODE>TAKE1</CODE>, which indicates one pre-parsed + argument, <CODE>FLAG</CODE>, which indicates that the argument + should be <CODE>On</CODE> or <CODE>Off</CODE>, and is passed in + as a boolean flag, <CODE>RAW_ARGS</CODE>, which causes the server to give the command the raw, unparsed arguments (everything but the command name itself). There is also - <code>ITERATE</code>, which means that the handler looks the - same as <code>TAKE1</code>, but that if multiple arguments are + <CODE>ITERATE</CODE>, which means that the handler looks the + same as <CODE>TAKE1</CODE>, but that if multiple arguments are present, it should be called multiple times, and finally - <code>ITERATE2</code>, which indicates that the command handler - looks like a <code>TAKE2</code>, but if more arguments are + <CODE>ITERATE2</CODE>, which indicates that the command handler + looks like a <CODE>TAKE2</CODE>, but if more arguments are present, then it should be called multiple times, holding the first argument constant. - <li> Finally, we have a string which describes the arguments that + <LI> Finally, we have a string which describes the arguments that should be present. If the arguments in the actual config file are not as required, this string will be used to help give a more specific error message. (You can safely leave this - <code>NULL</code>). -</ul> + <CODE>NULL</CODE>). +</UL> Finally, having set this all up, we have to use it. This is ultimately done in the module's handlers, specifically for its file-typing handler, which looks more or less like this; note that the per-directory configuration structure is extracted from the -<code>request_rec</code>'s per-directory configuration vector by using -the <code>get_module_config</code> function. +<CODE>request_rec</CODE>'s per-directory configuration vector by using +the <CODE>ap_get_module_config</CODE> function. -<pre> +<PRE> int find_ct(request_rec *r) { int i; - char *fn = pstrdup (r->pool, r->filename); + char *fn = ap_pstrdup (r->pool, r->filename); mime_dir_config *conf = (mime_dir_config *) - get_module_config(r->per_dir_config, &mime_module); + ap_get_module_config(r->per_dir_config, &mime_module); char *type; - if (S_ISDIR(r->finfo.st_mode)) { - r->content_type = DIR_MAGIC_TYPE; + if (S_ISDIR(r->finfo.st_mode)) { + r->content_type = DIR_MAGIC_TYPE; return OK; } - - if((i=rind(fn,'.')) < 0) return DECLINED; + + if((i=ap_rind(fn,'.')) < 0) return DECLINED; ++i; - if ((type = table_get (conf->encoding_types, &fn[i]))) + if ((type = ap_table_get (conf->encoding_types, &fn[i]))) { - r->content_encoding = type; + r->content_encoding = type; /* go back to previous extension to try to use it as a type */ fn[i-1] = '\0'; - if((i=rind(fn,'.')) < 0) return OK; + if((i=ap_rind(fn,'.')) < 0) return OK; ++i; } - if ((type = table_get (conf->forced_types, &fn[i]))) + if ((type = ap_table_get (conf->forced_types, &fn[i]))) { - r->content_type = type; + r->content_type = type; } - + return OK; } -</pre> +</PRE> -<h3><a name="servconf">Side notes --- per-server configuration, virtual servers, etc.</a></h3> +<H3><A NAME="servconf">Side notes --- per-server configuration, virtual + servers, <EM>etc</EM>.</A></H3> The basic ideas behind per-server module configuration are basically the same as those for per-directory configuration; there is a creation @@ -982,36 +1131,37 @@ virtual server has partially overridden the base server configuration, and a combined structure must be computed. (As with per-directory configuration, the default if no merge function is specified, and a module is configured in some virtual server, is that the base -configuration is simply ignored). <p> +configuration is simply ignored). <P> The only substantial difference is that when a command needs to configure the per-server private module data, it needs to go to the -<code>cmd_parms</code> data to get at it. Here's an example, from the +<CODE>cmd_parms</CODE> data to get at it. Here's an example, from the alias module, which also indicates how a syntax error can be returned (note that the per-directory configuration argument to the command handler is declared as a dummy, since the module doesn't actually have per-directory config data): -<pre> +<PRE> char *add_redirect(cmd_parms *cmd, void *dummy, char *f, char *url) { - server_rec *s = cmd->server; + server_rec *s = cmd->server; alias_server_conf *conf = (alias_server_conf *) - get_module_config(s->module_config,&alias_module); - alias_entry *new = push_array (conf->redirects); + ap_get_module_config(s->module_config,&alias_module); + alias_entry *new = ap_push_array (conf->redirects); + + if (!ap_is_url (url)) return "Redirect to non-URL"; - if (!is_url (url)) return "Redirect to non-URL"; - new->fake = f; new->real = url; return NULL; } -</pre> +</PRE> <HR> + <H3 ALIGN="CENTER"> - Apache HTTP Server Version 1.2 + Apache HTTP Server Version 1.3 </H3> <A HREF="./"><IMG SRC="../images/index.gif" ALT="Index"></A> <A HREF="../"><IMG SRC="../images/home.gif" ALT="Home"></A> -</body></html> +</BODY></HTML> diff --git a/usr.sbin/httpd/htdocs/manual/misc/FAQ.html b/usr.sbin/httpd/htdocs/manual/misc/FAQ.html index d7d335f7810..04475577e93 100644 --- a/usr.sbin/httpd/htdocs/manual/misc/FAQ.html +++ b/usr.sbin/httpd/htdocs/manual/misc/FAQ.html @@ -2,6 +2,7 @@ <HTML> <HEAD> <TITLE>Apache Server Frequently Asked Questions</TITLE> + </HEAD> <!-- Background white, links blue (unvisited), navy (visited), red (active) --> <BODY @@ -20,7 +21,7 @@ <H1 ALIGN="CENTER">Apache Server Frequently Asked Questions</H1> <P> - $Revision: 1.3 $ ($Date: 1999/03/01 01:05:09 $) + $Revision: 1.4 $ ($Date: 1999/09/29 06:29:00 $) </P> <P> The latest version of this FAQ is always available from the main @@ -85,11 +86,20 @@ <!-- (A: you can't but "satisfy any; allow from all" can be close --> <!-- - '400 malformed request' on Win32 might mean stale proxy; see --> <!-- PR #2300. --> -<!-- - "expected </Directory> saw </Directory" due to buggy AIX --> -<!-- compiler. --> -<UL> - <LI><STRONG>Background</STRONG> - <OL START=1> +<!-- - how do I tell what version of Apache I am running? --> +<OL TYPE="A"> + + + + + + + + + + + <LI VALUE="1"><STRONG>Background</STRONG> + <OL> <LI><A HREF="#what">What is Apache?</A> </LI> <LI><A HREF="#why">Why was Apache created?</A> @@ -113,55 +123,163 @@ </LI> </OL> </LI> - <LI><STRONG>Technical Questions</STRONG> - <OL START=11> + + + + + + + + + + + + + + + + <LI value="2"><STRONG>General Technical Questions</STRONG> + <OL> <LI><A HREF="#what2do">"Why can't I ...? Why won't ... work?" What to do in case of problems</A> </LI> <LI><A HREF="#compatible">How compatible is Apache with my existing NCSA 1.3 setup?</A> </LI> - <LI><A HREF="#CGIoutsideScriptAlias">How do I enable CGI execution - in directories other than the ScriptAlias?</A> + <LI><A HREF="#year2000">Is Apache Year 2000 compliant?</A> </LI> - <LI><A HREF="#premature-script-headers">What does it mean when my - CGIs fail with "<SAMP>Premature end of script - headers</SAMP>"?</A> + <LI><A HREF="#submit_patch">How do I submit a patch to the Apache Group?</A> </LI> - <LI><A HREF="#ssi-part-i">How do I enable SSI (parsed HTML)?</A> + <LI><A HREF="#domination">Why has Apache stolen my favourite site's + Internet address?</A> </LI> - <LI><A HREF="#ssi-part-ii">Why don't my parsed files get cached?</A> + <LI><A HREF="#apspam">Why am I getting spam mail from the Apache site?</A> </LI> - <LI><A HREF="#ssi-part-iii">How can I have my script output parsed?</A> + <LI><A HREF="#redist">May I include the Apache software on a CD or other + package I'm distributing?</A> </LI> - <LI><A HREF="#ssi-part-iv">SSIs don't work for VirtualHosts and/or - user home directories</A> + <LI><A HREF="#zoom">What's the best hardware/operating system/... How do + I get the most out of my Apache Web server?</A> </LI> - <LI><A HREF="#proxy">Does or will Apache act as a Proxy server?</A> + <LI><A HREF="#regex">What are "regular expressions"?</A> </LI> - <LI><A HREF="#multiviews">What are "multiviews"?</A> + </OL> + </LI> + + + + + + + + + + + + + + + + <LI VALUE="3"><STRONG>Building Apache</STRONG> + <OL> + <LI><A HREF="#bind8.1">Why do I get an error about an undefined + reference to "<SAMP>__inet_ntoa</SAMP>" or other + <SAMP>__inet_*</SAMP> symbols?</A> + </LI> + <LI><A HREF="#cantbuild">Why won't Apache compile with my + system's <SAMP>cc</SAMP>?</A> + </LI> + <LI><A HREF="#linuxiovec">Why do I get complaints about redefinition + of "<CODE>struct iovec</CODE>" when compiling under Linux?</A> </LI> + <LI><A HREF="#broken-gcc">I'm using gcc and I get some compilation errors, + what is wrong?</A> + </LI> + <LI><A HREF="#glibc-crypt">I'm using RedHat Linux 5.0, or some other + <SAMP>glibc</SAMP>-based Linux system, and I get errors with the + <CODE>crypt</CODE> function when I attempt to build Apache 1.2.</A> + </LI> + </OL> + </LI> + + + + + + + + + + + + + + + + <LI VALUE="4"><STRONG>Error Log Messages and Problems Starting Apache</STRONG> + <OL> + <LI><A HREF="#setgid">Why do I get "<SAMP>setgid: Invalid + argument</SAMP>" at startup?</A> + </LI> + <LI><A HREF="#nodelay">Why am I getting "<SAMP>httpd: could not + set socket option TCP_NODELAY</SAMP>" in my error log?</A> + </LI> + <LI><A HREF="#peerreset">Why am I getting "<SAMP>connection + reset by peer</SAMP>" in my error log?</A> + </LI> + <LI><A HREF="#wheres-the-dump">The errorlog says Apache dumped core, + but where's the dump file?</A> + </LI> + <LI><A HREF="#linux-shmget">When I run it under Linux I get "shmget: + function not found", what should I do?</A> + </LI> + <LI><A HREF="#nfslocking">Server hangs, or fails to start, and/or error log + fills with "<SAMP>fcntl: F_SETLKW: No record locks + available</SAMP>" or similar messages</A> + </LI> + <LI><A HREF="#aixccbug">Why am I getting "<SAMP>Expected </Directory> + but saw </Directory></SAMP>" when I try to start Apache?</A> + </LI> + <LI><A HREF="#redhat">I'm using RedHat Linux and I have problems with httpd + dying randomly or not restarting properly</A> + </LI> + <LI><A HREF="#stopping">I upgraded from an Apache version earlier + than 1.2.0 and suddenly I have problems with Apache dying randomly + or not restarting properly</A> + </LI> + <LI><A HREF="#setservername">When I try to start Apache from a DOS + window, I get a message like "<samp>Cannot determine host name. + Use ServerName directive to set it manually.</samp>" What does + this mean?</A> + </LI> + </OL> + </LI> + + + + + + + + + + + + + + + + <LI VALUE="5"><STRONG>Configuration Questions</STRONG> + <OL> <LI><A HREF="#fdlim">Why can't I run more than <<EM>n</EM>> virtual hosts?</A> </LI> <LI><A HREF="#freebsd-setsize">Can I increase <SAMP>FD_SETSIZE</SAMP> on FreeBSD?</A> </LI> - <LI><A HREF="#POSTnotallowed">Why do I keep getting "Method Not - Allowed" for form POST requests?</A> - </LI> - <LI><A HREF="#passwdauth">Can I use my <SAMP>/etc/passwd</SAMP> file - for Web page authentication?</A> - </LI> <LI><A HREF="#errordoc401">Why doesn't my <CODE>ErrorDocument 401</CODE> work?</A> </LI> - <LI><A HREF="#errordocssi">How can I use <CODE>ErrorDocument</CODE> - and SSI to simplify customized error messages?</A> - </LI> - <LI><A HREF="#setgid">Why do I get "<SAMP>setgid: Invalid - argument</SAMP>" at startup?</A> - </LI> <LI><A HREF="#cookies1">Why does Apache send a cookie on every response?</A> </LI> <LI><A HREF="#cookies2">Why don't my cookies work, I even compiled in @@ -170,66 +288,114 @@ <LI><A HREF="#jdk1-and-http1.1">Why do my Java app[let]s give me plain text when I request an URL from an Apache server?</A> </LI> - <LI><A HREF="#putsupport">Why can't I publish to my Apache server - using PUT on Netscape Gold and other programs?</A> + <LI><A HREF="#midi">How do I get Apache to send a MIDI file so the + browser can play it?</A> </LI> - <LI><A HREF="#fastcgi">Why isn't FastCGI included with Apache any - more?</A> + <LI><A HREF="#addlog">How do I add browsers and referrers to my logs?</A> </LI> - <LI><A HREF="#nodelay">Why am I getting "<SAMP>httpd: could not - set socket option TCP_NODELAY</SAMP>" in my error log?</A> + <LI><A HREF="#set-servername">Why does accessing directories only work + when I include the trailing "/" + (<EM>e.g.</EM>, <SAMP>http://foo.domain.com/~user/</SAMP>) but + not when I omit it + (<EM>e.g.</EM>, <SAMP>http://foo.domain.com/~user</SAMP>)?</A> </LI> - <LI><A HREF="#peerreset">Why am I getting "<SAMP>connection - reset by peer</SAMP>" in my error log?</A> + <LI><A HREF="#no-info-directives">Why doesn't mod_info list any + directives?</A> + </LI> + <LI><A HREF="#namevhost">I upgraded to Apache 1.3 and now my + virtual hosts don't work!</A> + </LI> + <LI><A HREF="#redhat-htm">I'm using RedHat Linux and my .htm files are + showing up as HTML source rather than being formatted!</A> + </LI> + <LI><A HREF="#htaccess-work">My <CODE>.htaccess</CODE> files are being + ignored.</A> + </LI> + <LI><A HREF="#forbidden">Why do I get a + "<SAMP>Forbidden</SAMP>" message whenever I try to + access a particular directory?</A> + </OL> + </LI> + + + + + + + + + + + + + + + + <LI VALUE="6"><STRONG>Dynamic Content (CGI and SSI)</STRONG> + <OL> + <LI><A HREF="#CGIoutsideScriptAlias">How do I enable CGI execution + in directories other than the ScriptAlias?</A> + </LI> + <LI><A HREF="#premature-script-headers">What does it mean when my + CGIs fail with "<SAMP>Premature end of script + headers</SAMP>"?</A> + </LI> + <LI><A HREF="#POSTnotallowed">Why do I keep getting "Method Not + Allowed" for form POST requests?</A> </LI> <LI><A HREF="#nph-scripts">How can I get my script's output without Apache buffering it? Why doesn't my server push work?</A> </LI> - <LI><A HREF="#linuxiovec">Why do I get complaints about redefinition - of "<CODE>struct iovec</CODE>" when compiling under Linux?</A> + <LI><A HREF="#cgi-spec">Where can I find the "CGI + specification"?</A> </LI> - <LI><A HREF="#wheres-the-dump">The errorlog says Apache dumped core, - but where's the dump file?</A> + <LI><A HREF="#fastcgi">Why isn't FastCGI included with Apache any + more?</A> </LI> - <LI><A HREF="#dnsauth">Why isn't restricting access by host or domain name - working correctly?</A> + <LI><A HREF="#ssi-part-i">How do I enable SSI (parsed HTML)?</A> </LI> - <LI><A HREF="#SSL-i">Why doesn't Apache include SSL?</A> + <LI><A HREF="#ssi-part-ii">Why don't my parsed files get cached?</A> </LI> - <LI><A HREF="#midi">How do I get Apache to send a MIDI file so the - browser can play it?</A> + <LI><A HREF="#ssi-part-iii">How can I have my script output parsed?</A> </LI> - <LI><A HREF="#cantbuild">Why won't Apache compile with my - system's <SAMP>cc</SAMP>?</A> + <LI><A HREF="#ssi-part-iv">SSIs don't work for VirtualHosts and/or + user home directories</A> </LI> - <LI><A HREF="#addlog">How do I add browsers and referrers to my logs?</A> + <LI><A HREF="#errordocssi">How can I use <CODE>ErrorDocument</CODE> + and SSI to simplify customized error messages?</A> </LI> - <LI><A HREF="#bind8.1">Why do I get an error about an undefined - reference to "<SAMP>__inet_ntoa</SAMP>" or other - <SAMP>__inet_*</SAMP> symbols?</A> + <LI><A HREF="#remote-user-var">Why is the environment variable + <SAMP>REMOTE_USER</SAMP> not set?</A> </LI> - <LI><A HREF="#set-servername">Why does accessing directories only work - when I include the trailing "/" - (<EM>e.g.</EM>, <SAMP>http://foo.domain.com/~user/</SAMP>) but - not when I omit it - (<EM>e.g.</EM>, <SAMP>http://foo.domain.com/~user</SAMP>)?</A> + </OL> + </LI> + + + + + + + + + + + + + + + + <LI VALUE="7"><STRONG>Authentication and Access Restrictions</STRONG> + <OL> + <LI><A HREF="#dnsauth">Why isn't restricting access by host or domain name + working correctly?</A> </LI> <LI><A HREF="#user-authentication">How do I set up Apache to require a username and password to access certain documents?</A> </LI> - <LI><A HREF="#remote-user-var">Why is the environment variable - <SAMP>REMOTE_USER</SAMP> not set?</A> - </LI> <LI><A HREF="#remote-auth-only">How do I set up Apache to allow access to certain documents only if a site is either a local site <EM>or</EM> the user supplies a password and username?</A> </LI> - <LI><A HREF="#no-info-directives">Why doesn't mod_info list any - directives?</A> - </LI> - <LI><A HREF="#linux-shmget">When I run it under Linux I get "shmget: - function not found", what should I do?</A> - </LI> <LI><A HREF="#authauthoritative">Why does my authentication give me a server error?</A> </LI> @@ -238,6 +404,28 @@ </LI> <LI><A HREF="#msql-slow">Why is my mSQL authentication terribly slow?</A> </LI> + <LI><A HREF="#passwdauth">Can I use my <SAMP>/etc/passwd</SAMP> file + for Web page authentication?</A> + </LI> + </OL> + </LI> + + + + + + + + + + + + + + + + <LI VALUE="8"><STRONG>URL Rewriting</STRONG> + <OL> <LI><A HREF="#rewrite-more-config">Where can I find mod_rewrite rulesets which already solve particular URL-related problems?</A> </LI> @@ -262,64 +450,65 @@ <LI><A HREF="#rewrite-envwhitespace">How can I use strings with whitespaces in RewriteRule's ENV flag?</A> </LI> - <LI><A HREF="#cgi-spec">Where can I find the "CGI - specification"?</A> - </LI> - <LI><A HREF="#year2000">Is Apache Year 2000 compliant?</A> - </LI> - <LI><A HREF="#namevhost">I upgraded to Apache 1.3 and now my - virtual hosts don't work!</A> - </LI> - <LI><A HREF="#redhat">I'm using RedHat Linux and I have problems with httpd - dying randomly or not restarting properly</A> - </LI> - <LI><A HREF="#stopping">I upgraded from an Apache version earlier - than 1.2.0 and suddenly I have problems with Apache dying randomly - or not restarting properly</A> - </LI> - <LI><A HREF="#redhat-htm">I'm using RedHat Linux and my .htm files are - showing up as HTML source rather than being formatted!</A> - </LI> - <LI><A HREF="#glibc-crypt">I'm using RedHat Linux 5.0, or some other - <SAMP>glibc</SAMP>-based Linux system, and I get errors with the - <CODE>crypt</CODE> function when I attempt to build Apache 1.2.</A> - </LI> - <LI><A HREF="#nfslocking">Server hangs, or fails to start, and/or error log - fills with "<SAMP>fcntl: F_SETLKW: No record locks - available</SAMP>" or similar messages</A> - </LI> - <LI><A HREF="#zoom">What's the best hardware/operating system/... How do - I get the most out of my Apache Web server?</A> - </LI> - <LI><A HREF="#regex">What are "regular expressions"?</A> - </LI> - <LI><A HREF="#broken-gcc">I'm using gcc and I get some compilation errors, - what is wrong?</A> + </OL> + </LI> + + + + + + + + + + + + + + + + <LI VALUE="9"><STRONG>Features</STRONG> + <OL> + <LI><A HREF="#proxy">Does or will Apache act as a Proxy server?</A> </LI> - <LI><A HREF="#htaccess-work">My <CODE>.htaccess</CODE> files are being - ignored.</A> + <LI><A HREF="#multiviews">What are "multiviews"?</A> </LI> - <LI><A HREF="#submit_patch">How do I submit a patch to the Apache Group?</A> + <LI><A HREF="#putsupport">Why can't I publish to my Apache server + using PUT on Netscape Gold and other programs?</A> </LI> - <LI><A HREF="#aixccbug">Why am I getting "<SAMP>Expected </Directory> - but saw </Directory></SAMP>" when I try to start Apache?</A> + <LI><A HREF="#SSL-i">Why doesn't Apache include SSL?</A> </LI> - <LI><A HREF="#domination">Why has Apache stolen my favourite site's - Internet address?</A> + <LI><A HREF="#footer">How can I attach a footer to my documents + without using SSI?</A> </LI> - <LI><A HREF="#apspam">Why am I getting spam mail from the Apache site?</A> + <LI><A HREF="#search">Does Apache include a search engine?</A> </LI> </OL> </LI> -</UL> + + + + + +</OL> <HR> <H2>The Answers</H2> - <H3> - Background - </H3> -<OL START=1> + + + + + + + + + + + + + <H3>A. Background</H3> +<OL> <LI><A NAME="what"> <STRONG>What is Apache?</STRONG> </A> @@ -400,7 +589,7 @@ <STRONG>How thoroughly tested is Apache?</STRONG> </A> <P> - Apache is run on over 1.2 million Internet servers (as of July 1998). It has + Apache is run on over 3 million Internet servers (as of June 1999). It has been tested thoroughly by both developers and users. The Apache Group maintains rigorous standards before releasing new versions of their server, and our server runs without a hitch on over one half of all @@ -487,8 +676,25 @@ <HR> </LI> </OL> - <H3>Technical Questions</H3> -<OL START=11> + + + + + + + + + + + + + + + + + <H3>B. General Technical Questions</H3> +<OL> + <LI><A NAME="what2do"> <STRONG>"Why can't I ...? Why won't ... work?" What to do in case of problems</STRONG> @@ -533,7 +739,8 @@ </P> </LI> <LI><STRONG>Ask in the <SAMP>comp.infosystems.www.servers.unix</SAMP> - USENET newsgroup</STRONG> + or <SAMP>comp.infosystems.www.servers.ms-windows</SAMP> USENET + newsgroup (as appropriate for the platform you use).</STRONG> <P> A lot of common problems never make it to the bug database because there's already high Q&A traffic about them in the @@ -600,298 +807,619 @@ <HR> </LI> - <LI><A NAME="CGIoutsideScriptAlias"> - <STRONG>How do I enable CGI execution in directories other than - the ScriptAlias?</STRONG> + <LI><A NAME="year2000"> + <STRONG>Is Apache Year 2000 compliant?</STRONG> </A> <P> - Apache recognizes all files in a directory named as a - <A HREF="../mod/mod_alias.html#scriptalias"><SAMP>ScriptAlias</SAMP></A> - as being eligible for execution rather than processing as normal - documents. This applies regardless of the file name, so scripts in a - ScriptAlias directory don't need to be named - "<SAMP>*.cgi</SAMP>" or "<SAMP>*.pl</SAMP>" or - whatever. In other words, <EM>all</EM> files in a ScriptAlias - directory are scripts, as far as Apache is concerned. + Yes, Apache is Year 2000 compliant. </P> <P> - To persuade Apache to execute scripts in other locations, such as in - directories where normal documents may also live, you must tell it how - to recognize them - and also that it's okay to execute them. For - this, you need to use something like the - <A HREF="../mod/mod_mime.html#addhandler"><SAMP>AddHandler</SAMP></A> - directive. + Apache internally never stores years as two digits. + On the HTTP protocol level RFC1123-style addresses are generated + which is the only format a HTTP/1.1-compliant server should + generate. To be compatible with older applications Apache + recognizes ANSI C's <CODE>asctime()</CODE> and + RFC850-/RFC1036-style date formats, too. + The <CODE>asctime()</CODE> format uses four-digit years, + but the RFC850 and RFC1036 date formats only define a two-digit year. + If Apache sees such a date with a value less than 70 it assumes that + the century is <SAMP>20</SAMP> rather than <SAMP>19</SAMP>. </P> <P> - <OL> - <LI>In an appropriate section of your server configuration files, add - a line such as - <P> - <DL> - <DD><CODE>AddHandler cgi-script .cgi</CODE> - </DD> - </DL> - <P></P> - <P> - The server will then recognize that all files in that location (and - its logical descendants) that end in "<SAMP>.cgi</SAMP>" - are script files, not documents. - </P> - </LI> - <LI>Make sure that the directory location is covered by an - <A HREF="../mod/core.html#options"><SAMP>Options</SAMP></A> - declaration that includes the <SAMP>ExecCGI</SAMP> option. - </LI> - </OL> - <P></P> + Although Apache is Year 2000 compliant, you may still get problems + if the underlying OS has problems with dates past year 2000 + (<EM>e.g.</EM>, OS calls which accept or return year numbers). + Most (UNIX) systems store dates internally as signed 32-bit integers + which contain the number of seconds since 1<SUP>st</SUP> January 1970, so + the magic boundary to worry about is the year 2038 and not 2000. + But modern operating systems shouldn't cause any trouble + at all. + </P> <P> - In some situations, you might not want to actually - allow all files named "<SAMP>*.cgi</SAMP>" to be executable. - Perhaps all you want is to enable a particular file in a normal directory to - be executable. This can be alternatively accomplished - <EM>via</EM> <A HREF="../mod/mod_rewrite.html"><SAMP>mod_rewrite</SAMP></A> - and the following steps: + Users of Apache 1.2.x should upgrade to a current version of Apache 1.3 + (see <A HREF="../new_features_1_3.html#misc">year-2000 improvements in + Apache 1.3</A> for details). </P> + <HR> + </LI> + + <LI><A NAME="submit_patch"> + <STRONG>How do I submit a patch to the Apache Group?</STRONG></A> + <P> + The Apache Group encourages patches from outside developers. There + are 2 main "types" of patches: small bugfixes and general + improvements. Bugfixes should be submitting using the Apache <A + HREF="http://www.apache.org/bug_report.html">bug report page</A>. + Improvements, modifications, and additions should follow the + instructions below. + </P> + <P> + In general, the first course of action is to be a member of the + <SAMP>new-httpd@apache.org</SAMP> mailing list. This indicates to + the Group that you are closely following the latest Apache + developments. Your patch file should be generated using either + '<CODE>diff -c</CODE>' or '<CODE>diff -u</CODE>' against + the latest CVS tree. To submit your patch, send email to + <SAMP>new-httpd@apache.org</SAMP> with a <SAMP>Subject:</SAMP> line + that starts with <SAMP>[PATCH]</SAMP> and includes a general + description of the patch. In the body of the message, the patch + should be clearly described and then included at the end of the + message. If the patch-file is long, you can note a URL to the file + instead of the file itself. Use of MIME enclosures/attachments + should be avoided. + </P> + <P> + Be prepared to respond to any questions about your patches and + possibly defend your code. If your patch results in a lot of + discussion, you may be asked to submit an updated patch that + incorporate all changes and suggestions. + </P> + <HR> + </LI> + + <LI><A NAME="domination"><STRONG>Why has Apache stolen my favourite site's + Internet address?</STRONG></A> + <P> + The simple answer is: "It hasn't." This misconception is usually + caused by the site in question having migrated to the Apache Web + server software, but not having migrated the site's content yet. When + Apache is installed, the default page that gets installed tells the + Webmaster the installation was successful. The expectation is that + this default page will be replaced with the site's real content. + If it doesn't, complain to the Webmaster, not to the Apache project -- + we just make the software and aren't responsible for what people + do (or don't do) with it. + </P> + <HR> + </LI> + + <LI><A NAME="apspam"><STRONG>Why am I getting spam mail from the + Apache site?</STRONG></A> + <P> + The short answer is: "You aren't." Usually when someone thinks the + Apache site is originating spam, it's because they've traced the + spam to a Web site, and the Web site says it's using Apache. See the + <A HREF="#domination">previous FAQ entry</A> for more details on this + phenomenon. + </P> + <P> + No marketing spam originates from the Apache site. The only mail + that comes from the site goes only to addresses that have been + <EM>requested</EM> to receive the mail. + </P> + <HR> + </LI> + + <LI><A NAME="redist"><STRONG>May I include the Apache software on a + CD or other package I'm distributing?</STRONG></A> + <P> + The detailed answer to this question can be found in the + Apache license, which is included in the Apache distribution in + the file <CODE>LICENSE</CODE>. You can also find it on the Web at + <SAMP><<A HREF="http://www.apache.org/LICENSE.txt" + >http://www.apache.org/LICENSE.txt</A>></SAMP>. + </P> + <HR> + </LI> + + <LI><A NAME="zoom"> + <STRONG>What's the best hardware/operating system/... How do + I get the most out of my Apache Web server?</STRONG> + </A> <P> - <OL> - <LI>Locally add to the corresponding <SAMP>.htaccess</SAMP> file a ruleset - similar to this one: - <P> - <DL> - <DD><CODE>RewriteEngine on - <BR> - RewriteBase /~foo/bar/ - <BR> - RewriteRule ^quux\.cgi$ - [T=application/x-httpd-cgi]</CODE> - </DD> - </DL> - <P></P> - </LI> - <LI>Make sure that the directory location is covered by an - <A HREF="../mod/core.html#options"><SAMP>Options</SAMP></A> - declaration that includes the <SAMP>ExecCGI</SAMP> and - <SAMP>FollowSymLinks</SAMP> option. - </LI> - </OL> - <P></P> + Check out Dean Gaudet's + <A HREF="http://www.apache.org/docs/misc/perf-tuning.html" + >performance tuning page</A>. + </P> <HR> </LI> - <LI><A NAME="premature-script-headers"> - <STRONG>What does it mean when my CGIs fail with - "<SAMP>Premature end of script headers</SAMP>"?</STRONG> + <LI><A NAME="regex"> + <STRONG>What are "regular expressions"?</STRONG></A> + <P> + Regular expressions are a way of describing a pattern - for example, "all + the words that begin with the letter A" or "every 10-digit phone number" + or even "Every sentence with two commas in it, and no capital letter Q". + Regular expressions (aka "regexp"s) are useful in Apache because they + let you apply certain attributes against collections of files or resources + in very flexible ways - for example, all .gif and .jpg files under + any "images" directory could be written as /.*\/images\/.*[jpg|gif]/. + </P> + <P> + The best overview around is probably the one which comes with Perl. + We implement a simple subset of Perl's regexp support, but it's + still a good way to learn what they mean. You can start by going + to the <A + HREF="http://www.perl.com/CPAN-local/doc/manual/html/pod/perlre.html#Version_8_Regular_Expresions" + >CPAN page on regular expressions</A>, and branching out from + there. + </P> + <HR> + </LI> +</OL> + + + + + + + + + + + + + + + + + <H3>C. Building Apache</H3> +<OL> + + <LI><A NAME="bind8.1"> + <STRONG>Why do I get an error about an undefined reference to + "<SAMP>__inet_ntoa</SAMP>" or other + <SAMP>__inet_*</SAMP> symbols?</STRONG> </A> <P> - It means just what it says: the server was expecting a complete set of - HTTP headers (one or more followed by a blank line), and didn't get - them. + If you have installed <A HREF="http://www.isc.org/bind.html">BIND-8</A> + then this is normally due to a conflict between your include files + and your libraries. BIND-8 installs its include files and libraries + <CODE>/usr/local/include/</CODE> and <CODE>/usr/local/lib/</CODE>, while + the resolver that comes with your system is probably installed in + <CODE>/usr/include/</CODE> and <CODE>/usr/lib/</CODE>. If + your system uses the header files in <CODE>/usr/local/include/</CODE> + before those in <CODE>/usr/include/</CODE> but you do not use the new + resolver library, then the two versions will conflict. </P> <P> - The most common cause of this problem is the script dying before - sending the complete set of headers, or possibly any at all, to the - server. To see if this is the case, try running the script standalone - from an interactive session, rather than as a script under the server. - If you get error messages, this is almost certainly the cause of the - "premature end of script headers" message. + To resolve this, you can either make sure you use the include files + and libraries that came with your system or make sure to use the + new include files and libraries. Adding <CODE>-lbind</CODE> to the + <CODE>EXTRA_LDFLAGS</CODE> line in your <SAMP>Configuration</SAMP> + file, then re-running <SAMP>Configure</SAMP>, should resolve the + problem. (Apache versions 1.2.* and earlier use + <CODE>EXTRA_LFLAGS</CODE> instead.) </P> <P> - The second most common cause of this (aside from people not - outputting the required headers at all) is a result of an interaction - with Perl's output buffering. To make Perl flush its buffers - after each output statement, insert the following statements around - the <CODE>print</CODE> or <CODE>write</CODE> statements that send your - HTTP headers: + <STRONG>Note:</STRONG>As of BIND 8.1.1, the bind libraries and files are + installed under <SAMP>/usr/local/bind</SAMP> by default, so you + should not run into this problem. Should you want to use the bind + resolvers you'll have to add the following to the respective lines: </P> <P> <DL> - <DD><CODE>{<BR> - local ($oldbar) = $|;<BR> - $cfh = select (STDOUT);<BR> - $| = 1;<BR> - #<BR> - # print your HTTP headers here<BR> - #<BR> - $| = $oldbar;<BR> - select ($cfh);<BR> - }</CODE> + <DD><CODE>EXTRA_CFLAGS=-I/usr/local/bind/include + <BR> + EXTRA_LDFLAGS=-L/usr/local/bind/lib + <BR> + EXTRA_LIBS=-lbind</CODE> </DD> </DL> <P></P> + <HR> + </LI> + + <LI><A NAME="cantbuild"> + <STRONG>Why won't Apache compile with my system's + <SAMP>cc</SAMP>?</STRONG> + </A> <P> - This is generally only necessary when you are calling external - programs from your script that send output to stdout, or if there will - be a long delay between the time the headers are sent and the actual - content starts being emitted. To maximize performance, you should - turn buffer-flushing back <EM>off</EM> (with <CODE>$| = 0</CODE> or the - equivalent) after the statements that send the headers, as displayed - above. - </P> - <P> - If your script isn't written in Perl, do the equivalent thing for - whatever language you <EM>are</EM> using (<EM>e.g.</EM>, for C, call - <CODE>fflush()</CODE> after writing the headers). + If the server won't compile on your system, it is probably due to one + of the following causes: </P> + <UL> + <LI><STRONG>The <SAMP>Configure</SAMP> script doesn't recognize your system + environment.</STRONG> + <BR> + This might be either because it's completely unknown or because + the specific environment (include files, OS version, <EM>et + cetera</EM>) isn't explicitly handled. If this happens, you may + need to port the server to your OS yourself. + </LI> + <LI><STRONG>Your system's C compiler is garbage.</STRONG> + <BR> + Some operating systems include a default C compiler that is either + not ANSI C-compliant or suffers from other deficiencies. The usual + recommendation in cases like this is to acquire, install, and use + <SAMP>gcc</SAMP>. + </LI> + <LI><STRONG>Your <SAMP>include</SAMP> files may be confused.</STRONG> + <BR> + In some cases, we have found that a compiler installation or system + upgrade has left the C header files in an inconsistent state. Make + sure that your include directory tree is in sync with the compiler and + the operating system. + </LI> + <LI><STRONG>Your operating system or compiler may be out of + revision.</STRONG> + <BR> + Software vendors (including those that develop operating systems) + issue new releases for a reason; sometimes to add functionality, but + more often to fix bugs that have been discovered. Try upgrading + your compiler and/or your operating system. + </LI> + </UL> <P> - Another cause for the "premature end of script headers" - message are the RLimitCPU and RLimitMEM directives. You may - get the message if the CGI script was killed due to a - resource limit. + The Apache Group tests the ability to build the server on many + different platforms. Unfortunately, we can't test all of the OS + platforms there are. If you have verified that none of the above + issues is the cause of your problem, and it hasn't been reported + before, please submit a + <A HREF="http://www.apache.org/bug_report.html">problem report</A>. + Be sure to include <EM>complete</EM> details, such as the compiler + & OS versions and exact error messages. </P> <HR> </LI> - <LI><A NAME="ssi-part-i"> - <STRONG>How do I enable SSI (parsed HTML)?</STRONG> + <LI><A NAME="linuxiovec"> + <STRONG>Why do I get complaints about redefinition + of "<CODE>struct iovec</CODE>" when + compiling under Linux?</STRONG> </A> <P> - SSI (an acronym for Server-Side Include) directives allow static HTML - documents to be enhanced at run-time (<EM>e.g.</EM>, when delivered to - a client by Apache). The format of SSI directives is covered - in the <A HREF="../mod/mod_include.html">mod_include manual</A>; - suffice it to say that Apache supports not only SSI but - xSSI (eXtended SSI) directives. - </P> - <P> - Processing a document at run-time is called <EM>parsing</EM> it; hence - the term "parsed HTML" sometimes used for documents that - contain SSI instructions. Parsing tends to be <EM>extremely</EM> - resource-consumptive, and is not enabled by default. It can also - interfere with the cachability of your documents, which can put a - further load on your server. (see the - <A HREF="#ssi-part-ii">next question</A> for more information about this.) + This is a conflict between your C library includes and your kernel + includes. You need to make sure that the versions of both are matched + properly. There are two workarounds, either one will solve the problem: </P> <P> - To enable SSI processing, you need to - </P> <UL> - <LI>Build your server with the - <A HREF="../mod/mod_include.html"><SAMP>mod_include</SAMP></A> - module. This is normally compiled in by default. + <LI>Remove the definition of <CODE>struct iovec</CODE> from your C + library includes. It is located in <CODE>/usr/include/sys/uio.h</CODE>. + <STRONG>Or,</STRONG> </LI> - <LI>Make sure your server configuration files have an - <A HREF="../mod/core.html#options"><SAMP>Options</SAMP></A> - directive which permits <SAMP>Includes</SAMP>. + <LI>Add <CODE>-DNO_WRITEV</CODE> to the <CODE>EXTRA_CFLAGS</CODE> + line in your <SAMP>Configuration</SAMP> and reconfigure/rebuild. + This hurts performance and should only be used as a last resort. </LI> - <LI>Make sure that the directory where you want the SSI documents to - live is covered by the "server-parsed" content handler, - either explicitly or in some ancestral location. That can be done - with the following - <A HREF="../mod/mod_mime.html#addhandler"><SAMP>AddHandler</SAMP></A> - directive: + </UL> + <P></P> + <HR> + </LI> + + <LI><A NAME="broken-gcc"><STRONG>I'm using gcc and I get some + compilation errors, what is wrong?</STRONG></A> <P> - <DL> - <DD><CODE>AddHandler server-parsed .shtml</CODE> - </DD> - </DL> - <P></P> + GCC parses your system header files and produces a modified subset which + it uses for compiling. This behaviour ties GCC tightly to the version + of your operating system. So, for example, if you were running IRIX 5.3 + when you built GCC and then upgrade to IRIX 6.2 later, you will have to + rebuild GCC. Similarly for Solaris 2.4, 2.5, or 2.5.1 when you upgrade + to 2.6. Sometimes you can type "gcc -v" and it will tell you the version + of the operating system it was built against. + </P> <P> - This indicates that all files ending in ".shtml" in that - location (or its descendants) should be parsed. Note that using - ".html" will cause all normal HTML files to be parsed, - which may put an inordinate load on your server. + If you fail to do this, then it is very likely that Apache will fail + to build. One of the most common errors is with <CODE>readv</CODE>, + <CODE>writev</CODE>, or <CODE>uio.h</CODE>. This is <STRONG>not</STRONG> a + bug with Apache. You will need to re-install GCC. </P> - </LI> - </UL> + <HR> + </LI> + + <LI><A NAME="glibc-crypt"> + <STRONG>I'm using RedHat Linux 5.0, or some other + <SAMP>glibc</SAMP>-based Linux system, and I get errors with the + <CODE>crypt</CODE> function when I attempt to build Apache 1.2.</STRONG> + </A> + <P> - For additional information, see the <CITE>Apache Week</CITE> article on - <A HREF="http://www.apacheweek.com/features/ssi" REL="Help" - ><CITE>Using Server Side Includes</CITE></A>. + <SAMP>glibc</SAMP> puts the <CODE>crypt</CODE> function into a separate + library. Edit your <CODE>src/Configuration</CODE> file and set this: + </P> + <DL> + <DD><CODE>EXTRA_LIBS=-lcrypt</CODE> + </DD> + </DL> + <P> + Then re-run <SAMP>src/Configure</SAMP> and re-execute the make. </P> <HR> </LI> - <LI><A NAME="ssi-part-ii"> - <STRONG>Why don't my parsed files get cached?</STRONG> +</OL> + + + + + + + + + + + + + + + + + <H3>D. Error Log Messages and Problems Starting Apache</H3> +<OL> + + <LI><A NAME="setgid"> + <STRONG>Why do I get "<SAMP>setgid: Invalid + argument</SAMP>" at startup?</STRONG> </A> <P> - Since the server is performing run-time processing of your SSI - directives, which may change the content shipped to the client, it - can't know at the time it starts parsing what the final size of the - result will be, or whether the parsed result will always be the same. - This means that it can't generate <SAMP>Content-Length</SAMP> or - <SAMP>Last-Modified</SAMP> headers. Caches commonly work by comparing - the <SAMP>Last-Modified</SAMP> of what's in the cache with that being - delivered by the server. Since the server isn't sending that header - for a parsed document, whatever's doing the caching can't tell whether - the document has changed or not - and so fetches it again to be on the - safe side. + Your + <A HREF="../mod/core.html#group"><SAMP>Group</SAMP></A> + directive (probably in <SAMP>conf/httpd.conf</SAMP>) needs to name a + group that actually exists in the <SAMP>/etc/group</SAMP> file (or + your system's equivalent). This problem is also frequently seen when + a negative number is used in the <CODE>Group</CODE> directive + (<EM>e.g.</EM>, "<CODE>Group #-1</CODE>"). Using a group name + -- not group number -- found in your system's group database should + solve this problem in all cases. </P> + <HR> + </LI> + + <LI><A NAME="nodelay"> + <STRONG>Why am I getting "<SAMP>httpd: could not set socket + option TCP_NODELAY</SAMP>" in my error log?</STRONG> + </A> <P> - You can work around this in some cases by causing an - <SAMP>Expires</SAMP> header to be generated. (See the - <A HREF="../mod/mod_expires.html" REL="Help"><SAMP>mod_expires</SAMP></A> - documentation for more details.) Another possibility is to use the - <A HREF="../mod/mod_include.html#xbithack" REL="Help" - ><SAMP>XBitHack Full</SAMP></A> - mechanism, which tells Apache to send (under certain circumstances - detailed in the XBitHack directive description) a - <SAMP>Last-Modified</SAMP> header based upon the last modification - time of the file being parsed. Note that this may actually be lying - to the client if the parsed file doesn't change but the SSI-inserted - content does; if the included content changes often, this can result - in stale copies being cached. + This message almost always indicates that the client disconnected + before Apache reached the point of calling <CODE>setsockopt()</CODE> + for the connection. It shouldn't occur for more than about 1% of the + requests your server handles, and it's advisory only in any case. </P> <HR> </LI> - <LI><A NAME="ssi-part-iii"> - <STRONG>How can I have my script output parsed?</STRONG> + <LI><A NAME="peerreset"> + <STRONG>Why am I getting "<SAMP>connection reset by + peer</SAMP>" in my error log?</STRONG> </A> <P> - So you want to include SSI directives in the output from your CGI - script, but can't figure out how to do it? - The short answer is "you can't." This is potentially - a security liability and, more importantly, it can not be cleanly - implemented under the current server API. The best workaround - is for your script itself to do what the SSIs would be doing. - After all, it's generating the rest of the content. + This is a normal message and nothing about which to be alarmed. It simply + means that the client canceled the connection before it had been + completely set up - such as by the end-user pressing the "Stop" + button. People's patience being what it is, sites with response-time + problems or slow network links may experiences this more than + high-capacity ones or those with large pipes to the network. </P> + <HR> + </LI> + + <LI><A NAME="wheres-the-dump"> + <STRONG>The errorlog says Apache dumped core, but where's the dump + file?</STRONG> + </A> <P> - This is a feature The Apache Group hopes to add in the next major - release after 1.3. + In Apache version 1.2, the error log message + about dumped core includes the directory where the dump file should be + located. However, many Unixes do not allow a process that has + called <CODE>setuid()</CODE> to dump core for security reasons; + the typical Apache setup has the server started as root to bind to + port 80, after which it changes UIDs to a non-privileged user to + serve requests. + </P> + <P> + Dealing with this is extremely operating system-specific, and may + require rebuilding your system kernel. Consult your operating system + documentation or vendor for more information about whether your system + does this and how to bypass it. If there <EM>is</EM> a documented way + of bypassing it, it is recommended that you bypass it only for the + <SAMP>httpd</SAMP> server process if possible. + </P> + <P> + The canonical location for Apache's core-dump files is the + <A HREF="../mod/core.html#serverroot">ServerRoot</A> + directory. As of Apache version 1.3, the location can be set <EM>via</EM> + the + <A HREF="../mod/core.html#coredumpdirectory" + ><SAMP>CoreDumpDirectory</SAMP></A> + directive to a different directory. Make sure that this directory is + writable by the user the server runs as (as opposed to the user the server + is <EM>started</EM> as). </P> <HR> </LI> - <LI><A NAME="ssi-part-iv"> - <STRONG>SSIs don't work for VirtualHosts and/or - user home directories.</STRONG> + <LI><A NAME="linux-shmget"> + <STRONG>When I run it under Linux I get "shmget: + function not found", what should I do?</STRONG> </A> <P> - This is almost always due to having some setting in your config file that - sets "Options Includes" or some other setting for your DocumentRoot - but not for other directories. If you set it inside a Directory - section, then that setting will only apply to that directory. + Your kernel has been built without SysV IPC support. You will have + to rebuild the kernel with that support enabled (it's under the + "General Setup" submenu). Documentation for kernel + building is beyond the scope of this FAQ; you should consult the <A + HREF="http://www.linuxhq.com/HOWTO/Kernel-HOWTO.html" >Kernel + HOWTO</A>, or the documentation provided with your distribution, or + a <A HREF="http://www.linuxhq.com/HOWTO/META-FAQ.html" >Linux + newsgroup/mailing list</A>. As a last-resort workaround, you can + comment out the <CODE>#define USE_SHMGET_SCOREBOARD</CODE> + definition in the <SAMP>LINUX</SAMP> section of + <SAMP>src/conf.h</SAMP> and rebuild the server (prior to 1.3b4, + simply removing <CODE>#define HAVE_SHMGET</CODE> would have + sufficed). This will produce a server which is slower and less + reliable. </P> + <HR> </LI> - <LI><A NAME="proxy"> - <STRONG>Does or will Apache act as a Proxy server?</STRONG> + <LI><A NAME="nfslocking"> + <STRONG>Server hangs, or fails to start, and/or error log + fills with "<SAMP>fcntl: F_SETLKW: No record locks + available</SAMP>" or similar messages</STRONG> </A> + <P> - Apache version 1.1 and above comes with a - <A HREF="../mod/mod_proxy.html">proxy module</A>. - If compiled in, this will make Apache act as a caching-proxy server. + These are symptoms of a fine locking problem, which usually means that + the server is trying to use a synchronization file on an NFS filesystem. + </P> + <P> + Because of its parallel-operation model, the Apache Web server needs to + provide some form of synchronization when accessing certain resources. + One of these synchronization methods involves taking out locks on a file, + which means that the filesystem whereon the lockfile resides must support + locking. In many cases this means it <EM>can't</EM> be kept on an + NFS-mounted filesystem. + </P> + <P> + To cause the Web server to work around the NFS locking limitations, include + a line such as the following in your server configuration files: + </P> + <DL> + <DD><CODE>LockFile /var/run/apache-lock</CODE> + </DD> + </DL> + <P> + The directory should not be generally writable (<EM>e.g.</EM>, don't use + <SAMP>/var/tmp</SAMP>). + See the <A HREF="../mod/core.html#lockfile"><SAMP>LockFile</SAMP></A> + documentation for more information. </P> <HR> </LI> - <LI><A NAME="multiviews"> - <STRONG>What are "multiviews"?</STRONG> + <LI><A NAME="aixccbug"><STRONG>Why am I getting "<SAMP>Expected + </Directory> but saw </Directory></SAMP>" when + I try to start Apache?</STRONG></A> + <P> + This is a known problem with certain versions of the AIX C compiler. + IBM are working on a solution, and the issue is being tracked by + <A HREF="http://bugs.apache.org/index/full/2312">problem report #2312</A>. + </P> + <HR> + </LI> + + <LI><A NAME="redhat"> + <STRONG>I'm using RedHat Linux and I have problems with httpd + dying randomly or not restarting properly</STRONG> </A> + <P> - "Multiviews" is the general name given to the Apache - server's ability to provide language-specific document variants in - response to a request. This is documented quite thoroughly in the - <A HREF="../content-negotiation.html" REL="Help">content negotiation</A> - description page. In addition, <CITE>Apache Week</CITE> carried an - article on this subject entitled - "<A HREF="http://www.apacheweek.com/features/negotiation" REL="Help" - ><CITE>Content Negotiation Explained</CITE></A>". + RedHat Linux versions 4.x (and possibly earlier) RPMs contain + various nasty scripts which do not stop or restart Apache properly. + These can affect you even if you're not running the RedHat supplied + RPMs. + </P> + <P> + If you're using the default install then you're probably running + Apache 1.1.3, which is outdated. From RedHat's ftp site you can + pick up a more recent RPM for Apache 1.2.x. This will solve one of + the problems. + </P> + <P> + If you're using a custom built Apache rather than the RedHat RPMs + then you should <CODE>rpm -e apache</CODE>. In particular you want + the mildly broken <CODE>/etc/logrotate.d/apache</CODE> script to be + removed, and you want the broken <CODE>/etc/rc.d/init.d/httpd</CODE> + (or <CODE>httpd.init</CODE>) script to be removed. The latter is + actually fixed by the apache-1.2.5 RPMs but if you're building your + own Apache then you probably don't want the RedHat files. + </P> + <P> + We can't stress enough how important it is for folks, <EM>especially + vendors</EM> to follow the <A HREF="../stopping.html">stopping Apache + directions</A> given in our documentation. In RedHat's defense, + the broken scripts were necessary with Apache 1.1.x because the + Linux support in 1.1.x was very poor, and there were various race + conditions on all platforms. None of this should be necessary with + Apache 1.2 and later. </P> <HR> </LI> + <LI><A NAME="stopping"> + <STRONG>I upgraded from an Apache version earlier + than 1.2.0 and suddenly I have problems with Apache dying randomly + or not restarting properly</STRONG> + </A> + + <P> + You should read <A HREF="#redhat">the previous note</A> about + problems with RedHat installations. It is entirely likely that your + installation has start/stop/restart scripts which were built for + an earlier version of Apache. Versions earlier than 1.2.0 had + various race conditions that made it necessary to use + <CODE>kill -9</CODE> at times to take out all the httpd servers. + But that should not be necessary any longer. You should follow + the <A HREF="../stopping.html">directions on how to stop + and restart Apache</A>. + </P> + <P>As of Apache 1.3 there is a script + <CODE>src/support/apachectl</CODE> which, after a bit of + customization, is suitable for starting, stopping, and restarting + your server. + </P> + <HR> + </LI> + + <LI><A name="setservername"> + <b>When I try to start Apache from a DOS + window, I get a message like "<samp>Cannot determine host name. + Use ServerName directive to set it manually.</samp>" What does + this mean?</b></A> + + <p> + It means what it says; the Apache software can't determine the + hostname of your system. Edit your <samp>conf\httpd.conf</samp> + file, look for the string "ServerName", and make sure there's an + uncommented directive such as + </p> + <dl> + <dd><code>ServerName localhost</code></dd> + </dl> + <p> + or + </p> + <dl> + <dd><code>ServerName www.foo.com</code></dd> + </dl> + <p> + in the file. Correct it if there one there with wrong information, or + add one if you don't already have one. Then try to start the server + again. + </p> + <hr> + </LI> + +</OL> + + + + + + + + + + + + + + + + + <H3>E. Configuration Questions</H3> +<OL> + <LI><A NAME="fdlim"> <STRONG>Why can't I run more than <<EM>n</EM>> virtual hosts?</STRONG> @@ -1024,76 +1552,6 @@ <HR> </LI> - <LI><A NAME="POSTnotallowed"> - <STRONG>Why do I keep getting "Method Not Allowed" for - form POST requests?</STRONG> - </A> - <P> - This is almost always due to Apache not being configured to treat the - file you are trying to POST to as a CGI script. You can not POST - to a normal HTML file; the operation has no meaning. See the FAQ - entry on <A HREF="#CGIoutsideScriptAlias">CGIs outside ScriptAliased - directories</A> for details on how to configure Apache to treat the - file in question as a CGI. - </P> - <HR> - </LI> - - <LI><A NAME="passwdauth"> - <STRONG>Can I use my <SAMP>/etc/passwd</SAMP> file - for Web page authentication?</STRONG> - </A> - <P> - Yes, you can - but it's a <STRONG>very bad idea</STRONG>. Here are - some of the reasons: - </P> - <UL> - <LI>The Web technology provides no governors on how often or how - rapidly password (authentication failure) retries can be made. That - means that someone can hammer away at your system's - <SAMP>root</SAMP> password using the Web, using a dictionary or - similar mass attack, just as fast as the wire and your server can - handle the requests. Most operating systems these days include - attack detection (such as <EM>n</EM> failed passwords for the same - account within <EM>m</EM> seconds) and evasion (breaking the - connection, disabling the account under attack, disabling - <EM>all</EM> logins from that source, <EM>et cetera</EM>), but the - Web does not. - </LI> - <LI>An account under attack isn't notified (unless the server is - heavily modified); there's no "You have 19483 login - failures" message when the legitimate owner logs in. - </LI> - <LI>Without an exhaustive and error-prone examination of the server - logs, you can't tell whether an account has been compromised. - Detecting that an attack has occurred, or is in progress, is fairly - obvious, though - <EM>if</EM> you look at the logs. - </LI> - <LI>Web authentication passwords (at least for Basic authentication) - generally fly across the wire, and through intermediate proxy - systems, in what amounts to plain text. "O'er the net we - go/Caching all the way;/O what fun it is to surf/Giving my password - away!" - </LI> - <LI>Since HTTP is stateless, information about the authentication is - transmitted <EM>each and every time</EM> a request is made to the - server. Essentially, the client caches it after the first - successful access, and transmits it without asking for all - subsequent requests to the same server. - </LI> - <LI>It's relatively trivial for someone on your system to put up a - page that will steal the cached password from a client's cache - without them knowing. Can you say "password grabber"? - </LI> - </UL> - <P> - If you still want to do this in light of the above disadvantages, the - method is left as an exercise for the reader. It'll void your Apache - warranty, though, and you'll lose all accumulated UNIX guru points. - </P> - <HR> - </LI> - <LI><A NAME="errordoc401"> <STRONG>Why doesn't my <CODE>ErrorDocument 401</CODE> work?</STRONG> </A> @@ -1107,39 +1565,11 @@ <HR> </LI> - <LI><A NAME="errordocssi"> - <STRONG>How can I use <CODE>ErrorDocument</CODE> - and SSI to simplify customized error messages?</STRONG> - </A> - <P> - Have a look at <A HREF="custom_errordocs.html">this document</A>. - It shows in example form how you can a combination of XSSI and - negotiation to tailor a set of <CODE>ErrorDocument</CODE>s to your - personal taste, and returning different internationalized error - responses based on the client's native language. - </P> - <HR> - </LI> - - <LI><A NAME="setgid"> - <STRONG>Why do I get "<SAMP>setgid: Invalid - argument</SAMP>" at startup?</STRONG> - </A> - <P> - Your - <A HREF="../mod/core.html#group"><SAMP>Group</SAMP></A> - directive (probably in <SAMP>conf/httpd.conf</SAMP>) needs to name a - group that actually exists in the <SAMP>/etc/group</SAMP> file (or - your system's equivalent). - </P> - <HR> - </LI> - <LI><A NAME="cookies1"> <STRONG>Why does Apache send a cookie on every response?</STRONG> </A> <P> - Apache does <EM>not</EM> send automatically send a cookie on every + Apache does <EM>not</EM> automatically send a cookie on every response, unless you have re-compiled it with the <A HREF="../mod/mod_usertrack.html"><SAMP>mod_usertrack</SAMP></A> module, and specifically enabled it with the @@ -1224,437 +1654,618 @@ <HR> </LI> - <LI><A NAME="putsupport"> - <STRONG>Why can't I publish to my Apache server using PUT on - Netscape Gold and other programs?</STRONG> + <LI><A NAME="midi"> + <STRONG>How do I get Apache to send a MIDI file so the browser can + play it?</STRONG> </A> <P> - Because you need to install and configure a script to handle - the uploaded files. This script is often called a "PUT" handler. - There are several available, but they may have security problems. - Using FTP uploads may be easier and more secure, at least for now. - For more information, see the <CITE>Apache Week</CITE> article - <A HREF="http://www.apacheweek.com/features/put" - ><CITE>Publishing Pages with PUT</CITE></A>. + Even though the registered MIME type for MIDI files is + <SAMP>audio/midi</SAMP>, some browsers are not set up to recognize it + as such; instead, they look for <SAMP>audio/x-midi</SAMP>. There are + two things you can do to address this: </P> + <OL> + <LI>Configure your browser to treat documents of type + <SAMP>audio/midi</SAMP> correctly. This is the type that Apache + sends by default. This may not be workable, however, if you have + many client installations to change, or if some or many of the + clients are not under your control. + </LI> + <LI>Instruct Apache to send a different <SAMP>Content-type</SAMP> + header for these files by adding the following line to your server's + configuration files: + <P> + <DL> + <DD><CODE>AddType audio/x-midi .mid .midi .kar</CODE> + </DD> + </DL> + <P></P> + <P> + Note that this may break browsers that <EM>do</EM> recognize the + <SAMP>audio/midi</SAMP> MIME type unless they're prepared to also + handle <SAMP>audio/x-midi</SAMP> the same way. + </P> + </LI> + </OL> <HR> </LI> - <LI><A NAME="fastcgi"> - <STRONG>Why isn't FastCGI included with Apache any more?</STRONG> + <LI><A NAME="addlog"> + <STRONG>How do I add browsers and referrers to my logs?</STRONG> </A> <P> - The simple answer is that it was becoming too difficult to keep the - version being included with Apache synchronized with the master copy - at the - <A HREF="http://www.fastcgi.com/" - >FastCGI web site</A>. When a new version of Apache was released, the - version of the FastCGI module included with it would soon be out of date. + Apache provides a couple of different ways of doing this. The + recommended method is to compile the + <A HREF="../mod/mod_log_config.html"><SAMP>mod_log_config</SAMP></A> + module into your configuration and use the + <A HREF="../mod/mod_log_config.html#customlog"><SAMP>CustomLog</SAMP></A> + directive. </P> <P> - You can still obtain the FastCGI module for Apache from the master - FastCGI web site. + You can either log the additional information in files other than your + normal transfer log, or you can add them to the records already being + written. For example: + </P> + <P> + <CODE> + CustomLog logs/access_log "%h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-Agent}i\"" + </CODE> + </P> + <P> + This will add the values of the <SAMP>User-agent:</SAMP> and + <SAMP>Referer:</SAMP> headers, which indicate the client and the + referring page, respectively, to the end of each line in the access + log. + </P> + <P> + You may want to check out the <CITE>Apache Week</CITE> article + entitled: + "<A HREF="http://www.apacheweek.com/features/logfiles" REL="Help" + ><CITE>Gathering Visitor Information: Customizing Your + Logfiles</CITE></A>". </P> <HR> </LI> - <LI><A NAME="nodelay"> - <STRONG>Why am I getting "<SAMP>httpd: could not set socket - option TCP_NODELAY</SAMP>" in my error log?</STRONG> + <LI><A NAME="set-servername"> + <STRONG>Why does accessing directories only work when I include + the trailing "/" + (<EM>e.g.</EM>, <SAMP>http://foo.domain.com/~user/</SAMP>) + but not when I omit it + (<EM>e.g.</EM>, <SAMP>http://foo.domain.com/~user</SAMP>)?</STRONG> </A> <P> - This message almost always indicates that the client disconnected - before Apache reached the point of calling <CODE>setsockopt()</CODE> - for the connection. It shouldn't occur for more than about 1% of the - requests your server handles, and it's advisory only in any case. + When you access a directory without a trailing "/", Apache needs + to send what is called a redirect to the client to tell it to + add the trailing slash. If it did not do so, relative URLs would + not work properly. When it sends the redirect, it needs to know + the name of the server so that it can include it in the redirect. + There are two ways for Apache to find this out; either it can guess, + or you can tell it. If your DNS is configured correctly, it can + normally guess without any problems. If it is not, however, then + you need to tell it. + </P> + <P> + Add a <A HREF="../mod/core.html#servername">ServerName</A> directive + to the config file to tell it what the domain name of the server is. </P> <HR> </LI> - <LI><A NAME="peerreset"> - <STRONG>Why am I getting "<SAMP>connection reset by - peer</SAMP>" in my error log?</STRONG> + <LI><A NAME="no-info-directives"> + <STRONG>Why doesn't mod_info list any directives?</STRONG> </A> <P> - This is a normal message and nothing about which to be alarmed. It simply - means that the client canceled the connection before it had been - completely set up - such as by the end-user pressing the "Stop" - button. People's patience being what it is, sites with response-time - problems or slow network links may experiences this more than - high-capacity ones or those with large pipes to the network. + The <A HREF="../mod/mod_info.html"><SAMP>mod_info</SAMP></A> + module allows you to use a Web browser to see how your server is + configured. Among the information it displays is the list modules and + their configuration directives. The "current" values for + the directives are not necessarily those of the running server; they + are extracted from the configuration files themselves at the time of + the request. If the files have been changed since the server was last + reloaded, the display will will not match the values actively in use. + If the files and the path to the files are not readable by the user as + which the server is running (see the + <A HREF="../mod/core.html#user"><SAMP>User</SAMP></A> + directive), then <SAMP>mod_info</SAMP> cannot read them in order to + list their values. An entry <EM>will</EM> be made in the error log in + this event, however. </P> <HR> </LI> - <LI><A NAME="nph-scripts"> - <STRONG>How can I get my script's output without Apache buffering - it? Why doesn't my server push work?</STRONG> + <LI><A NAME="namevhost"> + <STRONG>I upgraded to Apache 1.3 and now my virtual hosts don't + work!</STRONG> </A> <P> - As of Apache 1.3, CGI scripts are essentially not buffered. Every time - your script does a "flush" to output data, that data gets relayed on to - the client. Some scripting languages, for example Perl, have their own - buffering for output - this can be disabled by setting the <CODE>$|</CODE> - special variable to 1. Of course this does increase the overall number - of packets being transmitted, which can result in a sense of slowness for - the end user. + In versions of Apache prior to 1.3b2, there was a lot of confusion + regarding address-based virtual hosts and (HTTP/1.1) name-based + virtual hosts, and the rules concerning how the server processed + <SAMP><VirtualHost></SAMP> definitions were very complex and not + well documented. </P> - <P>Prior to 1.3, you needed to use "nph-" scripts to accomplish non-buffering. - Today, the only difference between nph scripts and normal scripts is - that nph scripts require the full HTTP headers to be sent. + <P> + Apache 1.3b2 introduced a new directive, + <A HREF="http://www.apache.org/docs/mod/core.html#namevirtualhost" + ><SAMP>NameVirtualHost</SAMP></A>, + which simplifies the rules quite a bit. However, changing the rules + like this means that your existing name-based + <SAMP><VirtualHost></SAMP> containers probably won't work + correctly immediately following the upgrade. </P> - <HR> - </LI> - - <LI><A NAME="linuxiovec"> - <STRONG>Why do I get complaints about redefinition - of "<CODE>struct iovec</CODE>" when - compiling under Linux?</STRONG> - </A> <P> - This is a conflict between your C library includes and your kernel - includes. You need to make sure that the versions of both are matched - properly. There are two workarounds, either one will solve the problem: + To correct this problem, add the following line to the beginning of + your server configuration file, before defining any virtual hosts: </P> + <DL> + <DD><CODE>NameVirtualHost <EM>n.n.n.n</EM></CODE> + </DD> + </DL> <P> - <UL> - <LI>Remove the definition of <CODE>struct iovec</CODE> from your C - library includes. It is located in <CODE>/usr/include/sys/uio.h</CODE>. - <STRONG>Or,</STRONG> - </LI> - <LI>Add <CODE>-DNO_WRITEV</CODE> to the <CODE>EXTRA_CFLAGS</CODE> - line in your <SAMP>Configuration</SAMP> and reconfigure/rebuild. - This hurts performance and should only be used as a last resort. - </LI> - </UL> - <P></P> + Replace the "<SAMP>n.n.n.n</SAMP>" with the IP address to + which the name-based virtual host names resolve; if you have multiple + name-based hosts on multiple addresses, repeat the directive for each + address. + </P> + <P> + Make sure that your name-based <SAMP><VirtualHost></SAMP> blocks + contain <SAMP>ServerName</SAMP> and possibly <SAMP>ServerAlias</SAMP> + directives so Apache can be sure to tell them apart correctly. + </P> + <P> + Please see the + <A HREF="http://www.apache.org/docs/vhosts/">Apache + Virtual Host documentation</A> for further details about configuration. + </P> <HR> </LI> - <LI><A NAME="wheres-the-dump"> - <STRONG>The errorlog says Apache dumped core, but where's the dump - file?</STRONG> + <LI><A NAME="redhat-htm"> + <STRONG>I'm using RedHat Linux and my .htm files are showing + up as HTML source rather than being formatted!</STRONG> </A> + <P> - In Apache version 1.2, the error log message - about dumped core includes the directory where the dump file should be - located. However, many Unixes do not allow a process that has - called <CODE>setuid()</CODE> to dump core for security reasons; - the typical Apache setup has the server started as root to bind to - port 80, after which it changes UIDs to a non-privileged user to - serve requests. - </P> - <P> - Dealing with this is extremely operating system-specific, and may - require rebuilding your system kernel. Consult your operating system - documentation or vendor for more information about whether your system - does this and how to bypass it. If there <EM>is</EM> a documented way - of bypassing it, it is recommended that you bypass it only for the - <SAMP>httpd</SAMP> server process if possible. + RedHat messed up and forgot to put a content type for <CODE>.htm</CODE> + files into <CODE>/etc/mime.types</CODE>. Edit <CODE>/etc/mime.types</CODE>, + find the line containing <CODE>html</CODE> and add <CODE>htm</CODE> to it. + Then restart your httpd server: </P> + <DL> + <DD><CODE>kill -HUP `cat /var/run/httpd.pid`</CODE> + </DD> + </DL> <P> - The canonical location for Apache's core-dump files is the - <A HREF="../mod/core.html#serverroot">ServerRoot</A> - directory. As of Apache version 1.3, the location can be set <EM>via</EM> - the - <A HREF="../mod/core.html#coredumpdirectory" - ><SAMP>CoreDumpDirectory</SAMP></A> - directive to a different directory. Make sure that this directory is - writable by the user the server runs as (as opposed to the user the server - is <EM>started</EM> as). + Then <STRONG>clear your browsers' caches</STRONG>. (Many browsers won't + re-examine the content type after they've reloaded a page.) </P> <HR> </LI> - <LI><A NAME="dnsauth"> - <STRONG>Why isn't restricting access by host or domain name - working correctly?</STRONG> + <LI><A NAME="htaccess-work"> + <STRONG>My <CODE>.htaccess</CODE> files are being ignored.</STRONG></A> + <P> + This is almost always due to your <A HREF="../mod/core.html#allowoverride"> + AllowOverride</A> directive being set incorrectly for the directory in + question. If it is set to <CODE>None</CODE> then .htaccess files will + not even be looked for. If you do have one that is set, then be certain + it covers the directory you are trying to use the .htaccess file in. + This is normally accomplished by ensuring it is inside the proper + <A HREF="../mod/core.html#directory">Directory</A> container. + </P> + <HR> + </LI> + <LI><A NAME="forbidden"> + <STRONG>Why do I get a "<SAMP>Forbidden</SAMP>" message + whenever I try to access a particular directory?</STRONG></A> + <P> + This message is generally caused because either + </P> + <UL> + <LI>The underlying file system permissions do not allow the + User/Group under which Apache is running to access the necessary + files; or + <LI>The Apache configuration has some access restrictions in + place which forbid access to the files. + </UL> + <P> + You can determine which case applies to your situation by checking the + error log. + </P> + <P> + In the case where file system permission are at fault, remember + that not only must the directory and files in question be readable, + but also all parent directories must be at least searchable by the + web server in order for the content to be accessible. + </P> + <HR> + </LI> +</OL> + + + + + + + + + + + + + + + + + <H3>F. Dynamic Content (CGI and SSI)</H3> +<OL> + + <LI><A NAME="CGIoutsideScriptAlias"> + <STRONG>How do I enable CGI execution in directories other than + the ScriptAlias?</STRONG> </A> <P> - Two of the most common causes of this are: + Apache recognizes all files in a directory named as a + <A HREF="../mod/mod_alias.html#scriptalias"><SAMP>ScriptAlias</SAMP></A> + as being eligible for execution rather than processing as normal + documents. This applies regardless of the file name, so scripts in a + ScriptAlias directory don't need to be named + "<SAMP>*.cgi</SAMP>" or "<SAMP>*.pl</SAMP>" or + whatever. In other words, <EM>all</EM> files in a ScriptAlias + directory are scripts, as far as Apache is concerned. </P> + <P> + To persuade Apache to execute scripts in other locations, such as in + directories where normal documents may also live, you must tell it how + to recognize them - and also that it's okay to execute them. For + this, you need to use something like the + <A HREF="../mod/mod_mime.html#addhandler"><SAMP>AddHandler</SAMP></A> + directive. + </P> + <P> <OL> - <LI><STRONG>An error, inconsistency, or unexpected mapping in the DNS - registration</STRONG> - <BR> - This happens frequently: your configuration restricts access to - <SAMP>Host.FooBar.Com</SAMP>, but you can't get in from that host. - The usual reason for this is that <SAMP>Host.FooBar.Com</SAMP> is - actually an alias for another name, and when Apache performs the - address-to-name lookup it's getting the <EM>real</EM> name, not - <SAMP>Host.FooBar.Com</SAMP>. You can verify this by checking the - reverse lookup yourself. The easiest way to work around it is to - specify the correct host name in your configuration. - </LI> - <LI><STRONG>Inadequate checking and verification in your - configuration of Apache</STRONG> - <BR> - If you intend to perform access checking and restriction based upon - the client's host or domain name, you really need to configure - Apache to double-check the origin information it's supplied. You do - this by adding the <SAMP>-DMAXIMUM_DNS</SAMP> clause to the - <SAMP>EXTRA_CFLAGS</SAMP> definition in your - <SAMP>Configuration</SAMP> file. For example: + <LI>In an appropriate section of your server configuration files, add + a line such as <P> <DL> - <DD><CODE>EXTRA_CFLAGS=-DMAXIMUM_DNS</CODE> + <DD><CODE>AddHandler cgi-script .cgi</CODE> </DD> </DL> <P></P> <P> - This will cause Apache to be very paranoid about making sure a - particular host address is <EM>really</EM> assigned to the name it - claims to be. Note that this <EM>can</EM> incur a significant - performance penalty, however, because of all the name resolution - requests being sent to a nameserver. + The server will then recognize that all files in that location (and + its logical descendants) that end in "<SAMP>.cgi</SAMP>" + are script files, not documents. </P> </LI> + <LI>Make sure that the directory location is covered by an + <A HREF="../mod/core.html#options"><SAMP>Options</SAMP></A> + declaration that includes the <SAMP>ExecCGI</SAMP> option. + </LI> + </OL> + <P></P> + <P> + In some situations, you might not want to actually + allow all files named "<SAMP>*.cgi</SAMP>" to be executable. + Perhaps all you want is to enable a particular file in a normal directory to + be executable. This can be alternatively accomplished + <EM>via</EM> <A HREF="../mod/mod_rewrite.html"><SAMP>mod_rewrite</SAMP></A> + and the following steps: + </P> + <P> + <OL> + <LI>Locally add to the corresponding <SAMP>.htaccess</SAMP> file a ruleset + similar to this one: + <P> + <DL> + <DD><CODE>RewriteEngine on + <BR> + RewriteBase /~foo/bar/ + <BR> + RewriteRule ^quux\.cgi$ - [T=application/x-httpd-cgi]</CODE> + </DD> + </DL> + <P></P> + </LI> + <LI>Make sure that the directory location is covered by an + <A HREF="../mod/core.html#options"><SAMP>Options</SAMP></A> + declaration that includes the <SAMP>ExecCGI</SAMP> and + <SAMP>FollowSymLinks</SAMP> option. + </LI> </OL> + <P></P> <HR> </LI> - <LI><A NAME="SSL-i"> - <STRONG>Why doesn't Apache include SSL?</STRONG> + <LI><A NAME="premature-script-headers"> + <STRONG>What does it mean when my CGIs fail with + "<SAMP>Premature end of script headers</SAMP>"?</STRONG> </A> <P> - SSL (Secure Socket Layer) data transport requires encryption, and many - governments have restrictions upon the import, export, and use of - encryption technology. If Apache included SSL in the base package, - its distribution would involve all sorts of legal and bureaucratic - issues, and it would no longer be freely available. Also, some of - the technology required to talk to current clients using SSL is - patented by <A HREF="http://www.rsa.com/">RSA Data Security</A>, - who restricts its use without a license. + It means just what it says: the server was expecting a complete set of + HTTP headers (one or more followed by a blank line), and didn't get + them. </P> <P> - Some SSL implementations of Apache are available, however; see the - "<A HREF="http://www.apache.org/related_projects.html" - >related projects</A>" - page at the main Apache web site. + The most common cause of this problem is the script dying before + sending the complete set of headers, or possibly any at all, to the + server. To see if this is the case, try running the script standalone + from an interactive session, rather than as a script under the server. + If you get error messages, this is almost certainly the cause of the + "premature end of script headers" message. Even if the CGI + runs fine from the command line, remember that the environment and + permissions may be different when running under the web server. The + CGI can only access resources allowed for the <A + HREF="../mod/core.html#user"><CODE>User</CODE></A> and + <A HREF="../mod/core.html#group"><CODE>Group</CODE></A> specified in + your Apache configuration. In addition, the environment will not be + the same as the one provided on the command line, but it can be + adjusted using the directives provided by <A + HREF="../mod/mod_env.html">mod_env</A>. </P> <P> - You can find out more about this topic in the <CITE>Apache Week</CITE> - article about - <A HREF="http://www.apacheweek.com/features/ssl" REL="Help" - ><CITE>Apache and Secure Transactions</CITE></A>. + The second most common cause of this (aside from people not + outputting the required headers at all) is a result of an interaction + with Perl's output buffering. To make Perl flush its buffers + after each output statement, insert the following statements around + the <CODE>print</CODE> or <CODE>write</CODE> statements that send your + HTTP headers: + </P> + <P> + <DL> + <DD><CODE>{<BR> + local ($oldbar) = $|;<BR> + $cfh = select (STDOUT);<BR> + $| = 1;<BR> + #<BR> + # print your HTTP headers here<BR> + #<BR> + $| = $oldbar;<BR> + select ($cfh);<BR> + }</CODE> + </DD> + </DL> + <P></P> + <P> + This is generally only necessary when you are calling external + programs from your script that send output to stdout, or if there will + be a long delay between the time the headers are sent and the actual + content starts being emitted. To maximize performance, you should + turn buffer-flushing back <EM>off</EM> (with <CODE>$| = 0</CODE> or the + equivalent) after the statements that send the headers, as displayed + above. + </P> + <P> + If your script isn't written in Perl, do the equivalent thing for + whatever language you <EM>are</EM> using (<EM>e.g.</EM>, for C, call + <CODE>fflush()</CODE> after writing the headers). + </P> + <P> + Another cause for the "premature end of script headers" + message are the RLimitCPU and RLimitMEM directives. You may + get the message if the CGI script was killed due to a + resource limit. + </P> + <P> + In addition, a configuration problem in <A + HREF="../suexec.html">suEXEC</A>, mod_perl, or another third party + module can often interfere with the execution of your CGI and cause + the "premature end of script headers" message. </P> <HR> </LI> - <LI><A NAME="midi"> - <STRONG>How do I get Apache to send a MIDI file so the browser can - play it?</STRONG> + <LI><A NAME="POSTnotallowed"> + <STRONG>Why do I keep getting "Method Not Allowed" for + form POST requests?</STRONG> </A> <P> - Even though the registered MIME type for MIDI files is - <SAMP>audio/midi</SAMP>, some browsers are not set up to recognize it - as such; instead, they look for <SAMP>audio/x-midi</SAMP>. There are - two things you can do to address this: + This is almost always due to Apache not being configured to treat the + file you are trying to POST to as a CGI script. You can not POST + to a normal HTML file; the operation has no meaning. See the FAQ + entry on <A HREF="#CGIoutsideScriptAlias">CGIs outside ScriptAliased + directories</A> for details on how to configure Apache to treat the + file in question as a CGI. </P> - <OL> - <LI>Configure your browser to treat documents of type - <SAMP>audio/midi</SAMP> correctly. This is the type that Apache - sends by default. This may not be workable, however, if you have - many client installations to change, or if some or many of the - clients are not under your control. - </LI> - <LI>Instruct Apache to send a different <SAMP>Content-type</SAMP> - header for these files by adding the following line to your server's - configuration files: - <P> - <DL> - <DD><CODE>AddType audio/x-midi .mid .midi .kar</CODE> - </DD> - </DL> - <P></P> - <P> - Note that this may break browsers that <EM>do</EM> recognize the - <SAMP>audio/midi</SAMP> MIME type unless they're prepared to also - handle <SAMP>audio/x-midi</SAMP> the same way. - </P> - </LI> - </OL> <HR> </LI> - <LI><A NAME="cantbuild"> - <STRONG>Why won't Apache compile with my system's - <SAMP>cc</SAMP>?</STRONG> + <LI><A NAME="nph-scripts"> + <STRONG>How can I get my script's output without Apache buffering + it? Why doesn't my server push work?</STRONG> </A> <P> - If the server won't compile on your system, it is probably due to one - of the following causes: + As of Apache 1.3, CGI scripts are essentially not buffered. Every time + your script does a "flush" to output data, that data gets relayed on to + the client. Some scripting languages, for example Perl, have their own + buffering for output - this can be disabled by setting the <CODE>$|</CODE> + special variable to 1. Of course this does increase the overall number + of packets being transmitted, which can result in a sense of slowness for + the end user. </P> - <UL> - <LI><STRONG>The <SAMP>Configure</SAMP> script doesn't recognize your system - environment.</STRONG> - <BR> - This might be either because it's completely unknown or because - the specific environment (include files, OS version, <EM>et - cetera</EM>) isn't explicitly handled. If this happens, you may - need to port the server to your OS yourself. - </LI> - <LI><STRONG>Your system's C compiler is garbage.</STRONG> - <BR> - Some operating systems include a default C compiler that is either - not ANSI C-compliant or suffers from other deficiencies. The usual - recommendation in cases like this is to acquire, install, and use - <SAMP>gcc</SAMP>. - </LI> - <LI><STRONG>Your <SAMP>include</SAMP> files may be confused.</STRONG> - <BR> - In some cases, we have found that a compiler installation or system - upgrade has left the C header files in an inconsistent state. Make - sure that your include directory tree is in sync with the compiler and - the operating system. - </LI> - <LI><STRONG>Your operating system or compiler may be out of - revision.</STRONG> - <BR> - Software vendors (including those that develop operating systems) - issue new releases for a reason; sometimes to add functionality, but - more often to fix bugs that have been discovered. Try upgrading - your compiler and/or your operating system. - </LI> - </UL> - <P> - The Apache Group tests the ability to build the server on many - different platforms. Unfortunately, we can't test all of the OS - platforms there are. If you have verified that none of the above - issues is the cause of your problem, and it hasn't been reported - before, please submit a - <A HREF="http://www.apache.org/bug_report.html">problem report</A>. - Be sure to include <EM>complete</EM> details, such as the compiler - & OS versions and exact error messages. + <P>Prior to 1.3, you needed to use "nph-" scripts to accomplish + non-buffering. Today, the only difference between nph scripts and + normal scripts is that nph scripts require the full HTTP headers to + be sent. </P> <HR> </LI> - <LI><A NAME="addlog"> - <STRONG>How do I add browsers and referrers to my logs?</STRONG> + <LI><A NAME="cgi-spec"> + <STRONG>Where can I find the "CGI specification"?</STRONG> </A> <P> - Apache provides a couple of different ways of doing this. The - recommended method is to compile the - <A HREF="../mod/mod_log_config.html"><SAMP>mod_log_config</SAMP></A> - module into your configuration and use the - <A HREF="../mod/mod_log_config.html#customlog"><SAMP>CustomLog</SAMP></A> - directive. - </P> - <P> - You can either log the additional information in files other than your - normal transfer log, or you can add them to the records already being - written. For example: + The Common Gateway Interface (CGI) specification can be found at + the original NCSA site + <<A HREF="http://hoohoo.ncsa.uiuc.edu/cgi/interface.html"> + <SAMP>http://hoohoo.ncsa.uiuc.edu/cgi/interface.html</SAMP></A>>. + This version hasn't been updated since 1995, and there have been + some efforts to update it. </P> <P> - <CODE> - CustomLog logs/access_log "%h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-Agent}i\"" - </CODE> + A new draft is being worked on with the intent of making it an informational + RFC; you can find out more about this project at + <<A HREF="http://web.golux.com/coar/cgi/" + ><SAMP>http://web.golux.com/coar/cgi/</SAMP></A>>. </P> + <HR> + </LI> + + <LI><A NAME="fastcgi"> + <STRONG>Why isn't FastCGI included with Apache any more?</STRONG> + </A> <P> - This will add the values of the <SAMP>User-agent:</SAMP> and - <SAMP>Referer:</SAMP> headers, which indicate the client and the - referring page, respectively, to the end of each line in the access - log. + The simple answer is that it was becoming too difficult to keep the + version being included with Apache synchronized with the master copy + at the + <A HREF="http://www.fastcgi.com/" + >FastCGI web site</A>. When a new version of Apache was released, the + version of the FastCGI module included with it would soon be out of date. </P> <P> - You may want to check out the <CITE>Apache Week</CITE> article - entitled: - "<A HREF="http://www.apacheweek.com/features/logfiles" REL="Help" - ><CITE>Gathering Visitor Information: Customising Your - Logfiles</CITE></A>". + You can still obtain the FastCGI module for Apache from the master + FastCGI web site. </P> <HR> </LI> - <LI><A NAME="bind8.1"> - <STRONG>Why do I get an error about an undefined reference to - "<SAMP>__inet_ntoa</SAMP>" or other - <SAMP>__inet_*</SAMP> symbols?</STRONG> + <LI><A NAME="ssi-part-i"> + <STRONG>How do I enable SSI (parsed HTML)?</STRONG> </A> <P> - If you have installed <A HREF="http://www.isc.org/bind.html">BIND-8</A> - then this is normally due to a conflict between your include files - and your libraries. BIND-8 installs its include files and libraries - <CODE>/usr/local/include/</CODE> and <CODE>/usr/local/lib/</CODE>, while - the resolver that comes with your system is probably installed in - <CODE>/usr/include/</CODE> and <CODE>/usr/lib/</CODE>. If - your system uses the header files in <CODE>/usr/local/include/</CODE> - before those in <CODE>/usr/include/</CODE> but you do not use the new - resolver library, then the two versions will conflict. + SSI (an acronym for Server-Side Include) directives allow static HTML + documents to be enhanced at run-time (<EM>e.g.</EM>, when delivered to + a client by Apache). The format of SSI directives is covered + in the <A HREF="../mod/mod_include.html">mod_include manual</A>; + suffice it to say that Apache supports not only SSI but + xSSI (eXtended SSI) directives. </P> <P> - To resolve this, you can either make sure you use the include files - and libraries that came with your system or make sure to use the - new include files and libraries. Adding <CODE>-lbind</CODE> to the - <CODE>EXTRA_LDFLAGS</CODE> line in your <SAMP>Configuration</SAMP> - file, then re-running <SAMP>Configure</SAMP>, should resolve the - problem. (Apache versions 1.2.* and earlier use - <CODE>EXTRA_LFLAGS</CODE> instead.) + Processing a document at run-time is called <EM>parsing</EM> it; hence + the term "parsed HTML" sometimes used for documents that + contain SSI instructions. Parsing tends to be <EM>extremely</EM> + resource-consumptive, and is not enabled by default. It can also + interfere with the cachability of your documents, which can put a + further load on your server. (see the + <A HREF="#ssi-part-ii">next question</A> for more information about this.) </P> <P> - <STRONG>Note:</STRONG>As of BIND 8.1.1, the bind libraries and files are - installed under <SAMP>/usr/local/bind</SAMP> by default, so you - should not run into this problem. Should you want to use the bind - resolvers you'll have to add the following to the respective lines: + To enable SSI processing, you need to </P> + <UL> + <LI>Build your server with the + <A HREF="../mod/mod_include.html"><SAMP>mod_include</SAMP></A> + module. This is normally compiled in by default. + </LI> + <LI>Make sure your server configuration files have an + <A HREF="../mod/core.html#options"><SAMP>Options</SAMP></A> + directive which permits <SAMP>Includes</SAMP>. + </LI> + <LI>Make sure that the directory where you want the SSI documents to + live is covered by the "server-parsed" content handler, + either explicitly or in some ancestral location. That can be done + with the following + <A HREF="../mod/mod_mime.html#addhandler"><SAMP>AddHandler</SAMP></A> + directive: + <P> + <DL> + <DD><CODE>AddHandler server-parsed .shtml</CODE> + </DD> + </DL> + <P></P> + <P> + This indicates that all files ending in ".shtml" in that + location (or its descendants) should be parsed. Note that using + ".html" will cause all normal HTML files to be parsed, + which may put an inordinate load on your server. + </P> + </LI> + </UL> <P> - <DL> - <DD><CODE>EXTRA_CFLAGS=-I/usr/local/bind/include - <BR> - EXTRA_LDFLAGS=-L/usr/local/bind/lib - <BR> - EXTRA_LIBS=-lbind</CODE> - </DD> - </DL> - <P></P> + For additional information, see the <CITE>Apache Week</CITE> article on + <A HREF="http://www.apacheweek.com/features/ssi" REL="Help" + ><CITE>Using Server Side Includes</CITE></A>. + </P> <HR> </LI> - <LI><A NAME="set-servername"> - <STRONG>Why does accessing directories only work when I include - the trailing "/" - (<EM>e.g.</EM>, <SAMP>http://foo.domain.com/~user/</SAMP>) - but not when I omit it - (<EM>e.g.</EM>, <SAMP>http://foo.domain.com/~user</SAMP>)?</STRONG> + <LI><A NAME="ssi-part-ii"> + <STRONG>Why don't my parsed files get cached?</STRONG> </A> <P> - When you access a directory without a trailing "/", Apache needs - to send what is called a redirect to the client to tell it to - add the trailing slash. If it did not do so, relative URLs would - not work properly. When it sends the redirect, it needs to know - the name of the server so that it can include it in the redirect. - There are two ways for Apache to find this out; either it can guess, - or you can tell it. If your DNS is configured correctly, it can - normally guess without any problems. If it is not, however, then - you need to tell it. + Since the server is performing run-time processing of your SSI + directives, which may change the content shipped to the client, it + can't know at the time it starts parsing what the final size of the + result will be, or whether the parsed result will always be the same. + This means that it can't generate <SAMP>Content-Length</SAMP> or + <SAMP>Last-Modified</SAMP> headers. Caches commonly work by comparing + the <SAMP>Last-Modified</SAMP> of what's in the cache with that being + delivered by the server. Since the server isn't sending that header + for a parsed document, whatever's doing the caching can't tell whether + the document has changed or not - and so fetches it again to be on the + safe side. </P> <P> - Add a <A HREF="../mod/core.html#servername">ServerName</A> directive - to the config file to tell it what the domain name of the server is. + You can work around this in some cases by causing an + <SAMP>Expires</SAMP> header to be generated. (See the + <A HREF="../mod/mod_expires.html" REL="Help"><SAMP>mod_expires</SAMP></A> + documentation for more details.) Another possibility is to use the + <A HREF="../mod/mod_include.html#xbithack" REL="Help" + ><SAMP>XBitHack Full</SAMP></A> + mechanism, which tells Apache to send (under certain circumstances + detailed in the XBitHack directive description) a + <SAMP>Last-Modified</SAMP> header based upon the last modification + time of the file being parsed. Note that this may actually be lying + to the client if the parsed file doesn't change but the SSI-inserted + content does; if the included content changes often, this can result + in stale copies being cached. </P> <HR> </LI> - <LI><A NAME="user-authentication"> - <STRONG>How do I set up Apache to require a username and - password to access certain documents?</STRONG> + <LI><A NAME="ssi-part-iii"> + <STRONG>How can I have my script output parsed?</STRONG> </A> <P> - There are several ways to do this; some of the more popular - ones are to use the <A HREF="../mod/mod_auth.html">mod_auth</A>, - <A HREF="../mod/mod_auth_db.html">mod_auth_db</A>, or - <A HREF="../mod/mod_auth_dbm.html">mod_auth_dbm</A> modules. + So you want to include SSI directives in the output from your CGI + script, but can't figure out how to do it? + The short answer is "you can't." This is potentially + a security liability and, more importantly, it can not be cleanly + implemented under the current server API. The best workaround + is for your script itself to do what the SSIs would be doing. + After all, it's generating the rest of the content. </P> <P> - For an explanation on how to implement these restrictions, see - <A HREF="http://www.apacheweek.com/"><CITE>Apache Week</CITE></A>'s - articles on - <A HREF="http://www.apacheweek.com/features/userauth" - ><CITE>Using User Authentication</CITE></A> - or - <A HREF="http://www.apacheweek.com/features/dbmauth" - ><CITE>DBM User Authentication</CITE></A>. + This is a feature The Apache Group hopes to add in the next major + release after 1.3. + </P> + <HR> + </LI> + + <LI><A NAME="ssi-part-iv"> + <STRONG>SSIs don't work for VirtualHosts and/or + user home directories.</STRONG> + </A> + <P> + This is almost always due to having some setting in your config file that + sets "Options Includes" or some other setting for your DocumentRoot + but not for other directories. If you set it inside a Directory + section, then that setting will only apply to that directory. + </P> + <HR> + </LI> + + <LI><A NAME="errordocssi"> + <STRONG>How can I use <CODE>ErrorDocument</CODE> + and SSI to simplify customized error messages?</STRONG> + </A> + <P> + Have a look at <A HREF="custom_errordocs.html">this document</A>. + It shows in example form how you can a combination of XSSI and + negotiation to tailor a set of <CODE>ErrorDocument</CODE>s to your + personal taste, and returning different internationalized error + responses based on the client's native language. </P> <HR> </LI> @@ -1686,6 +2297,95 @@ <HR> </LI> +</OL> + + + + + + + + + + + + + + + + <H3>G. Authentication and Access Restrictions</H3> +<OL> + + <LI><A NAME="dnsauth"> + <STRONG>Why isn't restricting access by host or domain name + working correctly?</STRONG> + </A> + <P> + Two of the most common causes of this are: + </P> + <OL> + <LI><STRONG>An error, inconsistency, or unexpected mapping in the DNS + registration</STRONG> + <BR> + This happens frequently: your configuration restricts access to + <SAMP>Host.FooBar.Com</SAMP>, but you can't get in from that host. + The usual reason for this is that <SAMP>Host.FooBar.Com</SAMP> is + actually an alias for another name, and when Apache performs the + address-to-name lookup it's getting the <EM>real</EM> name, not + <SAMP>Host.FooBar.Com</SAMP>. You can verify this by checking the + reverse lookup yourself. The easiest way to work around it is to + specify the correct host name in your configuration. + </LI> + <LI><STRONG>Inadequate checking and verification in your + configuration of Apache</STRONG> + <BR> + If you intend to perform access checking and restriction based upon + the client's host or domain name, you really need to configure + Apache to double-check the origin information it's supplied. You do + this by adding the <SAMP>-DMAXIMUM_DNS</SAMP> clause to the + <SAMP>EXTRA_CFLAGS</SAMP> definition in your + <SAMP>Configuration</SAMP> file. For example: + <P> + <DL> + <DD><CODE>EXTRA_CFLAGS=-DMAXIMUM_DNS</CODE> + </DD> + </DL> + <P></P> + <P> + This will cause Apache to be very paranoid about making sure a + particular host address is <EM>really</EM> assigned to the name it + claims to be. Note that this <EM>can</EM> incur a significant + performance penalty, however, because of all the name resolution + requests being sent to a nameserver. + </P> + </LI> + </OL> + <HR> + </LI> + + <LI><A NAME="user-authentication"> + <STRONG>How do I set up Apache to require a username and + password to access certain documents?</STRONG> + </A> + <P> + There are several ways to do this; some of the more popular + ones are to use the <A HREF="../mod/mod_auth.html">mod_auth</A>, + <A HREF="../mod/mod_auth_db.html">mod_auth_db</A>, or + <A HREF="../mod/mod_auth_dbm.html">mod_auth_dbm</A> modules. + </P> + <P> + For an explanation on how to implement these restrictions, see + <A HREF="http://www.apacheweek.com/"><CITE>Apache Week</CITE></A>'s + articles on + <A HREF="http://www.apacheweek.com/features/userauth" + ><CITE>Using User Authentication</CITE></A> + or + <A HREF="http://www.apacheweek.com/features/dbmauth" + ><CITE>DBM User Authentication</CITE></A>. + </P> + <HR> + </LI> + <LI><A NAME="remote-auth-only"> <STRONG>How do I set up Apache to allow access to certain documents only if a site is either a local site <EM>or</EM> @@ -1726,54 +2426,6 @@ <HR> </LI> - <LI><A NAME="no-info-directives"> - <STRONG>Why doesn't mod_info list any directives?</STRONG> - </A> - <P> - The <A HREF="../mod/mod_info.html"><SAMP>mod_info</SAMP></A> - module allows you to use a Web browser to see how your server is - configured. Among the information it displays is the list modules and - their configuration directives. The "current" values for - the directives are not necessarily those of the running server; they - are extracted from the configuration files themselves at the time of - the request. If the files have been changed since the server was last - reloaded, the display will will not match the values actively in use. - If the files and the path to the files are not readable by the user as - which the server is running (see the - <A HREF="../mod/core.html#user"><SAMP>User</SAMP></A> - directive), then <SAMP>mod_info</SAMP> cannot read them in order to - list their values. An entry <EM>will</EM> be made in the error log in - this event, however. - </P> - <HR> - </LI> - - <LI><A NAME="linux-shmget"> - <STRONG>When I run it under Linux I get "shmget: - function not found", what should I do?</STRONG> - </A> - <P> - Your kernel has been built without SysV IPC support. You will have to - rebuild the kernel with that support enabled (it's under the - "General Setup" submenu). Documentation for - kernel building is beyond the scope of this FAQ; you should consult - the - <A HREF="http://www.linuxhq.com/HOWTO/Kernel-HOWTO.html" - >Kernel HOWTO</A>, - or the documentation provided with your distribution, or a - <A HREF="http://www.linuxhq.com/HOWTO/META-FAQ.html" - >Linux newsgroup/mailing list</A>. - As a last-resort workaround, you can - comment out the <CODE>#define USE_SHMGET_SCOREBOARD</CODE> - definition in the - <SAMP>LINUX</SAMP> section of - <SAMP>src/conf.h</SAMP> and rebuild the server (prior to 1.3b4, simply - removing <CODE>#define HAVE_SHMGET</CODE> would have sufficed). - This will produce a server which is slower and less reliable. - </P> - <HR> - </LI> - <LI><A NAME="authauthoritative"> <STRONG>Why does my authentication give me a server error?</STRONG> </A> @@ -1867,6 +2519,80 @@ <HR> </LI> + <LI><A NAME="passwdauth"> + <STRONG>Can I use my <SAMP>/etc/passwd</SAMP> file + for Web page authentication?</STRONG> + </A> + <P> + Yes, you can - but it's a <STRONG>very bad idea</STRONG>. Here are + some of the reasons: + </P> + <UL> + <LI>The Web technology provides no governors on how often or how + rapidly password (authentication failure) retries can be made. That + means that someone can hammer away at your system's + <SAMP>root</SAMP> password using the Web, using a dictionary or + similar mass attack, just as fast as the wire and your server can + handle the requests. Most operating systems these days include + attack detection (such as <EM>n</EM> failed passwords for the same + account within <EM>m</EM> seconds) and evasion (breaking the + connection, disabling the account under attack, disabling + <EM>all</EM> logins from that source, <EM>et cetera</EM>), but the + Web does not. + </LI> + <LI>An account under attack isn't notified (unless the server is + heavily modified); there's no "You have 19483 login + failures" message when the legitimate owner logs in. + </LI> + <LI>Without an exhaustive and error-prone examination of the server + logs, you can't tell whether an account has been compromised. + Detecting that an attack has occurred, or is in progress, is fairly + obvious, though - <EM>if</EM> you look at the logs. + </LI> + <LI>Web authentication passwords (at least for Basic authentication) + generally fly across the wire, and through intermediate proxy + systems, in what amounts to plain text. "O'er the net we + go/Caching all the way;/O what fun it is to surf/Giving my password + away!" + </LI> + <LI>Since HTTP is stateless, information about the authentication is + transmitted <EM>each and every time</EM> a request is made to the + server. Essentially, the client caches it after the first + successful access, and transmits it without asking for all + subsequent requests to the same server. + </LI> + <LI>It's relatively trivial for someone on your system to put up a + page that will steal the cached password from a client's cache + without them knowing. Can you say "password grabber"? + </LI> + </UL> + <P> + If you still want to do this in light of the above disadvantages, the + method is left as an exercise for the reader. It'll void your Apache + warranty, though, and you'll lose all accumulated UNIX guru points. + </P> + <HR> + </LI> +</OL> + + + + + + + + + + + + + + + + + <H3>H. URL Rewriting</H3> +<OL> + <LI><A NAME="rewrite-more-config"> <STRONG>Where can I find mod_rewrite rulesets which already solve particular URL-related problems?</STRONG> @@ -1955,13 +2681,13 @@ get prefixed with DocumentRoot when using mod_rewrite?</STRONG> </A> <P> - If the rule starts with <SAMP>/somedir/...</SAMP> make sure that really no - <SAMP>/somedir</SAMP> exists on the filesystem if you don't want to lead the - URL to match this directory, <EM>i.e.</EM>, there must be no root directory named - <SAMP>somedir</SAMP> on the filesystem. Because if there is such a - directory, the URL will not get prefixed with DocumentRoot. This behaviour - looks ugly, but is really important for some other aspects of URL - rewriting. + If the rule starts with <SAMP>/somedir/...</SAMP> make sure that + really no <SAMP>/somedir</SAMP> exists on the filesystem if you + don't want to lead the URL to match this directory, <EM>i.e.</EM>, + there must be no root directory named <SAMP>somedir</SAMP> on the + filesystem. Because if there is such a directory, the URL will not + get prefixed with DocumentRoot. This behaviour looks ugly, but is + really important for some other aspects of URL rewriting. </P> <HR> </LI> @@ -1971,16 +2697,15 @@ </STRONG> </A> <P> - You can't! The reason is: First, case translations for arbitrary length URLs - cannot be done <EM>via</EM> regex patterns and corresponding substitutions. - One need - a per-character pattern like sed/Perl <SAMP>tr|..|..|</SAMP> feature. - Second, just - making URLs always upper or lower case will not resolve the complete problem - of case-INSENSITIVE URLs, because actually the URLs had to be rewritten to - the correct case-variant residing on the filesystem because in later - processing Apache needs to access the file. And Unix filesystem is always - case-SENSITIVE. + You can't! The reason is: First, case translations for arbitrary + length URLs cannot be done <EM>via</EM> regex patterns and + corresponding substitutions. One need a per-character pattern like + sed/Perl <SAMP>tr|..|..|</SAMP> feature. Second, just making URLs + always upper or lower case will not resolve the complete problem of + case-INSENSITIVE URLs, because actually the URLs had to be rewritten + to the correct case-variant residing on the filesystem because in + later processing Apache needs to access the file. And Unix + filesystem is always case-SENSITIVE. </P> <P> But there is a module named <CODE>mod_speling.c</CODE> (yes, it is named @@ -2005,409 +2730,148 @@ flag?</STRONG> </A> <P> - There is only one ugly solution: You have to surround the complete flag - argument by quotation marks (<SAMP>"[E=...]"</SAMP>). Notice: The argument - to quote here is not the argument to the E-flag, it is the argument of the - Apache config file parser, <EM>i.e.</EM>, the third argument of the RewriteRule here. - So you have to write <SAMP>"[E=any text with whitespaces]"</SAMP>. + There is only one ugly solution: You have to surround the complete + flag argument by quotation marks (<SAMP>"[E=...]"</SAMP>). Notice: + The argument to quote here is not the argument to the E-flag, it is + the argument of the Apache config file parser, <EM>i.e.</EM>, the + third argument of the RewriteRule here. So you have to write + <SAMP>"[E=any text with whitespaces]"</SAMP>. </P> <HR> </LI> - <LI><A NAME="cgi-spec"> - <STRONG>Where can I find the "CGI specification"?</STRONG> - </A> - <P> - The Common Gateway Interface (CGI) specification can be found at - the original NCSA site - <<A HREF="http://hoohoo.ncsa.uiuc.edu/cgi/interface.html"> - <SAMP>http://hoohoo.ncsa.uiuc.edu/cgi/interface.html</SAMP></A>>. - This version hasn't been updated since 1995, and there have been - some efforts to update it. - </P> - <P> - A new draft is being worked on with the intent of making it an informational - RFC; you can find out more about this project at - <<A HREF="http://web.golux.com/coar/cgi/" - ><SAMP>http://web.golux.com/coar/cgi/</SAMP></A>>. - </P> - <HR> - </LI> +</OL> - <LI><A NAME="year2000"> - <STRONG>Is Apache Year 2000 compliant?</STRONG> - </A> - <P> - Yes, Apache is Year 2000 compliant. - </P> - <P> - Apache internally never stores years as two digits. - On the HTTP protocol level RFC1123-style addresses are generated - which is the only format a HTTP/1.1-compliant server should - generate. To be compatible with older applications Apache - recognizes ANSI C's <CODE>asctime()</CODE> and - RFC850-/RFC1036-style date formats, too. - The <CODE>asctime()</CODE> format uses four-digit years, - but the RFC850 and RFC1036 date formats only define a two-digit year. - If Apache sees such a date with a value less than 70 it assumes that - the century is <SAMP>20</SAMP> rather than <SAMP>19</SAMP>. - </P> - <P> - Some aspects of Apache's output may use two-digit years, such as the - automatic listing of directory contents provided by - <A HREF="../mod/mod_autoindex.html"><SAMP>mod_autoindex</SAMP></A> - with the - <A HREF="../mod/mod_autoindex.html#indexoptions" - ><SAMP>FancyIndexing</SAMP></A> - option enabled, but it is improper to depend upon such displays for - specific syntax. And even that issue is being addressed by the - developers; a future version of Apache should allow you to format that - display as you like. - </P> - <P> - Although Apache is Year 2000 compliant, you may still get problems - if the underlying OS has problems with dates past year 2000 - (<EM>e.g.</EM>, OS calls which accept or return year numbers). - Most (UNIX) systems store dates internally as signed 32-bit integers - which contain the number of seconds since 1<SUP>st</SUP> January 1970, so - the magic boundary to worry about is the year 2038 and not 2000. - But modern operating systems shouldn't cause any trouble - at all. - </P> - <HR> - </LI> - <LI><A NAME="namevhost"> - <STRONG>I upgraded to Apache 1.3 and now my virtual hosts don't - work!</STRONG> + + + + + + + + + + + + + + + <H3>I. Features</H3> +<OL> + + <LI><A NAME="proxy"> + <STRONG>Does or will Apache act as a Proxy server?</STRONG> </A> <P> - In versions of Apache prior to 1.3b2, there was a lot of confusion - regarding address-based virtual hosts and (HTTP/1.1) name-based - virtual hosts, and the rules concerning how the server processed - <SAMP><VirtualHost></SAMP> definitions were very complex and not - well documented. - </P> - <P> - Apache 1.3b2 introduced a new directive, - <A HREF="http://www.apache.org/docs/mod/core.html#namevirtualhost" - ><SAMP>NameVirtualHost</SAMP></A>, - which simplifies the rules quite a bit. However, changing the rules - like this means that your existing name-based - <SAMP><VirtualHost></SAMP> containers probably won't work - correctly immediately following the upgrade. - </P> - <P> - To correct this problem, add the following line to the beginning of - your server configuration file, before defining any virtual hosts: - </P> - <DL> - <DD><CODE>NameVirtualHost <EM>n.n.n.n</EM></CODE> - </DD> - </DL> - <P> - Replace the "<SAMP>n.n.n.n</SAMP>" with the IP address to - which the name-based virtual host names resolve; if you have multiple - name-based hosts on multiple addresses, repeat the directive for each - address. - </P> - <P> - Make sure that your name-based <SAMP><VirtualHost></SAMP> blocks - contain <SAMP>ServerName</SAMP> and possibly <SAMP>ServerAlias</SAMP> - directives so Apache can be sure to tell them apart correctly. - </P> - <P> - Please see the - <A HREF="http://www.apache.org/docs/vhosts/">Apache - Virtual Host documentation</A> for further details about configuration. + Apache version 1.1 and above comes with a + <A HREF="../mod/mod_proxy.html">proxy module</A>. + If compiled in, this will make Apache act as a caching-proxy server. </P> <HR> </LI> - <LI><A NAME="redhat"> - <STRONG>I'm using RedHat Linux and I have problems with httpd - dying randomly or not restarting properly</STRONG> + <LI><A NAME="multiviews"> + <STRONG>What are "multiviews"?</STRONG> </A> - - <P> - RedHat Linux versions 4.x (and possibly earlier) RPMs contain - various nasty scripts which do not stop or restart Apache properly. - These can affect you even if you're not running the RedHat supplied - RPMs. - </P> - <P> - If you're using the default install then you're probably running - Apache 1.1.3, which is outdated. From RedHat's ftp site you can - pick up a more recent RPM for Apache 1.2.x. This will solve one of - the problems. - </P> <P> - If you're using a custom built Apache rather than the RedHat RPMs - then you should <CODE>rpm -e apache</CODE>. In particular you want - the mildly broken <CODE>/etc/logrotate.d/apache</CODE> script to be - removed, and you want the broken <CODE>/etc/rc.d/init.d/httpd</CODE> - (or <CODE>httpd.init</CODE>) script to be removed. The latter is - actually fixed by the apache-1.2.5 RPMs but if you're building your - own Apache then you probably don't want the RedHat files. - </P> - <P> - We can't stress enough how important it is for folks, <EM>especially - vendors</EM> to follow the <A HREF="../stopping.html">stopping Apache - directions</A> given in our documentation. In RedHat's defense, - the broken scripts were necessary with Apache 1.1.x because the - Linux support in 1.1.x was very poor, and there were various race - conditions on all platforms. None of this should be necessary with - Apache 1.2 and later. + "Multiviews" is the general name given to the Apache + server's ability to provide language-specific document variants in + response to a request. This is documented quite thoroughly in the + <A HREF="../content-negotiation.html" REL="Help">content negotiation</A> + description page. In addition, <CITE>Apache Week</CITE> carried an + article on this subject entitled + "<A HREF="http://www.apacheweek.com/features/negotiation" REL="Help" + ><CITE>Content Negotiation Explained</CITE></A>". </P> <HR> </LI> - <LI><A NAME="stopping"> - <STRONG>I upgraded from an Apache version earlier - than 1.2.0 and suddenly I have problems with Apache dying randomly - or not restarting properly</STRONG> + <LI><A NAME="putsupport"> + <STRONG>Why can't I publish to my Apache server using PUT on + Netscape Gold and other programs?</STRONG> </A> - <P> - You should read <A HREF="#redhat">the previous note</A> about - problems with RedHat installations. It is entirely likely that your - installation has start/stop/restart scripts which were built for - an earlier version of Apache. Versions earlier than 1.2.0 had - various race conditions that made it necessary to use - <CODE>kill -9</CODE> at times to take out all the httpd servers. - But that should not be necessary any longer. You should follow - the <A HREF="../stopping.html">directions on how to stop - and restart Apache</A>. - </P> - <P>As of Apache 1.3 there is a script - <CODE>src/support/apachectl</CODE> which, after a bit of - customization, is suitable for starting, stopping, and restarting - your server. + Because you need to install and configure a script to handle + the uploaded files. This script is often called a "PUT" handler. + There are several available, but they may have security problems. + Using FTP uploads may be easier and more secure, at least for now. + For more information, see the <CITE>Apache Week</CITE> article + <A HREF="http://www.apacheweek.com/features/put" + ><CITE>Publishing Pages with PUT</CITE></A>. </P> <HR> </LI> - <LI><A NAME="redhat-htm"> - <STRONG>I'm using RedHat Linux and my .htm files are showing - up as HTML source rather than being formatted!</STRONG> + <LI><A NAME="SSL-i"> + <STRONG>Why doesn't Apache include SSL?</STRONG> </A> - <P> - RedHat messed up and forgot to put a content type for <CODE>.htm</CODE> - files into <CODE>/etc/mime.types</CODE>. Edit <CODE>/etc/mime.types</CODE>, - find the line containing <CODE>html</CODE> and add <CODE>htm</CODE> to it. - Then restart your httpd server: - </P> - <DL> - <DD><CODE>kill -HUP `cat /var/run/httpd.pid`</CODE> - </DD> - </DL> - <P> - Then <STRONG>clear your browsers' caches</STRONG>. (Many browsers won't - re-examine the content type after they've reloaded a page.) + SSL (Secure Socket Layer) data transport requires encryption, and many + governments have restrictions upon the import, export, and use of + encryption technology. If Apache included SSL in the base package, + its distribution would involve all sorts of legal and bureaucratic + issues, and it would no longer be freely available. Also, some of + the technology required to talk to current clients using SSL is + patented by <A HREF="http://www.rsa.com/">RSA Data Security</A>, + who restricts its use without a license. </P> - <HR> - </LI> - - <LI><A NAME="glibc-crypt"> - <STRONG>I'm using RedHat Linux 5.0, or some other - <SAMP>glibc</SAMP>-based Linux system, and I get errors with the - <CODE>crypt</CODE> function when I attempt to build Apache 1.2.</STRONG> - </A> - <P> - <SAMP>glibc</SAMP> puts the <CODE>crypt</CODE> function into a separate - library. Edit your <CODE>src/Configuration</CODE> file and set this: + Some SSL implementations of Apache are available, however; see the + "<A HREF="http://www.apache.org/related_projects.html" + >related projects</A>" + page at the main Apache web site. </P> - <DL> - <DD><CODE>EXTRA_LIBS=-lcrypt</CODE> - </DD> - </DL> <P> - Then re-run <SAMP>src/Configure</SAMP> and re-execute the make. + You can find out more about this topic in the <CITE>Apache Week</CITE> + article about + <A HREF="http://www.apacheweek.com/features/ssl" REL="Help" + ><CITE>Apache and Secure Transactions</CITE></A>. </P> <HR> </LI> - - <LI><A NAME="nfslocking"> - <STRONG>Server hangs, or fails to start, and/or error log - fills with "<SAMP>fcntl: F_SETLKW: No record locks - available</SAMP>" or similar messages</STRONG> + <LI><A NAME="footer"> + <STRONG>How can I attach a footer to my documents + without using SSI?</STRONG> </A> - <P> - These are symptoms of a fine locking problem, which usually means that - the server is trying to use a synchronization file on an NFS filesystem. + You can make arbitrary changes to static documents by configuring an + <A HREF="http://www.apache.org/docs/mod/mod_actions.html#action"> + Action</A> which launches a CGI script. The CGI is then + responsible for setting a content-type and delivering the requested + document (the location of which is passed in the + <SAMP>PATH_TRANSLATED</SAMP> environment variable), along with + whatever footer is needed. </P> <P> - Because of its parallel-operation model, the Apache Web server needs to - provide some form of synchronization when accessing certain resources. - One of these synchronization methods involves taking out locks on a file, - which means that the filesystem whereon the lockfile resides must support - locking. In many cases this means it <EM>can't</EM> be kept on an - NFS-mounted filesystem. - </P> - <P> - To cause the Web server to work around the NFS locking limitations, include - a line such as the following in your server configuration files: - </P> - <DL> - <DD><CODE>LockFile /var/run/apache-lock</CODE> - </DD> - </DL> - <P> - The directory should not be generally writable (<EM>e.g.</EM>, don't use - <SAMP>/var/tmp</SAMP>). - See the <A HREF="../mod/core.html#lockfile"><SAMP>LockFile</SAMP></A> - documentation for more information. + Busy sites may not want to run a CGI script on every request, and + should consider using an Apache module to add the footer. There are + several third party modules available through the <A + HREF="http://modules.apache.org/">Apache Module Registry</A> which + will add footers to documents. These include mod_trailer, PHP + (<SAMP>php3_auto_append_file</SAMP>), and mod_perl + (<SAMP>Apache::Sandwich</SAMP>). </P> <HR> </LI> - <LI><A NAME="zoom"> - <STRONG>What's the best hardware/operating system/... How do - I get the most out of my Apache Web server?</STRONG> + <LI><A NAME="search"> + <STRONG>Does Apache include a search engine?</STRONG> </A> - <P> - Check out Dean Gaudet's - <A HREF="http://www.apache.org/docs/misc/perf-tuning.html" - >performance tuning page</A>. + <P>Apache does not include a search engine, but there are many good + commercial and free search engines which can be used easily with + Apache. Some of them are listed on the <A + HREF="http://www.searchtools.com/tools/tools.html">Web Site Search + Tools</A> page. Open source search engines that are often used with + Apache include <A HREF="http://www.htdig.org/">ht://Dig</A> and <A + HREF="http://sunsite.berkeley.edu/SWISH-E/">SWISH-E</A>. </P> <HR> </LI> - <LI><A NAME="regex"> - <STRONG>What are "regular expressions"?</STRONG></A> - <P> - Regular expressions are a way of describing a pattern - for example, "all - the words that begin with the letter A" or "every 10-digit phone number" - or even "Every sentence with two commas in it, and no capital letter Q". - Regular expressions (aka "regexp"s) are useful in Apache because they - let you apply certain attributes against collections of files or resources - in very flexible ways - for example, all .gif and .jpg files under - any "images" directory could be written as /.*\/images\/.*[jpg|gif]/. - </P> - <P> - The best overview around is probably the one which comes with - Perl. We implement a simple subset of Perl's regexp support, but - it's still a good way to learn what they mean. You can start by - going to the - <A - HREF="http://www.perl.com/CPAN-local/doc/manual/html/pod/perlre.html#Version_8_Regular_Expresions" - >CPAN page on regular expressions</A>, and branching out from there. - </P> - <HR> - </LI> - <LI><A NAME="broken-gcc"><STRONG>I'm using gcc and I get some - compilation errors, what is wrong?</STRONG></A> - <P> - GCC parses your system header files and produces a modified subset which - it uses for compiling. This behaviour ties GCC tightly to the version - of your operating system. So, for example, if you were running IRIX 5.3 - when you built GCC and then upgrade to IRIX 6.2 later, you will have to - rebuild GCC. Similarly for Solaris 2.4, 2.5, or 2.5.1 when you upgrade - to 2.6. Sometimes you can type "gcc -v" and it will tell you the version - of the operating system it was built against. - </P> - <P> - If you fail to do this, then it is very likely that Apache will fail - to build. One of the most common errors is with <CODE>readv</CODE>, - <CODE>writev</CODE>, or <CODE>uio.h</CODE>. This is <STRONG>not</STRONG> a - bug with Apache. You will need to re-install GCC. - </P> - <HR> - </LI> - <LI><A NAME="htaccess-work"> - <STRONG>My <CODE>.htaccess</CODE> files are being ignored.</STRONG></A> - <P> - This is almost always due to your <A HREF="../mod/core.html#allowoverride"> - AllowOverride</A> directive being set incorrectly for the directory in - question. If it is set to <CODE>None</CODE> then .htaccess files will - not even be looked for. If you do have one that is set, then be certain - it covers the directory you are trying to use the .htaccess file in. - This is normally accomplished by ensuring it is inside the proper - <A HREF="../mod/core.html#directory">Directory</A> container. - </P> - <HR> - </LI> - <LI><A NAME="submit_patch"> - <STRONG>How do I submit a patch to the Apache Group?</STRONG></A> - <P> - The Apache Group encourages patches from outside developers. There are 2 - main "types" - of patches: small bugfixes and general improvements. Bugfixes should be - submitting using the - Apache <A HREF="http://www.apache.org/bug_report.html">bug report page</A>. - Improvements, modifications, and additions should follow the instructions - below. - </P> - <P> - In general, the first course of action is to be a member of the - <SAMP>new-httpd@apache.org</SAMP> mailing list. This indicates to the Group - that - you are closely following the latest Apache developments. Your patch file - should be - generated using either '<CODE>diff -c</CODE>' or - '<CODE>diff -u</CODE>' against the - latest CVS tree. To submit your patch, send email to - <SAMP>new-httpd@apache.org</SAMP> - with a <SAMP>Subject:</SAMP> line that starts with <SAMP>[PATCH]</SAMP> and - includes a general description of the patch. In the body of the message, the - patch should be clearly described and then included at the end of the - message. - If the patch-file is long, you can note a URL to the file instead of the - file itself. Use of MIME enclosures/attachments should be avoided. - </P> - <P> - Be prepared to respond to any questions about your patches and possibly - defend - your code. If your patch results in a lot of discussion, you may be asked to - submit an updated patch that incorporate all changes and suggestions. - </P> - <HR> - </LI> - <LI><A NAME="aixccbug"><STRONG>Why am I getting "<SAMP>Expected - </Directory> but saw </Directory></SAMP>" when - I try to start Apache?</STRONG></A> - <P> - This is a known problem with certain versions of the AIX C compiler. - IBM are working on a solution, and the issue is being tracked by - <A HREF="http://bugs.apache.org/index/full/2312">problem report #2312</A>. - </P> - <HR> - </LI> - <LI><A NAME="domination"><STRONG>Why has Apache stolen my favourite site's - Internet address?</STRONG></A> - <P> - The simple answer is: "It hasn't." This misconception is usually - caused by the site in question having migrated to the Apache Web - server software, but not having migrated the site's content yet. When - Apache is installed, the default page that gets installed tells the - Webmaster the installation was successful. The expectation is that - this default page will be replaced with the site's real content. - If it doesn't, complain to the Webmaster, not to the Apache project -- - we just make the software and aren't responsible for what people - do (or don't do) with it. - </P> - <HR> - </LI> - <LI><A NAME="apspam"><STRONG>Why am I getting spam mail from the - Apache site?</STRONG></A> - <P> - The short answer is: "You aren't." Usually when someone thinks the - Apache site is originating spam, it's because they've traced the - spam to a Web site, and the Web site says it's using Apache. See the - <A HREF="#domination">previous FAQ entry</A> for more details on this - phenomenon. - </P> - <P> - No marketing spam originates from the Apache site. The only mail - that comes from the site goes only to addresses that have been - <EM>requested</EM> to receive the mail. - </P> - <HR> - </LI> - <!-- Don't forget to add HR tags at the end of each list item.. --> - </OL> + + + + <HR> <H3 ALIGN="CENTER"> diff --git a/usr.sbin/httpd/htdocs/manual/misc/howto.html b/usr.sbin/httpd/htdocs/manual/misc/howto.html index 62f1116656a..7cb757fc58d 100644 --- a/usr.sbin/httpd/htdocs/manual/misc/howto.html +++ b/usr.sbin/httpd/htdocs/manual/misc/howto.html @@ -83,7 +83,7 @@ want to compile mod_rewrite into your server. <P>Here's how to redirect all requests to a script... In the server configuration file, -<BLOCKQUOTE><PRE>ScriptAlias / /usr/local/httpd/cgi-bin/redirect_script</PRE> +<BLOCKQUOTE><PRE>ScriptAlias / /usr/local/httpd/cgi-bin/redirect_script/</PRE> </BLOCKQUOTE> and here's a simple perl script to redirect requests: @@ -91,8 +91,9 @@ and here's a simple perl script to redirect requests: <BLOCKQUOTE><PRE> #!/usr/local/bin/perl -print "Status: 302 Moved Temporarily\r -Location: http://www.some.where.else.com/\r\n\r\n"; +print "Status: 302 Moved Temporarily\r\n" . + "Location: http://www.some.where.else.com/\r\n" . + "\r\n"; </PRE></BLOCKQUOTE></P> diff --git a/usr.sbin/httpd/htdocs/manual/misc/known_client_problems.html b/usr.sbin/httpd/htdocs/manual/misc/known_client_problems.html index 4f64e06a2f4..d432c44953b 100644 --- a/usr.sbin/httpd/htdocs/manual/misc/known_client_problems.html +++ b/usr.sbin/httpd/htdocs/manual/misc/known_client_problems.html @@ -15,144 +15,150 @@ <DIV ALIGN="CENTER"> <IMG SRC="../images/sub.gif" ALT="[APACHE DOCUMENTATION]"> <H3> - Apache HTTP Server Version 1.2 + Apache HTTP Server Version 1.3 </H3> </DIV> <H1 ALIGN="CENTER">Known Problems in Clients</H1> -<p>Over time the Apache Group has discovered or been notified of problems -with various clients which we have had to work around. This document -describes these problems and the workarounds available. It's not arranged -in any particular order. Some familiarity with the standards is assumed, -but not necessary. - -<p>For brevity, <i>Navigator</i> will refer to Netscape's Navigator -product, and <i>MSIE</i> will refer to Microsoft's Internet Explorer -product. All trademarks and copyrights belong to their respective -companies. We welcome input from the various client authors to correct -inconsistencies in this paper, or to provide us with exact version -numbers where things are broken/fixed. - -<p>For reference, -<a href="ftp://ds.internic.net/rfc/rfc1945.txt">RFC1945</a> +<P>Over time the Apache Group has discovered or been notified of problems +with various clients which we have had to work around, or explain. +This document describes these problems and the workarounds available. +It's not arranged in any particular order. Some familiarity with the +standards is assumed, but not necessary. + +<P>For brevity, <EM>Navigator</EM> will refer to Netscape's Navigator +product (which in later versions was renamed "Communicator" and +various other names), and <EM>MSIE</EM> will refer to Microsoft's +Internet Explorer product. All trademarks and copyrights belong to +their respective companies. We welcome input from the various client +authors to correct inconsistencies in this paper, or to provide us with +exact version numbers where things are broken/fixed. + +<P>For reference, +<A HREF="ftp://ds.internic.net/rfc/rfc1945.txt">RFC1945</A> defines HTTP/1.0, and -<a href="ftp://ds.internic.net/rfc/rfc2068.txt">RFC2068</a> +<A HREF="ftp://ds.internic.net/rfc/rfc2068.txt">RFC2068</A> defines HTTP/1.1. Apache as of version 1.2 is an HTTP/1.1 server (with an optional HTTP/1.0 proxy). -<p>Various of these workarounds are triggered by environment variables. +<P>Various of these workarounds are triggered by environment variables. The admin typically controls which are set, and for which clients, by using -<a href="../mod/mod_browser.html">mod_browser</a>. Unless otherwise +<A HREF="../mod/mod_browser.html">mod_browser</A>. Unless otherwise noted all of these workarounds exist in versions 1.2 and later. -<a name="trailing-crlf"><H3>Trailing CRLF on POSTs</H3></a> +<H3><A NAME="trailing-crlf">Trailing CRLF on POSTs</A></H3> -<p>This is a legacy issue. The CERN webserver required <code>POST</code> -data to have an extra <code>CRLF</code> following it. Thus many -clients send an extra <code>CRLF</code> that -is not included in the <code>Content-Length</code> of the request. +<P>This is a legacy issue. The CERN webserver required <CODE>POST</CODE> +data to have an extra <CODE>CRLF</CODE> following it. Thus many +clients send an extra <CODE>CRLF</CODE> that +is not included in the <CODE>Content-Length</CODE> of the request. Apache works around this problem by eating any empty lines which appear before a request. -<a name="broken-keepalive"><h3>Broken keepalive</h3></a> +<H3><A NAME="broken-keepalive">Broken keepalive</A></H3> -<p>Various clients have had broken implementations of <i>keepalive</i> +<P>Various clients have had broken implementations of <EM>keepalive</EM> (persistent connections). In particular the Windows versions of Navigator 2.0 get very confused when the server times out an idle connection. The workaround is present in the default config files: -<blockquote><code> +<BLOCKQUOTE><CODE> BrowserMatch Mozilla/2 nokeepalive -</code></blockquote> +</CODE></BLOCKQUOTE> Note that this matches some earlier versions of MSIE, which began the -practice of calling themselves <i>Mozilla</i> in their user-agent +practice of calling themselves <EM>Mozilla</EM> in their user-agent strings just like Navigator. -<p>MSIE 4.0b2, which claims to support HTTP/1.1, does not properly +<P>MSIE 4.0b2, which claims to support HTTP/1.1, does not properly support keepalive when it is used on 301 or 302 (redirect) -responses. Unfortunately Apache's <code>nokeepalive</code> code +responses. Unfortunately Apache's <CODE>nokeepalive</CODE> code prior to 1.2.2 would not work with HTTP/1.1 clients. You must apply -<a href="http://www.apache.org/dist/patches/apply_to_1.2.1/msie_4_0b2_fixes.patch">this -patch</a> to version 1.2.1. Then add this to your config: -<blockquote><code> +<A +HREF="http://www.apache.org/dist/patches/apply_to_1.2.1/msie_4_0b2_fixes.patch" +>this patch</A> to version 1.2.1. Then add this to your config: +<BLOCKQUOTE><CODE> BrowserMatch "MSIE 4\.0b2;" nokeepalive -</code></blockquote> +</CODE></BLOCKQUOTE> -<a name="force-response-1.0"><h3>Incorrect interpretation of <code>HTTP/1.1</code> in response</h3></a> +<H3><A NAME="force-response-1.0">Incorrect interpretation of +<CODE>HTTP/1.1</CODE> in response</A></H3> -<p>To quote from section 3.1 of RFC1945: -<blockquote> -HTTP uses a "<major>.<minor>" numbering scheme to indicate versions +<P>To quote from section 3.1 of RFC1945: +<BLOCKQUOTE> +HTTP uses a "<MAJOR>.<MINOR>" numbering scheme to indicate versions of the protocol. The protocol versioning policy is intended to allow the sender to indicate the format of a message and its capacity for understanding further HTTP communication, rather than the features obtained via that communication. -</blockquote> +</BLOCKQUOTE> Since Apache is an HTTP/1.1 server, it indicates so as part of its response. Many client authors mistakenly treat this part of the response as an indication of the protocol that the response is in, and then refuse to accept the response. -<p>The first major indication of this problem was with AOL's proxy servers. +<P>The first major indication of this problem was with AOL's proxy servers. When Apache 1.2 went into beta it was the first wide-spread HTTP/1.1 server. After some discussion, AOL fixed their proxies. In -anticipation of similar problems, the <code>force-response-1.0</code> +anticipation of similar problems, the <CODE>force-response-1.0</CODE> environment variable was added to Apache. When present Apache will indicate "HTTP/1.0" in response to an HTTP/1.0 client, but will not in any other way change the response. -<p>The pre-1.1 Java Development Kit (JDK) that is used in many clients +<P>The pre-1.1 Java Development Kit (JDK) that is used in many clients (including Navigator 3.x and MSIE 3.x) exhibits this problem. As do some of the early pre-releases of the 1.1 JDK. We think it is fixed in the 1.1 JDK release. In any event the workaround: -<blockquote><code> -BrowserMatch Java1.0 force-response-1.0 <br> +<BLOCKQUOTE><CODE> +BrowserMatch Java/1.0 force-response-1.0 <BR> BrowserMatch JDK/1.0 force-response-1.0 -</code></blockquote> +</CODE></BLOCKQUOTE> -<p>RealPlayer 4.0 from Progressive Networks also exhibits this problem. +<P>RealPlayer 4.0 from Progressive Networks also exhibits this problem. However they have fixed it in version 4.01 of the player, but version -4.01 uses the same <code>User-Agent</code> as version 4.0. The +4.01 uses the same <CODE>User-Agent</CODE> as version 4.0. The workaround is still: -<blockquote><code> +<BLOCKQUOTE><CODE> BrowserMatch "RealPlayer 4.0" force-response-1.0 -</code></blockquote> +</CODE></BLOCKQUOTE> -<a name="msie4.0b2"><h3>Requests use HTTP/1.1 but responses must be in HTTP/1.0</h3></a> +<H3><A NAME="msie4.0b2">Requests use HTTP/1.1 but responses must be +in HTTP/1.0</A></H3> -<p>MSIE 4.0b2 has this problem. Its Java VM makes requests in HTTP/1.1 +<P>MSIE 4.0b2 has this problem. Its Java VM makes requests in HTTP/1.1 format but the responses must be in HTTP/1.0 format (in particular, it -does not understand <i>chunked</i> responses). The workaround +does not understand <EM>chunked</EM> responses). The workaround is to fool Apache into believing the request came in HTTP/1.0 format. -<blockquote><code> +<BLOCKQUOTE><CODE> BrowserMatch "MSIE 4\.0b2;" downgrade-1.0 force-response-1.0 -</code></blockquote> +</CODE></BLOCKQUOTE> This workaround is available in 1.2.2, and in a -<a href="http://www.apache.org/dist/patches/apply_to_1.2.1/msie_4_0b2_fixes.patch">patch -</a> against 1.2.1. +<A +HREF="http://www.apache.org/dist/patches/apply_to_1.2.1/msie_4_0b2_fixes.patch" +>patch</A> against 1.2.1. -<a name="257th-byte"><h3>Boundary problems with header parsing</h3></a> +<H3><A NAME="257th-byte">Boundary problems with header parsing</A></H3> -<p>All versions of Navigator from 2.0 through 4.0b2 (and possibly later) +<P>All versions of Navigator from 2.0 through 4.0b2 (and possibly later) have a problem if the trailing CRLF of the response header starts at -the 256th or 257th byte of the response. A BrowserMatch for this would +offset 256, 257 or 258 of the response. A BrowserMatch for this would match on nearly every hit, so the workaround is enabled automatically -on all responses. The workaround is to detect when this condition would -occur in a response and add extra padding to the header to push the -trailing CRLF past the 257th byte of the response. +on all responses. The workaround implemented detects when this condition would +occur in a response and adds extra padding to the header to push the +trailing CRLF past offset 258 of the response. -<a name="boundary-string"><h3>Multipart responses and Quoted Boundary Strings</h3></a> +<H3><A NAME="boundary-string">Multipart responses and Quoted Boundary +Strings</A></H3> -<p>On multipart responses some clients will not accept quotes (") +<P>On multipart responses some clients will not accept quotes (") around the boundary string. The MIME standard recommends that such quotes be used. But the clients were probably written based on one of the examples in RFC2068, which does not include quotes. Apache does not include quotes on its boundary strings to workaround this problem. -<a name="byterange-requests"><h3>Byterange requests</h3></a> +<H3><A NAME="byterange-requests">Byterange requests</A></H3> -<p>A byterange request is used when the client wishes to retrieve a +<P>A byterange request is used when the client wishes to retrieve a portion of an object, not necessarily the entire object. There was a very old draft which included these byteranges in the URL. Old clients such as Navigator 2.0b1 and MSIE 3.0 for the MAC @@ -161,48 +167,148 @@ it will appear in the servers' access logs as (failed) attempts to retrieve a URL with a trailing ";xxx-yyy". Apache does not attempt to implement this at all. -<p>A subsequent draft of this standard defines a header -<code>Request-Range</code>, and a response type -<code>multipart/x-byteranges</code>. The HTTP/1.1 standard includes +<P>A subsequent draft of this standard defines a header +<CODE>Request-Range</CODE>, and a response type +<CODE>multipart/x-byteranges</CODE>. The HTTP/1.1 standard includes this draft with a few fixes, and it defines the header -<code>Range</code> and type <code>multipart/byteranges</code>. - -<p>Navigator (versions 2 and 3) sends both <code>Range</code> and -<code>Request-Range</code> headers (with the same value), but does not -accept a <code>multipart/byteranges</code> response. The response must -be <code>multipart/x-byteranges</code>. As a workaround, if Apache -receives a <code>Request-Range</code> header it considers it "higher -priority" than a <code>Range</code> header and in response uses -<code>multipart/x-byteranges</code>. - -<p>The Adobe Acrobat Reader plugin makes extensive use of byteranges and -prior to version 3.01 supports only the <code>multipart/x-byterange</code> +<CODE>Range</CODE> and type <CODE>multipart/byteranges</CODE>. + +<P>Navigator (versions 2 and 3) sends both <CODE>Range</CODE> and +<CODE>Request-Range</CODE> headers (with the same value), but does not +accept a <CODE>multipart/byteranges</CODE> response. The response must +be <CODE>multipart/x-byteranges</CODE>. As a workaround, if Apache +receives a <CODE>Request-Range</CODE> header it considers it "higher +priority" than a <CODE>Range</CODE> header and in response uses +<CODE>multipart/x-byteranges</CODE>. + +<P>The Adobe Acrobat Reader plugin makes extensive use of byteranges and +prior to version 3.01 supports only the <CODE>multipart/x-byterange</CODE> response. Unfortunately there is no clue that it is the plugin making the request. If the plugin is used with Navigator, the above workaround works fine. But if the plugin is used with MSIE 3 (on Windows) the workaround won't work because MSIE 3 doesn't give the -<code>Range-Request</code> clue that Navigator does. To workaround this, -Apache special cases "MSIE 3" in the <code>User-Agent</code> and serves -<code>multipart/x-byteranges</code>. Note that the necessity for this +<CODE>Range-Request</CODE> clue that Navigator does. To workaround this, +Apache special cases "MSIE 3" in the <CODE>User-Agent</CODE> and serves +<CODE>multipart/x-byteranges</CODE>. Note that the necessity for this with MSIE 3 is actually due to the Acrobat plugin, not due to the browser. -<p>Netscape Communicator appears to not issue the non-standard -<code>Request-Range</code> header. When an Acrobat plugin prior to +<P>Netscape Communicator appears to not issue the non-standard +<CODE>Request-Range</CODE> header. When an Acrobat plugin prior to version 3.01 is used with it, it will not properly understand byteranges. The user must upgrade their Acrobat reader to 3.01. -<a name="cookie-merge"><h3><code>Set-Cookie</code> header is unmergeable</h3></a> +<H3><A NAME="cookie-merge"><CODE>Set-Cookie</CODE> header is +unmergeable</A></H3> -<p>The HTTP specifications say that it is legal to merge headers with -duplicate names into one (separated by semicolon). Some browsers +<P>The HTTP specifications say that it is legal to merge headers with +duplicate names into one (separated by commas). Some browsers that support Cookies don't like merged headers and prefer that each -<code>Set-Cookie</code> header is sent separately. When parsing the +<CODE>Set-Cookie</CODE> header is sent separately. When parsing the headers returned by a CGI, Apache will explicitly avoid merging any -<code>Set-Cookie</code> headers. +<CODE>Set-Cookie</CODE> headers. + +<H3><A NAME="gif89-expires"><CODE>Expires</CODE> headers and GIF89A +animations</A></H3> + +<P>Navigator versions 2 through 4 will erroneously re-request +GIF89A animations on each loop of the animation if the first +response included an <CODE>Expires</CODE> header. This happens +regardless of how far in the future the expiry time is set. There +is no workaround supplied with Apache, however there are hacks for <A +HREF="http://www.arctic.org/~dgaudet/patches/apache-1.2-gif89-expires-hack.patch">1.2</A> +and for <A +HREF="http://www.arctic.org/~dgaudet/patches/apache-1.3-gif89-expires-hack.patch">1.3</A>. + +<H3><A NAME="no-content-length"><CODE>POST</CODE> without +<CODE>Content-Length</CODE></A></H3> + +<P>In certain situations Navigator 3.01 through 3.03 appear to incorrectly +issue a POST without the request body. There is no +known workaround. It has been fixed in Navigator 3.04, Netscapes +provides some +<A HREF="http://help.netscape.com/kb/client/971014-42.html">information</A>. +There's also +<A HREF="http://www.arctic.org/~dgaudet/apache/no-content-length/"> +some information</A> about the actual problem. + +<H3><A NAME="jdk-12-bugs">JDK 1.2 betas lose parts of responses.</A></H3> + +<P>The http client in the JDK1.2beta2 and beta3 will throw away the first part of +the response body when both the headers and the first part of the body are sent +in the same network packet AND keep-alive's are being used. If either condition +is not met then it works fine. + +<P>See also Bug-ID's 4124329 and 4125538 at the java developer connection. + +<P>If you are seeing this bug yourself, you can add the following BrowserMatch +directive to work around it: + +<BLOCKQUOTE><CODE> +BrowserMatch "Java1\.2beta[23]" nokeepalive +</CODE></BLOCKQUOTE> + +<P>We don't advocate this though since bending over backwards for beta software +is usually not a good idea; ideally it gets fixed, new betas or a final release +comes out, and no one uses the broken old software anymore. In theory. + +<H3><A NAME="content-type-persistence"><CODE>Content-Type</CODE> change +is not noticed after reload</A></H3> + +<P>Navigator (all versions?) will cache the <CODE>content-type</CODE> +for an object "forever". Using reload or shift-reload will not cause +Navigator to notice a <CODE>content-type</CODE> change. The only +work-around is for the user to flush their caches (memory and disk). By +way of an example, some folks may be using an old <CODE>mime.types</CODE> +file which does not map <CODE>.htm</CODE> to <CODE>text/html</CODE>, +in this case Apache will default to sending <CODE>text/plain</CODE>. +If the user requests the page and it is served as <CODE>text/plain</CODE>. +After the admin fixes the server, the user will have to flush their caches +before the object will be shown with the correct <CODE>text/html</CODE> +type. + +<h3><a name="msie-cookie-y2k">MSIE Cookie problem with expiry date in +the year 2000</a></h3> + +<p>MSIE versions 3.00 and 3.02 (without the Y2K patch) do not handle +cookie expiry dates in the year 2000 properly. Years after 2000 and +before 2000 work fine. This is fixed in IE4.01 service pack 1, and in +the Y2K patch for IE3.02. Users should avoid using expiry dates in the +year 2000. + +<h3><a name="lynx-negotiate-trans">Lynx incorrectly asking for transparent +content negotiation</a></h3> + +<p>The Lynx browser versions 2.7 and 2.8 send a "negotiate: trans" header +in their requests, which is an indication the browser supports transparent +content negotiation (TCN). However the browser does not support TCN. +As of version 1.3.4, Apache supports TCN, and this causes problems with +these versions of Lynx. As a workaround future versions of Apache will +ignore this header when sent by the Lynx client. + +<h3><a name="ie40-vary">MSIE 4.0 mishandles Vary response header</a></h3> + +<p>MSIE 4.0 does not handle a Vary header properly. The Vary header is +generated by mod_rewrite in apache 1.3. The result is an error from MSIE +saying it cannot download the requested file. There are more details +in <a href="http://bugs.apache.org/index/full/4118">PR#4118</a>. +</P> +<P> +A workaround is to add the following to your server's configuration +files: +</P> +<PRE> + BrowserMatch "MSIE 4\.0" force-no-vary +</PRE> +<P> +(This workaround is only available with releases <STRONG>after</STRONG> +1.3.6 of the Apache Web server.) +</P> + <HR> + <H3 ALIGN="CENTER"> - Apache HTTP Server Version 1.2 + Apache HTTP Server Version 1.3 </H3> <A HREF="./"><IMG SRC="../images/index.gif" ALT="Index"></A> diff --git a/usr.sbin/httpd/htdocs/manual/misc/perf-dec.html b/usr.sbin/httpd/htdocs/manual/misc/perf-dec.html index eb0551a1ecd..21edc29af9b 100644 --- a/usr.sbin/httpd/htdocs/manual/misc/perf-dec.html +++ b/usr.sbin/httpd/htdocs/manual/misc/perf-dec.html @@ -14,7 +14,7 @@ <DIV ALIGN="CENTER"> <IMG SRC="../images/sub.gif" ALT="[APACHE DOCUMENTATION]"> <H3> - Apache HTTP Server Version 1.2 + Apache HTTP Server Version 1.3 </H3> </DIV> @@ -22,7 +22,7 @@ Below is a set of newsgroup posts made by an engineer from DEC in response to queries about how to modify DEC's Digital Unix OS for more -heavily loaded web sites. Copied with permission. +heavily loaded web sites. Copied with permission. <HR> @@ -38,10 +38,10 @@ Date: Fri, 28 Jun 96 16:07:56 MDT<BR> mechanism. <LI>Patch ID OSF350-146 has been superseded by -<blockquote> +<BLOCKQUOTE> Patch ID OSF350-195 for V3.2C<BR> Patch ID OSF360-350195 for V3.2D -</blockquote> +</BLOCKQUOTE> Patch IDs for V3.2E and V3.2F should be available soon. There is no known reason why the Patch ID OSF360-350195 won't work on these releases, but such use is not officially @@ -56,7 +56,7 @@ From mogul@pa.dec.com (Jeffrey Mogul) Organization DEC Western Research Date 30 May 1996 00:50:25 GMT Newsgroups <A HREF="news:comp.unix.osf.osf1">comp.unix.osf.osf1</A> -Message-ID <A HREF="news:4oirch$bc8@usenet.pa.dec.com"><4oirch$bc8@usenet.pa.dec.com></A> +Message-ID <4oirch$bc8@usenet.pa.dec.com> Subject Re: Web Site Performance References 1 @@ -69,8 +69,10 @@ In article <skoogDs54BH.9pF@netcom.com> skoog@netcom.com (Jim Skoog) write >runing DEC UNIX 3.2C, which run DEC's seal firewall and behind >that Alpha 1000 and 2100 webservers. -Our experience (running such Web servers as <A HREF="http://altavista.digital.com">altavista.digital.com</A> -and <A HREF="http://www.digital.com">www.digital.com</A>) is that there is one important kernel tuning +Our experience (running such Web servers as <A + HREF="http://altavista.digital.com">altavista.digital.com</A> +and <A HREF="http://www.digital.com" + >www.digital.com</A>) is that there is one important kernel tuning knob to adjust in order to get good performance on V3.2C. You need to patch the kernel global variable "somaxconn" (use dbx -k to do this) from its default value of 8 to something much larger. @@ -104,7 +106,10 @@ with no obvious performance bottlenecks at the millions-of-hits-per-day level. We have some Webstone performance results available at - <A HREF="http://www.digital.com/info/alphaserver/news/webff.html">http://www.digital.com/info/alphaserver/news/webff.html</A> + http://www.digital.com/info/alphaserver/news/webff.html + +<EM>[The document referenced above is no longer at that URL -- Ed.]</EM> + I'm not sure if these were done using V4.0 or an earlier version of Digital UNIX, although I suspect they were done using a test version of V4.0. @@ -119,7 +124,7 @@ From mogul@pa.dec.com (Jeffrey Mogul) Organization DEC Western Research Date 31 May 1996 21:01:01 GMT Newsgroups <A HREF="news:comp.unix.osf.osf1">comp.unix.osf.osf1</A> -Message-ID <A HREF="news:4onmmd$mmd@usenet.pa.dec.com"><4onmmd$mmd@usenet.pa.dec.com></A> +Message-ID <4onmmd$mmd@usenet.pa.dec.com> Subject Digital UNIX V3.2C Internet tuning patch info ---------------------------------------------------------------------------- @@ -142,7 +147,8 @@ so the description of the various tuning parameters in this README file might be useful to people running V4.0 systems. This patch kit does not appear to be available (yet?) from - <A HREF="http://www.service.digital.com/html/patch_service.html">http://www.service.digital.com/html/patch_service.html</A> + <A HREF="http://www.service.digital.com/html/patch_service.html" + >http://www.service.digital.com/html/patch_service.html</A> so I guess you'll have to call Digital's Customer Support to get it. -Jeff @@ -213,7 +219,7 @@ TUNING tcp_keepinit This is the amount of time a partially established connection will sit on the listen - queue before timing out (e.g. if a client + queue before timing out (<EM>e.g.</EM>, if a client sends a SYN but never answers our SYN/ACK). Partially established connections tie up slots on the listen queue. If the queue starts to @@ -281,8 +287,9 @@ sysconfig -q socket - - X </PRE> <HR> + <H3 ALIGN="CENTER"> - Apache HTTP Server Version 1.2 + Apache HTTP Server Version 1.3 </H3> <A HREF="./"><IMG SRC="../images/index.gif" ALT="Index"></A> diff --git a/usr.sbin/httpd/htdocs/manual/misc/perf-tuning.html b/usr.sbin/httpd/htdocs/manual/misc/perf-tuning.html index 956a7febbc5..46995c9eccd 100644 --- a/usr.sbin/httpd/htdocs/manual/misc/perf-tuning.html +++ b/usr.sbin/httpd/htdocs/manual/misc/perf-tuning.html @@ -209,12 +209,12 @@ consider tuning these settings. Use the <CODE>mod_status</CODE> output as a guide. <P>Related to process creation is process death induced by the -<CODE>MaxRequestsPerChild</CODE> setting. By default this is 30, which -is probably far too low unless your server is using a module such as -<CODE>mod_perl</CODE> which causes children to have bloated memory -images. If your server is serving mostly static pages then consider -raising this value to something like 10000. The code is robust enough -that this shouldn't be a problem. +<CODE>MaxRequestsPerChild</CODE> setting. By default this is 0, which +means that there is no limit to the number of requests handled +per child. If your configuration currently has this set to some +very low number, such as 30, you may want to bump this up significantly. +If you are running SunOS or an old version of Solaris, limit this +to 10000 or so because of memory leaks. <P>When keep-alives are in use, children will be kept busy doing nothing waiting for more requests on the already open diff --git a/usr.sbin/httpd/htdocs/manual/misc/security_tips.html b/usr.sbin/httpd/htdocs/manual/misc/security_tips.html index 20942181fa0..d1b186d3caa 100644 --- a/usr.sbin/httpd/htdocs/manual/misc/security_tips.html +++ b/usr.sbin/httpd/htdocs/manual/misc/security_tips.html @@ -15,44 +15,80 @@ <DIV ALIGN="CENTER"> <IMG SRC="../images/sub.gif" ALT="[APACHE DOCUMENTATION]"> <H3> - Apache HTTP Server Version 1.2 + Apache HTTP Server Version 1.3 </H3> </DIV> <H1 ALIGN="CENTER">Security Tips for Server Configuration</H1> -<hr> +<HR> <P>Some hints and tips on security issues in setting up a web server. Some of the suggestions will be general, others specific to Apache. <HR> -<H2>Permissions on Log File Directories</H2> -<P>When Apache starts, it opens the log files as the user who started the -server before switching to the user defined in the -<a href="../mod/core.html#user"><b>User</b></a> directive. Anyone who -has write permission for the directory where any log files are -being written to can append pseudo-arbitrary data to any file on the -system which is writable by the user who starts Apache. Since the -server is normally started by root, you should <EM>NOT</EM> give anyone -write permission to the directory where logs are stored unless you -want them to have root access. +<H2><A NAME="serverroot">Permissions on ServerRoot Directories</A></H2> +<P>In typical operation, Apache is started by the root +user, and it switches to the user defined by the <A +HREF="../mod/core.html#user"><STRONG>User</STRONG></A> directive to serve hits. +As is the case with any command that root executes, you must take care +that it is protected from modification by non-root users. Not only +must the files themselves be writeable only by root, but so must the +directories, and parents of all directories. For example, if you +choose to place ServerRoot in <CODE>/usr/local/apache</CODE> then it is +suggested that you create that directory as root, with commands +like these: + +<BLOCKQUOTE><PRE> + mkdir /usr/local/apache + cd /usr/local/apache + mkdir bin conf logs + chown 0 . bin conf logs + chgrp 0 . bin conf logs + chmod 755 . bin conf logs +</PRE></BLOCKQUOTE> + +It is assumed that /, /usr, and /usr/local are only modifiable by root. +When you install the httpd executable, you should ensure that it is +similarly protected: + +<BLOCKQUOTE><PRE> + cp httpd /usr/local/apache/bin + chown 0 /usr/local/apache/bin/httpd + chgrp 0 /usr/local/apache/bin/httpd + chmod 511 /usr/local/apache/bin/httpd +</PRE></BLOCKQUOTE> + +<P>You can create an htdocs subdirectory which is modifiable by other +users -- since root never executes any files out of there, and shouldn't +be creating files in there. + +<P>If you allow non-root users to modify any files that root either +executes or writes on then you open your system to root compromises. +For example, someone could replace the httpd binary so that the next +time you start it, it will execute some arbitrary code. If the logs +directory is writeable (by a non-root user), someone +could replace a log file with a symlink to some other system file, +and then root might overwrite that file with arbitrary data. If the +log files themselves are writeable (by a non-root user), then someone +may be able to overwrite the log itself with bogus data. <P> <HR> <H2>Server Side Includes</H2> <P>Server side includes (SSI) can be configured so that users can execute arbitrary programs on the server. That thought alone should send a shiver -down the spine of any sys-admin.<p> +down the spine of any sys-admin.<P> One solution is to disable that part of SSI. To do that you use the IncludesNOEXEC option to the <A HREF="../mod/core.html#options">Options</A> -directive.<p> +directive.<P> <HR> <H2>Non Script Aliased CGI</H2> -<P>Allowing users to execute <B>CGI</B> scripts in any directory should only +<P>Allowing users to execute <STRONG>CGI</STRONG> scripts in any directory +should only be considered if; <OL> <LI>You trust your users not to write scripts which will deliberately or @@ -60,26 +96,27 @@ accidentally expose your system to an attack. <LI>You consider security at your site to be so feeble in other areas, as to make one more potential hole irrelevant. <LI>You have no users, and nobody ever visits your server. -</OL><p> +</OL><P> <HR> <H2>Script Alias'ed CGI</H2> -<P>Limiting <B>CGI</B> to special directories gives the admin control over +<P>Limiting <STRONG>CGI</STRONG> to special directories gives the admin +control over what goes into those directories. This is inevitably more secure than -non script aliased CGI, but <strong>only if users with write access to the -directories are trusted</strong> or the admin is willing to test each new CGI +non script aliased CGI, but <STRONG>only if users with write access to the +directories are trusted</STRONG> or the admin is willing to test each new CGI script/program for potential security holes.<P> -Most sites choose this option over the non script aliased CGI approach.<p> +Most sites choose this option over the non script aliased CGI approach.<P> <HR> <H2>CGI in general</H2> <P>Always remember that you must trust the writers of the CGI script/programs or your ability to spot potential security holes in CGI, whether they were -deliberate or accidental.<p> +deliberate or accidental.<P> All the CGI scripts will run as the same user, so they have potential to -conflict (accidentally or deliberately) with other scripts e.g. +conflict (accidentally or deliberately) with other scripts <EM>e.g.</EM> User A hates User B, so he writes a script to trash User B's CGI database. One program which can be used to allow scripts to run as different users is <A HREF="../suexec.html">suEXEC</A> which is @@ -93,21 +130,21 @@ the Apache server code. Another popular way of doing this is with <H2>Stopping users overriding system wide settings...</H2> <P>To run a really tight ship, you'll want to stop users from setting up <CODE>.htaccess</CODE> files which can override security features -you've configured. Here's one way to do it...<p> +you've configured. Here's one way to do it...<P> In the server configuration file, put -<blockquote><code> -<Directory /> <br> -AllowOverride None <br> -Options None <br> -allow from all <br> -</Directory> <br> -</code></blockquote> +<BLOCKQUOTE><CODE> +<Directory /> <BR> +AllowOverride None <BR> +Options None <BR> +allow from all <BR> +</Directory> <BR> +</CODE></BLOCKQUOTE> Then setup for specific directories<P> This stops all overrides, Includes and accesses in all directories apart -from those named.<p> +from those named.<P> <HR> <H2> Protect server files by default @@ -176,19 +213,30 @@ Also be wary of playing games with the >UserDir</A> directive; setting it to something like <SAMP>"./"</SAMP> would have the same effect, for root, as the first example above. +If you are using Apache 1.3 or above, we strongly recommend that you +include the following line in your server configuration files: +</P> +<DL> + <DD><SAMP>UserDir disabled root</SAMP> + </DD> +</DL> <HR> <P>Please send any other useful security tips to The Apache Group by filling out a -<A HREF="http://www.apache.org/bugdb.cgi">problem report</A>, or by -sending mail to -<A HREF="mailto:apache-bugs@mail.apache.org">apache-bugs@mail.apache.org</A> -<p> +<A HREF="http://www.apache.org/bug_report.html">problem report</A>. +If you are confident you have found a security bug in the Apache +source code itself, <A +HREF="http://www.apache.org/security_report.html">please let us +know</A>. + +<P> <HR> <HR> + <H3 ALIGN="CENTER"> - Apache HTTP Server Version 1.2 + Apache HTTP Server Version 1.3 </H3> <A HREF="./"><IMG SRC="../images/index.gif" ALT="Index"></A> |