diff options
Diffstat (limited to 'usr.sbin/httpd/htdocs/manual/content-negotiation.html.en')
-rw-r--r-- | usr.sbin/httpd/htdocs/manual/content-negotiation.html.en | 674 |
1 files changed, 0 insertions, 674 deletions
diff --git a/usr.sbin/httpd/htdocs/manual/content-negotiation.html.en b/usr.sbin/httpd/htdocs/manual/content-negotiation.html.en deleted file mode 100644 index fdb271b7565..00000000000 --- a/usr.sbin/httpd/htdocs/manual/content-negotiation.html.en +++ /dev/null @@ -1,674 +0,0 @@ -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" - "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> - -<html xmlns="http://www.w3.org/1999/xhtml"> - <head> - <meta name="generator" content="HTML Tidy, see www.w3.org" /> - - <title>Apache Content Negotiation</title> - </head> - <!-- Background white, links blue (unvisited), navy (visited), red (active) --> - - <body bgcolor="#FFFFFF" text="#000000" link="#0000FF" - vlink="#000080" alink="#FF0000"> - <div align="CENTER"> - <img src="images/sub.gif" alt="[APACHE DOCUMENTATION]" /> - - <h3>Apache HTTP Server</h3> - </div> - - - - <h1 align="CENTER">Content Negotiation</h1> - - <p>Apache's support for content negotiation has been updated to - meet the HTTP/1.1 specification. It can choose the best - representation of a resource based on the browser-supplied - preferences for media type, languages, character set and - encoding. It is also implements a couple of features to give - more intelligent handling of requests from browsers which send - incomplete negotiation information.</p> - - <p>Content negotiation is provided by the <a - href="mod/mod_negotiation.html">mod_negotiation</a> module, - which is compiled in by default.</p> - <hr /> - - <h2>About Content Negotiation</h2> - - <p>A resource may be available in several different - representations. For example, it might be available in - different languages or different media types, or a combination. - One way of selecting the most appropriate choice is to give the - user an index page, and let them select. However it is often - possible for the server to choose automatically. This works - because browsers can send as part of each request information - about what representations they prefer. For example, a browser - could indicate that it would like to see information in French, - if possible, else English will do. Browsers indicate their - preferences by headers in the request. To request only French - representations, the browser would send</p> -<pre> - Accept-Language: fr -</pre> - - <p>Note that this preference will only be applied when there is - a choice of representations and they vary by language.</p> - - <p>As an example of a more complex request, this browser has - been configured to accept French and English, but prefer - French, and to accept various media types, preferring HTML over - plain text or other text types, and preferring GIF or JPEG over - other media types, but also allowing any other media type as a - last resort:</p> -<pre> - Accept-Language: fr; q=1.0, en; q=0.5 - Accept: text/html; q=1.0, text/*; q=0.8, image/gif; q=0.6, - image/jpeg; q=0.6, image/*; q=0.5, */*; q=0.1 -</pre> - Apache 1.2 supports 'server driven' content negotiation, as - defined in the HTTP/1.1 specification. It fully supports the - Accept, Accept-Language, Accept-Charset and Accept-Encoding - request headers. Apache 1.3.4 also supports 'transparent' - content negotiation, which is an experimental negotiation - protocol defined in RFC 2295 and RFC 2296. It does not offer - support for 'feature negotiation' as defined in these RFCs. - - <p>A <strong>resource</strong> is a conceptual entity - identified by a URI (RFC 2396). An HTTP server like Apache - provides access to <strong>representations</strong> of the - resource(s) within its namespace, with each representation in - the form of a sequence of bytes with a defined media type, - character set, encoding, etc. Each resource may be associated - with zero, one, or more than one representation at any given - time. If multiple representations are available, the resource - is referred to as <strong>negotiable</strong> and each of its - representations is termed a <strong>variant</strong>. The ways - in which the variants for a negotiable resource vary are called - the <strong>dimensions</strong> of negotiation.</p> - - <h2>Negotiation in Apache</h2> - - <p>In order to negotiate a resource, the server needs to be - given information about each of the variants. This is done in - one of two ways:</p> - - <ul> - <li>Using a type map (<em>i.e.</em>, a <code>*.var</code> - file) which names the files containing the variants - explicitly, or</li> - - <li>Using a 'MultiViews' search, where the server does an - implicit filename pattern match and chooses from among the - results.</li> - </ul> - - <h3>Using a type-map file</h3> - - <p>A type map is a document which is associated with the - handler named <code>type-map</code> (or, for - backwards-compatibility with older Apache configurations, the - mime type <code>application/x-type-map</code>). Note that to - use this feature, you must have a handler set in the - configuration that defines a file suffix as - <code>type-map</code>; this is best done with a</p> -<pre> - AddHandler type-map .var -</pre> - in the server configuration file. See the comments in the - sample config file for more details. - - <p>Type map files have an entry for each available variant; - these entries consist of contiguous HTTP-format header lines. - Entries for different variants are separated by blank lines. - Blank lines are illegal within an entry. It is conventional to - begin a map file with an entry for the combined entity as a - whole (although this is not required, and if present will be - ignored). An example map file is:</p> -<pre> - URI: foo - - URI: foo.en.html - Content-type: text/html - Content-language: en - - URI: foo.fr.de.html - Content-type: text/html;charset=iso-8859-2 - Content-language: fr, de -</pre> - If the variants have different source qualities, that may be - indicated by the "qs" parameter to the media type, as in this - picture (available as jpeg, gif, or ASCII-art): -<pre> - URI: foo - - URI: foo.jpeg - Content-type: image/jpeg; qs=0.8 - - URI: foo.gif - Content-type: image/gif; qs=0.5 - - URI: foo.txt - Content-type: text/plain; qs=0.01 -</pre> - - <p>qs values can vary in the range 0.000 to 1.000. Note that - any variant with a qs value of 0.000 will never be chosen. - Variants with no 'qs' parameter value are given a qs factor of - 1.0. The qs parameter indicates the relative 'quality' of this - variant compared to the other available variants, independent - of the client's capabilities. For example, a jpeg file is - usually of higher source quality than an ascii file if it is - attempting to represent a photograph. However, if the resource - being represented is an original ascii art, then an ascii - representation would have a higher source quality than a jpeg - representation. A qs value is therefore specific to a given - variant depending on the nature of the resource it - represents.</p> - - <p>The full list of headers recognized is:</p> - - <dl> - <dt><code>URI:</code></dt> - - <dd>uri of the file containing the variant (of the given - media type, encoded with the given content encoding). These - are interpreted as URLs relative to the map file; they must - be on the same server (!), and they must refer to files to - which the client would be granted access if they were to be - requested directly.</dd> - - <dt><code>Content-Type:</code></dt> - - <dd>media type --- charset, level and "qs" parameters may be - given. These are often referred to as MIME types; typical - media types are <code>image/gif</code>, - <code>text/plain</code>, or - <code>text/html; level=3</code>.</dd> - - <dt><code>Content-Language:</code></dt> - - <dd>The languages of the variant, specified as an Internet - standard language tag from RFC 1766 (<em>e.g.</em>, - <code>en</code> for English, <code>kr</code> for Korean, - <em>etc.</em>).</dd> - - <dt><code>Content-Encoding:</code></dt> - - <dd>If the file is compressed, or otherwise encoded, rather - than containing the actual raw data, this says how that was - done. Apache only recognizes encodings that are defined by an - <a href="mod/mod_mime.html#addencoding">AddEncoding</a> - directive. This normally includes the encodings - <code>x-compress</code> for compress'd files, and - <code>x-gzip</code> for gzip'd files. The <code>x-</code> - prefix is ignored for encoding comparisons.</dd> - - <dt><code>Content-Length:</code></dt> - - <dd>The size of the file. Specifying content lengths in the - type-map allows the server to compare file sizes without - checking the actual files.</dd> - - <dt><code>Description:</code></dt> - - <dd>A human-readable textual description of the variant. If - Apache cannot find any appropriate variant to return, it will - return an error response which lists all available variants - instead. Such a variant list will include the human-readable - variant descriptions.</dd> - </dl> - - <h3>Multiviews</h3> - - <p><code>MultiViews</code> is a per-directory option, meaning - it can be set with an <code>Options</code> directive within a - <code><Directory></code>, <code><Location></code> - or <code><Files></code> section in - <code>access.conf</code>, or (if <code>AllowOverride</code> is - properly set) in <code>.htaccess</code> files. Note that - <code>Options All</code> does not set <code>MultiViews</code>; - you have to ask for it by name.</p> - - <p>The effect of <code>MultiViews</code> is as follows: if the - server receives a request for <code>/some/dir/foo</code>, if - <code>/some/dir</code> has <code>MultiViews</code> enabled, and - <code>/some/dir/foo</code> does <em>not</em> exist, then the - server reads the directory looking for files named foo.*, and - effectively fakes up a type map which names all those files, - assigning them the same media types and content-encodings it - would have if the client had asked for one of them by name. It - then chooses the best match to the client's requirements.</p> - - <p><code>MultiViews</code> may also apply to searches for the - file named by the <code>DirectoryIndex</code> directive, if the - server is trying to index a directory. If the configuration - files specify</p> -<pre> - DirectoryIndex index -</pre> - then the server will arbitrate between <code>index.html</code> - and <code>index.html3</code> if both are present. If neither - are present, and <code>index.cgi</code> is there, the server - will run it. - - <p>If one of the files found when reading the directive is a - CGI script, it's not obvious what should happen. The code gives - that case special treatment --- if the request was a POST, or a - GET with QUERY_ARGS or PATH_INFO, the script is given an - extremely high quality rating, and generally invoked; otherwise - it is given an extremely low quality rating, which generally - causes one of the other views (if any) to be retrieved.</p> - - <h2>The Negotiation Methods</h2> - After Apache has obtained a list of the variants for a given - resource, either from a type-map file or from the filenames in - the directory, it invokes one of two methods to decide on the - 'best' variant to return, if any. It is not necessary to know - any of the details of how negotiation actually takes place in - order to use Apache's content negotiation features. However the - rest of this document explains the methods used for those - interested. - - <p>There are two negotiation methods:</p> - - <ol> - <li><strong>Server driven negotiation with the Apache - algorithm</strong> is used in the normal case. The Apache - algorithm is explained in more detail below. When this - algorithm is used, Apache can sometimes 'fiddle' the quality - factor of a particular dimension to achieve a better result. - The ways Apache can fiddle quality factors is explained in - more detail below.</li> - - <li><strong>Transparent content negotiation</strong> is used - when the browser specifically requests this through the - mechanism defined in RFC 2295. This negotiation method gives - the browser full control over deciding on the 'best' variant, - the result is therefore dependent on the specific algorithms - used by the browser. As part of the transparent negotiation - process, the browser can ask Apache to run the 'remote - variant selection algorithm' defined in RFC 2296.</li> - </ol> - - <h3>Dimensions of Negotiation</h3> - - <table> - <tr valign="top"> - <th>Dimension</th> - - <th>Notes</th> - </tr> - - <tr valign="top"> - <td>Media Type</td> - - <td>Browser indicates preferences with the Accept header - field. Each item can have an associated quality factor. - Variant description can also have a quality factor (the - "qs" parameter).</td> - </tr> - - <tr valign="top"> - <td>Language</td> - - <td>Browser indicates preferences with the Accept-Language - header field. Each item can have a quality factor. Variants - can be associated with none, one or more than one - language.</td> - </tr> - - <tr valign="top"> - <td>Encoding</td> - - <td>Browser indicates preference with the Accept-Encoding - header field. Each item can have a quality factor.</td> - </tr> - - <tr valign="top"> - <td>Charset</td> - - <td>Browser indicates preference with the Accept-Charset - header field. Each item can have a quality factor. Variants - can indicate a charset as a parameter of the media - type.</td> - </tr> - </table> - - <h3>Apache Negotiation Algorithm</h3> - - <p>Apache can use the following algorithm to select the 'best' - variant (if any) to return to the browser. This algorithm is - not further configurable. It operates as follows:</p> - - <ol> - <li>First, for each dimension of the negotiation, check the - appropriate <em>Accept*</em> header field and assign a - quality to each variant. If the <em>Accept*</em> header for - any dimension implies that this variant is not acceptable, - eliminate it. If no variants remain, go to step 4.</li> - - <li> - Select the 'best' variant by a process of elimination. Each - of the following tests is applied in order. Any variants - not selected at each test are eliminated. After each test, - if only one variant remains, select it as the best match - and proceed to step 3. If more than one variant remains, - move on to the next test. - - <ol> - <li>Multiply the quality factor from the Accept header - with the quality-of-source factor for this variant's - media type, and select the variants with the highest - value.</li> - - <li>Select the variants with the highest language quality - factor.</li> - - <li>Select the variants with the best language match, - using either the order of languages in the - Accept-Language header (if present), or else the order of - languages in the <code>LanguagePriority</code> directive - (if present).</li> - - <li>Select the variants with the highest 'level' media - parameter (used to give the version of text/html media - types).</li> - - <li>Select variants with the best charset media - parameters, as given on the Accept-Charset header line. - Charset ISO-8859-1 is acceptable unless explicitly - excluded. Variants with a <code>text/*</code> media type - but not explicitly associated with a particular charset - are assumed to be in ISO-8859-1.</li> - - <li>Select those variants which have associated charset - media parameters that are <em>not</em> ISO-8859-1. If - there are no such variants, select all variants - instead.</li> - - <li>Select the variants with the best encoding. If there - are variants with an encoding that is acceptable to the - user-agent, select only these variants. Otherwise if - there is a mix of encoded and non-encoded variants, - select only the unencoded variants. If either all - variants are encoded or all variants are not encoded, - select all variants.</li> - - <li>Select the variants with the smallest content - length.</li> - - <li>Select the first variant of those remaining. This - will be either the first listed in the type-map file, or - when variants are read from the directory, the one whose - file name comes first when sorted using ASCII code - order.</li> - </ol> - </li> - - <li>The algorithm has now selected one 'best' variant, so - return it as the response. The HTTP response header Vary is - set to indicate the dimensions of negotiation (browsers and - caches can use this information when caching the resource). - End.</li> - - <li><p>To get here means no variant was selected (because none - are acceptable to the browser). Return a 406 status (meaning - "No acceptable representation") with a response body - consisting of an HTML document listing the available - variants. Also set the HTTP Vary header to indicate the - dimensions of variance.</p> - - <p>You should be aware that the error message returned by Apache is - necessarily rather terse and might confuse some users (even though it - lists the available alternatives). If you want to avoid users seeing this - error page, you should organize your documents such that a document in a - default language (or with a default encoding etc.) is always returned if a - document is not available in any of the languages, encodings etc. the - browser asked for.</p> - - <p>In particular, if you want a document in a default language to - be returned if a document is not available in any of the languages - a browser asked for, you should create a document with no language - attribute set. See <a href="#nolanguage">Variants with no - Language</a> below for details.</p></li> - </ol> - - <h2><a id="better" name="better">Fiddling with Quality - Values</a></h2> - - <p>Apache sometimes changes the quality values from what would - be expected by a strict interpretation of the Apache - negotiation algorithm above. This is to get a better result - from the algorithm for browsers which do not send full or - accurate information. Some of the most popular browsers send - Accept header information which would otherwise result in the - selection of the wrong variant in many cases. If a browser - sends full and correct information these fiddles will not be - applied.</p> - - <h3>Media Types and Wildcards</h3> - - <p>The Accept: request header indicates preferences for media - types. It can also include 'wildcard' media types, such as - "image/*" or "*/*" where the * matches any string. So a request - including:</p> -<pre> - Accept: image/*, */* -</pre> - would indicate that any type starting "image/" is acceptable, - as is any other type (so the first "image/*" is redundant). - Some browsers routinely send wildcards in addition to explicit - types they can handle. For example: -<pre> - Accept: text/html, text/plain, image/gif, image/jpeg, */* -</pre> - The intention of this is to indicate that the explicitly listed - types are preferred, but if a different representation is - available, that is ok too. However under the basic algorithm, - as given above, the */* wildcard has exactly equal preference - to all the other types, so they are not being preferred. The - browser should really have sent a request with a lower quality - (preference) value for *.*, such as: -<pre> - Accept: text/html, text/plain, image/gif, image/jpeg, */*; q=0.01 -</pre> - The explicit types have no quality factor, so they default to a - preference of 1.0 (the highest). The wildcard */* is given a - low preference of 0.01, so other types will only be returned if - no variant matches an explicitly listed type. - - <p>If the Accept: header contains <em>no</em> q factors at all, - Apache sets the q value of "*/*", if present, to 0.01 to - emulate the desired behaviour. It also sets the q value of - wildcards of the format "type/*" to 0.02 (so these are - preferred over matches against "*/*". If any media type on the - Accept: header contains a q factor, these special values are - <em>not</em> applied, so requests from browsers which send the - correct information to start with work as expected.</p> - - <h3><a id="nolanguage" name="nolanguage">Variants with no Language</a></h3> - - <p>If some of the variants for a particular resource have a - language attribute, and some do not, those variants with no - language are given a very low language quality factor of - 0.001.</p> - - <p>The reason for setting this language quality factor for variant - with no language to a very low value is to allow for a default - variant which can be supplied if none of the other variants match - the browser's language preferences. This allows you to avoid users - seeing a "406" error page if their browser is set to only accept - languages which you do not offer for the resource that was - requested.</p> - - <p>For example, consider the situation with Multiviews enabled and - three variants:</p> - - <ul> - <li>foo.en.html, language en</li> - - <li>foo.fr.html, language en</li> - - <li>foo.html, no language</li> - </ul> - - <p>The meaning of a variant with no language is that it is always - acceptable to the browser. If the request is for <code>foo</code> - and the Accept-Language header includes either en or fr (or both) - one of foo.en.html or foo.fr.html will be returned. If the browser - does not list either en or fr as acceptable, foo.html will be - returned instead. If the client requests <code>foo.html</code> - instead, then no negotiation will occur since the exact match - will be returned. To avoid this problem, it is sometimes helpful - to name the "no language" variant <code>foo.html.html</code> to assure - that Multiviews and language negotiation will come into play.</p> - - <h2>Extensions to Transparent Content Negotiation</h2> - Apache extends the transparent content negotiation protocol - (RFC 2295) as follows. A new <code>{encoding ..}</code> element - is used in variant lists to label variants which are available - with a specific content-encoding only. The implementation of - the RVSA/1.0 algorithm (RFC 2296) is extended to recognize - encoded variants in the list, and to use them as candidate - variants whenever their encodings are acceptable according to - the Accept-Encoding request header. The RVSA/1.0 implementation - does not round computed quality factors to 5 decimal places - before choosing the best variant. - - <h2>Note on hyperlinks and naming conventions</h2> - - <p>If you are using language negotiation you can choose between - different naming conventions, because files can have more than - one extension, and the order of the extensions is normally - irrelevant (see <a href="mod/mod_mime.html">mod_mime</a> - documentation for details).</p> - - <p>A typical file has a MIME-type extension (<em>e.g.</em>, - <samp>html</samp>), maybe an encoding extension (<em>e.g.</em>, - <samp>gz</samp>), and of course a language extension - (<em>e.g.</em>, <samp>en</samp>) when we have different - language variants of this file.</p> - - <p>Examples:</p> - - <ul> - <li>foo.en.html</li> - - <li>foo.html.en</li> - - <li>foo.en.html.gz</li> - </ul> - - <p>Here some more examples of filenames together with valid and - invalid hyperlinks:</p> - - <table border="1" cellpadding="8" cellspacing="0"> - <tr> - <th>Filename</th> - - <th>Valid hyperlink</th> - - <th>Invalid hyperlink</th> - </tr> - - <tr> - <td><em>foo.html.en</em></td> - - <td>foo<br /> - foo.html</td> - - <td>-</td> - </tr> - - <tr> - <td><em>foo.en.html</em></td> - - <td>foo</td> - - <td>foo.html</td> - </tr> - - <tr> - <td><em>foo.html.en.gz</em></td> - - <td>foo<br /> - foo.html</td> - - <td>foo.gz<br /> - foo.html.gz</td> - </tr> - - <tr> - <td><em>foo.en.html.gz</em></td> - - <td>foo</td> - - <td>foo.html<br /> - foo.html.gz<br /> - foo.gz</td> - </tr> - - <tr> - <td><em>foo.gz.html.en</em></td> - - <td>foo<br /> - foo.gz<br /> - foo.gz.html</td> - - <td>foo.html</td> - </tr> - - <tr> - <td><em>foo.html.gz.en</em></td> - - <td>foo<br /> - foo.html<br /> - foo.html.gz</td> - - <td>foo.gz</td> - </tr> - </table> - - <p>Looking at the table above you will notice that it is always - possible to use the name without any extensions in a hyperlink - (<em>e.g.</em>, <samp>foo</samp>). The advantage is that you - can hide the actual type of a document rsp. file and can change - it later, <em>e.g.</em>, from <samp>html</samp> to - <samp>shtml</samp> or <samp>cgi</samp> without changing any - hyperlink references.</p> - - <p>If you want to continue to use a MIME-type in your - hyperlinks (<em>e.g.</em> <samp>foo.html</samp>) the language - extension (including an encoding extension if there is one) - must be on the right hand side of the MIME-type extension - (<em>e.g.</em>, <samp>foo.html.en</samp>).</p> - - <h2>Note on Caching</h2> - - <p>When a cache stores a representation, it associates it with - the request URL. The next time that URL is requested, the cache - can use the stored representation. But, if the resource is - negotiable at the server, this might result in only the first - requested variant being cached and subsequent cache hits might - return the wrong response. To prevent this, Apache normally - marks all responses that are returned after content negotiation - as non-cacheable by HTTP/1.0 clients. Apache also supports the - HTTP/1.1 protocol features to allow caching of negotiated - responses.</p> - - <p>For requests which come from a HTTP/1.0 compliant client - (either a browser or a cache), the directive - <tt>CacheNegotiatedDocs</tt> can be used to allow caching of - responses which were subject to negotiation. This directive can - be given in the server config or virtual host, and takes no - arguments. It has no effect on requests from HTTP/1.1 clients. - <hr /> - - <h3 align="CENTER">Apache HTTP Server</h3> - <a href="./"><img src="images/index.gif" alt="Index" /></a> - - </p> - </body> -</html> - |