diff options
author | Todd C. Miller <millert@cvs.openbsd.org> | 2002-10-27 22:15:11 +0000 |
---|---|---|
committer | Todd C. Miller <millert@cvs.openbsd.org> | 2002-10-27 22:15:11 +0000 |
commit | a46685421d59e1cc6b69c65dca0dc74032488989 (patch) | |
tree | f41fc38f3ddd84019e6f622d78b4dfcd2a15e121 /gnu/usr.bin/perl/pod | |
parent | 1471789f8ef043699698d481c476036544ec944b (diff) |
stock perl 5.8.0 from CPAN
Diffstat (limited to 'gnu/usr.bin/perl/pod')
-rw-r--r-- | gnu/usr.bin/perl/pod/perlpodspec.pod | 219 |
1 files changed, 89 insertions, 130 deletions
diff --git a/gnu/usr.bin/perl/pod/perlpodspec.pod b/gnu/usr.bin/perl/pod/perlpodspec.pod index 67f74b629b9..73872586343 100644 --- a/gnu/usr.bin/perl/pod/perlpodspec.pod +++ b/gnu/usr.bin/perl/pod/perlpodspec.pod @@ -1,4 +1,3 @@ -=encoding utf8 =head1 NAME @@ -31,7 +30,7 @@ it implicates that such an option I<may> be provided. =head1 Pod Definitions -Pod is embedded in files, typically Perl source files, although you +Pod is embedded in files, typically Perl source files -- although you can write a file that's nothing but Pod. A B<line> in a file consists of zero or more non-newline characters, @@ -50,7 +49,7 @@ A B<non-blank line> is a line containing one or more characters other than space or tab (and terminated by a newline or end-of-file). (I<Note:> Many older Pod parsers did not accept a line consisting of -spaces/tabs and then a newline as a blank line. The only lines they +spaces/tabs and then a newline as a blank line -- the only lines they considered blank were lines consisting of I<no characters at all>, terminated by a newline.) @@ -66,12 +65,12 @@ directly formatting it). A B<Pod formatter> (or B<Pod translator>) is a module or program that converts Pod to some other format (HTML, plaintext, TeX, PostScript, RTF). A B<Pod processor> might be a formatter or translator, or might be a program that does something -else with the Pod (like counting words, scanning for index points, +else with the Pod (like wordcounting it, scanning for index points, etc.). Pod content is contained in B<Pod blocks>. A Pod block starts with a line that matches <m/\A=[a-zA-Z]/>, and continues up to the next line -that matches C<m/\A=cut/> or up to the end of the file if there is +that matches C<m/\A=cut/> -- or up to the end of the file, if there is no C<m/\A=cut/> line. =for comment @@ -133,7 +132,7 @@ I<Some> command paragraphs allow formatting codes in their content In other words, the Pod processing handler for "head1" will apply the same processing to "Did You Remember to CE<lt>use strict;>?" that it -would to an ordinary paragraph (i.e., formatting codes like +would to an ordinary paragraph -- i.e., formatting codes (like "CE<lt>...>") are parsed and presumably formatted appropriately, and whitespace in the form of literal spaces and/or tabs is not significant. @@ -190,7 +189,7 @@ is a verbatim paragraph, because its first line starts with a literal whitespace character (and there's no "=begin"..."=end" region around). The "=begin I<identifier>" ... "=end I<identifier>" commands stop -paragraphs that they surround from being parsed as ordinary or verbatim +paragraphs that they surround from being parsed as data or verbatim paragraphs, if I<identifier> doesn't begin with a colon. This is discussed in detail in the section L</About Data Paragraphs and "=beginE<sol>=end" Regions>. @@ -239,7 +238,7 @@ ignored. Examples: # This is the first line of program text. sub foo { # This is the second. -It is an error to try to I<start> a Pod block with a "=cut" command. In +It is an error to try to I<start> a Pod black with a "=cut" command. In that case, the Pod processor must halt parsing of the input file, and must by default emit a warning. @@ -294,8 +293,6 @@ by the most recent "=over" command. It permits no text after the =item "=begin formatname" -=item "=begin formatname parameter" - This marks the following paragraphs (until the matching "=end formatname") as being for some special kind of processing. Unless "formatname" begins with a colon, the contained non-command @@ -305,11 +302,9 @@ or data paragraphs. This is discussed in detail in the section L</About Data Paragraphs and "=beginE<sol>=end" Regions>. It is advised that formatnames match the regexp -C<m/\A:?[-a-zA-Z0-9_]+\z/>. Everything following whitespace after the -formatname is a parameter that may be used by the formatter when dealing -with this region. This parameter must not be repeated in the "=end" -paragraph. Implementors should anticipate future expansion in the -semantics and syntax of the first parameter to "=begin"/"=end"/"=for". +C<m/\A:?[-a-zA-Z0-9_]+\z/>. Implementors should anticipate future +expansion in the semantics and syntax of the first parameter +to "=begin"/"=end"/"=for". =item "=end formatname" @@ -337,29 +332,6 @@ then "text..." will constitute a data paragraph. There is no way to use "=for formatname text..." to express "text..." as a verbatim paragraph. -=item "=encoding encodingname" - -This command, which should occur early in the document (at least -before any non-US-ASCII data!), declares that this document is -encoded in the encoding I<encodingname>, which must be -an encoding name that L<Encode> recognizes. (Encode's list -of supported encodings, in L<Encode::Supported>, is useful here.) -If the Pod parser cannot decode the declared encoding, it -should emit a warning and may abort parsing the document -altogether. - -A document having more than one "=encoding" line should be -considered an error. Pod processors may silently tolerate this if -the not-first "=encoding" lines are just duplicates of the -first one (e.g., if there's a "=encoding utf8" line, and later on -another "=encoding utf8" line). But Pod processors should complain if -there are contradictory "=encoding" lines in the same document -(e.g., if there is a "=encoding utf8" early in the document and -"=encoding big5" later). Pod processors that recognize BOMs -may also complain if they see an "=encoding" line -that contradicts the BOM (e.g., if a document with a UTF-16LE -BOM has an "=encoding shiftjis" line). - =back If a Pod processor sees any command other than the ones listed @@ -416,7 +388,7 @@ formatting code. Examples: B<< $foo->bar(); >> With this syntax, the whitespace character(s) after the "CE<lt><<" -and before the ">>" (or whatever letter) are I<not> renderable. They +and before the ">>" (or whatever letter) are I<not> renderable -- they do not signify whitespace, are merely part of the formatting codes themselves. That is, these are all synonymous: @@ -430,18 +402,6 @@ themselves. That is, these are all synonymous: and so on. -Finally, the multiple-angle-bracket form does I<not> alter the interpretation -of nested formatting codes, meaning that the following four example lines are -identical in meaning: - - B<example: C<$a E<lt>=E<gt> $b>> - - B<example: C<< $a <=> $b >>> - - B<example: C<< $a E<lt>=E<gt> $b >>> - - B<<< example: C<< $a E<lt>=E<gt> $b >> >>> - =back In parsing Pod, a notably tricky part is the correct parsing of @@ -503,7 +463,7 @@ L</Notes on Implementing Pod Processors>. This formatting code is syntactically simple, but semantically complex. What it means is that each space in the printable -content of this code signifies a non-breaking space. +content of this code signifies a nonbreaking space. Consider: @@ -514,7 +474,7 @@ Consider: Both signify the monospace (c[ode] style) text consisting of "$x", one space, "?", one space, ":", one space, "$z". The difference is that in the latter, with the S code, those spaces -are not "normal" spaces, but instead are non-breaking spaces. +are not "normal" spaces, but instead are nonbreaking spaces. =back @@ -539,7 +499,7 @@ a "-". This was so that this: would parse as equivalent to this: - C<$foo-E<gt>bar> + C<$foo-E<lt>bar> instead of as equivalent to a "C" formatting code containing only "$foo-", and then a "bar>" outside the "C" formatting code. This @@ -629,7 +589,7 @@ UTF-16. If the file begins with the three literal byte values 0xEF 0xBB 0xBF =for comment - If toke.c is modified to support UTF-32, add mention of those here. + If toke.c is modified to support UTF32, add mention of those here. =item * @@ -684,13 +644,13 @@ text identifying its name and version number, and the name and version numbers of any modules it might be using to process the Pod. Minimal examples: - %% POD::Pod2PS v3.14159, using POD::Parser v1.92 + %% POD::Pod2PS v3.14159, using POD::Parser v1.92 - <!-- Pod::HTML v3.14159, using POD::Parser v1.92 --> + <!-- Pod::HTML v3.14159, using POD::Parser v1.92 --> - {\doccomm generated by Pod::Tree::RTF 3.14159 using Pod::Tree 1.08} + {\doccomm generated by Pod::Tree::RTF 3.14159 using Pod::Tree 1.08} - .\" Pod::Man version 3.14159, using POD::Parser version 1.92 + .\" Pod::Man version 3.14159, using POD::Parser version 1.92 Formatters may also insert additional comments, including: the release date of the Pod formatter program, the contact address for @@ -741,7 +701,7 @@ period-space-space or period-newline sequences). Pod parsers should not, by default, try to coerce apostrophe (') and quote (") into smart quotes (little 9's, 66's, 99's, etc), nor try to turn backtick (`) into anything else but a single backtick character -(distinct from an open quote character!), nor "--" into anything but +(distinct from an openquote character!), nor "--" into anything but two minus signs. They I<must never> do any of those things to text in CE<lt>...> formatting codes, and never I<ever> to text in verbatim paragraphs. @@ -749,10 +709,10 @@ paragraphs. =item * When rendering Pod to a format that has two kinds of hyphens (-), one -that's a non-breaking hyphen, and another that's a breakable hyphen +that's a nonbreaking hyphen, and another that's a breakable hyphen (as in "object-oriented", which can be split across lines as "object-", newline, "oriented"), formatters are encouraged to -generally translate "-" to non-breaking hyphen, but may apply +generally translate "-" to nonbreaking hyphen, but may apply heuristics to convert some of these to breaking hyphens. =item * @@ -976,7 +936,7 @@ for idiosyncratic mappings of Unicode-to-I<my_escapes>. =item * -It is up to individual Pod formatter to display good judgement when +It is up to individual Pod formatter to display good judgment when confronted with an unrenderable character (which is distinct from an unknown EE<lt>thing> sequence that the parser couldn't resolve to anything, renderable or not). It is good practice to map Latin letters @@ -1009,15 +969,15 @@ EE<lt>euro>1,000,000 Solution|Million::Euros>". =item * -Some Pod formatters output to formats that implement non-breaking +Some Pod formatters output to formats that implement nonbreaking spaces as an individual character (which I'll call "NBSP"), and -others output to formats that implement non-breaking spaces just as +others output to formats that implement nonbreaking spaces just as spaces wrapped in a "don't break this across lines" code. Note that at the level of Pod, both sorts of codes can occur: Pod can contain a NBSP character (whether as a literal, or as a "EE<lt>160>" or "EE<lt>nbsp>" code); and Pod can contain "SE<lt>foo IE<lt>barE<gt> baz>" codes, where "mere spaces" (character 32) in -such codes are taken to represent non-breaking spaces. Pod +such codes are taken to represent nonbreaking spaces. Pod parsers should consider supporting the optional parsing of "SE<lt>foo IE<lt>barE<gt> baz>" as if it were "fooI<NBSP>IE<lt>barE<gt>I<NBSP>baz", and, going the other way, the @@ -1134,20 +1094,20 @@ link text. Note that link text may contain formatting.) =item Second: -The possibly inferred link-text; i.e., if there was no real link +The possibly inferred link-text -- i.e., if there was no real link text, then this is the text that we'll infer in its place. (E.g., for "LE<lt>Getopt::Std>", the inferred link text is "Getopt::Std".) =item Third: The name or URL, or undef if none. (E.g., in "LE<lt>Perl -Functions|perlfunc>", the name (also sometimes called the page) +Functions|perlfunc>", the name -- also sometimes called the page -- is "perlfunc". In "LE<lt>/CAVEATS>", the name is undef.) =item Fourth: The section (AKA "item" in older perlpods), or undef if none. E.g., -in "LE<lt>Getopt::Std/DESCRIPTIONE<gt>", "DESCRIPTION" is the section. (Note +in L<Getopt::Std/DESCRIPTION>, "DESCRIPTION" is the section. (Note that this is not the same as a manpage section like the "5" in "man 5 crontab". "Section Foo" in the Pod sense means the part of the text that's introduced by the heading or item whose text is "Foo".) @@ -1178,61 +1138,52 @@ a requirement that these be passed as an actual list or array.) For example: L<Foo::Bar> - => undef, # link text - "Foo::Bar", # possibly inferred link text - "Foo::Bar", # name - undef, # section - 'pod', # what sort of link - "Foo::Bar" # original content + => undef, # link text + "Foo::Bar", # possibly inferred link text + "Foo::Bar", # name + undef, # section + 'pod', # what sort of link + "Foo::Bar" # original content L<Perlport's section on NL's|perlport/Newlines> - => "Perlport's section on NL's", # link text - "Perlport's section on NL's", # possibly inferred link text - "perlport", # name - "Newlines", # section - 'pod', # what sort of link - "Perlport's section on NL's|perlport/Newlines" - # original content + => "Perlport's section on NL's", # link text + "Perlport's section on NL's", # possibly inferred link text + "perlport", # name + "Newlines", # section + 'pod', # what sort of link + "Perlport's section on NL's|perlport/Newlines" # orig. content L<perlport/Newlines> - => undef, # link text - '"Newlines" in perlport', # possibly inferred link text - "perlport", # name - "Newlines", # section - 'pod', # what sort of link - "perlport/Newlines" # original content + => undef, # link text + '"Newlines" in perlport', # possibly inferred link text + "perlport", # name + "Newlines", # section + 'pod', # what sort of link + "perlport/Newlines" # original content L<crontab(5)/"DESCRIPTION"> - => undef, # link text - '"DESCRIPTION" in crontab(5)', # possibly inferred link text - "crontab(5)", # name - "DESCRIPTION", # section - 'man', # what sort of link - 'crontab(5)/"DESCRIPTION"' # original content + => undef, # link text + '"DESCRIPTION" in crontab(5)', # possibly inferred link text + "crontab(5)", # name + "DESCRIPTION", # section + 'man', # what sort of link + 'crontab(5)/"DESCRIPTION"' # original content L</Object Attributes> - => undef, # link text - '"Object Attributes"', # possibly inferred link text - undef, # name - "Object Attributes", # section - 'pod', # what sort of link - "/Object Attributes" # original content + => undef, # link text + '"Object Attributes"', # possibly inferred link text + undef, # name + "Object Attributes", # section + 'pod', # what sort of link + "/Object Attributes" # original content L<http://www.perl.org/> - => undef, # link text - "http://www.perl.org/", # possibly inferred link text - "http://www.perl.org/", # name - undef, # section - 'url', # what sort of link - "http://www.perl.org/" # original content - - L<Perl.org|http://www.perl.org/> - => "Perl.org", # link text - "http://www.perl.org/", # possibly inferred link text - "http://www.perl.org/", # name - undef, # section - 'url', # what sort of link - "Perl.org|http://www.perl.org/" # original content + => undef, # link text + "http://www.perl.org/", # possibly inferred link text + "http://www.perl.org/", # name + undef, # section + 'url', # what sort of link + "http://www.perl.org/" # original content Note that you can distinguish URL-links from anything else by the fact that they match C<m/\A\w+:[^:\s]\S*\z/>. So @@ -1302,14 +1253,23 @@ browsers to decide. =item * +Authors wanting to link to a particular (absolute) URL, must do so +only with "LE<lt>scheme:...>" codes (like +LE<lt>http://www.perl.org>), and must not attempt "LE<lt>Some Site +Name|scheme:...>" codes. This restriction avoids many problems +in parsing and rendering LE<lt>...> codes. + +=item * + In a C<LE<lt>text|...E<gt>> code, text may contain formatting codes for formatting or for EE<lt>...> escapes, as in: L<B<ummE<234>stuff>|...> For C<LE<lt>...E<gt>> codes without a "name|" part, only -C<EE<lt>...E<gt>> and C<ZE<lt>E<gt>> codes may occur. That is, -authors should not use "C<LE<lt>BE<lt>Foo::BarE<gt>E<gt>>". +C<EE<lt>...E<gt>> and C<ZE<lt>E<gt>> codes may occur -- no +other formatting codes. That is, authors should not use +"C<LE<lt>BE<lt>Foo::BarE<gt>E<gt>>". Note, however, that formatting codes and ZE<lt>>'s can occur in any and all parts of an LE<lt>...> (i.e., in I<name>, I<section>, I<text>, @@ -1336,29 +1296,28 @@ that case, formatters will have to just ignore that formatting. At time of writing, C<LE<lt>nameE<gt>> values are of two types: either the name of a Pod page like C<LE<lt>Foo::BarE<gt>> (which might be a real Perl module or program in an @INC / PATH -directory, or a .pod file in those places); or the name of a Unix +directory, or a .pod file in those places); or the name of a UNIX man page, like C<LE<lt>crontab(5)E<gt>>. In theory, C<LE<lt>chmodE<gt>> in ambiguous between a Pod page called "chmod", or the Unix man page "chmod" (in whatever man-section). However, the presence of a string in parens, as in "crontab(5)", is sufficient to signal that what is being discussed is not a Pod page, and so is presumably a -Unix man page. The distinction is of no importance to many +UNIX man page. The distinction is of no importance to many Pod processors, but some processors that render to hypertext formats may need to distinguish them in order to know how to render a given C<LE<lt>fooE<gt>> code. =item * -Previous versions of perlpod allowed for a C<LE<lt>sectionE<gt>> syntax (as in -C<LE<lt>Object AttributesE<gt>>), which was not easily distinguishable from -C<LE<lt>nameE<gt>> syntax and for C<LE<lt>"section"E<gt>> which was only -slightly less ambiguous. This syntax is no longer in the specification, and -has been replaced by the C<LE<lt>/sectionE<gt>> syntax (where the slash was -formerly optional). Pod parsers should tolerate the C<LE<lt>"section"E<gt>> -syntax, for a while at least. The suggested heuristic for distinguishing -C<LE<lt>sectionE<gt>> from C<LE<lt>nameE<gt>> is that if it contains any -whitespace, it's a I<section>. Pod processors should warn about this being -deprecated syntax. +Previous versions of perlpod allowed for a C<LE<lt>sectionE<gt>> syntax +(as in "C<LE<lt>Object AttributesE<gt>>"), which was not easily distinguishable +from C<LE<lt>nameE<gt>> syntax. This syntax is no longer in the +specification, and has been replaced by the C<LE<lt>"section"E<gt>> syntax +(where the quotes were formerly optional). Pod parsers should tolerate +the C<LE<lt>sectionE<gt>> syntax, for a while at least. The suggested +heuristic for distinguishing C<LE<lt>sectionE<gt>> from C<LE<lt>nameE<gt>> +is that if it contains any whitespace, it's a I<section>. Pod processors +may warn about this being deprecated syntax. =back @@ -1559,7 +1518,7 @@ probably want to format it like so: Ut Enim -But (for the foreseeable future), Pod does not provide any way for Pod +But (for the forseeable future), Pod does not provide any way for Pod authors to distinguish which grouping is meant by the above "=item"-cluster structure. So formatters should format it like so: @@ -1879,7 +1838,7 @@ currently open region has the formatname "inner", not "outer". (It just happens that "outer" is the format name of a higher-up region.) This is an error. Processors must by default report this as an error, and may halt processing the document containing that error. A corollary of this is that -regions cannot "overlap". That is, the latter block above does not represent +regions cannot "overlap" -- i.e., the latter block above does not represent a region called "outer" which contains X and Y, overlapping a region called "inner" which contains Y and Z. But because it is invalid (as all apparently overlapping regions would be), it doesn't represent that, or |