summaryrefslogtreecommitdiff
path: root/usr.bin/mandoc
AgeCommit message (Collapse)Author
2023-11-241. Do not put ASCII_HYPH (0x1c) into the tag file.Ingo Schwarze
That happened when tagging a string containing '-' on an input text line, most commonly in man(7) .TP next line scope. 2. Do not let "\-" end the tag. In both cases, translate ASCII_HYPH and "\-" to plain '-' for output. For example, this improves handling of unbound.conf(5). These two bugs were found thanks to a posting by weerd@.
2023-11-13Reduce the man(7) default global indentation from 7n, which was an oddityIngo Schwarze
in groff-1.01 to groff-1.22.4, to 5n for compatibility with Version 7 AT&T UNIX, 4.3BSD-Reno, groff-1.23.0, and all versions of mdoc(7). OK jmc@ millert@
2023-10-24Implement the man(7) .MR macro, a 2023 GNU extension.Ingo Schwarze
The syntax and semantics is almost identical to mdoc(7) .Xr. This will be needed for reading the groff manual pages once our port will be updated to 1.23, and the Linux Manual Pages Project is also determined to start using it sooner or later. I did not advocate for this new macro, but since we want to remain able to read all manual pages found in the wild, there is little choice but to support it. At least it is easy to do, they basically copied .Xr.
2023-10-23Support some escape sequences, in particular character escape sequences,Ingo Schwarze
inside \w arguments, and skip most other escape sequences when measuring the output length in this way because most escape sequences contribute little or nothing to text width: for example, consider font escapes in terminal output. This implementation is very rudimentary. In particular, it assumes that every character has the same width. No attempt is made to detect double-width or zero-width Unicode characters or to take dependencies on output devices or fonts into account. These limitations are hard to avoid because mandoc has to interpolate \w at the parsing stage when the output device is not yet known. I really do not want the content of the syntax tree to depend on the output device. Feature requested by Paul <Eggert at cs dot ucla dot edu>, who also submitted a patch, but i chose to commit this very different patch with almost the same functionality. His input was still very valuable because complete support for \w is out of the question, and consequently, the main task is identifying subsets of the feature that are needed for real-world manual pages and can be supported without uprooting the whole forest.
2023-10-22While doing delayed expansion of escape sequences in macro arguments,Ingo Schwarze
correctly check for failure of the in-place expansion function. If an argument not only does recursive delayed expansion but infinitely recursive delayed expansion, this bug could result in an ESCAPE_EXPAND assertion failure. Thanks to Eric van Gyzen <vangyzen at FreeBSD> for finding this bug by inspecting FreeBSD source code.
2023-10-21When parsing a macro argument results in delayed escape sequenceIngo Schwarze
expansion, re-check for all contained escape sequences whether they need delayed expansion, not just for the particular escape sequences that triggered delayed expansion in the first place. This is needed because delayed expansion can result in strings containing nested escape sequences recursively needing delayed expansion, too. This fixes an assertion failure in krb5_openlog(3), see: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=266882 Thanks to Wolfram Schneider <wosch at FreeBSD> for reporting the bug and to Baptiste Daroussin <bapt at FreeBSD> for forwarding the report.
2023-10-18Support the GNU-specific syntax ".IP \\[bu]" for bullet lists in man(7)Ingo Schwarze
pages that Alejandro Colomar recommends in the "Lists" subsection of https://man7.org/linux/man-pages/man7/man-pages.7.html#STYLE_GUIDE . For example, this will improve HTML formatting of the first list in the subsection "Feature test macros understood by glibc" on the page https://manpages.debian.org/bookworm/manpages/ftm.7.en.html . Issue reported by Alejandro Colomar <alx at kernel dot org>.
2023-10-18Better document the purpose and features of the file mandoc.cssIngo Schwarze
and the purpose and limitations of the embedded stylesheet. Triggered by a conversation with Alejandro Colomar <alx at kernel dot org>.
2023-09-04Fix a bug where the wrong digit was used for prioritizing filenamesIngo Schwarze
in the standard man(1) mode that formats a single resulting page if the respective manpath contained digits, like X11R6 does. Fortunately, this bug did not trigger for any Xenocara manual page.
2023-04-28Do not rewrite MAN_LP and MAN_P to MAN_PP because doing that causesIngo Schwarze
confusing warning messages complaining about macros that don't even appear in the input file. As a welcome side effect, this also shortens the code... Fixing a minibug reported by Alejandro Colomar <alx dot manpages at gmail dot com>.
2022-12-26spelling fixes; from paul tagliamonteJason McIntyre
amendments to his diff are noted on tech
2022-12-22Denote multiple arguments with 'arg ...' not 'args'Klemens Nanni
A few programs used the plural in their synopsis which doesn't read as clear as the obvious triple-dot notation. mdoc(7) .Ar defaults to "file ..." if no arguments are given and consistent use of 'arg ...' matches that behaviour. Cleanup a few markups of the same argument so the text keeps reading naturally; omit unhelpful parts like 'if optional arguments are given, they are passed along' for tools like time(1) and timeout(1) that obviously execute commands with whatever arguments where given -- just like doas(1) which doesn't mention arguments in its DESCRIPTION in the first place. For expr(1) the difference between 'expressions' and 'expression ...' is crucial, as arguments must be passed as individual words. Feedback millert jmc schwarze deraadt OK jmc
2022-09-11Finally expand and delete the macro SCALE_VS_INIT().Ingo Schwarze
It's nothing but obfuscation and only used at three places in a single file. Removing it also makes the code three lines shorter. The ugliness was already pointed out six years ago by mmcc@.
2022-08-28Stop skipping vertical space after boxed tables.Ingo Schwarze
Skipping such space used to be a bug in GNU tbl(1), and a kludge was added to mandoc to produce identical output. The bug was fixed in groff commit 8818c07c Jul 30 2022 gbranden@ https://savannah.gnu.org/bugs/index.php?49390 Consequently, now is the time to get rid of the kludge.
2022-08-28Stop unconditionally emitting vertical space before .TS (table start).Ingo Schwarze
Same change as in groff commit 7ec36dc9 Jul 30 2022 gbranden@ For more details, see https://savannah.gnu.org/bugs/index.php?62841 This change makes sense because: * It improves the formatting of more pages than it degrades. * Existing manual pages are wildly inconsistent in which behaviour they expect: apparently few manual page authors understood the old rules. * It simplifies the rules of how .TS behaves in man(7) and makes them more similar to how it behaves in mdoc(7). * It improves flexibility, making it possible for a table to immediately follow preceding text without a blank line, which some existing pages want to use, for example XCreateWindow(3).
2022-08-19Up to version 1.22.4, groff_mdoc(7) only considered the first wordIngo Schwarze
when comparing section headers. For example, ".Sh SEE ELSEWHERE" and ".Sh SEE Em ALSO" were considered instances of a SEE ALSO section. In groff-current, exact matches with no sub-macros are required. Adjust mandoc behaviour. While here, also fix a very minor mandoc bug, even though no detrimental effect of the bug on formatting is known. While using sub-macros in the .Sh HEAD is bad style, the parsers accept it, so setting the section attribute on the HEAD needs to act recursively.
2022-08-16Restore the traditional behaviour of the man(7) single-fontIngo Schwarze
macros .B, .I, .SM, and .SB that the next-line scope extends to the end of the next logical input line and is not extended if that line ends with a \c (no-space) escape sequence. While improving a loosely related feature in the man(7) .TP macro, a regression entered the groff codebase in groff commit 3549fd9f (28-Apr-2017) caused by the usual sloppiness of Bjarni Ingi Gislason. Since that time, groff wrongly had \c extend next-line scope to a second line for these macros. In man.c rev. 1.127 (25-Aug-2018) i synched mandoc behaviour with groff in this respect, unfortunately failing to notice the recent regression in groff. The groff regression was finally fixed by gbranden@ in commit 09c028f3 (07-Jun-2022). With the present commit, mandoc is back in sync with both GNU and Heirloom roff regarding the interaction of single-font macros with \c.
2022-08-16When starting a new input line, even when continuing the same outputIngo Schwarze
line, use the current output position as the reference position for tabs on that input line. This brings mandoc in line with the behaviour of GNU, Heirloom, and Plan 9 roff.
2022-08-16Even though the constant ASCII_ESC is only used in the roff pre-parser roff.c,Ingo Schwarze
move it to the top level include file mandoc.h to reduce the risk of causing clashes when introducing new ASCII_* constants in the future.
2022-08-15Simplify handling of no-fill mode in man(7) by inspecting NODE_NOFILLIngo Schwarze
at the beginning of the node handler, in the same way as it is done in the mdoc(7) node handler. As a side effect, this also fixes a bug: if an input line contained nothing but an escape sequence producing no output whatsoever (for example, \fR), the old code incorrectly emitted a blank line anyway, whereas the new code only emits such a blank link if the input line actually produces output (even invisible zero-width output). To make the distinction, the ASCII_NBRZW -> lastcol -> term_newln() mechanism established in term.c rev. 1.149 is used.
2022-08-15Distinguish between escape sequences that produce no outputIngo Schwarze
whatsoever (for example \fR) and escape sequences that produce invisible zero-width output (for example \&). No, i'm not joking, groff does make that distinction, and it has consequences in some situations, for example for vertical spacing in no-fill mode. Heirloom and Plan 9 behaviour is subtly different, but in case of doubt, we want to follow groff. While this fixes the behaviour for the majority of escape sequences, in particular for those most likely to occur in practice, it is not perfect yet because some of the more exotic ESCAPE_IGNORE sequences are actually of the "no output whatsoever" type but treated as "invisible zero-width" for now. With the new ASCII_NBRZW mechanism in place, switching them over one by one when the need arises will no longer be very difficult.
2022-08-15In GNU, Heirloom, and Plan 9 roff, tab positions apply to *input* lines,Ingo Schwarze
not to *output* lines. In particular, if an input line gets broken in fill mode and a tab occurs in the second output line, it advances to a position of at least (width of the first output line) + (width of a space character even though this is never printed) + (width of the part of the second output line that precedes the tab). Implement the same logic in mandoc. Again, do not use tabs in filled text: they have surprising effects, including this one.
2022-08-15In GNU, Heirloom, and Plan 9 roff, literal tab characters areIngo Schwarze
non-breakable in exactly the same way as "\ ". That is, the preceding word, the tab character, and the following word are always kept together on the same output line. If filling is enabled and an output line break is required before the end of the following word, the break occurs before the beginning of the preceding word. Make mandoc behave in the same way. Of course, using literal tab characters in filled text remains a bad idea, and the "WARNING: tab in filled text" remains unchanged.
2022-08-09prevent breakable hyphens in segment identifiersIngo Schwarze
from being turned into underscores; bug reported by <Eldred dot fr> Habert
2022-08-04For clarity and consistency, refer to ".Bx 4.0" rather than ".Bx 4".Ingo Schwarze
Also, mention /usr/ucb/man because /usr/bin/man did not provide -f in 4.0BSD.
2022-08-02If the body of a man(7) .MT or .UR block is empty, do not emit a warning.Ingo Schwarze
Leaving the body empty is legitimate in this case if the author only wants to display a mail address or URI without providing a link text. Output modules already handle this correctly: terminal output shows just the URI without an accompanying text, HTML output uses the URI for *both* the href= attribute and as the content of the <a> element. The documentation was also wrong and claimed that an .MT or .UR block with an empty body would produce no output. As explained above, this isn't true. Bogus warning reported by Alejandro Colomar <alx dot manpages at gmail dot com>.
2022-07-06For accessibility, label the last two widgets in the search form.Ingo Schwarze
Patch from Anna Vyalkova <cyber at sysrq dot in>, significantly tweaked by me.
2022-07-06https://www.w3.org/WAI/ARIA/apg/practices/names-and-descriptions/ says:Ingo Schwarze
"Start names with a capital letter; it helps some screen readers speak them with appropriate inflection." Anna Vyalkova already did that correctly when sending patches, but i ruined it when committing, so fix it now.
2022-07-06improve the description of header.html and footer.htmlIngo Schwarze
2022-07-06assign the ARIA role "doc-subtitle" to the .Nd element;Ingo Schwarze
discussed with Anna Vyalkova <cyber at sysrq dot in>
2022-07-06While the HTML standard allows multiple <h1> elements in the sameIngo Schwarze
document, <h1> is intended for top level headers, and most of the sections in a manual page can hardly be considered top-level. It is more usual to use <h1> only for the main title of the document of for the site name. Consequently, move .Sh/.SH from <h1> to <h2> and .Ss/.SS from <h2> to <h3>, freeing <h1> for use by header.html in man.cgi(8). Discussed with Anna Vyalkova <cyber at sysrq dot in>.
2022-07-05Finally get rid of the archaic <table> markup for header and footer linesIngo Schwarze
and use flexbox CSS instead. Improve accessibility by adding role and aria-label attributes to these header and footer lines. Using ideas from both Anna Vyalkova <cyber at sysrq dot in> and myself. As a welcome side effect, this also resolves the long-standing issue that the rendering was always 65em wide, requiring horizontal scrolling when the window was narrower. Now, rendering nicely adapts to browser windows of arbitrary narrowness.
2022-07-05Somehow, the content of header.html ended upIngo Schwarze
before and outside the <header> element. Fix this by moving it into the <header> element where it belongs. While here, also wrap footer.html in a <footer> element.
2022-07-04Improve accessibility of man.cgi(8) in various respects,Ingo Schwarze
in particular adding <header>, <main>, and <nav> elements and role and aria-label attributes in several places. Patch from Anna Vyalkova <cyber at sysrq dot in>, minimally tweaked by me.
2022-07-04Repair "make man.cgi" which got accidentally broken in the previousIngo Schwarze
commit to the Makefile. The man.cgi binary now uses roff_escape.o, too.
2022-07-04Put the HTML comment containing the Copyright header (if any)Ingo Schwarze
between the <head> and the <body> rather than before the <head> because the <meta charset="utf-8"/> element ought to be within the first 1024 bytes of the HTML code. Issue found with validator.w3.org.
2022-07-03Instead of the custom <div class="manual-text">, use the standardIngo Schwarze
HTML <main> element. The benefit is that it has the ARIA landmark role "main" by default. To ease the transition for people using their own CSS file instead of mandoc.css, retain the custom class for now. I had this idea in a discussion with Anna Vyalkova <cyber at sysrq dot in>. Patch from Anna, slightly tweaked by me.
2022-06-28spellingJonathan Gray
2022-06-26In groff commit 78e66624 on May 7 20:15:33 2021 +1000,Ingo Schwarze
G. Branden Robinson changed the -T ascii rendering of \(sd, the "second" symbol, U+2033 DOUBLE PRIME, from '' to ". Follow suit in mandoc.
2022-06-25If an .Xr macro contains a section argument, write an aria-label attributeIngo Schwarze
such that users of screen readers aren't forced to listen to lengthy and distracting readings like "mdoc, left parenthesis, 7, right parenthesis". Based on a patch from Anna Vyalkova <cyber at sysrq dot in>, significantly tweaked by me.
2022-06-24Improve accessibility of -T html -O toc output by using the <nav> elementIngo Schwarze
in the DPUB-ARIA doc-toc role. Patch from Anna Vyalkova <cyber at sysrq dot in> slightly tweaked by me. This is hopefully the start of a collaboration to improve accessibility of Unix manual pages using the WAI-ARIA, HTML-ARIA, and DPUB-ARIA standards. Progress appears to be possible without changing *anything* with respect to the way manual pages are written. Instead, it seems sufficient to properly translate semantic cues already implied by existing mdoc(7) markup into the appropriate HTML elements and ARIA attributes. Overall, the total length of HTML output is likely to increase slightly, but not much.
2022-06-22Delete the statement that the default stylesheet only used CSS1Ingo Schwarze
because that has no longer been true for some time now. I would certainly like to adhere to a coherent standard and state which one that is. Unfortunately, the W3C deliberately smashed the CSS standard into pieces such that a coherent standard no longer exists and such that statements about standard conformance have become next to meaningless. Consequently, i now remain reluctantly silent regarding CSS standard(s) conformance. Going back to CSS2.1, published in 2011, which was the last CSS standard in the proper sense of the word, is not an option because it has gaping holes in functionality and is no longer adequate for use on today's WWW.
2022-06-08When looking for the next block to tag, we aren't interested in childrenIngo Schwarze
of the current block but really want the next block instead. This fixes a segfault reported by Evan Silberman <evan at jklol dot net> on bugs@.
2022-06-08Surprisingly, every escape sequence can also be used as an argumentIngo Schwarze
delimiter for an outer escape sequence, in which case the delimiting escape sequence retains its syntax but usually ignores its argument and loses its inherent effect. Add rudimentary support for this syntax quirk in order to improve parsing compatibility with groff.
2022-06-07Split the excessively generic diagnostic message "invalid escape sequence"Ingo Schwarze
into the more specific messages "invalid escape argument delimiter" and "invalid escape sequence argument".
2022-06-07Purge duplicate error reporting from the .tr request parser:Ingo Schwarze
the error was already reported earlier when roff_expand() called roff_escape().
2022-06-06To better match groff parsing, reject digits and some mathematicalIngo Schwarze
operators as argument delimiters for some escape sequences that take numerical arguments, in the same way as it had already been done for \h. Argument delimiter parsing for escape sequences taking numerical arguments is not perfect yet. In particular, when a character representing a scaling unit is abused as the argument delimiter, parsing for that character becomes context-dependent, and it is no longer possible to find the end of the escape sequence without calling the full numerical expression parser, which i refrain from attempting in this commit. For now, continuing to misparse insane constructions like \Bc1c+1cc (which is valid in groff and resolves to "1" because 1c+1c = two centimeters is a valid numerical expression and 'c' is also a valid delimiter) is a small price to pay for keeping complexity at bay and for not losing focus in the ongoing series of refinements.
2022-06-06Allow arbitrary argument delimiters for \C, like groff does.Ingo Schwarze
The restriction of only allowing ' as the delimiter was introduced by kristaps@ on 2011/04/09 when he first supported \C. For most other escape sequences, similar restrictions were relaxed later on, but for the rarely used \C, it was apparently forgotten. While here, reject empty character names: they are never valid.
2022-06-05With the improved escape sequence parser, it becomes easy to also improveIngo Schwarze
diagnostics. Distinguish "incomplete escape sequence", "invalid special character", and "unknown special character" from the generic "invalid escape sequence", also promoting them from WARNING to ERROR because incomplete escape sequences are severe syntax violations and because encountering an invalid or unknown special character makes it likely that part of the document content intended by the authors gets lost.
2022-06-05Small cleanup of error reporting:Ingo Schwarze
call mandoc_msg() only once at the end, not sometimes in the middle, classify incomplete, non-expanding escape sequences as ESCAPE_ERROR, and also reduce the number of return statemants; no formatting change intended.