summaryrefslogtreecommitdiff
path: root/usr.bin/mandoc/mdoc_validate.c
AgeCommit message (Collapse)Author
2024-09-20remove unneeded semicolons; checked by millert@Jonathan Gray
2022-06-08When looking for the next block to tag, we aren't interested in childrenIngo Schwarze
of the current block but really want the next block instead. This fixes a segfault reported by Evan Silberman <evan at jklol dot net> on bugs@.
2021-10-04store the operating system name obtained from uname(3) in the adequateIngo Schwarze
struct together with similar state date rather than in a function-scope static variable, such that it can be free(3)d in roff_man_free(); no functional change
2021-07-18Support auto-tagging for ".It Va".Ingo Schwarze
This combination is somewhat rare because few libraries expose so many global variables that they need a list to enumerate them, but when the idiom does occur, tagging the variable names is generally useful. For example, this helps awk(1), dc(1), make(1), rc.subr(8), ... Missing feature reported and patch reviewed, tested, and OK'ed by kn@.
2020-10-30Promote section headers that can can be used unmodified as fragmentIngo Schwarze
identifiers from TAG_WEAK to TAG_STRONG, such that for example ...#DESCRIPTION always works. Suggested by Aman Verma on the discuss@ list.
2020-04-26While we do not recommend the idiom ".Fl Fl long" for long optionsIngo Schwarze
because it is an abuse of semantic macros for device-specific presentational effects, this idiom is so widespread that it makes sense to convert it to the recommended ".Fl \-long" during the validation phase. For example, this improves HTML formatting in pages where authors have used the dubious .Fl Fl. Feature suggested by Steffen Nurpmeso <steffen at sdaoden dot eu> on freebsd-hackers.
2020-04-24provide a STYLE message when mandoc knows the file name and the extensionIngo Schwarze
disagrees with the section number given in the .Dt or .TH macro; feature suggested and patch tested by jmc@
2020-04-18When a .Tg is attached to a paragraph, attach the permalinkIngo Schwarze
to the first word, or the first few words if they are short.
2020-04-08Use a separate node->tag attribute rather than abusing the node->stringIngo Schwarze
attribute for the purpose. No functional change intended. The purpose is to make it possible to later attach tags to text nodes.
2020-04-06Support manual tagging of .Pp, .Bd, .D1, .Dl, .Bl, and .It.Ingo Schwarze
In HTML output, improve the logic for writing inside permalinks: skip them when there is no child content or when there is a risk that the children might contain flow content.
2020-04-02Copy tagged strings before marking hyphens as breakable.Ingo Schwarze
For example, this makes ":tCo-processes" work in ksh(1).
2020-04-01Just like we are already doing it in HTML output, automatically tagIngo Schwarze
section and subsection headers in terminal output, too. Even though admittedly, commands like "/SEE" and "/ Subsec" work, too, there is no downside, and besides, with the recent improvements in the tagging framework, implementation cost is negligible.
2020-03-13Split tagging into a validation part including prioritizationIngo Schwarze
in tag.{h,c} and {mdoc,man}_validate.c and into a formatting part including command line argument checking in term_tag.{h,c}, html.c, and {mdoc|man}_{term|html}.c. Immediate functional benefits include: * Improved prioritization of automatic tags for .Em and .Sy. * Avoiding bogus automatic tags when .Em, .Fn, or .Sy are explicitly tagged. * Explicit tagging of .Er and .Fl now works in HTML output. * Automatic tagging of .IP and .TP now works in HTML output. But mainly, this patch provides clean earth to build further improvements on. Technical changes: * Main program: Write a tag file for ASCII and UTF-8 output only. * All formatters: There is no more need to delay writing the tags. * mdoc(7)+man(7) formatters: No more need for elaborate syntax tree inspection. * HTML formatter: If available, use the "string" attribute as the tag. * HTML formatter: New function to write permalinks, to reduce code duplication. Style cleanup in the vicinity while here: * mdoc(7) terminal formatter: To set up bold font for children, defer to termp_bold_pre() rather than calling term_fontpush() manually. * mdoc(7) terminal formatter: Garbage collect some duplicate functions. * mdoc(7) HTML formatter: Unify <code> handling, delete redundant functions. * Where possible, use switch statements rather than if cascades. * Get rid of some more Yoda notation. The necessity for such changes was first discussed with kn@, but i didn't bother him with a request to review the resulting -673/+782 line patch.
2020-02-27Fully support explicit tagging of .Sh and .Ss.Ingo Schwarze
This fixes the offset of two lines in terminal output and this improves HTML output by putting the id= attribute and <a> element into the respective <h1> or <h2> element rather than writing an additional <mark> element. To that end, introduce node flags NODE_ID (to make the node a link target, for example by writing an HTML id= attribute or by calling tag_put()) and NODE_HREF (to make the node a link source, used only in HTML output, used only to write an <a class="permalink"> element). In particular: * In the validator, generalize the concept of the "next node" such that it also works before .Sh and .Ss. * If the first argument of .Tg is empty, don't forget to complain if there are additional arguments, which will be ignored. * In the terminal formatter, support writing of explicit tags for all kinds of nodes, not just for .Tg. * In deroff(), allow nodes to have an explicit string representation even when they aren't text nodes. Use this for explicitly tagged section headers. Suprisingly, this is sufficient to make HTML output work, without explicit code changes in the HTML formatter. * In syntax tree output, display NODE_ID and NODE_HREF.
2020-02-27Introduce the concept of nodes that are semantically transparent:Ingo Schwarze
they are skipped when looking for previous or following high-level macros. Examples include roff(7) .ft, .ll, and .ta, mdoc(7) .Sm and .Tg, and man(7) .DT and .PD. Use this concept for a variety of improved decisions in various validators and formatters. While here, * remove a few const qualifiers on struct arguments that caused trouble; * get rid of some more Yoda notation in the vicinity; * and apply some other stylistic improvements in the vicinity. I found this class of issues while considering .Tg patches from kn@.
2020-01-19Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a placeIngo Schwarze
as defining a term. Please only use it when automatic tagging does not work. Manual page authors will not be required to add the new macro; using it remains optional. HTML output is still rudimentary in this version and will be polished later. Thanks to kn@ for reminding me that i have been considering since BSDCan 2014 whether something like this might be useful. Given that possibilities of making automatic tagging better are running out and there are still several situations where automatic tagging cannot do the job, i think the time is now ripe. Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.
2020-01-19Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:Ingo Schwarze
without an argument, use the empty string, and always concatenate all arguments, no matter their number. This allows reducing the number of arguments of mandoc_normdate() and some other simplifications, at the same time polishing some error messages by adding the name of the macro in question.
2019-09-13Improve validation of function names:Ingo Schwarze
1. Relax checking to accept function types of the form "ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>). 2. Tighten checking to require the closing parenthesis.
2019-06-27Fix mandoc_normdate() and the way it is used.Ingo Schwarze
In the past, it could return NULL but the calling code wasn't prepared to handle that. Make sure it always returns an allocated string. While here, simplify the code by handling the "quick" attribute inside mandoc_normdate() rather than at multiple callsites. Triggered by deraadt@ pointing out that snprintf(3) error handling was incomplete in time2a().
2019-03-13Contrary to what the NetBSD attribute(3) manual page suggests,Ingo Schwarze
using __dead instead of __attribute__((__noreturn__)) actually hinders portability rather than helping it. Given that mandoc already uses __attribute__ in several files and that in the portable version, ./configure already contains rudimentary support for ignoring it on platforms that do not support it, use __attribute__ directly. This is expected to fix build failures that Stephen Gregoratto <dev at sgregoratto dot me> reported from Arch and Debian Linux.
2019-03-11mark check_abort() and post_abort() as __dead;Ingo Schwarze
based on a patch by Christos@ Zoulas at NetBSD
2019-03-04When the -S option is given to man(1) and the requested manual pageIngo Schwarze
name is not found and the requested architecture is unknown, complain about the architecture rather than about the manual page name: $ man -S vax cpu man: Unknown architecture "vax". $ man -S sparc64 foobar man: No entry for foobar in the manual. Friendlier error message suggested by jmc@, who also OK'ed the patch.
2019-03-04Fix the last straggler where the struct roff_node "line" memberIngo Schwarze
was abused to detect an input line break; instead, use the NODE_LINE flag to improve robustness.
2018-12-31Use the new flag NODE_NOFILL in the validators, which is sometimesIngo Schwarze
simpler and always more robust. In particular, move the nesting warnings for .EX and .EE from man_state(), where they were misplaced, to the man(7) validator.
2018-12-31Cleanup, no functional change:Ingo Schwarze
Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too, instead of the old MDOC_LITERAL, which was an alias for the former MAN_LITERAL.
2018-12-31Cleanup, minus 15 LOC, no functional change:Ingo Schwarze
Simplify the way the man(7) and mdoc(7) validators are called. Reset the parser state with a common function before calling them. There is no need to again reset the parser state afterwards, the parsers are no longer used after validation. This allows getting rid of man_node_validate() and mdoc_node_validate() as separate functions.
2018-12-30Cleanup, no functional change:Ingo Schwarze
The struct roff_man used to be a bad mixture of internal parser state and public parsing results. Move the public results to the parsing result struct roff_meta, which is already public. Move the rest of struct roff_man to the parser-internal header roff_int.h. Since the validators need access to the parser state, call them from the top level parser during mparse_result() rather than from the main programs, also reducing code duplication. This keeps parser internal state out of thee main programs (five in mandoc portable) and out of eight formatters.
2018-12-14Almost mechanical diff to remove the "struct mparse *" argumentIngo Schwarze
from mandoc_msg(), where it is no longer used. While here, rename mandoc_vmsg() to mandoc_msg() and retire the old version: There is really no point in having another function merely to save "%s" in a few places. Minus 140 lines of code.
2018-12-04Clean up the validation of .Pp, .PP, .sp, and .br. Make sure allIngo Schwarze
combinations are handled, and are handled in a systematic manner. This resolves some erratic duplicate handling, handles a number of missing cases, and improves diagnostics in various respects. Move validation of .br and .sp to the roff validation module rather than doing that twice in the mdoc and man validation modules. Move the node relinking function to the roff library where it belongs. In validation functions, only look at the node itself, at previous nodes, and at descendants, not at following nodes or ancestors, such that only nodes are inspected which are already validated.
2018-12-03In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)Ingo Schwarze
to the standard forms (Pp, Ft, PP) up front, such that later code does not need to look for the obsolete versions. This reduces the risk of incomplete handling.
2018-08-17Remove more pointer arithmetic passing via regions outside the arrayIngo Schwarze
that is undefined according to the C standard. Robert Elz <kre at munnari dot oz dot au> pointed out i wasn't quite done yet.
2018-08-16Do not calculate a pointer to a memory location before the beginning ofIngo Schwarze
a static array. Christos Zoulas, Robert Elz, and Andreas Gustafsson point out that is undefined behaviour by the C standard even if we never access the pointer.
2018-08-01Fix an off-by-one string read access that could happen if an emptyIngo Schwarze
string argument preceded a string argument beginning with "--". Found by Leah Neukirchen <leah at vuxu dot org> with -Wpointer-compare.
2018-08-01Avoid a read access one byte beyond the end of an allocated stringIngo Schwarze
which occurred in situations like ".Fl a Cm --"; found by Leah Neukirchen <leah at vuxu dot org> with valgrind on Void Linux.
2018-04-11preserve comments before .Dd when converting mdoc(7) to man(7)Ingo Schwarze
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>
2018-04-05use the portable \(lq and \(rq internally rather than \(Lq and \(RqIngo Schwarze
2018-03-16Ouch, fix previous: In the edge case of a single-character stringIngo Schwarze
containing nothing but a single hyphen, the pointer got incremented twice at one point, causing a read overrun found by naddy@.
2018-03-16Style message about bad input encoding of em-dashes as -- instead of \(em.Ingo Schwarze
Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.
2018-02-06Delete the "no blank before trailing delimiter" check from theIngo Schwarze
partial explicit macros. Leah Neukirchen <leah at vuxu dot org> rightfully points out that the check makes no sense for these macros.
2017-09-12Do not segfault when there are two .Dt macros, the first withoutIngo Schwarze
an architecture argument and the second with an invalid one. Bug found by jsg@ with afl(1).
2017-08-02No longer use names that only occur in the SYNOPSIS section as namesIngo Schwarze
for man(1) lookup. For OpenBSD base and Xenocara, that functionality was never intended to be required, and i just fixed the last handful of offenders using it - not counting the horribly ill-designed interfaces engine(3) and lh_new(3) which are impossible to properly document in the first place. Of course, apropos(1) and whatis(1) continue to use SYNOPSIS .Nm, .Fn, and .Fo macros, so "man -k ENGINE_get_load_privkey_function" still works. This change also gets rid of a few bogus warnings "cross reference to self" which actually are *not* to self, like in yp(8). This former functionality was intended to help third-party software in the ports tree and on non-OpenBSD systems containing manual pages with incomplete or corrupt NAME sections. But it turned out it did more harm than good, and caused more confusion than relief, specifically for third party manuals and for maintainers of mandoc-portable on other operating systems. So kill it. Problems reported, among others, by Yuri Pankov (illumos). OK jmc@
2017-07-31Fix an out of bounds read access to a constant array that causedIngo Schwarze
segfaults on certain hardened versions of glibc. Triggered by .sp or blank lines right before .SS or .SH, or before the first .Sh. Found the hard way by Dr. Markus Waldner on Debian and by Leah Neukirchen on Void Linux.
2017-07-20correctly handle letters in .Nx arguments; improves for exampleIngo Schwarze
getpgid(2), ac(8), ldconfig(8), mount_ffs(8), sa(8), ttyflags(8), ...
2017-07-15If -column, -diag, -inset, -item, or -ohang lists have a -width,Ingo Schwarze
don't just talk about ignoring it, actually do ignore it. No change for terminal output, improves HTML output.
2017-07-03report trailing delimiters after macros where they are usually a mistake;Ingo Schwarze
the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>
2017-07-02add warning "cross reference to self"; inspired by mdoclintIngo Schwarze
2017-07-01Basic reporting of .Xrs to manual pages that don't existIngo Schwarze
in the base system, inspired by mdoclint(1). We are able to do this because (1) the -mdoc parser, the -Tlint validator, and the man(1) manual page lookup code are all in the same program and (2) the mandoc.db(5) database format allows fast lookup. Feedback from, previous versions tested by, and OK jmc@. A few features will be added to this in the tree, step by step.
2017-06-29warn about some non-portable idioms in .Bl -column;Ingo Schwarze
triggered by a question from Yuri Pankov (illumos)
2017-06-27warn about .Ns macros that have no effect because they are followedIngo Schwarze
by an isolated closing delimiter; inspired by mdoclint
2017-06-25Catch typos in .Sh names; suggested by jmc@.Ingo Schwarze
I'm using a very simple, linear time / zero space fuzzy string matching heuristic rather than a full Levenshtein metric, to keep the code both simple and fast.