summaryrefslogtreecommitdiff
path: root/usr.bin/mandoc/roff.c
AgeCommit message (Collapse)Author
2012-07-07Support the .cc request; code by kristaps@, tests by me.Ingo Schwarze
Needed for sqlite3(1) as reported by espie@.
2012-06-02In groff, trying to redefine standard man(7) macros before .TH has no effect;Ingo Schwarze
after .TH, it works. Trying to redefine standard mdoc(7) macros before .Dd works when calling groff with the -mdoc command line option, but does not when calling groff with -mandoc; after .Dd, it always works. Arguably, one might call that buggy behaviour in groff, but it is very unlikely that anybody will change groff in this respect (certainly, i'm not volunteering). So let's be bug-compatible. This fixes the vertical spacing in sox(1).
2012-05-31Fix blank line handling in .if.Ingo Schwarze
In particular, two cases were wrong: - single-line .if with trailing whitespace gave no blank line - multiline .if with \{ but without \{\ gave no blank line While here, simplify roff_cond() by partially reordering the code.
2011-10-24Handle infinite recursion the same way as groff:Ingo Schwarze
When string expansion exceeds the recursion limit, drop the whole input line, instead of leaving just the string unexpanded. This fixes src/regress/usr.bin/mandoc/roff/string/infinite.in. ok kristaps@
2011-09-19Breaking the line at a hyphen is only allowed if the hyphenIngo Schwarze
is both preceded and followed by an alphabetic character. This fixes about a dozen places in base.
2011-09-18Fix another regression introduced in 1.11.7:Ingo Schwarze
If a string is defined in terms of itself, the REPARSE_LIMIT in read.c used to break the cycle. This no longer works since all the work is now done in the function roff_res(), looping indefinitely. Make this loop finite by arbitrarily limiting the number of times one string may be expanded; when that limit is reached, leave the remaining string references unexpanded. This changes behaviour compared to 1.11.5, where the whole line would have been dropped. The new behaviour is better because it loses less information. We don't want to imitate groff-1.20.1 behaviour anyway because groff aborts parsing of the whole file.
2011-09-18sync to version 1.11.7 from kristaps@Ingo Schwarze
main new feature: support the roff(7) .tr request plus various bugfixes and some refactoring regressions are so minor that it's better to get this in and fix them in the tree
2011-09-18sync to version 1.11.5:Ingo Schwarze
adding an implementation of the eqn(7) language by kristaps@ So far, only .EQ/.EN blocks are handled, in-line equations are not, and rendering is not yet very pretty, but the parser is fairly complete.
2011-07-31Workaround to prevent misrendering of \*(-- as "O-" in pod2man(1)-Ingo Schwarze
generated manuals; this fixes more than 500 manuals in base alone. As a real fix, .tr will be supported after unlock. OK kristaps@ to put in the workaround for now
2011-07-07Fix a bogus "unknown macro" error reported in the pod2man(1) preamble:Ingo Schwarze
- Actually let roff_parse() recognize ".\}" as a cond block end request. - Do not rewrite "\}" to the zero-width space "\&" because that prevents recognition of immediately preceding macros; use normal blanks instead. - To avoid a vertical spacing regression in pod2man(1) manuals, drop one vertical spacing request just before NAME. From kristaps@.
2011-07-05Sync to bsd.lv (all coded by kristaps@):Ingo Schwarze
- mdoc(7): fix an assertion if the first line after .Bd -column starts with a blank, and some simplifications in mdoc_argv.c - man(7): literal mode ends at .SH and .SS (bug reported by naddy@) - allow .RS/.RE blocks to nest (bug reported by dcoppa@ and gsoares@) - improve vertical spacing of man(7) blocks - roff(7): clear user-defined strings when starting a new file - correct ID tags in -T[x]html
2011-05-29Merge release 1.11.3, almost all code by kristaps@:Ingo Schwarze
* Unicode output support (no Unicode input yet, though). * Refactoring: completely handle predefined strings in roff.c. - New function mandoc_escape() replaces a2roffdeco() and mandoc_special(). - Start using mandoc_getarg() in mdoc_argv.c. - Clean up parsing of delimiters in mdoc(7). * And many minor fixes and lots of cleanup.
2011-04-24User defined macros may invoke high-level macros.Ingo Schwarze
The latter got lost due to a regression in bsd.lv rev. 1.130.
2011-04-24Merge version 1.11.1:Ingo Schwarze
Again lots of cleanup and maintenance work by kristaps@. - simplify error reporting: less function pointers, more mandoc_[v]msg - main: split document parsing out of main.c into read.c - roff, mdoc, man: improved recognition of control characters - roff: better handling of if/else stack overflows - roff: add some predefined strings for backward compatibility - mdoc, man: empty sections are not errors - mdoc: move delimiter handling to libmdoc - some header restructuring and some minor features and fixes This merge causes two minor regressions that i will fix in separate commits right afterwards.
2011-04-21Merge version 1.10.10:Ingo Schwarze
lots of cleanup and maintenance work by kristaps@. - move some main.c globals into struct curparse - move mandoc_*alloc to mandoc.h such that all code can use them - make mandoc_isdelim available to formatting frontends - dissolve mdoc_strings.c, move the code where it is used - make all error reporting functions void, their return values were useless - and various minor cleanups and fixes
2011-04-05On .de macro lines, after the new macro name, space and tab are equivalent.Ingo Schwarze
Bug reported by Tristan dot LeGuern at gmail dot com in fvwm(1). tweaks and ok kristaps@; earlier version looked good to espie@ as well
2011-03-20Import the foundation for eqn(7) support.Ingo Schwarze
Written by kristaps@. For now, i'm adding one line to each of the four frontends to just pass the input text through to the output, not yet interpreting any of then eqn keywords.
2011-01-25Ignore .ns (no-space mode), .ps (change point size), .ta (tab control)Ingo Schwarze
for now. All of these just cause a bit too much or too little whitespace, but no serious formatting problems.
2011-01-20When finding the roff .it request (line trap),Ingo Schwarze
make it clear that you cannot use mandoc to format that page (yet). Triggered by a report from brad@.
2011-01-12Implement the roff .rm request (remove macro).Ingo Schwarze
Using the new roff_getname() function, this is really simple. Breaks mandoc of the habit of reporting an error in each pod2man(1) preamble. Reminded by a report from brad@.
2011-01-10Refactoring in preparation for .rm support:Ingo Schwarze
Unify parsing of names given as roff request arguments into a new function roff_getname(), which is rather different from the parsing function for normal arguments, mandoc_getarg(), because names cannot be quoted and cannot contain whitespace or escaped characters. The new function now throws an ERROR when finding escaped characters in a name. "I'm fine with this." kristaps@
2011-01-04Merge kristaps@' cleaner tbl integration, removing mine;Ingo Schwarze
there are still a few bugs, but fixing these will be easier in tree.
2011-01-03Calling a macro with fewer arguments than it is defined with is OK;Ingo Schwarze
the remaining ones default to the empty string, not to NULL. Regression reported and fix tested by kristaps@.
2011-01-03Unify roff macro argument parsing (in roff.c, roff_userdef()) and man macroIngo Schwarze
argument parsing (in man_argv.c, man_args()), both having different bugs, to use one common macro argument parser (in mandoc.c, mandoc_getarg()), because from the point of view of roff, man macros are just roff macros, hence their arguments are parsed in exactly the same way. While doing so, fix these bugs: * Escaped blanks (i.e. those preceded by an odd number of backslashes) were mishandled as argument separators in unquoted arguments to user-defined roff macros. * Unescaped blanks preceded by an even number of backslashes were not recognized as argument separators in unquoted arguments to man macros. * Escaped backslashes (i.e. pairs of backslashes) were not reduced to single backslashes both in unquoted and quoted arguments both to user-defined roff macros and to man macros. * Escaped quotes (i.e. pairs of quotes inside quoted arguments) were not reduced to single quotes in man macros. OK kristaps@ Note that mdoc macro argument parsing is yet another beast for no good reason and is probably afflicted by similar bugs. But i don't attempt to fix that right now because it is intricately entangled with lots of unrelated high-level mdoc(7) functionality, like delimiter handling and column list phrase handling. Disentagling that would waste too much time now.
2010-12-21Kristaps questioned the efficiency of the algorithm used in roff.c r1.23.Ingo Schwarze
An indeed, this optimization (using suggestions by Joerg Sonnenberger) saves about 40% of the processing time needed for the roff_res() function when processing typical manuals. No functional change, and the new code is not harder to understand. ok kristaps@
2010-12-09Roff only interpolates \* strings when the leading backslash is not escaped.Ingo Schwarze
Kristaps@ agrees with the idea, even though he didn't review the final patch.
2010-12-07Complete the merge of bsd.lv version 1.10.7:Ingo Schwarze
No more functional changes, just sync ordering, comments and white space.
2010-11-28To avoid FATAL errors, we have been parsing and ignoring the roffIngo Schwarze
requests .am, .ami, .am1, .dei, and .rm for a long time. Since ignoring them can (rarely) cause information loss and serious misformatting, throw an ERROR: NOT IMPLEMENTED when finding them. Implementing them would not be too difficult, but they are so rare in practice that i can find better use for my time right now. In this context, - Put the string "NOT IMPLEMENTED" into two other error messages as well, to distinguish them from those caused by broken input. - Print the string "unknown macro" once, not twice in the error message associated with MANDOCERR_MACRO, and begin printing the buffer at the point where the unknown macro really is, not at the start of line.
2010-11-28Parse and ignore the .ad, .hy, .nh, and .ne roff requests.Ingo Schwarze
Ignoring these can neither cause information loss nor serious formatting issues. As they are frequently used by pod2man(1), this considerably reduces ERROR noise from mandoc -Tlint for the Perl manuals.
2010-11-27Two related bugfixes:Ingo Schwarze
1) When using a user-defined string of length 0 as a macro, do not access memory before the start of the string (segfault). 2) When beginning to define a user-defined macro, initialize the string representing the macro to the empty string, not to the NULL pointer, such that, in case the macro turns out to not have any content, like in .de IX .. the macro will be defined and empty instead of undefined. This avoids large numbers of bogus mandoc ERROR messages about undefined macros (which are actually defined and empty), in particular in man(7) code generated from pod2man(1), for example in Perl and OpenSSL.
2010-11-25Make .de1 a symnonym for .de, not .ig as it was before.Ingo Schwarze
The .de1 instructions is a GNU extension not found in traditional roff and not even in old groff, defined as "define a macro that will be executed with traditional roff compatibility mode switched off during macro execution". Since we ran into it in the wild, we have been parsing and ignoring it for a long time. Now that we have proper .de support, we can as well use the contents, even though we don't implement compatibility mode at all.
2010-11-25Support quoting of arguments passed to user-defined macros,Ingo Schwarze
such that arguments can contain blank characters. Also support escaping of quote characters by doubling them. For example, the argument "a""b c." resolves to: a"b c.
2010-11-25Implement the .de (define macro) roff instruction.Ingo Schwarze
This fixes various Xenocara manuals. Do not define your own macros in new manuals, though: this code exists purely to cope with existing and old stuff. Like in both traditional and GNU roff, the .de and .ds (define string) roff instructions share the same string table, so one can abuse strings as macros and vice versa. This implementation supports multi-line user-defined macros and user-defined macros taking up to 9 arguments. Project started near the end of p2k10, now mature for production, but there is still room for future improvements in various respects.
2010-10-26Warn developers that .so is fragile and suggest using ln(1) instead;Ingo Schwarze
throwing a warning here was suggested by Joerg Sonnenberger.
2010-10-26Support .so (low-level roff "switch source file"),Ingo Schwarze
needed for Xenocara and various ports. Accept only relative paths and no ascension to the parent directory as suggested by Joerg Sonnenberger; code looked over by Joerg, too. Useful discussions with various people, among others espie@.
2010-09-27Merge the last bits of 1.10.6 (released today), most were already in:Ingo Schwarze
* ignore double-.Pp * ignore .Pp before .Bd and .Bl (unless -compact in specified) * avoid double blank line upon .Pp, .br and friends in literal context * cast enums to int when passing them to exit(3) to please lint(1) While merging, fix a regression introduced by kristaps@: Outside literal mode, double blank lines must both be printed. To achieve this again after kristaps@ improvements in 1.10.6, treat such blank lines as .sp (instead of .Pp as in 1.10.5) and drop .Pp before .sp just like dropping .Pp before .Pp.
2010-09-13Parse and ignore the \k, \o, \w, and \z roff escapes, and recursivelyIngo Schwarze
ignore embedded escapes and mathematical roff subexpressions. In roff copy mode, resolve "\\" to '\'. Allow ".xx\}" where xx is a macro to close roff conditional scope. Mandoc now handles the special character definitions in the pod2man(1) preamble, so remove the explicit redefinitions in chars.c/chars.in. From kristaps@. I have checked that this causes no relevant change to the Perl manuals. The only change introduced is that some non-ASCII characters rendered incorrectly before are now rendered incorrectly in a different way. For example, e accent aigu was "e", now is "e'" and c cedille was "c", now is "c,".
2010-08-20Implement a simple, consistent user interface for error handling.Ingo Schwarze
We now have sufficient practical experience to know what we want, so this is intended to be final: - provide -Wlevel (warning, error or fatal) to select what you care about - provide -Wstop to stop after parsing a file with warnings you care about - provide consistent exit status codes for those warnings you care about - fully document what warnings, errors and fatal errors mean - remove all other cruft from the user interface, less is more: - remove all -f knobs along with the whole -f option - remove the old -Werror because calling warnings "fatal" is silly - always finish parsing each file, unless fatal errors prevent that This commit also includes a couple of related simplifications behind the scenes regarding error handling. Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and Sascha Wildner (DragonFly BSD) agree with the general direction.
2010-07-31Merge bsd.lv version 1.10.5: last larger batch of bug fixes before release.Ingo Schwarze
NOT including Kristaps' .Bd -literal changes which cause regressions. Features: * -Tpdf now fully working Bugfixes: * proper handling of quoted strings by .ds in roff(7) * allow empty .Dd * make .Sm start no-spacing after the first output word * underline .Ad * minor fixes in -Thtml and some optimisations in terminal output.
2010-07-25Sync to bsd.lv; in particular, pull in lots of bug fixes.Ingo Schwarze
new features: * support the .in macro in man(7) * support minimal PDF output * support .Sm in mdoc(7) HTML output * support .Vb and .nf in man(7) HTML output * complete the mdoc(7) manual bug fixes: * do not let mdoc(7) .Pp produce a newline before/after .Sh; reported by jmc@ * avoid double blank lines related to man(7) .sp and .br * let man(7) .nf and .fi flush the line; reported by jsg@ and naddy@ * let "\ " produce a non-breaking space; reported by deraadt@ * discard \m colour escape sequences; reported by J.C. Roberts * map undefined 1-character-escapes to the literal character itself maintenance: * express mdoc(7) arguments in terms of an enum for additional type-safety * simplify mandoc_special() and a2roffdeco() * use strcspn in term_word() in place of a manual loop * minor optimisations in the -Tps and -Thtml formatting frontends
2010-07-13Merge release 1.10.4 (all code by kristaps@), providing four new features:Ingo Schwarze
1) Proper .Bk support: allow output line breaks at input line breaks, but keep input lines together in the output, finally fixing synopses like aucat(1), mail(1) and tmux(1). 2) Mostly finished -Tps (PostScript) output. 3) Implement -Thtml output for .Nm blocks and .Bk -words. 4) Allow iterative interpolation of user-defined roff(7) strings. Also contains some minor bugfixes and some performance improvements.
2010-07-03Rudimentary implementation of user-defined strings;Ingo Schwarze
no time for more refinement right now. In particular, fixes terminfo(3) and mdoc.samples(7). ok kristaps@, who will add the HTML frontend bits
2010-06-27Full .nr nS support, unbreaking the kernel manuals.Ingo Schwarze
Kristaps coded this from scratch after reading my .nr patch; it is simpler and more powerful. Registers live in struct regset in regs.h, struct man and struct mdoc contain pointers to it. The nS register is cleared when parsing .Sh. Frontends respect the MDOC_SYNPRETTY flag set in mdoc node_alloc.
2010-06-26merge release 1.10.2Ingo Schwarze
* bug fixes: - interaction of ASCII_HYPH with special chars (found by Ulrich Spoerlein) - handling of roff conditionals (found by Ulrich Spoerlein) - .Bd -offset will no more default to 6n * maintenance: - more caching of .Bd and .Bl arguments for efficiency - deconstify man(7) validation routines - add FreeBSD library names (provided by Ulrich Spoerlein) * start PostScript font-switching
2010-06-06Merge bsd.lv version 1.10.1 (to be released soon).Ingo Schwarze
The main step forward is that this now has *much* better .Bl -column support, now supporting many manuals that previously errored out without producing any output. Other fixes include: * do not die from multiple list types, use the first and warn * in .Bl without a type, default to -item * various tweaks to .Dt * fix .In, .Fd, .Ft, .Fn and .Fo formatting * some documentation fixes and additions * and fix a couple of bugs reported by Ulrich Spoerlein: * better support for roff block-end "\}" without a preceding dot * .In must not break the line outside SYNOPSIS * spelling in some error messages While merging, fix one regression in .In spacing that needs to go to bsd.lv, too.
2010-06-06Merge bsd.lv release 1.10.0,Ingo Schwarze
which is mostly the post-hackathon release, bringing in the OpenBSD changes to bsd.lv, but which also has a few additional minor fixes: * .Lb is an in-line macro, not in_line_eoln * .Bt, .Ud now warn when discarding arguments * allow bad -man dates to flow verbatim into the front-ends - so far all reported by Ulrich Spoerlein * .Ar, .Fl and .Li starting with closing punctuation emit an empty element * empty .Li macros print nothing, but may cause spacing * proper EOS handling for .Bt, .Ex, .Rv, and .Ud. * cleanup: collapse posts_xr into posts_wtext (which is the same) * efficiency: very simple table lookup for roff.c
2010-05-20Support nested roff instructions:Ingo Schwarze
* allow roff_parseln() to be re-run * allow roff_parseln() to manipulate the line buffer offset * support the offset in the man and mdoc libraries * adapt .if, .ie, .el, .ig, .am* and .de* support * interpret some instructions even in conditional-negative context Coded by kristaps during the last day of the mandoc hackathon. To avoid regressions in the OpenBSD tree, commit this together with some small local additions: * detect roff block end "\}" even on macro lines * actually implement the ".if n" conditional * ignore .ds, .rm and .tr in libroff Also back my old .if/.ie/.el-handling out of libman, reverting: man.h 1.15 man.c 1.25 man_macro.c 1.15 man_validate.c 1.19 man_action.c 1.15 man_term.c 1.28 man_html.c 1.9.
2010-05-16In theory, Kristaps never intended to write a roff parser,Ingo Schwarze
but in practice, most real legacy man(7)uals are using so much low level roff that we can't really get away without at least partially handling some roff instructions. As doing this in man(7) only has become messy and as even some mdoc(7) pages need it, start a minimal partial roff preprocessor. As a first step, move handling of .am[i], .de[i] and .ig there. Do not use the roff preprocessor for new manuals! Now that we have three main parser libraries - roff, man and mdoc - each one having its own error handling is becoming messy, too. Thus, start unifying message handling in one central place, introducing a new generic function mmsg(). coded by kristaps@