summaryrefslogtreecommitdiff
path: root/usr.bin/mandoc/term.c
AgeCommit message (Collapse)Author
2022-12-26spelling fixes; from paul tagliamonteJason McIntyre
amendments to his diff are noted on tech
2022-08-16When starting a new input line, even when continuing the same outputIngo Schwarze
line, use the current output position as the reference position for tabs on that input line. This brings mandoc in line with the behaviour of GNU, Heirloom, and Plan 9 roff.
2022-08-15Distinguish between escape sequences that produce no outputIngo Schwarze
whatsoever (for example \fR) and escape sequences that produce invisible zero-width output (for example \&). No, i'm not joking, groff does make that distinction, and it has consequences in some situations, for example for vertical spacing in no-fill mode. Heirloom and Plan 9 behaviour is subtly different, but in case of doubt, we want to follow groff. While this fixes the behaviour for the majority of escape sequences, in particular for those most likely to occur in practice, it is not perfect yet because some of the more exotic ESCAPE_IGNORE sequences are actually of the "no output whatsoever" type but treated as "invisible zero-width" for now. With the new ASCII_NBRZW mechanism in place, switching them over one by one when the need arises will no longer be very difficult.
2022-08-15In GNU, Heirloom, and Plan 9 roff, tab positions apply to *input* lines,Ingo Schwarze
not to *output* lines. In particular, if an input line gets broken in fill mode and a tab occurs in the second output line, it advances to a position of at least (width of the first output line) + (width of a space character even though this is never printed) + (width of the part of the second output line that precedes the tab). Implement the same logic in mandoc. Again, do not use tabs in filled text: they have surprising effects, including this one.
2022-08-15In GNU, Heirloom, and Plan 9 roff, literal tab characters areIngo Schwarze
non-breakable in exactly the same way as "\ ". That is, the preceding word, the tab character, and the following word are always kept together on the same output line. If filling is enabled and an output line break is required before the end of the following word, the break occurs before the beginning of the preceding word. Make mandoc behave in the same way. Of course, using literal tab characters in filled text remains a bad idea, and the "WARNING: tab in filled text" remains unchanged.
2022-04-27Fix three bugs regarding the interaction of \z and \h:Ingo Schwarze
1. The combination \z\h is a no-op whatever the argument may be. In the past, the \z only affected the first space character generated by the \h, which was wrong. 2. For the conbination \zX\h with a positive argument, the first space resulting from the \h is not printed but consumed by the \z. 3. For the combination \zX\h with a negative argument, application of the \z needs to be completed before the \h can be started. In the past, if this combination occurred at the beginning of an output line, the \h backed up to the beginning of the line and after that, the \z attempted to back up even further, triggering an assertion. Bugs found during an audit of assignments to termp->col that i started after the bugfix tbl_term.c rev. 1.65. The assertion triggered by bug 3 was *not* yet found by afl(1).
2022-01-10When rendering the \h (horizontal motion) low-level roff(7) escapeIngo Schwarze
sequence in -T ps and -T pdf output mode, use an appropriate horizontal distance by correctly using the term_len() utility function. Output from the -T ascii, -T utf8, and -T html modes was already correct and remains unchanged. Lennart Jablonka <hummsmith42 at gmail dot com> found and reported this unit conversion bug (misinterpreting AFM units as if they were en units) when rendering scdoc-generated manuals (which is a low quality generator, but that's no excuse for mandoc misformatting \h) on Alpine Linux. Lennart also tested this patch.
2021-10-04Provide a cleanup function for the term_tab module, freeing memoryIngo Schwarze
and resetting the internal state to the initial state. Call this function from the proper place in term_free(). With the way the module is currently used, this does not imply any functional change, but doing proper cleanup is more robust, makes it easier during code review to understand what is going on, and makes it explicit that there is no memory leak.
2021-08-10Support two-character font names (BI, CW, CR, CB, CI)Ingo Schwarze
in the tbl(7) layout font modifier. Get rid of the TBL_CELL_BOLD and TBL_CELL_ITALIC flags and use the usual ESCAPE_FONT* enum mandoc_esc members from mandoc.h instead, which simplifies and unifies some code. While here, also support CB and CI in roff(7) \f escape sequences and in roff(7) .ft requests for all output modes. Using those is certainly not recommended because portability is limited even with groff, but supporting them makes some existing third-party manual pages look better, in particular in HTML output mode. Bug-compatible with groff as far as i'm aware, except that i consider font names starting with the '\n' (ASCII 0x0a line feed) character so insane that i decided to not support them. Missing feature reported by nabijaczleweli dot xyz in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=992002. I used none of the code from the initial patch submitted by nabijaczleweli, but some of their ideas. Final patch tested by them, too.
2020-09-02Do not indent by SIZE_MAX/2 when .ce occurs inside explicit no-fill mode.Ingo Schwarze
While here, drop two unused arguments from the function term_field(); the related work was already done by term_fill() before this commit. I found the bug in an afl run that was performed by Jan Schreiber <jes at posteo dot de>.
2019-06-03Explicitly state that the cases in the inner switch in term_fill()Ingo Schwarze
are exhaustive. While there is no bug, being explicit has no downside is is potentially safer for the future. Michal Nowak <mnowak at startmail dot com> reported that gcc 4.4.4 and 7.4.0 on illumos throw -Wuninitialized false positives.
2019-01-15In PostScript and PDF output, one AFM unit is not nearly enoughIngo Schwarze
inter-word spacing, let's try again with 250 AFM units. Regression caused during my recent term_flushln() reorg in rev. 1.138, reported by brynet@ (sorry and many thanks for reporting).
2019-01-04Implement centering and adjustment to the right margin directly inIngo Schwarze
the terminal filling routine, controlled by new flags TERMP_CENTER and TERMP_RIGHT. This became possible by the recent term_flushln() rewrite. No functional change yet, but to be used by upcoming commits.
2019-01-03Rewrite the line filling function for terminal output yet again.Ingo Schwarze
This function has always been among the most complicated parts of mandoc, and it repeatedly needed substantial functional enhancements. The present rewrite is required to prepare for the implementation of simultaneous filling and centering of output lines. The previous implementation looked at each word in turn and printed it to the output stream as soon as it was found to still fit on the current output line. Obviously, that approach neither allows centering nor adjustment to the right margin. The new implementation first decides which part of the paragraph to put onto the current output line, also measuring the display width of that part, even if that part consists of multiple words including intervening whitespace. This will allow moving the whole output line to the right as desired before printing it, for example to center it or to adjust it to the right margin. The function is split into three parts, each much shorter, solving a better defined task, much easier to understand and better commented: 1. the steering function term_flushln() looping over output lines; 2. the calculation function term_fill() looping over input characters; 3. and the output function term_field() looping over printed characters. No functional change yet.
2018-12-15Several improvements to escape sequence handling.Ingo Schwarze
* Add the missing special character \_ (underscore). * Partial implementations of \a (leader character) and \E (uninterpreted escape character). * Parse and ignore \r (reverse line feed). * Add a WARNING message about undefined escape sequences. * Add an UNSUPP message about unsupported escape sequences. * Mark \! and \? (transparent throughput) and \O (suppress output) as unsupported. * Treat the various variants of zero-width spaces as one-byte escape sequences rather than as special characters, to avoid defining bogus forms with square brackets. * For special characters with one-byte names, do not define bogus forms with square brackets, except for \[-], which is valid. * In the form with square brackets, undefined special characters do not fall back to printing the name verbatim, not even for one-byte names. * Starting a special character name with a blank is an error. * Undefined escape sequences never abort formatting of the input string, not even in HTML output mode. * Document the newly handled escapes, and a few that were missing. * Regression tests for most of the above.
2018-10-25Implement the \f(CW and \f(CR (constant width font) escape sequencesIngo Schwarze
for HTML output. Somewhat relevant because pod2man(1) relies on this. Missing feature reported by Pali dot Rohar at gmail dot com. Note that constant width font was already correctly selected before this when required by semantic markup. Only attempting physical markup with the low-level escape sequence was ineffective.
2018-08-16Implement the \*(.T predefined string (interpolate device name)Ingo Schwarze
by allowing the preprocessor to pass it through to the formatters. Used for example by the groff_char(7) manual page.
2017-07-28use & to check if a bit is set in a flag; pointed out by clangFlorian Obser
OK schwarze
2017-06-14implement so-called absolute horizontal motion: \h'|...',Ingo Schwarze
used for example by zoem(1)
2017-06-14let \l use the right fill characterIngo Schwarze
2017-06-14improve rounding rules for scaling unitsIngo Schwarze
in horizontal orientation in the terminal formatter
2017-06-14implement the roff(7) \p (break output line) escape sequenceIngo Schwarze
2017-06-12Implement automatic line breakingIngo Schwarze
inside individual table cells that contain text blocks. This cures overlong lines in various Xenocara manuals.
2017-06-08make the internal a2roffsu() interface more powerful by returningIngo Schwarze
a pointer to the end of the parsed data, making it easier to parse subsequent bytes
2017-06-07Prepare the terminal driver for filling multiple columns in parallel,Ingo Schwarze
second step: make the per-column byte pointer persistent across term_flushln() calls, such that a subsequent call can continue at the point where the previous call left. If more than one column is in use, return from term_flushln() when the column is full, rather than breaking the output line. No functional change, because nothing sets up multiple columns yet.
2017-06-07Prepare the terminal driver for filling multiple columns in parallel,Ingo Schwarze
first step: split column data out of the terminal state struct into a new column state struct and use an array of such column state structs. No functional change.
2017-06-07The \h escape sequence provides another method for moving backwards,Ingo Schwarze
and after that, previously written output gets overwritten, but overwriting with blanks does *not* erase previously written content. Yes, manual pages exist that are crazy enough to rely on that...
2017-06-04Implement the roff(7) .mc (right margin character) request.Ingo Schwarze
The Tcl/Tk manual pages use this extensively. Delete the TERM_MAXMARGIN hack, it breaks .mc inside .nf; instead, implement a proper TERMP_BRNEVER flag.
2017-06-04Make term_flushln() simpler and more robust:Ingo Schwarze
Eliminate the "overstep" state variable. The information is already contained in "viscol". Minus 60 lines of code, no functional change intended.
2017-06-02Partial implementation of \h (horizontal line drawing function).Ingo Schwarze
A full implementation would require access to output device properties and state variables (both only available after the main parser has finalized the parse tree) before numerical expansions in the roff preprocessor (i.e., before the main parser is even started). Not trying to pull that stunt right now because the static-width implementation committed here is sufficient for tcl-style manual pages and already more complicated than i would have suspected.
2017-06-01Minimal implementation of the \h (horizontal motion) escape sequence.Ingo Schwarze
Good enough to cope with the average DocBook insanity.
2017-05-07Basic implementation of the roff(7) .ta (define tab stops) request.Ingo Schwarze
This is the first feature made possible by the parser reorganization. Improves the formatting of the SYNOPSIS in many Xenocara GL manuals. Also important for ports, as reported by many, including naddy@.
2017-01-08Fix an assertion failure caused by \z\[u00FF] with -Tps/-Tpdf.Ingo Schwarze
Reported by jsg@ after an afl(1) run long ago.
2016-08-10Fix assertion failures caused by whitespace inside \o'' (overstrike)Ingo Schwarze
sequences that jsg@ found with afl(1): * Avoid writing \t\b in term.c. * Handle trailing \b in term_ps.c.
2016-03-20" the the " -> " the ", or in a couple of cases replace the superfluousKenneth R Westerback
"the" with the obviously intended word. Started with a "the the" spotted by Mihal Mazurek.
2016-01-07This code wasted memory by allocating sizeof(enum termfont *)Ingo Schwarze
where only sizeof(enum termfont) is needed. Fixes CID 1288941. From christos@ via wiz@, both at NetBSD.
2015-10-23apply bold and italic to all non-ASCII Unicode codepoints,Ingo Schwarze
fixing input like \fB\('e; issue reported by bentley@
2015-10-13Major character table cleanup:Ingo Schwarze
* Use ohash(3) rather than a hand-rolled hash table. * Make the character table static in the chars.c module: There is no need to pass a pointer around, we most certainly never want to use two different character tables concurrently. * No need to keep the characters in a separate file chars.in; that merely encourages downstream porters to mess with them. * Sort the characters to agree with the mandoc_chars(7) manual page. * Specify Unicode codepoints in hex, not decimal (that's the detail that originally triggered this patch). No functional change, minus 100 LOC, and i don't see a performance change.
2015-10-12To make the code more readable, delete 283 /* FALLTHROUGH */ commentsIngo Schwarze
that were right between two adjacent case statement. Keep only those 24 where the first case actually executes some code before falling through to the next case.
2015-10-06modernize style: "return" is not a function; ok cmp(1)Ingo Schwarze
2015-09-26/* NOTREACHED */ after abort() is silly, delete itIngo Schwarze
2015-09-21Trailing whitespace is significant when determining the width of a tagIngo Schwarze
in mdoc(7) .Bl -tag and man(7) .TP, but not in man(7) .IP. Quirk reported by Jan Stary <hans at stare dot cz> on ports@.
2015-08-30Drop leading, internal, and trailing blank characters in \o (overstrike)Ingo Schwarze
escape sequences; that's cleaner for all output modes, and it's required to prevent the PostScript/PDF formatter from dying on assertions. Bug found by jsg@ with afl.
2015-04-29Replace the kludge for the \z escape sequence by an actualIngo Schwarze
implementation. As a side effect, minus ten lines of code. As another side effect, this also fixes the assertion failure that used to be triggered by "\z\o'ab'c" at the beginning of an output line, found by jsg@ with afl (test case 022/Apr27).
2015-04-04Rounding rules for horizontal scaling widths are more complicated.Ingo Schwarze
There is a first rounding to basic units on the input side. After that, rounding rules differ between requests and macros. Requests round to the nearest possible character position. Macros round to the next character position to the left. Implement that by changing the return value of term_hspan() to basic units and leaving the second scaling and rounding stage to the formatters instead of doing it in the terminal handler. Improves for example argtable2(3).
2015-04-02Third step towards parser unification:Ingo Schwarze
Replace struct mdoc_meta and struct man_meta by a unified struct roff_meta. Written of the train from London to Exeter on the way to p2k15.
2015-03-09prevent the skipvsp flag from creeping past actual textIngo Schwarze
2015-01-31Use relative offsets instead of absolute pointers for the terminalIngo Schwarze
font stack. The latter fail after the stack is grown with realloc(). Fixing an assertion failure found by jsg@ with afl some time ago (test case number 51).
2015-01-21Rudimentary implementation of the roff(7) \o escape sequence (overstrike).Ingo Schwarze
This is of some relevance because the pod2man(1) preamble abuses it for the icelandic letter Thorn, instead of simply using \(TP and \(Tp. Missing feature found by sthen@ in DateTime::Locale::is_IS(3p).
2014-12-24Support negative indentations for mdoc(7) displays and lists.Ingo Schwarze
Not exactly recommended for use, rather for groff compatibility. While here, introduce similar SHRT_MAX limits as in man(7), fixing a few cases of infinite output found by jsg@ with afl.