diff options
author | Todd C. Miller <millert@cvs.openbsd.org> | 2000-04-09 07:58:38 +0000 |
---|---|---|
committer | Todd C. Miller <millert@cvs.openbsd.org> | 2000-04-09 07:58:38 +0000 |
commit | 098fe4a0b368c914c7d1f7ce086634958df8796a (patch) | |
tree | 35c4467b0223be7d6cd8bf4a8d03b0010b342e2a /gnu/usr.bin/groff/grohtml/design.ms | |
parent | 972922b0b73ac8052cf5ab98e029ac4e27c752f3 (diff) |
groff 1.15
Diffstat (limited to 'gnu/usr.bin/groff/grohtml/design.ms')
-rw-r--r-- | gnu/usr.bin/groff/grohtml/design.ms | 156 |
1 files changed, 156 insertions, 0 deletions
diff --git a/gnu/usr.bin/groff/grohtml/design.ms b/gnu/usr.bin/groff/grohtml/design.ms new file mode 100644 index 00000000000..e62e2233096 --- /dev/null +++ b/gnu/usr.bin/groff/grohtml/design.ms @@ -0,0 +1,156 @@ +.nr PS 12 +.nr VS 14 +.LP +.TL +Design of grohtml +.sp 1i +.SH +What is grohtml +.LP +Grohtml is a back end for groff which generates html. +The aim of grohtml is to produce respectible html given +fairly typical groff input. +.SH +Limitations of grohtml +.LP +Although basic text can be translated +in a straightforward fashion there are some areas where grohtml +has to try and guess text relationship. In particular whenever +grohtml encounters text tables and indented paragraphs or +two column mode it will try and utilize the html table construct +to preserve columns. Grohtml also attempts to work out which +lines should be automatically formatted by the browser. +Ultimately in trying to make reasonable guesses most of the time +it will make mistakes occasionally. +.PP +Tbl, pic, eqn's are also generated using images which may be +considered a limitation. +.SH +Overview of html.cc +.LP +This file briefly provides an overview of how html.cc operates. +The html device driver works as follows: +.IP (i) .5i +firstly it creates a linked list of all words on a page. +.IP (ii) .5i +it runs through the page and finds the left most margin. Later +on when generating the page it removes the margin. +.IP (iii) .5i +scans a page and builds two kinds of regions ascii text and graphical. +The graphical regions consist of tbl's, eqn's, pic's +(basically anything that cannot be textually displayed). +It will scan through a page to find lines (such as footer etc) +and places these into tiny graphical regions. Certain fonts +also are treated as a graphical region - as html has no easy +equivalent. For example Greek math symbols. +.LP +Finally all graphical regions are translated into png files and +all text regions into html text. +.PP +To give grohtml a sporting chance of accuratly deciding which +is a graphical region and which is text, the front end programs +tbl, eqn, pic have all been tweeked to encapsulate pictures, tables +and equations with the following lines: +.sp +.nf +\f[CR]\&.if '\\*(.T'html' \\X(graphic-start(\c + +\&.if '\\*(.T'html' \\X(graphic-end(\c +\fP +.fi +.sp +these appear to grohtml as: +.sp +.nf +\f[CR]\&x X graphic-start + +\&... + +\&x X graphic-end\fP +.fi +.sp +.LP +In addition to graphic-start and graphic-end there are two +other "special characters" which are used. +.sp +\f[CR]\&x X index:N\fP +.sp +where N is a number. The purpose of this sequence is to stop +devhtml from automatically producing links to headings which +have a header level >N. +The line: +.sp +\f[CR]\&x X html:STRING\fR +.sp +.LP +allows a STRING to be passed through to the output file with +no processing whatsoever. Ie it allows users to include html +commands, via macro, such as: +.sp +\f[CR]\&.URL "Latest Emacs" "ftp://somewonderful.gnu.software"\fP +.sp +.LP +Where the URL macro bundles the info into STRING above. +For more info consult: \f[CR]tmac/tmac.arkup\fP. +.PP +While scanning through a page the html device copies headings and titles +into a list of links which are later written to the beginning +of the html document. +.SH +Table handling code +.LP +Provided that the -t option is not present when grohtml is run the grohtml +driver will attempt to find textual tables and generate html tables. +This allows .RS and .RE commands to operate with auto formatting. It also +should grohtml to process .2C correctly. However, the table handling code +has to examine the troff output and \fIguess\fR when a table starts and +finishes. It is well to know the limitations of this approach as it +sometimes makes the wrong decision. +.LP +Here are some of the rules that grohtml uses for terminating a html table: +.LP +.IP "(i)" .5i +A table will be terminated when grohtml finds line which is all in bold +font (it believes that this is a header which is outside of a table). +This might be considered incorrect behaviour especially if you use .2C +which generates a heading on the left column when the corresponding +right row is blank. +.IP "(ii)" .5i +A table is terminated when grohtml sees that the complete line is +has been spanned by words. Ie no gaps exist. +.IP "(nb)" .5i +the documentation about these rules is particularly incomplete and needs finishing +when time prevails. +.SH +To do +.LP +.IP (i) .5i +finish working out the max and min x, y, extents for splines. +.IP (ii) .5i +check and test thoroughly all the character descriptions in devhtml +(originally taken from devX100) +.IP (iii) .5i +improve tmac.arkup +.IP (vi) .5i +also improve documentation. +.IP (v) .5i +fix the bugs which are exposed by Eric Raymonds pic guide, +\fBMaking Pictures With GNU PIC\fR. It appears that grohtml becomes confused +about which sections of the document are text and which sections need +to be rendered as an image. +.IP (vi) .5i +it would be nice to modularise the source. A natural division might be +to extract the table handling code from html.cc into table.cc. +The table.cc could be expanded to recognise output from tbl and try +and generate html tables with lines/rules/boxes. The code as it stands +should cope with very simple plain text tables. But of course at present +it does not get a chance to do this because the output of gtbl is +bracketed by \fCgraphic-start\fR and \fCgraphic-end\fR. +.IP (vii) .5i +introduce anti aliasing for the images as mentioned by Werner. +.SH +Dependencies +.LP +Grohtml is dependent upon grops, gs which are invoked to +generate all png files. Png files are generated whenever a table, picture, +equation or line is encountered. |