diff options
author | Ingo Schwarze <schwarze@cvs.openbsd.org> | 2010-05-26 02:39:59 +0000 |
---|---|---|
committer | Ingo Schwarze <schwarze@cvs.openbsd.org> | 2010-05-26 02:39:59 +0000 |
commit | 71c147c136372d4309b120eda4306853560b848a (patch) | |
tree | 1b4efa482538f8fc7431ab27c2147a028931f03e /usr.bin/mandoc/mandoc.c | |
parent | c03090c978c31a26b24683e4bbdef4271bbf5b3a (diff) |
When a word does not fully fit onto the output line, but it contains
at least one hyphen, we already had support for breaking the line a the
last fitting hyphen. This patch improves this functionality by only
breaking at hyphens in free-form text, and by not breaking at hyphens
* at the beginning or end of a word or
* immediately preceded or followed by another hyphen or
* escaped by a preceding backslash.
Before this patch, differences in break-at-hyphen support were one
of the major sources of noise in automatic comparisons to mdoc(7)
groff output. Now, the remaining differences are hard to find among
the noise coming from other sources.
Where there are still differences, what we do seems to be better than
what groff does, see e.g. the chio(1) exchange and position commands
for one of the now rare examples.
idea and coding by kristaps@
Besides, this was the last substantial code difference left
between bsd.lv and openbsd.org. We are now in full sync.
Diffstat (limited to 'usr.bin/mandoc/mandoc.c')
-rw-r--r-- | usr.bin/mandoc/mandoc.c | 30 |
1 files changed, 29 insertions, 1 deletions
diff --git a/usr.bin/mandoc/mandoc.c b/usr.bin/mandoc/mandoc.c index 92c65e9d2e1..4f534f1589b 100644 --- a/usr.bin/mandoc/mandoc.c +++ b/usr.bin/mandoc/mandoc.c @@ -1,4 +1,4 @@ -/* $Id: mandoc.c,v 1.11 2010/05/15 15:37:53 schwarze Exp $ */ +/* $Id: mandoc.c,v 1.12 2010/05/26 02:39:58 schwarze Exp $ */ /* * Copyright (c) 2008, 2009 Kristaps Dzonsons <kristaps@kth.se> * @@ -336,3 +336,31 @@ mandoc_eos(const char *p, size_t sz) return(0); } + + +int +mandoc_hyph(const char *start, const char *c) +{ + + /* + * Choose whether to break at a hyphenated character. We only + * do this if it's free-standing within a word. + */ + + /* Skip first/last character of buffer. */ + if (c == start || '\0' == *(c + 1)) + return(0); + /* Skip first/last character of word. */ + if ('\t' == *(c + 1) || '\t' == *(c - 1)) + return(0); + if (' ' == *(c + 1) || ' ' == *(c - 1)) + return(0); + /* Skip double invocations. */ + if ('-' == *(c + 1) || '-' == *(c - 1)) + return(0); + /* Skip escapes. */ + if ('\\' == *(c - 1)) + return(0); + + return(1); +} |