src - OpenBSD base system

diff options

author	Ingo Schwarze <schwarze@cvs.openbsd.org>	2014-12-19 04:57:12 +0000
committer	Ingo Schwarze <schwarze@cvs.openbsd.org>	2014-12-19 04:57:12 +0000
commit	471959573eec2067dcde4dd2b29ca931a17c6983 (patch)
tree	f3c1c78ee8e3c7ebca409b6762ba714c943f1f02 /usr.bin/mandoc
parent	40ac43966a02a9df27bd6166381399205a4db7a0 (diff)

Rewrite the low-level UTF-8 parser from scratch.

It accepted invalid byte sequences like 0xc080-c1bf, 0xe08080-e09fbf, 0xeda080-edbfbf, and 0xf0808080-f08fbfbf, produced valid roff Unicode escape sequences from them, and the algorithm contained strong defenses against any attempt to fix it. This cures an assertion failure in the terminal formatter caused by sneaking in ASCII 0x08 (backspace) by "encoding" it as an (invalid) multibyte UTF-8 sequence, found by jsg@ with afl. As a bonus, the new algorithm also reduces the code in the function by about 20%.

Diffstat (limited to 'usr.bin/mandoc')

-rw-r--r--

usr.bin/mandoc/preconv.c

135

1 files changed, 59 insertions, 76 deletions


context:
space:
mode: