summaryrefslogtreecommitdiff
path: root/regress
diff options
context:
space:
mode:
authorIngo Schwarze <schwarze@cvs.openbsd.org>2015-10-13 23:30:43 +0000
committerIngo Schwarze <schwarze@cvs.openbsd.org>2015-10-13 23:30:43 +0000
commit9925e5168895883876a7016b4e1fb1e97961064f (patch)
tree37f1c77c0a8e22ce3db34a9f414ae931714075e8 /regress
parentc50fea35705e12062d913205519e3a1b4b80703a (diff)
Reject the escape sequences \[uD800] to \[uDFFF] in the parser.
These surrogates are not valid Unicode codepoints, so treat them just like any other undefined character escapes: Warn about them and do not produce output. Issue noticed while talking to stsp@, semarie@, and bentley@.
Diffstat (limited to 'regress')
-rw-r--r--regress/usr.bin/mandoc/char/unicode/input.out_ascii4
-rw-r--r--regress/usr.bin/mandoc/char/unicode/input.out_lint2
-rw-r--r--regress/usr.bin/mandoc/char/unicode/input.out_utf84
3 files changed, 6 insertions, 4 deletions
diff --git a/regress/usr.bin/mandoc/char/unicode/input.out_ascii b/regress/usr.bin/mandoc/char/unicode/input.out_ascii
index a9946d1b528..7711574c61d 100644
--- a/regress/usr.bin/mandoc/char/unicode/input.out_ascii
+++ b/regress/usr.bin/mandoc/char/unicode/input.out_ascii
@@ -37,8 +37,8 @@ DDEESSCCRRIIPPTTIIOONN
U+CFFF 0xecbfbf <?><?> end of last normal middle byte
U+D000 0xed8080 <?><?> begin of strange middle byte
U+D7FF 0xed9fbf <?><?> highest public three-byte
- U+D800 0xeda080 <?>??? lowest surrogate
- U+DFFF 0xedbfbf <?>??? highest surrogate
+ U+D800 0xeda080 ??? lowest surrogate
+ U+DFFF 0xedbfbf ??? highest surrogate
U+E000 0xee8080 <?><?> lowest private use
U+FFFF 0xefbfbf <?><?> highest three-byte
diff --git a/regress/usr.bin/mandoc/char/unicode/input.out_lint b/regress/usr.bin/mandoc/char/unicode/input.out_lint
index 77b6161cbab..8ac05edcef0 100644
--- a/regress/usr.bin/mandoc/char/unicode/input.out_lint
+++ b/regress/usr.bin/mandoc/char/unicode/input.out_lint
@@ -24,9 +24,11 @@ mandoc: input.in:34:19: ERROR: skipping bad character: 0xbf
mandoc: input.in:41:25: ERROR: skipping bad character: 0xed
mandoc: input.in:41:26: ERROR: skipping bad character: 0xa0
mandoc: input.in:41:27: ERROR: skipping bad character: 0x80
+mandoc: input.in:41:17: WARNING: invalid escape sequence: \[uD800]
mandoc: input.in:42:25: ERROR: skipping bad character: 0xed
mandoc: input.in:42:26: ERROR: skipping bad character: 0xbf
mandoc: input.in:42:27: ERROR: skipping bad character: 0xbf
+mandoc: input.in:42:17: WARNING: invalid escape sequence: \[uDFFF]
mandoc: input.in:50:19: ERROR: skipping bad character: 0xf0
mandoc: input.in:50:20: ERROR: skipping bad character: 0x80
mandoc: input.in:50:21: ERROR: skipping bad character: 0x80
diff --git a/regress/usr.bin/mandoc/char/unicode/input.out_utf8 b/regress/usr.bin/mandoc/char/unicode/input.out_utf8
index 44813b8d7ae..89aa6719533 100644
--- a/regress/usr.bin/mandoc/char/unicode/input.out_utf8
+++ b/regress/usr.bin/mandoc/char/unicode/input.out_utf8
@@ -37,8 +37,8 @@ DDEESSCCRRIIPPTTIIOONN
U+CFFF 0xecbfbf ì¿¿ì¿¿ end of last normal middle byte
U+D000 0xed8080 퀀퀀 begin of strange middle byte
U+D7FF 0xed9fbf ퟿퟿ highest public three-byte
- U+D800 0xeda080 í €??? lowest surrogate
- U+DFFF 0xedbfbf í¿¿??? highest surrogate
+ U+D800 0xeda080 ??? lowest surrogate
+ U+DFFF 0xedbfbf ??? highest surrogate
U+E000 0xee8080  lowest private use
U+FFFF 0xefbfbf ï¿¿ï¿¿ highest three-byte