src - OpenBSD base system

diff options


context:
space:
mode:

author	Jason McIntyre <jmc@cvs.openbsd.org>	2004-09-28 20:56:01 +0000
committer	Jason McIntyre <jmc@cvs.openbsd.org>	2004-09-28 20:56:01 +0000
commit	ae2f164a5236aa60cb3e19994eeefc1902464078 (patch)
tree	cb65ab389e2b949b7fc730a7700471c717cfcc48
parent	9b74b5e09b835238cfc71092bf8b3681fcc81198 (diff)

various fixes to make this page more readable/helpful;

also split into 2 sections (ere and bre) and add a list of the expressions supported (nicked/adapted from ed(1)); includes fixes/feedback from otto and jared;

Diffstat

-rw-r--r--

lib/libc/regex/re_format.7

732

1 files changed, 596 insertions, 136 deletions

diff --git a/lib/libc/regex/re_format.7 b/lib/libc/regex/re_format.7
index d84ea6e7615..e5f0933072d 100644
--- a/lib/libc/regex/re_format.7
+++ b/lib/libc/regex/re_format.7

@@ -1,4 +1,4 @@

-.\" $OpenBSD: re_format.7,v 1.11 2004/05/07 14:49:53 otto Exp $

+.\" $OpenBSD: re_format.7,v 1.12 2004/09/28 20:56:00 jmc Exp $

.\"

@@ -40,157 +40,257 @@

.Os

.Sh NAME

.Nm re_format

-.Nd POSIX 1003.2 regular expressions

+.Nd POSIX regular expressions

.Sh DESCRIPTION

-Regular expressions (``RE''s),

-as defined in POSIX 1003.2, come in two forms:

-modern REs (roughly those of

-.Xr egrep 1 ;

-1003.2 calls these ``extended'' REs)

-and obsolete REs (roughly those of

-.Xr ed 1 ;

-1003.2 ``basic'' REs).

-Obsolete REs mostly exist for backward compatibility in some old programs;

-they will be discussed at the end.

-1003.2 leaves some aspects of RE syntax and semantics open;

-`\(dg' marks decisions on these aspects that

-may not be fully portable to other 1003.2 implementations.

+Regular expressions (REs),

+as defined in

+.St -p1003.1-2003 ,

+come in two forms:

+basic regular expressions

+(BREs)

+and extended regular expressions

+(EREs).

+Both forms of regular expressions are supported

+by the interfaces described in

+.Xr regex 3 .

+Applications dealing with regular expressions

+may use one or the other form

+(or indeed both).

+For example,

+.Xr ed 1

+uses BREs,

+whilst

+.Xr egrep 1

+talks EREs.

+Consult the manual page for the specific application to find out which

+it uses.

+.Pp

+POSIX leaves some aspects of RE syntax and semantics open;

+.Sq **

+marks decisions on these aspects that

+may not be fully portable to other POSIX implementations.

.Pp

-A (modern) RE is one\(dg or more non-empty\(dg

+This manual page first describes regular expressions in general,

+specifically extended regular expressions,

+and then discusses differences between them and basic regular expressions.

+.Sh EXTENDED REGULAR EXPRESSIONS

+An ERE is one** or more non-empty**

.Em branches ,

-separated by `|'.

+separated by

+.Sq \*(Ba .

It matches anything that matches one of the branches.

.Pp

-A branch is one\(dg or more

+A branch is one** or more

.Em pieces ,

concatenated.

It matches a match for the first, followed by a match for the second, etc.

.Pp

A piece is an

.Em atom

-possibly followed by a single\(dg `*', `+', `?', or

+possibly followed by a single**

+.Sq * ,

+.Sq + ,

+.Sq ?\& ,

+or

.Em bound .

-An atom followed by `*' matches a sequence of 0 or more matches of the atom.

-An atom followed by `+' matches a sequence of 1 or more matches of the atom.

-An atom followed by `?' matches a sequence of 0 or 1 matches of the atom.

+An atom followed by

+.Sq *

+matches a sequence of 0 or more matches of the atom.

+An atom followed by

+.Sq +

+matches a sequence of 1 or more matches of the atom.

+An atom followed by

+.Sq ?\&

+matches a sequence of 0 or 1 matches of the atom.

.Pp

-A

-.Em bound

-is `{' followed by an unsigned decimal integer,

-possibly followed by `,'

+A bound is

+.Sq {

+followed by an unsigned decimal integer,

+possibly followed by

+.Sq ,\&

possibly followed by another unsigned decimal integer,

-always followed by `}'.

-The integers must lie between 0 and RE_DUP_MAX (255\(dg) inclusive,

+always followed by

+.Sq } .

+The integers must lie between 0 and

+.Dv RE_DUP_MAX

+(255**) inclusive,

and if there are two of them, the first may not exceed the second.

-An atom followed by a bound containing one integer \fIi\fR

+An atom followed by a bound containing one integer

+.Ar i

and no comma matches

-a sequence of exactly \fIi\fR matches of the atom.

+a sequence of exactly

+.Ar i

+matches of the atom.

An atom followed by a bound

-containing one integer \fIi\fR and a comma matches

-a sequence of \fIi\fR or more matches of the atom.

+containing one integer

+.Ar i

+and a comma matches

+a sequence of

+.Ar i

+or more matches of the atom.

An atom followed by a bound

-containing two integers \fIi\fR and \fIj\fR matches

-a sequence of \fIi\fR through \fIj\fR (inclusive) matches of the atom.

+containing two integers

+.Ar i

+and

+.Ar j

+matches a sequence of

+.Ar i

+through

+.Ar j

+(inclusive) matches of the atom.

.Pp

-An

-.Em atom

-is a regular expression enclosed in `()'

-(matching a match for the regular expression),

-an empty set of `()' (matching the null string)\(dg,

+An atom is a regular expression enclosed in

+.Sq ()

+(matching a part of the regular expression),

+an empty set of

+.Sq ()

+(matching the null string)**,

-.Em "bracket expression"

-(see below), `.'

-(matching any single character), `^' (matching the null string at the

-beginning of a line), `$' (matching the null string at the

-end of a line), a `\e' followed by one of the characters

-`^.[$()|*+?{\e'

+.Em bracket expression

+(see below),

+.Sq .\&

+(matching any single character),

+.Sq ^

+(matching the null string at the beginning of a line),

+.Sq $

+(matching the null string at the end of a line),

+.Sq \e

+followed by one of the characters

+.Sq ^.[$()|*+?{\e

(matching that character taken as an ordinary character),

-a `\e' followed by any other character\(dg

+.Sq \e

+followed by any other character**

(matching that character taken as an ordinary character,

-as if the `\e' had not been present\(dg),

+as if the

+.Sq \e

+had not been present**),

or a single character with no other significance (matching that character).

-A `{' followed by a character other than a digit is an ordinary

-character, not the beginning of a bound\(dg.

-It is illegal to end an RE with `\e'.

-.Pp

-.Em "bracket expression"

-is a list of characters enclosed in `[]'.

+.Sq {

+followed by a character other than a digit is an ordinary character,

+not the beginning of a bound**.

+It is illegal to end an RE with

+.Sq \e .

+.Pp

+A bracket expression is a list of characters enclosed in

+.Sq [] .

It normally matches any single character from the list (but see below).

-If the list begins with `^',

+If the list begins with

+.Sq ^ ,

it matches any single character

-(but see below)

.Em not

-from the rest of the list.

-If two characters in the list are separated by `\-', this is shorthand

-for the full

+from the rest of the list

+(but see below).

+If two characters in the list are separated by

+.Sq - ,

+this is shorthand for the full

.Em range

of characters between those two (inclusive) in the

-collating sequence,

-e.g., `[0-9]' in ASCII matches any decimal digit.

-It is illegal\(dg for two ranges to share an

-endpoint, e.g., `a-c-e'.

+collating sequence, e.g.\&

+.Sq [0-9]

+in ASCII matches any decimal digit.

+It is illegal** for two ranges to share an endpoint, e.g.\&

+.Sq a-c-e .

Ranges are very collating-sequence-dependent,

and portable programs should avoid relying on them.

.Pp

-To include a literal `]' in the list, make it the first character

-(following a possible `^').

-To include a literal `\-', make it the first or last character,

+To include a literal

+.Sq ]\&

+in the list, make it the first character

+(following a possible

+.Sq ^ ) .

+To include a literal

+.Sq - ,

+make it the first or last character,

or the second endpoint of a range.

-To use a literal `\-' as the first endpoint of a range,

-enclose it in `[.' and `.]' to make it a collating element (see below).

-With the exception of these and some combinations using `[' (see next

-paragraphs), all other special characters, including `\e', lose their

-special significance within a bracket expression.

+To use a literal

+.Sq -

+as the first endpoint of a range,

+enclose it in

+.Sq [.

+and

+.Sq .]

+to make it a collating element (see below).

+With the exception of these and some combinations using

+.Sq [

+(see next paragraphs),

+all other special characters, including

+.Sq \e ,

+lose their special significance within a bracket expression.

.Pp

-Within a bracket expression, a collating element (a character,

+Within a bracket expression, a collating element

+(a character,

a multi-character sequence that collates as if it were a single character,

or a collating-sequence name for either)

-enclosed in `[.' and `.]' stands for the

-sequence of characters of that collating element.

+enclosed in

+.Sq [.

+and

+.Sq .]

+stands for the sequence of characters of that collating element.

The sequence is a single element of the bracket expression's list.

A bracket expression containing a multi-character collating element

can thus match more than one character,

-e.g., if the collating sequence includes a `ch' collating element,

-then the RE `[[.ch.]]*c' matches the first five characters

-of `chchcc'.

+e.g. if the collating sequence includes a

+.Sq ch

+collating element,

+then the RE

+.Sq [[.ch.]]*c

+matches the first five characters of

+.Sq chchcc .

.Pp

-Within a bracket expression, a collating element enclosed in `[=' and

-`=]' is an equivalence class, standing for the sequences of characters

+Within a bracket expression, a collating element enclosed in

+.Sq [=

+and

+.Sq =]

+is an equivalence class, standing for the sequences of characters

of all collating elements equivalent to that one, including itself.

(If there are no other equivalent collating elements,

-the treatment is as if the enclosing delimiters were `[.' and `.]'.)

-For example, if o and \o'o^' are the members of an equivalence class,

-then `[[=o=]]', `[[=\o'o^'=]]', and `[o\o'o^']' are all synonymous.

-An equivalence class may not\(dg be an endpoint

-of a range.

+the treatment is as if the enclosing delimiters were

+.Sq [.

+and

+.Sq .] . )

+For example, if

+.Sq x

+and

+.Sq y

+are the members of an equivalence class,

+then

+.Sq [[=x=]] ,

+.Sq [[=y=]] ,

+and

+.Sq [xy]

+are all synonymous.

+An equivalence class may not** be an endpoint of a range.

.Pp

Within a bracket expression, the name of a

-.Em "character class"

+.Em character class

enclosed

-in `[:' and `:]' stands for the list of all characters belonging to that

-class.

+in

+.Sq [:

+and

+.Sq :]

+stands for the list of all characters belonging to that class.

Standard character class names are:

-.Pp

-.Bl -item -compact -offset indent

-.It

+.Bd -literal -offset indent

alnum digit punct

-.It

alpha graph space

-.It

blank lower upper

-.It

cntrl print xdigit

-.El

+.Ed

.Pp

These stand for the character classes defined in

.Xr ctype 3 .

A locale may provide others.

A character class may not be used as an endpoint of a range.

.Pp

-There are two special cases\(dg of bracket expressions:

-the bracket expressions `[[:<:]]' and `[[:>:]]' match the null string at

-the beginning and end of a word respectively.

+There are two special cases** of bracket expressions:

+the bracket expressions

+.Sq [[:<:]]

+and

+.Sq [[:>:]]

+match the null string at the beginning and end of a word, respectively.

A word is defined as a sequence of

characters starting and ending with a word character

which is neither preceded nor followed by

@@ -201,7 +301,7 @@ character (as defined by

.Xr ctype 3 )

or an underscore.

This is an extension,

-compatible with but not specified by POSIX 1003.2,

+compatible with but not specified by POSIX,

and should be used with

caution in software intended to be portable to other systems.

.Pp

@@ -220,12 +320,22 @@ their lower-level component subexpressions.

Match lengths are measured in characters, not collating elements.

A null string is considered longer than no match at all.

For example,

-`bb*' matches the three middle characters of `abbbc',

-`(wee|week)(knights|nights)' matches all ten characters of `weeknights',

-when `(.*).*' is matched against `abc' the parenthesized subexpression

-matches all three characters, and

-when `(a*)*' is matched against `bc' both the whole RE and the parenthesized

-subexpression match the null string.

+.Sq bb*

+matches the three middle characters of

+.Sq abbbc ;

+.Sq (wee|week)(knights|nights)

+matches all ten characters of

+.Sq weeknights ;

+when

+.Sq (.*).*

+is matched against

+.Sq abc ,

+the parenthesized subexpression matches all three characters;

+and when

+.Sq (a*)*

+is matched against

+.Sq bc ,

+both the whole RE and the parenthesized subexpression match the null string.

.Pp

If case-independent matching is specified,

the effect is much as if all case distinctions had vanished from the

@@ -233,64 +343,414 @@ alphabet.

When an alphabetic that exists in multiple cases appears as an

ordinary character outside a bracket expression, it is effectively

transformed into a bracket expression containing both cases,

-e.g., `x' becomes `[xX]'.

-When it appears inside a bracket expression, all case counterparts

-of it are added to the bracket expression, so that (e.g.) `[x]'

-becomes `[xX]' and `[^x]' becomes `[^xX]'.

+e.g.\&

+.Sq x

+becomes

+.Sq [xX] .

+When it appears inside a bracket expression,

+all case counterparts of it are added to the bracket expression,

+so that, for example,

+.Sq [x]

+becomes

+.Sq [xX]

+and

+.Sq [^x]

+becomes

+.Sq [^xX] .

.Pp

-No particular limit is imposed on the length of REs\(dg.

+No particular limit is imposed on the length of REs**.

Programs intended to be portable should not employ REs longer

than 256 bytes,

as an implementation can refuse to accept such REs and remain

POSIX-compliant.

.Pp

-Obsolete (``basic'') regular expressions differ in several respects.

-`|', `+', and `?' are ordinary characters and there is no equivalent

+The following is a list of extended regular expressions:

+.Bl -tag -width Ds

+.It Ar c

+Any character

+.Ar c

+not listed below matches itself.

+.It \e Ns Ar c

+Any backslash-escaped character

+.Ar c

+matches itself.

+.It \&.

+Matches any single character that is not a newline

+.Pq Sq \en .

+.It Bq Ar char-class

+Matches any single character in

+.Ar char-class .

+To include a

+.Ql \&]

+in

+.Ar char-class ,

+it must be the first character.

+A range of characters may be specified by separating the end characters

+of the range with a

+.Ql - ;

+e.g.\&

+.Ar a-z

+specifies the lower case characters.

+The following literal expressions can also be used in

+.Ar char-class

+to specify sets of characters:

+.Bd -unfilled -offset indent

+[:alnum:] [:cntrl:] [:lower:] [:space:]

+[:alpha:] [:digit:] [:print:] [:upper:]

+[:blank:] [:graph:] [:punct:] [:xdigit:]

+.Ed

+.Pp

+If

+.Ql -

+appears as the first or last character of

+.Ar char-class ,

+then it matches itself.

+All other characters in

+.Ar char-class

+match themselves.

+.Pp

+Patterns in

+.Ar char-class

+of the form

+.Eo [.

+.Ar col-elm

+.Ec .]\&

+or

+.Eo [=

+.Ar col-elm

+.Ec =]\& ,

+where

+.Ar col-elm

+is a collating element, are interpreted according to

+.Xr setlocale 3

+.Pq not currently supported .

+.It Bq ^ Ns Ar char-class

+Matches any single character, other than newline, not in

+.Ar char-class .

+.Ar char-class

+is defined as above.

+.It ^

+If

+.Sq ^

+is the first character of a regular expression, then it

+anchors the regular expression to the beginning of a line.

+Otherwise, it matches itself.

+.It $

+If

+.Sq $

+is the last character of a regular expression,

+it anchors the regular expression to the end of a line.

+Otherwise, it matches itself.

+.It [[:<:]]

+Anchors the single character regular expression or subexpression

+immediately following it to the beginning of a word.

+.It [[:>:]]

+Anchors the single character regular expression or subexpression

+immediately following it to the end of a word.

+.It Pq Ar re

+Defines a subexpression

+.Ar re .

+Any set of characters enclosed in parentheses

+matches whatever the set of characters without parentheses matches

+(that is a long-winded way of saying the constructs

+.Sq (re)

+and

+.Sq re

+match identically).

+.It *

+Matches the single character regular expression or subexpression

+immediately preceding it zero or more times.

+If

+.Sq *

+is the first character of a regular expression or subexpression,

+then it matches itself.

+The

+.Sq *

+operator sometimes yields unexpected results.

+For example, the regular expression

+.Ar b*

+matches the beginning of the string

+.Qq abbb

+(as opposed to the substring

+.Qq bbb ) ,

+since a null match is the only leftmost match.

+.It +

+Matches the singular character regular expression

+or subexpression immediately preceding it

+one or more times.

+.It ?

+Matches the singular character regular expression

+or subexpression immediately preceding it

+0 or 1 times.

+.Sm off

+.It Xo

+.Pf { Ar n , m No }\ \&

+.Pf { Ar n , No }\ \&

+.Pf { Ar n No }

+.Xc

+.Sm on

+Matches the single character regular expression or subexpression

+immediately preceding it at least

+.Ar n

+and at most

+.Ar m

+times.

+If

+.Ar m

+is omitted, then it matches at least

+.Ar n

+times.

+If the comma is also omitted, then it matches exactly

+.Ar n

+times.

+.It \*(Ba

+Used to separate patterns.

+For example,

+the pattern

+.Sq cat\*(Badog

+matches either

+.Sq cat

+or

+.Sq dog .

+.El

+.Sh BASIC REGULAR EXPRESSIONS

+Basic regular expressions differ in several respects:

+.Bl -bullet -offset 3n

+.It

+.Sq \*(Ba ,

+.Sq + ,

+and

+.Sq ?\&

+are ordinary characters and there is no equivalent

for their functionality.

-The delimiters for bounds are `\e{' and `\e}',

-with `{' and `}' by themselves ordinary characters.

-The parentheses for nested subexpressions are `\e(' and `\e)',

-with `(' and `)' by themselves ordinary characters.

-`^' is an ordinary character except at the beginning of the

-RE or\(dg the beginning of a parenthesized subexpression,

-`$' is an ordinary character except at the end of the

-RE or\(dg the end of a parenthesized subexpression,

-and `*' is an ordinary character if it appears at the beginning of the

+.It

+The delimiters for bounds are

+.Sq \e{

+and

+.Sq \e} ,

+with

+.Sq {

+and

+.Sq }

+by themselves ordinary characters.

+.It

+The parentheses for nested subexpressions are

+.Sq \e(

+and

+.Sq \e) ,

+with

+.Sq (

+and

+.Sq )\&

+by themselves ordinary characters.

+.It

+.Sq ^

+is an ordinary character except at the beginning of the

+RE or** the beginning of a parenthesized subexpression.

+.It

+.Sq $

+is an ordinary character except at the end of the

+RE or** the end of a parenthesized subexpression.

+.It

+.Sq *

+is an ordinary character if it appears at the beginning of the

RE or the beginning of a parenthesized subexpression

-(after a possible leading `^').

+(after a possible leading

+.Sq ^ ) .

+.It

Finally, there is one new type of atom, a

-.Em "back reference" :

-`\e' followed by a non-zero decimal digit

-.Em d

-matches the same sequence of characters

-matched by the

-.Em d Ns th

+.Em back-reference :

+.Sq \e

+followed by a non-zero decimal digit

+.Ar d

+matches the same sequence of characters matched by the

+.Ar d Ns th

parenthesized subexpression

(numbering subexpressions by the positions of their opening parentheses,

left to right),

-so that (e.g.) `\e([bc]\e)\e1' matches `bb' or `cc' but not `bc'.

+so that, for example,

+.Sq \e([bc]\e)\e1

+matches

+.Sq bb\&

+or

+.Sq cc

+but not

+.Sq bc .

+.El

+.Pp

+The following is a list of basic regular expressions:

+.Bl -tag -width Ds

+.It Ar c

+Any character

+.Ar c

+not listed below matches itself.

+.It \e Ns Ar c

+Any backslash-escaped character

+.Ar c ,

+except for

+.Sq { ,

+.Sq } ,

+.Sq \&( ,

+and

+.Sq \&) ,

+matches itself.

+.It \&.

+Matches any single character that is not a newline

+.Pq Sq \en .

+.It Bq Ar char-class

+Matches any single character in

+.Ar char-class .

+To include a

+.Ql \&]

+in

+.Ar char-class ,

+it must be the first character.

+A range of characters may be specified by separating the end characters

+of the range with a

+.Ql - ;

+e.g.\&

+.Ar a-z

+specifies the lower case characters.

+The following literal expressions can also be used in

+.Ar char-class

+to specify sets of characters:

+.Bd -unfilled -offset indent

+[:alnum:] [:cntrl:] [:lower:] [:space:]

+[:alpha:] [:digit:] [:print:] [:upper:]

+[:blank:] [:graph:] [:punct:] [:xdigit:]

+.Ed

+.Pp

+If

+.Ql -

+appears as the first or last character of

+.Ar char-class ,

+then it matches itself.

+All other characters in

+.Ar char-class

+match themselves.

+.Pp

+Patterns in

+.Ar char-class

+of the form

+.Eo [.

+.Ar col-elm

+.Ec .]\&

+or

+.Eo [=

+.Ar col-elm

+.Ec =]\& ,

+where

+.Ar col-elm

+is a collating element, are interpreted according to

+.Xr setlocale 3

+.Pq not currently supported .

+.It Bq ^ Ns Ar char-class

+Matches any single character, other than newline, not in

+.Ar char-class .

+.Ar char-class

+is defined as above.

+.It ^

+If

+.Sq ^

+is the first character of a regular expression, then it

+anchors the regular expression to the beginning of a line.

+Otherwise, it matches itself.

+.It $

+If

+.Sq $

+is the last character of a regular expression,

+it anchors the regular expression to the end of a line.

+Otherwise, it matches itself.

+.It [[:<:]]

+Anchors the single character regular expression or subexpression

+immediately following it to the beginning of a word.

+.It [[:>:]]

+Anchors the single character regular expression or subexpression

+immediately following it to the end of a word.

+.It \e( Ns Ar re Ns \e)

+Defines a subexpression

+.Ar re .

+Subexpressions may be nested.

+A subsequent backreference of the form

+.Pf \e Ns Ar n ,

+where

+.Ar n

+is a number in the range [1,9], expands to the text matched by the

+.Ar n Ns th

+subexpression.

+For example, the regular expression

+.Ar \e(.*\e)\e1

+matches any string consisting of identical adjacent substrings.

+Subexpressions are ordered relative to their left delimiter.

+.It *

+Matches the single character regular expression or subexpression

+immediately preceding it zero or more times.

+If

+.Sq *

+is the first character of a regular expression or subexpression,

+then it matches itself.

+The

+.Sq *

+operator sometimes yields unexpected results.

+For example, the regular expression

+.Ar b*

+matches the beginning of the string

+.Qq abbb

+(as opposed to the substring

+.Qq bbb ) ,

+since a null match is the only leftmost match.

+.Sm off

+.It Xo

+.Pf \e{ Ar n , m No \e}\ \&

+.Pf \e{ Ar n , No \e}\ \&

+.Pf \e{ Ar n No \e}

+.Xc

+.Sm on

+Matches the single character regular expression or subexpression

+immediately preceding it at least

+.Ar n

+and at most

+.Ar m

+times.

+If

+.Ar m

+is omitted, then it matches at least

+.Ar n

+times.

+If the comma is also omitted, then it matches exactly

+.Ar n

+times.

+.El

.Sh SEE ALSO

+.Xr ctype 3 ,

.Xr regex 3

-.Pp

-POSIX 1003.2, section 2.8 (Regular Expression Notation).

+.Sh STANDARDS

+.St -p1003.1-2003 :

+Base Definitions, Chapter 9 (Regular Expressions).

.Sh BUGS

Having two kinds of REs is a botch.

.Pp

-The current 1003.2 spec says that `)' is an ordinary character in

-the absence of an unmatched `(';

+The current POSIX spec says that

+.Sq )\&

+is an ordinary character in the absence of an unmatched

+.Sq ( ;

this was an unintentional result of a wording error,

and change is likely.

Avoid relying on it.

.Pp

-Back references are a dreadful botch,

+Back-references are a dreadful botch,

posing major problems for efficient implementations.

They are also somewhat vaguely defined

(does

-`a\e(\e(b\e)*\e2\e)*d' match `abbbd'?).

+.Sq a\e(\e(b\e)*\e2\e)*d

+match

+.Sq abbbd ? ) .

Avoid using them.

.Pp

-1003.2's specification of case-independent matching is vague.

-The ``one case implies all cases'' definition given above

-is current consensus among implementors as to the right interpretation.

+POSIX's specification of case-independent matching is vague.

+The

+.Dq one case implies all cases

+definition given above

+is the current consensus among implementors as to the right interpretation.

.Pp

The syntax for word boundaries is incredibly ugly.