summaryrefslogtreecommitdiff
path: root/gnu/usr.bin/perl/pod
diff options
context:
space:
mode:
authorAndrew Fresh <afresh1@cvs.openbsd.org>2014-11-17 21:03:20 +0000
committerAndrew Fresh <afresh1@cvs.openbsd.org>2014-11-17 21:03:20 +0000
commit8e919c18ad6b483d5f3e40212752c5cced314c3b (patch)
tree6296222ff29f1b5bdc9d7635e4dac5daceacfad6 /gnu/usr.bin/perl/pod
parente6e16ef87b96de65fd8f668e9ace374bfe477e94 (diff)
Regenerate unicore for perl-5.20.1
ok deraadt@ sthen@ espie@ miod@
Diffstat (limited to 'gnu/usr.bin/perl/pod')
-rw-r--r--gnu/usr.bin/perl/pod/perl.pod2
-rw-r--r--gnu/usr.bin/perl/pod/perluniprops.pod781
2 files changed, 482 insertions, 301 deletions
diff --git a/gnu/usr.bin/perl/pod/perl.pod b/gnu/usr.bin/perl/pod/perl.pod
index 31e5633bdc2..08d7916a899 100644
--- a/gnu/usr.bin/perl/pod/perl.pod
+++ b/gnu/usr.bin/perl/pod/perl.pod
@@ -34,7 +34,7 @@ For ease of access, the Perl manual has been split up into several sections.
# This section is parsed by Porting/pod_lib.pl for use by pod/buildtoc etc
-flag =g perluniprops perlmodlib perlapi perlintern
+flag =g perlmodlib perlapi perlintern
flag =go perltoc
flag =ro perlcn perljp perlko perltw
flag = perlvms
diff --git a/gnu/usr.bin/perl/pod/perluniprops.pod b/gnu/usr.bin/perl/pod/perluniprops.pod
index 096cb4856de..54372f02120 100644
--- a/gnu/usr.bin/perl/pod/perluniprops.pod
+++ b/gnu/usr.bin/perl/pod/perluniprops.pod
@@ -2,7 +2,7 @@
# !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
# This file is machine-generated by lib/unicore/mktables from the Unicode
-# database, Version 6.2.0. Any changes made here will be lost!
+# database, Version 6.3.0. Any changes made here will be lost!
To change this file, edit lib/unicore/mktables instead.
@@ -11,7 +11,7 @@ To change this file, edit lib/unicore/mktables instead.
=head1 NAME
-perluniprops - Index of Unicode Version 6.2.0 character properties in Perl
+perluniprops - Index of Unicode Version 6.3.0 character properties in Perl
=head1 DESCRIPTION
@@ -58,7 +58,7 @@ B<Compound forms> consist of two components, separated by an equals sign or a
colon. The first component is the property name, and the second component is
the particular value of the property to match against, for example,
C<\p{Script: Greek}> and C<\p{Script=Greek}> both mean to match characters
-whose Script property is Greek.
+whose Script property value is Greek.
B<Single forms>, like C<\p{Greek}>, are mostly Perl-defined shortcuts for
their equivalent compound forms. The table shows these equivalences. (In our
@@ -76,11 +76,13 @@ for improved legibility.
Also, white space, hyphens, and underscores are normally ignored
everywhere between the {braces}, and hence can be freely added or removed
even if the C</x> modifier hasn't been specified on the regular expression.
-But a 'B<T>' at the beginning of an entry in the table below
+But in the table below a 'B<T>' at the beginning of an entry
means that tighter (stricter) rules are used for that entry:
=over 4
+=over 4
+
=item Single form (C<\p{name}>) tighter rules:
White space, hyphens, and underscores ARE significant
@@ -108,11 +110,15 @@ adjacent to (but within) the braces and the colon or equal sign.
=back
+=back
+
Some properties are considered obsolete by Unicode, but still available.
There are several varieties of obsolescence:
=over 4
+=over 4
+
=item Stabilized
A property may be stabilized. Such a determination does not indicate
@@ -156,6 +162,8 @@ some of these extensions to be removed without warning, replaced by another
property with the same name that means something different. Use the
equivalent shown instead.
+=back
+
Matches in the Block property have shortcuts that begin with "In_". For
example, C<\p{Block=Latin1}> can be written as C<\p{In_Latin1}>. For
@@ -173,11 +181,14 @@ about this.
The table below has two columns. The left column contains the C<\p{}>
constructs to look up, possibly preceded by the flags mentioned above; and
the right column contains information about them, like a description, or
-synonyms. It shows both the single and compound forms for each property that
-has them. If the left column is a short name for a property, the right column
-will give its longer, more descriptive name; and if the left column is the
-longest name, the right column will show any equivalent shortest name, in both
-single and compound forms if applicable.
+synonyms. The table shows both the single and compound forms for each
+property that has them. If the left column is a short name for a property,
+the right column will give its longer, more descriptive name; and if the left
+column is the longest name, the right column will show any equivalent shortest
+name, in both single and compound forms if applicable.
+
+If braces are not needed to specify a property (e.g., C<\pL>), the left
+column contains both forms, with and without braces.
The right column will also caution you if a property means something different
than what might normally be expected.
@@ -185,18 +196,15 @@ than what might normally be expected.
All single forms are Perl extensions; a few compound forms are as well, and
are noted as such.
-Numbers in (parentheses) indicate the total number of code points matched by
-the property. For emphasis, those properties that match no code points at all
-are listed as well in a separate section following the table.
+Numbers in (parentheses) indicate the total number of Unicode code points
+matched by the property. For emphasis, those properties that match no code
+points at all are listed as well in a separate section following the table.
Most properties match the same code points regardless of whether C<"/i">
case-insensitive matching is specified or not. But a few properties are
-affected. These are shown with the notation
-
- (/i= other_property)
-
+affected. These are shown with the notation S<C<(/i= I<other_property>)>>
in the second column. Under case-insensitive matching they match the
-same code pode points as the property "other_property".
+same code pode points as the property I<other_property>.
There is no description given for most non-Perl defined properties (See
L<http://www.unicode.org/reports/tr44/> for that).
@@ -233,20 +241,34 @@ B<Legend summary:>
=over 4
-=item Z<>B<*> is a wild-card
+=item *
+
+B<*> is a wild-card
+
+=item *
+
+B<(\d+)> in the info column gives the number of Unicode code points matched
+by this property.
+
+=item *
+
+B<D> means this is deprecated.
+
+=item *
-=item B<(\d+)> in the info column gives the number of code points matched by
-this property.
+B<O> means this is obsolete.
-=item B<D> means this is deprecated.
+=item *
-=item B<O> means this is obsolete.
+B<S> means this is stabilized.
-=item B<S> means this is stabilized.
+=item *
-=item B<T> means tighter (stricter) name matching applies.
+B<T> means tighter (stricter) name matching applies.
-=item B<X> means use of this form is discouraged, and may not be
+=item *
+
+B<X> means use of this form is discouraged, and may not be
stable.
=back
@@ -268,10 +290,13 @@ stable.
T \p{Age: 6.0} \p{Age=V6_0} (2088)
T \p{Age: 6.1} \p{Age=V6_1} (732)
T \p{Age: 6.2} \p{Age=V6_2} (1)
- \p{Age: NA} \p{Age=Unassigned} (864_348)
+ T \p{Age: 6.3} \p{Age=V6_3} (5)
+ \p{Age: NA} \p{Age=Unassigned} (864_343 plus all
+ above-Unicode code points)
\p{Age: Unassigned} Code point's usage has not been assigned
in any Unicode release thus far. (Short:
- \p{Age=NA}) (864_348)
+ \p{Age=NA}) (864_343 plus all above-
+ Unicode code points)
\p{Age: V1_1} Code point's usage introduced in version
1.1 (33_979)
\p{Age: V2_0} Code point's usage was introduced in
@@ -313,6 +338,9 @@ stable.
\p{Age: V6_2} Code point's usage was introduced in
version 6.2; See also Property
'Present_In' (1)
+ \p{Age: V6_3} Code point's usage was introduced in
+ version 6.3; See also Property
+ 'Present_In' (5)
\p{AHex} \p{PosixXDigit} (= \p{ASCII_Hex_Digit=Y})
(22)
\p{AHex: *} \p{ASCII_Hex_Digit: *}
@@ -320,12 +348,15 @@ stable.
Alchemical_Symbols}) (128)
X \p{Alchemical_Symbols} \p{Block=Alchemical_Symbols} (Short:
\p{InAlchemical}) (128)
- \p{All} \p{Any} (1_114_112)
+ \p{All} All code points, including those above
+ Unicode. Same as qr/./s (1_114_112 plus
+ all above-Unicode code points)
\p{Alnum} Alphabetic and (decimal) Numeric (102_619)
\p{Alpha} \p{Alphabetic=Y} (102_159)
\p{Alpha: *} \p{Alphabetic: *}
\p{Alphabetic} \p{Alpha} (= \p{Alphabetic=Y}) (102_159)
- \p{Alphabetic: N*} (Short: \p{Alpha=N}, \P{Alpha}) (1_011_953)
+ \p{Alphabetic: N*} (Short: \p{Alpha=N}, \P{Alpha}) (1_011_953
+ plus all above-Unicode code points)
\p{Alphabetic: Y*} (Short: \p{Alpha=Y}, \p{Alpha}) (102_159)
X \p{Alphabetic_PF} \p{Alphabetic_Presentation_Forms} (=
\p{Block=Alphabetic_Presentation_Forms})
@@ -341,11 +372,12 @@ stable.
\p{InAncientGreekMusic}) (80)
X \p{Ancient_Greek_Numbers} \p{Block=Ancient_Greek_Numbers} (80)
X \p{Ancient_Symbols} \p{Block=Ancient_Symbols} (64)
- \p{Any} [\x{0000}-\x{10FFFF}] (1_114_112)
+ \p{Any} All Unicode code points: [\x{0000}-
+ \x{10FFFF}] (1_114_112)
\p{Arab} \p{Arabic} (= \p{Script=Arabic}) (NOT
- \p{Block=Arabic}) (1235)
+ \p{Block=Arabic}) (1236)
\p{Arabic} \p{Script=Arabic} (Short: \p{Arab}; NOT
- \p{Block=Arabic}) (1235)
+ \p{Block=Arabic}) (1236)
X \p{Arabic_Ext_A} \p{Arabic_Extended_A} (= \p{Block=
Arabic_Extended_A}) (96)
X \p{Arabic_Extended_A} \p{Block=Arabic_Extended_A} (Short:
@@ -384,9 +416,10 @@ stable.
\p{ASCII} \p{Block=Basic_Latin} [[:ASCII:]] (128)
\p{ASCII_Hex_Digit} \p{PosixXDigit} (= \p{ASCII_Hex_Digit=Y})
(22)
- \p{ASCII_Hex_Digit: N*} (Short: \p{AHex=N}, \P{AHex}) (1_114_090)
+ \p{ASCII_Hex_Digit: N*} (Short: \p{AHex=N}, \P{AHex}) (1_114_090
+ plus all above-Unicode code points)
\p{ASCII_Hex_Digit: Y*} (Short: \p{AHex=Y}, \p{AHex}) (22)
- \p{Assigned} All assigned code points (249_698)
+ \p{Assigned} All assigned code points (249_703)
\p{Avestan} \p{Script=Avestan} (Short: \p{Avst}; NOT
\p{Block=Avestan}) (61)
\p{Avst} \p{Avestan} (= \p{Script=Avestan}) (NOT
@@ -413,57 +446,81 @@ stable.
\p{Block=Bengali}) (92)
\p{Bengali} \p{Script=Bengali} (Short: \p{Beng}; NOT
\p{Block=Bengali}) (92)
- \p{Bidi_C} \p{Bidi_Control} (= \p{Bidi_Control=Y}) (7)
+ \p{Bidi_C} \p{Bidi_Control} (= \p{Bidi_Control=Y})
+ (12)
\p{Bidi_C: *} \p{Bidi_Control: *}
\p{Bidi_Class: AL} \p{Bidi_Class=Arabic_Letter} (1438)
\p{Bidi_Class: AN} \p{Bidi_Class=Arabic_Number} (49)
\p{Bidi_Class: Arabic_Letter} (Short: \p{Bc=AL}) (1438)
\p{Bidi_Class: Arabic_Number} (Short: \p{Bc=AN}) (49)
\p{Bidi_Class: B} \p{Bidi_Class=Paragraph_Separator} (7)
- \p{Bidi_Class: BN} \p{Bidi_Class=Boundary_Neutral} (4015)
- \p{Bidi_Class: Boundary_Neutral} (Short: \p{Bc=BN}) (4015)
+ \p{Bidi_Class: BN} \p{Bidi_Class=Boundary_Neutral} (4012)
+ \p{Bidi_Class: Boundary_Neutral} (Short: \p{Bc=BN}) (4012)
\p{Bidi_Class: Common_Separator} (Short: \p{Bc=CS}) (15)
\p{Bidi_Class: CS} \p{Bidi_Class=Common_Separator} (15)
\p{Bidi_Class: EN} \p{Bidi_Class=European_Number} (131)
\p{Bidi_Class: ES} \p{Bidi_Class=European_Separator} (12)
- \p{Bidi_Class: ET} \p{Bidi_Class=European_Terminator} (66)
+ \p{Bidi_Class: ET} \p{Bidi_Class=European_Terminator} (87)
\p{Bidi_Class: European_Number} (Short: \p{Bc=EN}) (131)
\p{Bidi_Class: European_Separator} (Short: \p{Bc=ES}) (12)
- \p{Bidi_Class: European_Terminator} (Short: \p{Bc=ET}) (66)
- \p{Bidi_Class: L} \p{Bidi_Class=Left_To_Right} (1_098_530)
- \p{Bidi_Class: Left_To_Right} (Short: \p{Bc=L}) (1_098_530)
+ \p{Bidi_Class: European_Terminator} (Short: \p{Bc=ET}) (87)
+ \p{Bidi_Class: First_Strong_Isolate} (Short: \p{Bc=FSI}) (1)
+ \p{Bidi_Class: FSI} \p{Bidi_Class=First_Strong_Isolate} (1)
+ \p{Bidi_Class: L} \p{Bidi_Class=Left_To_Right} (1_098_508
+ plus all above-Unicode code points)
+ \p{Bidi_Class: Left_To_Right} (Short: \p{Bc=L}) (1_098_508 plus
+ all above-Unicode code points)
\p{Bidi_Class: Left_To_Right_Embedding} (Short: \p{Bc=LRE}) (1)
+ \p{Bidi_Class: Left_To_Right_Isolate} (Short: \p{Bc=LRI}) (1)
\p{Bidi_Class: Left_To_Right_Override} (Short: \p{Bc=LRO}) (1)
\p{Bidi_Class: LRE} \p{Bidi_Class=Left_To_Right_Embedding} (1)
+ \p{Bidi_Class: LRI} \p{Bidi_Class=Left_To_Right_Isolate} (1)
\p{Bidi_Class: LRO} \p{Bidi_Class=Left_To_Right_Override} (1)
- \p{Bidi_Class: Nonspacing_Mark} (Short: \p{Bc=NSM}) (1290)
- \p{Bidi_Class: NSM} \p{Bidi_Class=Nonspacing_Mark} (1290)
+ \p{Bidi_Class: Nonspacing_Mark} (Short: \p{Bc=NSM}) (1291)
+ \p{Bidi_Class: NSM} \p{Bidi_Class=Nonspacing_Mark} (1291)
\p{Bidi_Class: ON} \p{Bidi_Class=Other_Neutral} (4447)
\p{Bidi_Class: Other_Neutral} (Short: \p{Bc=ON}) (4447)
\p{Bidi_Class: Paragraph_Separator} (Short: \p{Bc=B}) (7)
\p{Bidi_Class: PDF} \p{Bidi_Class=Pop_Directional_Format} (1)
+ \p{Bidi_Class: PDI} \p{Bidi_Class=Pop_Directional_Isolate} (1)
\p{Bidi_Class: Pop_Directional_Format} (Short: \p{Bc=PDF}) (1)
+ \p{Bidi_Class: Pop_Directional_Isolate} (Short: \p{Bc=PDI}) (1)
\p{Bidi_Class: R} \p{Bidi_Class=Right_To_Left} (4086)
\p{Bidi_Class: Right_To_Left} (Short: \p{Bc=R}) (4086)
\p{Bidi_Class: Right_To_Left_Embedding} (Short: \p{Bc=RLE}) (1)
+ \p{Bidi_Class: Right_To_Left_Isolate} (Short: \p{Bc=RLI}) (1)
\p{Bidi_Class: Right_To_Left_Override} (Short: \p{Bc=RLO}) (1)
\p{Bidi_Class: RLE} \p{Bidi_Class=Right_To_Left_Embedding} (1)
+ \p{Bidi_Class: RLI} \p{Bidi_Class=Right_To_Left_Isolate} (1)
\p{Bidi_Class: RLO} \p{Bidi_Class=Right_To_Left_Override} (1)
\p{Bidi_Class: S} \p{Bidi_Class=Segment_Separator} (3)
\p{Bidi_Class: Segment_Separator} (Short: \p{Bc=S}) (3)
- \p{Bidi_Class: White_Space} (Short: \p{Bc=WS}) (18)
- \p{Bidi_Class: WS} \p{Bidi_Class=White_Space} (18)
- \p{Bidi_Control} \p{Bidi_Control=Y} (Short: \p{BidiC}) (7)
- \p{Bidi_Control: N*} (Short: \p{BidiC=N}, \P{BidiC}) (1_114_105)
- \p{Bidi_Control: Y*} (Short: \p{BidiC=Y}, \p{BidiC}) (7)
+ \p{Bidi_Class: White_Space} (Short: \p{Bc=WS}) (17)
+ \p{Bidi_Class: WS} \p{Bidi_Class=White_Space} (17)
+ \p{Bidi_Control} \p{Bidi_Control=Y} (Short: \p{BidiC}) (12)
+ \p{Bidi_Control: N*} (Short: \p{BidiC=N}, \P{BidiC}) (1_114_100
+ plus all above-Unicode code points)
+ \p{Bidi_Control: Y*} (Short: \p{BidiC=Y}, \p{BidiC}) (12)
\p{Bidi_M} \p{Bidi_Mirrored} (= \p{Bidi_Mirrored=Y})
(545)
\p{Bidi_M: *} \p{Bidi_Mirrored: *}
\p{Bidi_Mirrored} \p{Bidi_Mirrored=Y} (Short: \p{BidiM})
(545)
- \p{Bidi_Mirrored: N*} (Short: \p{BidiM=N}, \P{BidiM}) (1_113_567)
+ \p{Bidi_Mirrored: N*} (Short: \p{BidiM=N}, \P{BidiM}) (1_113_567
+ plus all above-Unicode code points)
\p{Bidi_Mirrored: Y*} (Short: \p{BidiM=Y}, \p{BidiM}) (545)
- \p{Blank} \h, Horizontal white space (19)
+ \p{Bidi_Paired_Bracket_Type: C} \p{Bidi_Paired_Bracket_Type=Close}
+ (60)
+ \p{Bidi_Paired_Bracket_Type: Close} (Short: \p{Bpt=C}) (60)
+ \p{Bidi_Paired_Bracket_Type: N} \p{Bidi_Paired_Bracket_Type=None}
+ (1_113_992 plus all above-Unicode code
+ points)
+ \p{Bidi_Paired_Bracket_Type: None} (Short: \p{Bpt=N}) (1_113_992
+ plus all above-Unicode code points)
+ \p{Bidi_Paired_Bracket_Type: O} \p{Bidi_Paired_Bracket_Type=Open}
+ (60)
+ \p{Bidi_Paired_Bracket_Type: Open} (Short: \p{Bpt=O}) (60)
+ \p{Blank} \h, Horizontal white space (18)
\p{Blk: *} \p{Block: *}
\p{Block: Aegean_Numbers} (Single: \p{InAegeanNumbers}) (64)
\p{Block: Alchemical} \p{Block=Alchemical_Symbols} (128)
@@ -905,13 +962,15 @@ stable.
\p{Block: Myanmar_Ext_A} \p{Block=Myanmar_Extended_A} (32)
\p{Block: Myanmar_Extended_A} (Short: \p{Blk=MyanmarExtA},
\p{InMyanmarExtA}) (32)
- \p{Block: NB} \p{Block=No_Block} (860_672)
+ \p{Block: NB} \p{Block=No_Block} (860_672 plus all
+ above-Unicode code points)
\p{Block: New_Tai_Lue} (Single: \p{InNewTaiLue}; NOT
\p{New_Tai_Lue} NOR \p{Is_New_Tai_Lue})
(96)
\p{Block: NKo} (Single: \p{InNKo}; NOT \p{Nko} NOR
\p{Is_NKo}) (64)
- \p{Block: No_Block} (Short: \p{Blk=NB}, \p{InNB}) (860_672)
+ \p{Block: No_Block} (Short: \p{Blk=NB}, \p{InNB}) (860_672
+ plus all above-Unicode code points)
\p{Block: Number_Forms} (Single: \p{InNumberForms}) (64)
\p{Block: OCR} \p{Block=Optical_Character_Recognition}
(32)
@@ -1102,6 +1161,7 @@ stable.
X \p{Bopomofo_Extended} \p{Block=Bopomofo_Extended} (Short:
\p{InBopomofoExt}) (32)
X \p{Box_Drawing} \p{Block=Box_Drawing} (128)
+ \p{Bpt: *} \p{Bidi_Paired_Bracket_Type: *}
\p{Brah} \p{Brahmi} (= \p{Script=Brahmi}) (NOT
\p{Block=Brahmi}) (108)
\p{Brahmi} \p{Script=Brahmi} (Short: \p{Brah}; NOT
@@ -1122,8 +1182,9 @@ stable.
Byzantine_Musical_Symbols}) (256)
X \p{Byzantine_Musical_Symbols} \p{Block=Byzantine_Musical_Symbols}
(Short: \p{InByzantineMusic}) (256)
- \p{C} \p{Other} (= \p{General_Category=Other})
- (1_004_134)
+ \p{C} \pC \p{Other} (= \p{General_Category=Other})
+ (1_004_135 plus all above-Unicode code
+ points)
\p{Cakm} \p{Chakma} (= \p{Script=Chakma}) (NOT
\p{Block=Chakma}) (67)
\p{Canadian_Aboriginal} \p{Script=Canadian_Aboriginal} (Short:
@@ -1133,7 +1194,8 @@ stable.
Unified_Canadian_Aboriginal_Syllabics})
(640)
T \p{Canonical_Combining_Class: 0} \p{Canonical_Combining_Class=
- Not_Reordered} (1_113_459)
+ Not_Reordered} (1_113_459 plus all
+ above-Unicode code points)
T \p{Canonical_Combining_Class: 1} \p{Canonical_Combining_Class=
Overlay} (26)
T \p{Canonical_Combining_Class: 7} \p{Canonical_Combining_Class=
@@ -1336,9 +1398,11 @@ stable.
\p{Canonical_Combining_Class: NK} \p{Canonical_Combining_Class=
Nukta} (13)
\p{Canonical_Combining_Class: Not_Reordered} (Short: \p{Ccc=NR})
- (1_113_459)
+ (1_113_459 plus all above-Unicode code
+ points)
\p{Canonical_Combining_Class: NR} \p{Canonical_Combining_Class=
- Not_Reordered} (1_113_459)
+ Not_Reordered} (1_113_459 plus all
+ above-Unicode code points)
\p{Canonical_Combining_Class: Nukta} (Short: \p{Ccc=NK}) (13)
\p{Canonical_Combining_Class: OV} \p{Canonical_Combining_Class=
Overlay} (26)
@@ -1355,11 +1419,13 @@ stable.
\p{Block=Carian}) (49)
\p{Carian} \p{Script=Carian} (Short: \p{Cari}; NOT
\p{Block=Carian}) (49)
- \p{Case_Ignorable} \p{Case_Ignorable=Y} (Short: \p{CI}) (1799)
- \p{Case_Ignorable: N*} (Short: \p{CI=N}, \P{CI}) (1_112_313)
- \p{Case_Ignorable: Y*} (Short: \p{CI=Y}, \p{CI}) (1799)
+ \p{Case_Ignorable} \p{Case_Ignorable=Y} (Short: \p{CI}) (1806)
+ \p{Case_Ignorable: N*} (Short: \p{CI=N}, \P{CI}) (1_112_306 plus
+ all above-Unicode code points)
+ \p{Case_Ignorable: Y*} (Short: \p{CI=Y}, \p{CI}) (1806)
\p{Cased} \p{Cased=Y} (3448)
- \p{Cased: N*} (Single: \P{Cased}) (1_110_664)
+ \p{Cased: N*} (Single: \P{Cased}) (1_110_664 plus all
+ above-Unicode code points)
\p{Cased: Y*} (Single: \p{Cased}) (3448)
\p{Cased_Letter} \p{General_Category=Cased_Letter} (Short:
\p{LC}) (3223)
@@ -1371,49 +1437,55 @@ stable.
\p{Composition_Exclusion=Y}) (81)
\p{CE: *} \p{Composition_Exclusion: *}
\p{Cf} \p{Format} (= \p{General_Category=Format})
- (139)
+ (145)
\p{Chakma} \p{Script=Chakma} (Short: \p{Cakm}; NOT
\p{Block=Chakma}) (67)
\p{Cham} \p{Script=Cham} (NOT \p{Block=Cham}) (83)
\p{Changes_When_Casefolded} \p{Changes_When_Casefolded=Y} (Short:
\p{CWCF}) (1107)
\p{Changes_When_Casefolded: N*} (Short: \p{CWCF=N}, \P{CWCF})
- (1_113_005)
+ (1_113_005 plus all above-Unicode code
+ points)
\p{Changes_When_Casefolded: Y*} (Short: \p{CWCF=Y}, \p{CWCF})
(1107)
\p{Changes_When_Casemapped} \p{Changes_When_Casemapped=Y} (Short:
\p{CWCM}) (2138)
\p{Changes_When_Casemapped: N*} (Short: \p{CWCM=N}, \P{CWCM})
- (1_111_974)
+ (1_111_974 plus all above-Unicode code
+ points)
\p{Changes_When_Casemapped: Y*} (Short: \p{CWCM=Y}, \p{CWCM})
(2138)
\p{Changes_When_Lowercased} \p{Changes_When_Lowercased=Y} (Short:
\p{CWL}) (1043)
\p{Changes_When_Lowercased: N*} (Short: \p{CWL=N}, \P{CWL})
- (1_113_069)
+ (1_113_069 plus all above-Unicode code
+ points)
\p{Changes_When_Lowercased: Y*} (Short: \p{CWL=Y}, \p{CWL}) (1043)
\p{Changes_When_NFKC_Casefolded} \p{Changes_When_NFKC_Casefolded=
- Y} (Short: \p{CWKCF}) (9944)
+ Y} (Short: \p{CWKCF}) (9946)
\p{Changes_When_NFKC_Casefolded: N*} (Short: \p{CWKCF=N},
- \P{CWKCF}) (1_104_168)
+ \P{CWKCF}) (1_104_166 plus all above-
+ Unicode code points)
\p{Changes_When_NFKC_Casefolded: Y*} (Short: \p{CWKCF=Y},
- \p{CWKCF}) (9944)
+ \p{CWKCF}) (9946)
\p{Changes_When_Titlecased} \p{Changes_When_Titlecased=Y} (Short:
\p{CWT}) (1099)
\p{Changes_When_Titlecased: N*} (Short: \p{CWT=N}, \P{CWT})
- (1_113_013)
+ (1_113_013 plus all above-Unicode code
+ points)
\p{Changes_When_Titlecased: Y*} (Short: \p{CWT=Y}, \p{CWT}) (1099)
\p{Changes_When_Uppercased} \p{Changes_When_Uppercased=Y} (Short:
\p{CWU}) (1126)
\p{Changes_When_Uppercased: N*} (Short: \p{CWU=N}, \P{CWU})
- (1_112_986)
+ (1_112_986 plus all above-Unicode code
+ points)
\p{Changes_When_Uppercased: Y*} (Short: \p{CWU=Y}, \p{CWU}) (1126)
\p{Cher} \p{Cherokee} (= \p{Script=Cherokee}) (NOT
\p{Block=Cherokee}) (85)
\p{Cherokee} \p{Script=Cherokee} (Short: \p{Cher}; NOT
\p{Block=Cherokee}) (85)
\p{CI} \p{Case_Ignorable} (= \p{Case_Ignorable=
- Y}) (1799)
+ Y}) (1806)
\p{CI: *} \p{Case_Ignorable: *}
X \p{CJK} \p{CJK_Unified_Ideographs} (= \p{Block=
CJK_Unified_Ideographs}) (20_992)
@@ -1482,9 +1554,10 @@ stable.
CJK_Unified_Ideographs_Extension_D}
(Short: \p{InCJKExtD}) (224)
\p{Close_Punctuation} \p{General_Category=Close_Punctuation}
- (Short: \p{Pe}) (71)
+ (Short: \p{Pe}) (73)
\p{Cn} \p{Unassigned} (= \p{General_Category=
- Unassigned}) (864_414)
+ Unassigned}) (864_409 plus all above-
+ Unicode code points)
\p{Cntrl} \p{General_Category=Control} Control
characters (Short: \p{Cc}) (65)
\p{Co} \p{Private_Use} (= \p{General_Category=
@@ -1509,7 +1582,7 @@ stable.
Symbols} (= \p{Block=
Combining_Diacritical_Marks_For_-
Symbols}) (48)
- \p{Common} \p{Script=Common} (Short: \p{Zyyy}) (6413)
+ \p{Common} \p{Script=Common} (Short: \p{Zyyy}) (6418)
X \p{Common_Indic_Number_Forms} \p{Block=Common_Indic_Number_Forms}
(Short: \p{InIndicNumberForms}) (16)
\p{Comp_Ex} \p{Full_Composition_Exclusion} (=
@@ -1519,7 +1592,8 @@ stable.
Hangul_Compatibility_Jamo}) (96)
\p{Composition_Exclusion} \p{Composition_Exclusion=Y} (Short:
\p{CE}) (81)
- \p{Composition_Exclusion: N*} (Short: \p{CE=N}, \P{CE}) (1_114_031)
+ \p{Composition_Exclusion: N*} (Short: \p{CE=N}, \P{CE}) (1_114_031
+ plus all above-Unicode code points)
\p{Composition_Exclusion: Y*} (Short: \p{CE=Y}, \p{CE}) (81)
\p{Connector_Punctuation} \p{General_Category=
Connector_Punctuation} (Short: \p{Pc})
@@ -1557,7 +1631,7 @@ stable.
\p{CWCM: *} \p{Changes_When_Casemapped: *}
\p{CWKCF} \p{Changes_When_NFKC_Casefolded} (=
\p{Changes_When_NFKC_Casefolded=Y})
- (9944)
+ (9946)
\p{CWKCF: *} \p{Changes_When_NFKC_Casefolded: *}
\p{CWL} \p{Changes_When_Lowercased} (=
\p{Changes_When_Lowercased=Y}) (1043)
@@ -1589,7 +1663,8 @@ stable.
\p{Cyrl} \p{Cyrillic} (= \p{Script=Cyrillic}) (NOT
\p{Block=Cyrillic}) (417)
\p{Dash} \p{Dash=Y} (27)
- \p{Dash: N*} (Single: \P{Dash}) (1_114_085)
+ \p{Dash: N*} (Single: \P{Dash}) (1_114_085 plus all
+ above-Unicode code points)
\p{Dash: Y*} (Single: \p{Dash}) (27)
\p{Dash_Punctuation} \p{General_Category=Dash_Punctuation}
(Short: \p{Pd}) (23)
@@ -1622,7 +1697,8 @@ stable.
\p{Decomposition_Type: Non_Canonical} Union of all non-canonical
decompositions (Short: \p{Dt=NonCanon})
(Perl extension) (3655)
- \p{Decomposition_Type: None} (Short: \p{Dt=None}) (1_097_232)
+ \p{Decomposition_Type: None} (Short: \p{Dt=None}) (1_097_232 plus
+ all above-Unicode code points)
\p{Decomposition_Type: Small} (Short: \p{Dt=Sml}) (26)
\p{Decomposition_Type: Sml} \p{Decomposition_Type=Small} (26)
\p{Decomposition_Type: Sqr} \p{Decomposition_Type=Square} (284)
@@ -1634,15 +1710,17 @@ stable.
\p{Decomposition_Type: Vertical} (Short: \p{Dt=Vert}) (35)
\p{Decomposition_Type: Wide} (Short: \p{Dt=Wide}) (104)
\p{Default_Ignorable_Code_Point} \p{Default_Ignorable_Code_Point=
- Y} (Short: \p{DI}) (4167)
+ Y} (Short: \p{DI}) (4169)
\p{Default_Ignorable_Code_Point: N*} (Short: \p{DI=N}, \P{DI})
- (1_109_945)
+ (1_109_943 plus all above-Unicode code
+ points)
\p{Default_Ignorable_Code_Point: Y*} (Short: \p{DI=Y}, \p{DI})
- (4167)
+ (4169)
\p{Dep} \p{Deprecated} (= \p{Deprecated=Y}) (111)
\p{Dep: *} \p{Deprecated: *}
\p{Deprecated} \p{Deprecated=Y} (Short: \p{Dep}) (111)
- \p{Deprecated: N*} (Short: \p{Dep=N}, \P{Dep}) (1_114_001)
+ \p{Deprecated: N*} (Short: \p{Dep=N}, \P{Dep}) (1_114_001
+ plus all above-Unicode code points)
\p{Deprecated: Y*} (Short: \p{Dep=Y}, \p{Dep}) (111)
\p{Deseret} \p{Script=Deseret} (Short: \p{Dsrt}) (80)
\p{Deva} \p{Devanagari} (= \p{Script=Devanagari})
@@ -1655,12 +1733,13 @@ stable.
\p{InDevanagariExt}) (32)
\p{DI} \p{Default_Ignorable_Code_Point} (=
\p{Default_Ignorable_Code_Point=Y})
- (4167)
+ (4169)
\p{DI: *} \p{Default_Ignorable_Code_Point: *}
\p{Dia} \p{Diacritic} (= \p{Diacritic=Y}) (693)
\p{Dia: *} \p{Diacritic: *}
\p{Diacritic} \p{Diacritic=Y} (Short: \p{Dia}) (693)
- \p{Diacritic: N*} (Short: \p{Dia=N}, \P{Dia}) (1_113_419)
+ \p{Diacritic: N*} (Short: \p{Dia=N}, \P{Dia}) (1_113_419
+ plus all above-Unicode code points)
\p{Diacritic: Y*} (Short: \p{Dia=Y}, \p{Dia}) (693)
X \p{Diacriticals} \p{Combining_Diacritical_Marks} (=
\p{Block=Combining_Diacritical_Marks})
@@ -1691,10 +1770,12 @@ stable.
\p{East_Asian_Width: Fullwidth} (Short: \p{Ea=F}) (104)
\p{East_Asian_Width: H} \p{East_Asian_Width=Halfwidth} (123)
\p{East_Asian_Width: Halfwidth} (Short: \p{Ea=H}) (123)
- \p{East_Asian_Width: N} \p{East_Asian_Width=Neutral} (801_894)
+ \p{East_Asian_Width: N} \p{East_Asian_Width=Neutral} (801_894 plus
+ all above-Unicode code points)
\p{East_Asian_Width: Na} \p{East_Asian_Width=Narrow} (111)
\p{East_Asian_Width: Narrow} (Short: \p{Ea=Na}) (111)
- \p{East_Asian_Width: Neutral} (Short: \p{Ea=N}) (801_894)
+ \p{East_Asian_Width: Neutral} (Short: \p{Ea=N}) (801_894 plus all
+ above-Unicode code points)
\p{East_Asian_Width: W} \p{East_Asian_Width=Wide} (173_134)
\p{East_Asian_Width: Wide} (Short: \p{Ea=W}) (173_134)
\p{Egyp} \p{Egyptian_Hieroglyphs} (= \p{Script=
@@ -1747,28 +1828,32 @@ stable.
\p{Ext} \p{Extender} (= \p{Extender=Y}) (31)
\p{Ext: *} \p{Extender: *}
\p{Extender} \p{Extender=Y} (Short: \p{Ext}) (31)
- \p{Extender: N*} (Short: \p{Ext=N}, \P{Ext}) (1_114_081)
+ \p{Extender: N*} (Short: \p{Ext=N}, \P{Ext}) (1_114_081
+ plus all above-Unicode code points)
\p{Extender: Y*} (Short: \p{Ext=Y}, \p{Ext}) (31)
\p{Final_Punctuation} \p{General_Category=Final_Punctuation}
(Short: \p{Pf}) (10)
\p{Format} \p{General_Category=Format} (Short:
- \p{Cf}) (139)
+ \p{Cf}) (145)
\p{Full_Composition_Exclusion} \p{Full_Composition_Exclusion=Y}
(Short: \p{CompEx}) (1120)
\p{Full_Composition_Exclusion: N*} (Short: \p{CompEx=N},
- \P{CompEx}) (1_112_992)
+ \P{CompEx}) (1_112_992 plus all above-
+ Unicode code points)
\p{Full_Composition_Exclusion: Y*} (Short: \p{CompEx=Y},
\p{CompEx}) (1120)
\p{Gc: *} \p{General_Category: *}
\p{GCB: *} \p{Grapheme_Cluster_Break: *}
- \p{General_Category: C} \p{General_Category=Other} (1_004_134)
+ \p{General_Category: C} \p{General_Category=Other} (1_004_135 plus
+ all above-Unicode code points)
\p{General_Category: Cased_Letter} [\p{Ll}\p{Lu}\p{Lt}] (Short:
\p{Gc=LC}, \p{LC}) (3223)
\p{General_Category: Cc} \p{General_Category=Control} (65)
- \p{General_Category: Cf} \p{General_Category=Format} (139)
+ \p{General_Category: Cf} \p{General_Category=Format} (145)
\p{General_Category: Close_Punctuation} (Short: \p{Gc=Pe}, \p{Pe})
- (71)
- \p{General_Category: Cn} \p{General_Category=Unassigned} (864_414)
+ (73)
+ \p{General_Category: Cn} \p{General_Category=Unassigned} (864_409
+ plus all above-Unicode code points)
\p{General_Category: Cntrl} \p{General_Category=Control} (65)
\p{General_Category: Co} \p{General_Category=Private_Use} (137_468)
\p{General_Category: Combining_Mark} \p{General_Category=Mark}
@@ -1789,7 +1874,7 @@ stable.
(12)
\p{General_Category: Final_Punctuation} (Short: \p{Gc=Pf}, \p{Pf})
(10)
- \p{General_Category: Format} (Short: \p{Gc=Cf}, \p{Cf}) (139)
+ \p{General_Category: Format} (Short: \p{Gc=Cf}, \p{Cf}) (145)
\p{General_Category: Initial_Punctuation} (Short: \p{Gc=Pi},
\p{Pi}) (12)
\p{General_Category: L} \p{General_Category=Letter} (101_013)
@@ -1816,11 +1901,11 @@ stable.
(1441)
\p{General_Category: M} \p{General_Category=Mark} (1645)
\p{General_Category: Mark} (Short: \p{Gc=M}, \p{M}) (1645)
- \p{General_Category: Math_Symbol} (Short: \p{Gc=Sm}, \p{Sm}) (952)
- \p{General_Category: Mc} \p{General_Category=Spacing_Mark} (353)
+ \p{General_Category: Math_Symbol} (Short: \p{Gc=Sm}, \p{Sm}) (948)
+ \p{General_Category: Mc} \p{General_Category=Spacing_Mark} (352)
\p{General_Category: Me} \p{General_Category=Enclosing_Mark} (12)
\p{General_Category: Mn} \p{General_Category=Nonspacing_Mark}
- (1280)
+ (1281)
\p{General_Category: Modifier_Letter} (Short: \p{Gc=Lm}, \p{Lm})
(237)
\p{General_Category: Modifier_Symbol} (Short: \p{Gc=Sk}, \p{Sk})
@@ -1830,11 +1915,12 @@ stable.
\p{General_Category: Nl} \p{General_Category=Letter_Number} (224)
\p{General_Category: No} \p{General_Category=Other_Number} (464)
\p{General_Category: Nonspacing_Mark} (Short: \p{Gc=Mn}, \p{Mn})
- (1280)
+ (1281)
\p{General_Category: Number} (Short: \p{Gc=N}, \p{N}) (1148)
\p{General_Category: Open_Punctuation} (Short: \p{Gc=Ps}, \p{Ps})
- (72)
- \p{General_Category: Other} (Short: \p{Gc=C}, \p{C}) (1_004_134)
+ (74)
+ \p{General_Category: Other} (Short: \p{Gc=C}, \p{C}) (1_004_135
+ plus all above-Unicode code points)
\p{General_Category: Other_Letter} (Short: \p{Gc=Lo}, \p{Lo})
(97_553)
\p{General_Category: Other_Number} (Short: \p{Gc=No}, \p{No}) (464)
@@ -1842,14 +1928,14 @@ stable.
(434)
\p{General_Category: Other_Symbol} (Short: \p{Gc=So}, \p{So})
(4404)
- \p{General_Category: P} \p{General_Category=Punctuation} (632)
+ \p{General_Category: P} \p{General_Category=Punctuation} (636)
\p{General_Category: Paragraph_Separator} (Short: \p{Gc=Zp},
\p{Zp}) (1)
\p{General_Category: Pc} \p{General_Category=
Connector_Punctuation} (10)
\p{General_Category: Pd} \p{General_Category=Dash_Punctuation} (23)
\p{General_Category: Pe} \p{General_Category=Close_Punctuation}
- (71)
+ (73)
\p{General_Category: Pf} \p{General_Category=Final_Punctuation}
(10)
\p{General_Category: Pi} \p{General_Category=Initial_Punctuation}
@@ -1858,31 +1944,32 @@ stable.
(434)
\p{General_Category: Private_Use} (Short: \p{Gc=Co}, \p{Co})
(137_468)
- \p{General_Category: Ps} \p{General_Category=Open_Punctuation} (72)
- \p{General_Category: Punct} \p{General_Category=Punctuation} (632)
- \p{General_Category: Punctuation} (Short: \p{Gc=P}, \p{P}) (632)
- \p{General_Category: S} \p{General_Category=Symbol} (5520)
+ \p{General_Category: Ps} \p{General_Category=Open_Punctuation} (74)
+ \p{General_Category: Punct} \p{General_Category=Punctuation} (636)
+ \p{General_Category: Punctuation} (Short: \p{Gc=P}, \p{P}) (636)
+ \p{General_Category: S} \p{General_Category=Symbol} (5516)
\p{General_Category: Sc} \p{General_Category=Currency_Symbol} (49)
- \p{General_Category: Separator} (Short: \p{Gc=Z}, \p{Z}) (20)
+ \p{General_Category: Separator} (Short: \p{Gc=Z}, \p{Z}) (19)
\p{General_Category: Sk} \p{General_Category=Modifier_Symbol} (115)
- \p{General_Category: Sm} \p{General_Category=Math_Symbol} (952)
+ \p{General_Category: Sm} \p{General_Category=Math_Symbol} (948)
\p{General_Category: So} \p{General_Category=Other_Symbol} (4404)
\p{General_Category: Space_Separator} (Short: \p{Gc=Zs}, \p{Zs})
- (18)
- \p{General_Category: Spacing_Mark} (Short: \p{Gc=Mc}, \p{Mc}) (353)
+ (17)
+ \p{General_Category: Spacing_Mark} (Short: \p{Gc=Mc}, \p{Mc}) (352)
\p{General_Category: Surrogate} (Short: \p{Gc=Cs}, \p{Cs}) (2048)
- \p{General_Category: Symbol} (Short: \p{Gc=S}, \p{S}) (5520)
+ \p{General_Category: Symbol} (Short: \p{Gc=S}, \p{S}) (5516)
\p{General_Category: Titlecase_Letter} (Short: \p{Gc=Lt}, \p{Lt};
/i= General_Category=Cased_Letter) (31)
\p{General_Category: Unassigned} (Short: \p{Gc=Cn}, \p{Cn})
- (864_414)
+ (864_409 plus all above-Unicode code
+ points)
\p{General_Category: Uppercase_Letter} (Short: \p{Gc=Lu}, \p{Lu};
/i= General_Category=Cased_Letter) (1441)
- \p{General_Category: Z} \p{General_Category=Separator} (20)
+ \p{General_Category: Z} \p{General_Category=Separator} (19)
\p{General_Category: Zl} \p{General_Category=Line_Separator} (1)
\p{General_Category: Zp} \p{General_Category=Paragraph_Separator}
(1)
- \p{General_Category: Zs} \p{General_Category=Space_Separator} (18)
+ \p{General_Category: Zs} \p{General_Category=Space_Separator} (17)
X \p{General_Punctuation} \p{Block=General_Punctuation} (Short:
\p{InPunctuation}) (112)
X \p{Geometric_Shapes} \p{Block=Geometric_Shapes} (96)
@@ -1903,29 +1990,31 @@ stable.
\p{Gothic} \p{Script=Gothic} (Short: \p{Goth}; NOT
\p{Block=Gothic}) (27)
\p{Gr_Base} \p{Grapheme_Base} (= \p{Grapheme_Base=Y})
- (108_661)
+ (108_659)
\p{Gr_Base: *} \p{Grapheme_Base: *}
\p{Gr_Ext} \p{Grapheme_Extend} (= \p{Grapheme_Extend=
- Y}) (1317)
+ Y}) (1318)
\p{Gr_Ext: *} \p{Grapheme_Extend: *}
- \p{Graph} Characters that are graphical (247_565)
+ \p{Graph} Characters that are graphical (247_571)
\p{Grapheme_Base} \p{Grapheme_Base=Y} (Short: \p{GrBase})
- (108_661)
+ (108_659)
\p{Grapheme_Base: N*} (Short: \p{GrBase=N}, \P{GrBase})
- (1_005_451)
- \p{Grapheme_Base: Y*} (Short: \p{GrBase=Y}, \p{GrBase}) (108_661)
+ (1_005_453 plus all above-Unicode code
+ points)
+ \p{Grapheme_Base: Y*} (Short: \p{GrBase=Y}, \p{GrBase}) (108_659)
\p{Grapheme_Cluster_Break: CN} \p{Grapheme_Cluster_Break=Control}
- (6023)
- \p{Grapheme_Cluster_Break: Control} (Short: \p{GCB=CN}) (6023)
+ (6025)
+ \p{Grapheme_Cluster_Break: Control} (Short: \p{GCB=CN}) (6025)
\p{Grapheme_Cluster_Break: CR} (Short: \p{GCB=CR}) (1)
\p{Grapheme_Cluster_Break: EX} \p{Grapheme_Cluster_Break=Extend}
- (1317)
- \p{Grapheme_Cluster_Break: Extend} (Short: \p{GCB=EX}) (1317)
+ (1318)
+ \p{Grapheme_Cluster_Break: Extend} (Short: \p{GCB=EX}) (1318)
\p{Grapheme_Cluster_Break: L} (Short: \p{GCB=L}) (125)
\p{Grapheme_Cluster_Break: LF} (Short: \p{GCB=LF}) (1)
\p{Grapheme_Cluster_Break: LV} (Short: \p{GCB=LV}) (399)
\p{Grapheme_Cluster_Break: LVT} (Short: \p{GCB=LVT}) (10_773)
- \p{Grapheme_Cluster_Break: Other} (Short: \p{GCB=XX}) (1_094_924)
+ \p{Grapheme_Cluster_Break: Other} (Short: \p{GCB=XX}) (1_094_922
+ plus all above-Unicode code points)
\p{Grapheme_Cluster_Break: PP} \p{Grapheme_Cluster_Break=Prepend}
(0)
\p{Grapheme_Cluster_Break: Prepend} (Short: \p{GCB=PP}) (0)
@@ -1934,16 +2023,18 @@ stable.
\p{Grapheme_Cluster_Break: RI} \p{Grapheme_Cluster_Break=
Regional_Indicator} (26)
\p{Grapheme_Cluster_Break: SM} \p{Grapheme_Cluster_Break=
- SpacingMark} (291)
- \p{Grapheme_Cluster_Break: SpacingMark} (Short: \p{GCB=SM}) (291)
+ SpacingMark} (290)
+ \p{Grapheme_Cluster_Break: SpacingMark} (Short: \p{GCB=SM}) (290)
\p{Grapheme_Cluster_Break: T} (Short: \p{GCB=T}) (137)
\p{Grapheme_Cluster_Break: V} (Short: \p{GCB=V}) (95)
\p{Grapheme_Cluster_Break: XX} \p{Grapheme_Cluster_Break=Other}
- (1_094_924)
+ (1_094_922 plus all above-Unicode code
+ points)
\p{Grapheme_Extend} \p{Grapheme_Extend=Y} (Short: \p{GrExt})
- (1317)
- \p{Grapheme_Extend: N*} (Short: \p{GrExt=N}, \P{GrExt}) (1_112_795)
- \p{Grapheme_Extend: Y*} (Short: \p{GrExt=Y}, \p{GrExt}) (1317)
+ (1318)
+ \p{Grapheme_Extend: N*} (Short: \p{GrExt=N}, \P{GrExt}) (1_112_794
+ plus all above-Unicode code points)
+ \p{Grapheme_Extend: Y*} (Short: \p{GrExt=Y}, \p{GrExt}) (1318)
\p{Greek} \p{Script=Greek} (Short: \p{Grek}; NOT
\p{Greek_And_Coptic}) (511)
X \p{Greek_And_Coptic} \p{Block=Greek_And_Coptic} (Short:
@@ -1994,9 +2085,11 @@ stable.
\p{Hangul_Syllable_Type: LVT_Syllable} (Short: \p{Hst=LVT})
(10_773)
\p{Hangul_Syllable_Type: NA} \p{Hangul_Syllable_Type=
- Not_Applicable} (1_102_583)
+ Not_Applicable} (1_102_583 plus all
+ above-Unicode code points)
\p{Hangul_Syllable_Type: Not_Applicable} (Short: \p{Hst=NA})
- (1_102_583)
+ (1_102_583 plus all above-Unicode code
+ points)
\p{Hangul_Syllable_Type: T} \p{Hangul_Syllable_Type=Trailing_Jamo}
(137)
\p{Hangul_Syllable_Type: Trailing_Jamo} (Short: \p{Hst=T}) (137)
@@ -2017,7 +2110,8 @@ stable.
\p{Hex} \p{XDigit} (= \p{Hex_Digit=Y}) (44)
\p{Hex: *} \p{Hex_Digit: *}
\p{Hex_Digit} \p{XDigit} (= \p{Hex_Digit=Y}) (44)
- \p{Hex_Digit: N*} (Short: \p{Hex=N}, \P{Hex}) (1_114_068)
+ \p{Hex_Digit: N*} (Short: \p{Hex=N}, \P{Hex}) (1_114_068
+ plus all above-Unicode code points)
\p{Hex_Digit: Y*} (Short: \p{Hex=Y}, \p{Hex}) (44)
X \p{High_Private_Use_Surrogates} \p{Block=
High_Private_Use_Surrogates} (Short:
@@ -2030,22 +2124,25 @@ stable.
\p{Block=Hiragana}) (91)
\p{Hiragana} \p{Script=Hiragana} (Short: \p{Hira}; NOT
\p{Block=Hiragana}) (91)
- \p{HorizSpace} \p{Blank} (19)
+ \p{HorizSpace} \p{Blank} (18)
\p{Hst: *} \p{Hangul_Syllable_Type: *}
D \p{Hyphen} \p{Hyphen=Y} (11)
D \p{Hyphen: N*} Supplanted by Line_Break property values;
see www.unicode.org/reports/tr14
- (Single: \P{Hyphen}) (1_114_101)
+ (Single: \P{Hyphen}) (1_114_101 plus all
+ above-Unicode code points)
D \p{Hyphen: Y*} Supplanted by Line_Break property values;
see www.unicode.org/reports/tr14
(Single: \p{Hyphen}) (11)
\p{ID_Continue} \p{ID_Continue=Y} (Short: \p{IDC}; NOT
\p{Ideographic_Description_Characters})
(103_355)
- \p{ID_Continue: N*} (Short: \p{IDC=N}, \P{IDC}) (1_010_757)
+ \p{ID_Continue: N*} (Short: \p{IDC=N}, \P{IDC}) (1_010_757
+ plus all above-Unicode code points)
\p{ID_Continue: Y*} (Short: \p{IDC=Y}, \p{IDC}) (103_355)
\p{ID_Start} \p{ID_Start=Y} (Short: \p{IDS}) (101_240)
- \p{ID_Start: N*} (Short: \p{IDS=N}, \P{IDS}) (1_012_872)
+ \p{ID_Start: N*} (Short: \p{IDS=N}, \P{IDS}) (1_012_872
+ plus all above-Unicode code points)
\p{ID_Start: Y*} (Short: \p{IDS=Y}, \p{IDS}) (101_240)
\p{IDC} \p{ID_Continue} (= \p{ID_Continue=Y}) (NOT
\p{Ideographic_Description_Characters})
@@ -2056,7 +2153,8 @@ stable.
\p{Ideo: *} \p{Ideographic: *}
\p{Ideographic} \p{Ideographic=Y} (Short: \p{Ideo})
(75_633)
- \p{Ideographic: N*} (Short: \p{Ideo=N}, \P{Ideo}) (1_038_479)
+ \p{Ideographic: N*} (Short: \p{Ideo=N}, \P{Ideo}) (1_038_479
+ plus all above-Unicode code points)
\p{Ideographic: Y*} (Short: \p{Ideo=Y}, \p{Ideo}) (75_633)
X \p{Ideographic_Description_Characters} \p{Block=
Ideographic_Description_Characters}
@@ -2066,12 +2164,14 @@ stable.
\p{IDS_Binary_Operator} \p{IDS_Binary_Operator=Y} (Short:
\p{IDSB}) (10)
\p{IDS_Binary_Operator: N*} (Short: \p{IDSB=N}, \P{IDSB})
- (1_114_102)
+ (1_114_102 plus all above-Unicode code
+ points)
\p{IDS_Binary_Operator: Y*} (Short: \p{IDSB=Y}, \p{IDSB}) (10)
\p{IDS_Trinary_Operator} \p{IDS_Trinary_Operator=Y} (Short:
\p{IDST}) (2)
\p{IDS_Trinary_Operator: N*} (Short: \p{IDST=N}, \P{IDST})
- (1_114_110)
+ (1_114_110 plus all above-Unicode code
+ points)
\p{IDS_Trinary_Operator: Y*} (Short: \p{IDST=Y}, \p{IDST}) (2)
\p{IDSB} \p{IDS_Binary_Operator} (=
\p{IDS_Binary_Operator=Y}) (10)
@@ -2114,14 +2214,15 @@ stable.
X \p{Jamo_Ext_B} \p{Hangul_Jamo_Extended_B} (= \p{Block=
Hangul_Jamo_Extended_B}) (80)
\p{Java} \p{Javanese} (= \p{Script=Javanese}) (NOT
- \p{Block=Javanese}) (91)
+ \p{Block=Javanese}) (90)
\p{Javanese} \p{Script=Javanese} (Short: \p{Java}; NOT
- \p{Block=Javanese}) (91)
+ \p{Block=Javanese}) (90)
\p{Jg: *} \p{Joining_Group: *}
\p{Join_C} \p{Join_Control} (= \p{Join_Control=Y}) (2)
\p{Join_C: *} \p{Join_Control: *}
\p{Join_Control} \p{Join_Control=Y} (Short: \p{JoinC}) (2)
- \p{Join_Control: N*} (Short: \p{JoinC=N}, \P{JoinC}) (1_114_110)
+ \p{Join_Control: N*} (Short: \p{JoinC=N}, \P{JoinC}) (1_114_110
+ plus all above-Unicode code points)
\p{Join_Control: Y*} (Short: \p{JoinC=Y}, \p{JoinC}) (2)
\p{Joining_Group: Ain} (Short: \p{Jg=Ain}) (7)
\p{Joining_Group: Alaph} (Short: \p{Jg=Alaph}) (1)
@@ -2155,7 +2256,8 @@ stable.
\p{Joining_Group: Meem} (Short: \p{Jg=Meem}) (4)
\p{Joining_Group: Mim} (Short: \p{Jg=Mim}) (1)
\p{Joining_Group: No_Joining_Group} (Short: \p{Jg=NoJoiningGroup})
- (1_113_870)
+ (1_113_870 plus all above-Unicode code
+ points)
\p{Joining_Group: Noon} (Short: \p{Jg=Noon}) (8)
\p{Joining_Group: Nun} (Short: \p{Jg=Nun}) (1)
\p{Joining_Group: Nya} (Short: \p{Jg=Nya}) (1)
@@ -2186,18 +2288,20 @@ stable.
\p{Joining_Group: Yudh_He} (Short: \p{Jg=YudhHe}) (1)
\p{Joining_Group: Zain} (Short: \p{Jg=Zain}) (1)
\p{Joining_Group: Zhain} (Short: \p{Jg=Zhain}) (1)
- \p{Joining_Type: C} \p{Joining_Type=Join_Causing} (3)
- \p{Joining_Type: D} \p{Joining_Type=Dual_Joining} (215)
- \p{Joining_Type: Dual_Joining} (Short: \p{Jt=D}) (215)
- \p{Joining_Type: Join_Causing} (Short: \p{Jt=C}) (3)
- \p{Joining_Type: L} \p{Joining_Type=Left_Joining} (0)
- \p{Joining_Type: Left_Joining} (Short: \p{Jt=L}) (0)
- \p{Joining_Type: Non_Joining} (Short: \p{Jt=U}) (1_112_389)
+ \p{Joining_Type: C} \p{Joining_Type=Join_Causing} (4)
+ \p{Joining_Type: D} \p{Joining_Type=Dual_Joining} (389)
+ \p{Joining_Type: Dual_Joining} (Short: \p{Jt=D}) (389)
+ \p{Joining_Type: Join_Causing} (Short: \p{Jt=C}) (4)
+ \p{Joining_Type: L} \p{Joining_Type=Left_Joining} (1)
+ \p{Joining_Type: Left_Joining} (Short: \p{Jt=L}) (1)
+ \p{Joining_Type: Non_Joining} (Short: \p{Jt=U}) (1_112_211 plus
+ all above-Unicode code points)
\p{Joining_Type: R} \p{Joining_Type=Right_Joining} (82)
\p{Joining_Type: Right_Joining} (Short: \p{Jt=R}) (82)
- \p{Joining_Type: T} \p{Joining_Type=Transparent} (1423)
- \p{Joining_Type: Transparent} (Short: \p{Jt=T}) (1423)
- \p{Joining_Type: U} \p{Joining_Type=Non_Joining} (1_112_389)
+ \p{Joining_Type: T} \p{Joining_Type=Transparent} (1425)
+ \p{Joining_Type: Transparent} (Short: \p{Jt=T}) (1425)
+ \p{Joining_Type: U} \p{Joining_Type=Non_Joining} (1_112_211
+ plus all above-Unicode code points)
\p{Jt: *} \p{Joining_Type: *}
\p{Kaithi} \p{Script=Kaithi} (Short: \p{Kthi}; NOT
\p{Block=Kaithi}) (66)
@@ -2237,7 +2341,7 @@ stable.
\p{Block=Kannada}) (86)
\p{Kthi} \p{Kaithi} (= \p{Script=Kaithi}) (NOT
\p{Block=Kaithi}) (66)
- \p{L} \p{Letter} (= \p{General_Category=Letter})
+ \p{L} \pL \p{Letter} (= \p{General_Category=Letter})
(101_013)
X \p{L&} \p{Cased_Letter} (= \p{General_Category=
Cased_Letter}) (3223)
@@ -2301,10 +2405,10 @@ stable.
\p{Line_Break: Alphabetic} (Short: \p{Lb=AL}) (15_355)
\p{Line_Break: Ambiguous} (Short: \p{Lb=AI}) (687)
\p{Line_Break: B2} \p{Line_Break=Break_Both} (3)
- \p{Line_Break: BA} \p{Line_Break=Break_After} (151)
+ \p{Line_Break: BA} \p{Line_Break=Break_After} (152)
\p{Line_Break: BB} \p{Line_Break=Break_Before} (19)
\p{Line_Break: BK} \p{Line_Break=Mandatory_Break} (4)
- \p{Line_Break: Break_After} (Short: \p{Lb=BA}) (151)
+ \p{Line_Break: Break_After} (Short: \p{Lb=BA}) (152)
\p{Line_Break: Break_Before} (Short: \p{Lb=BB}) (19)
\p{Line_Break: Break_Both} (Short: \p{Lb=B2}) (3)
\p{Line_Break: Break_Symbols} (Short: \p{Lb=SY}) (1)
@@ -2315,8 +2419,8 @@ stable.
\p{Line_Break: CL} \p{Line_Break=Close_Punctuation} (87)
\p{Line_Break: Close_Parenthesis} (Short: \p{Lb=CP}) (2)
\p{Line_Break: Close_Punctuation} (Short: \p{Lb=CL}) (87)
- \p{Line_Break: CM} \p{Line_Break=Combining_Mark} (1628)
- \p{Line_Break: Combining_Mark} (Short: \p{Lb=CM}) (1628)
+ \p{Line_Break: CM} \p{Line_Break=Combining_Mark} (1634)
+ \p{Line_Break: Combining_Mark} (Short: \p{Lb=CM}) (1634)
\p{Line_Break: Complex_Context} (Short: \p{Lb=SA}) (665)
\p{Line_Break: Conditional_Japanese_Starter} (Short: \p{Lb=CJ})
(51)
@@ -2333,8 +2437,8 @@ stable.
\p{Line_Break: HL} \p{Line_Break=Hebrew_Letter} (74)
\p{Line_Break: HY} \p{Line_Break=Hyphen} (1)
\p{Line_Break: Hyphen} (Short: \p{Lb=HY}) (1)
- \p{Line_Break: ID} \p{Line_Break=Ideographic} (162_700)
- \p{Line_Break: Ideographic} (Short: \p{Lb=ID}) (162_700)
+ \p{Line_Break: ID} \p{Line_Break=Ideographic} (162_698)
+ \p{Line_Break: Ideographic} (Short: \p{Lb=ID}) (162_698)
\p{Line_Break: IN} \p{Line_Break=Inseparable} (4)
\p{Line_Break: Infix_Numeric} (Short: \p{Lb=IS}) (13)
\p{Line_Break: Inseparable} (Short: \p{Lb=IN}) (4)
@@ -2356,8 +2460,8 @@ stable.
\p{Line_Break: Open_Punctuation} (Short: \p{Lb=OP}) (81)
\p{Line_Break: PO} \p{Line_Break=Postfix_Numeric} (28)
\p{Line_Break: Postfix_Numeric} (Short: \p{Lb=PO}) (28)
- \p{Line_Break: PR} \p{Line_Break=Prefix_Numeric} (46)
- \p{Line_Break: Prefix_Numeric} (Short: \p{Lb=PR}) (46)
+ \p{Line_Break: PR} \p{Line_Break=Prefix_Numeric} (67)
+ \p{Line_Break: Prefix_Numeric} (Short: \p{Lb=PR}) (67)
\p{Line_Break: QU} \p{Line_Break=Quotation} (34)
\p{Line_Break: Quotation} (Short: \p{Lb=QU}) (34)
\p{Line_Break: Regional_Indicator} (Short: \p{Lb=RI}) (26)
@@ -2371,10 +2475,12 @@ stable.
and therefore shouldn't be the basis for
line breaking (Short: \p{Lb=SG}) (2048)
\p{Line_Break: SY} \p{Line_Break=Break_Symbols} (1)
- \p{Line_Break: Unknown} (Short: \p{Lb=XX}) (918_337)
+ \p{Line_Break: Unknown} (Short: \p{Lb=XX}) (918_311 plus all
+ above-Unicode code points)
\p{Line_Break: WJ} \p{Line_Break=Word_Joiner} (2)
\p{Line_Break: Word_Joiner} (Short: \p{Lb=WJ}) (2)
- \p{Line_Break: XX} \p{Line_Break=Unknown} (918_337)
+ \p{Line_Break: XX} \p{Line_Break=Unknown} (918_311 plus all
+ above-Unicode code points)
\p{Line_Break: ZW} \p{Line_Break=ZWSpace} (1)
\p{Line_Break: ZWSpace} (Short: \p{Lb=ZW}) (1)
\p{Line_Separator} \p{General_Category=Line_Separator}
@@ -2398,7 +2504,8 @@ stable.
\p{Logical_Order_Exception} \p{Logical_Order_Exception=Y} (Short:
\p{LOE}) (15)
\p{Logical_Order_Exception: N*} (Short: \p{LOE=N}, \P{LOE})
- (1_114_097)
+ (1_114_097 plus all above-Unicode code
+ points)
\p{Logical_Order_Exception: Y*} (Short: \p{LOE=Y}, \p{LOE}) (15)
X \p{Low_Surrogates} \p{Block=Low_Surrogates} (1024)
\p{Lower} \p{Lowercase=Y} (/i= Cased=Yes) (1934)
@@ -2406,7 +2513,8 @@ stable.
\p{Lowercase} \p{Lower} (= \p{Lowercase=Y}) (/i= Cased=
Yes) (1934)
\p{Lowercase: N*} (Short: \p{Lower=N}, \P{Lower}; /i= Cased=
- No) (1_112_178)
+ No) (1_112_178 plus all above-Unicode
+ code points)
\p{Lowercase: Y*} (Short: \p{Lower=Y}, \p{Lower}; /i= Cased=
Yes) (1934)
\p{Lowercase_Letter} \p{General_Category=Lowercase_Letter}
@@ -2427,7 +2535,7 @@ stable.
\p{Block=Lydian}) (27)
\p{Lydian} \p{Script=Lydian} (Short: \p{Lydi}; NOT
\p{Block=Lydian}) (27)
- \p{M} \p{Mark} (= \p{General_Category=Mark})
+ \p{M} \pM \p{Mark} (= \p{General_Category=Mark})
(1645)
X \p{Mahjong} \p{Mahjong_Tiles} (= \p{Block=
Mahjong_Tiles}) (48)
@@ -2442,7 +2550,8 @@ stable.
\p{Mark} \p{General_Category=Mark} (Short: \p{M})
(1645)
\p{Math} \p{Math=Y} (2310)
- \p{Math: N*} (Single: \P{Math}) (1_111_802)
+ \p{Math: N*} (Single: \P{Math}) (1_111_802 plus all
+ above-Unicode code points)
\p{Math: Y*} (Single: \p{Math}) (2310)
X \p{Math_Alphanum} \p{Mathematical_Alphanumeric_Symbols} (=
\p{Block=
@@ -2451,14 +2560,14 @@ stable.
X \p{Math_Operators} \p{Mathematical_Operators} (= \p{Block=
Mathematical_Operators}) (256)
\p{Math_Symbol} \p{General_Category=Math_Symbol} (Short:
- \p{Sm}) (952)
+ \p{Sm}) (948)
X \p{Mathematical_Alphanumeric_Symbols} \p{Block=
Mathematical_Alphanumeric_Symbols}
(Short: \p{InMathAlphanum}) (1024)
X \p{Mathematical_Operators} \p{Block=Mathematical_Operators}
(Short: \p{InMathOperators}) (256)
\p{Mc} \p{Spacing_Mark} (= \p{General_Category=
- Spacing_Mark}) (353)
+ Spacing_Mark}) (352)
\p{Me} \p{Enclosing_Mark} (= \p{General_Category=
Enclosing_Mark}) (12)
\p{Meetei_Mayek} \p{Script=Meetei_Mayek} (Short: \p{Mtei};
@@ -2517,7 +2626,7 @@ stable.
(NOT \p{Block=Malayalam}) (98)
\p{Mn} \p{Nonspacing_Mark} (=
\p{General_Category=Nonspacing_Mark})
- (1280)
+ (1281)
\p{Modifier_Letter} \p{General_Category=Modifier_Letter}
(Short: \p{Lm}) (237)
X \p{Modifier_Letters} \p{Spacing_Modifier_Letters} (= \p{Block=
@@ -2544,10 +2653,11 @@ stable.
\p{InMyanmarExtA}) (32)
\p{Mymr} \p{Myanmar} (= \p{Script=Myanmar}) (NOT
\p{Block=Myanmar}) (188)
- \p{N} \p{Number} (= \p{General_Category=Number})
+ \p{N} \pN \p{Number} (= \p{General_Category=Number})
(1148)
X \p{NB} \p{No_Block} (= \p{Block=No_Block})
- (860_672)
+ (860_672 plus all above-Unicode code
+ points)
\p{NChar} \p{Noncharacter_Code_Point} (=
\p{Noncharacter_Code_Point=Y}) (66)
\p{NChar: *} \p{Noncharacter_Code_Point: *}
@@ -2566,10 +2676,12 @@ stable.
(1120)
\p{NFC_Quick_Check: Y} \p{NFC_Quick_Check=Yes} (NOT
\p{NFC_Quick_Check} NOR \p{NFC_QC})
- (1_112_888)
+ (1_112_888 plus all above-Unicode code
+ points)
\p{NFC_Quick_Check: Yes} (Short: \p{NFCQC=Y}; NOT
\p{NFC_Quick_Check} NOR \p{NFC_QC})
- (1_112_888)
+ (1_112_888 plus all above-Unicode code
+ points)
\p{NFD_QC: *} \p{NFD_Quick_Check: *}
\p{NFD_Quick_Check: N} \p{NFD_Quick_Check=No} (NOT
\P{NFD_Quick_Check} NOR \P{NFD_QC})
@@ -2579,10 +2691,12 @@ stable.
(13_225)
\p{NFD_Quick_Check: Y} \p{NFD_Quick_Check=Yes} (NOT
\p{NFD_Quick_Check} NOR \p{NFD_QC})
- (1_100_887)
+ (1_100_887 plus all above-Unicode code
+ points)
\p{NFD_Quick_Check: Yes} (Short: \p{NFDQC=Y}; NOT
\p{NFD_Quick_Check} NOR \p{NFD_QC})
- (1_100_887)
+ (1_100_887 plus all above-Unicode code
+ points)
\p{NFKC_QC: *} \p{NFKC_Quick_Check: *}
\p{NFKC_Quick_Check: M} \p{NFKC_Quick_Check=Maybe} (104)
\p{NFKC_Quick_Check: Maybe} (Short: \p{NFKCQC=M}) (104)
@@ -2594,10 +2708,12 @@ stable.
(4787)
\p{NFKC_Quick_Check: Y} \p{NFKC_Quick_Check=Yes} (NOT
\p{NFKC_Quick_Check} NOR \p{NFKC_QC})
- (1_109_221)
+ (1_109_221 plus all above-Unicode code
+ points)
\p{NFKC_Quick_Check: Yes} (Short: \p{NFKCQC=Y}; NOT
\p{NFKC_Quick_Check} NOR \p{NFKC_QC})
- (1_109_221)
+ (1_109_221 plus all above-Unicode code
+ points)
\p{NFKD_QC: *} \p{NFKD_Quick_Check: *}
\p{NFKD_Quick_Check: N} \p{NFKD_Quick_Check=No} (NOT
\P{NFKD_Quick_Check} NOR \P{NFKD_QC})
@@ -2607,10 +2723,12 @@ stable.
(16_880)
\p{NFKD_Quick_Check: Y} \p{NFKD_Quick_Check=Yes} (NOT
\p{NFKD_Quick_Check} NOR \p{NFKD_QC})
- (1_097_232)
+ (1_097_232 plus all above-Unicode code
+ points)
\p{NFKD_Quick_Check: Yes} (Short: \p{NFKDQC=Y}; NOT
\p{NFKD_Quick_Check} NOR \p{NFKD_QC})
- (1_097_232)
+ (1_097_232 plus all above-Unicode code
+ points)
\p{Nko} \p{Script=Nko} (NOT \p{NKo}) (59)
\p{Nkoo} \p{Nko} (= \p{Script=Nko}) (NOT \p{NKo})
(59)
@@ -2619,15 +2737,17 @@ stable.
\p{No} \p{Other_Number} (= \p{General_Category=
Other_Number}) (464)
X \p{No_Block} \p{Block=No_Block} (Short: \p{InNB})
- (860_672)
+ (860_672 plus all above-Unicode code
+ points)
\p{Noncharacter_Code_Point} \p{Noncharacter_Code_Point=Y} (Short:
\p{NChar}) (66)
\p{Noncharacter_Code_Point: N*} (Short: \p{NChar=N}, \P{NChar})
- (1_114_046)
+ (1_114_046 plus all above-Unicode code
+ points)
\p{Noncharacter_Code_Point: Y*} (Short: \p{NChar=Y}, \p{NChar})
(66)
\p{Nonspacing_Mark} \p{General_Category=Nonspacing_Mark}
- (Short: \p{Mn}) (1280)
+ (Short: \p{Mn}) (1281)
\p{Nt: *} \p{Numeric_Type: *}
\p{Number} \p{General_Category=Number} (Short: \p{N})
(1148)
@@ -2636,10 +2756,10 @@ stable.
\p{Numeric_Type: Decimal} (Short: \p{Nt=De}) (460)
\p{Numeric_Type: Di} \p{Numeric_Type=Digit} (128)
\p{Numeric_Type: Digit} (Short: \p{Nt=Di}) (128)
- \p{Numeric_Type: None} (Short: \p{Nt=None}) (1_112_883)
+ \p{Numeric_Type: None} (Short: \p{Nt=None}) (1_112_883 plus all
+ above-Unicode code points)
\p{Numeric_Type: Nu} \p{Numeric_Type=Numeric} (641)
\p{Numeric_Type: Numeric} (Short: \p{Nt=Nu}) (641)
- T \p{Numeric_Value: -1} (Short: \p{Nv=-1}) (2)
T \p{Numeric_Value: -1/2} (Short: \p{Nv=-1/2}) (1)
T \p{Numeric_Value: 0} (Short: \p{Nv=0}) (60)
T \p{Numeric_Value: 1/16} (Short: \p{Nv=1/16}) (3)
@@ -2664,9 +2784,9 @@ stable.
T \p{Numeric_Value: 7/8} (Short: \p{Nv=7/8}) (1)
T \p{Numeric_Value: 1} (Short: \p{Nv=1}) (97)
T \p{Numeric_Value: 3/2} (Short: \p{Nv=3/2}) (1)
- T \p{Numeric_Value: 2} (Short: \p{Nv=2}) (100)
+ T \p{Numeric_Value: 2} (Short: \p{Nv=2}) (101)
T \p{Numeric_Value: 5/2} (Short: \p{Nv=5/2}) (1)
- T \p{Numeric_Value: 3} (Short: \p{Nv=3}) (102)
+ T \p{Numeric_Value: 3} (Short: \p{Nv=3}) (103)
T \p{Numeric_Value: 7/2} (Short: \p{Nv=7/2}) (1)
T \p{Numeric_Value: 4} (Short: \p{Nv=4}) (93)
T \p{Numeric_Value: 9/2} (Short: \p{Nv=9/2}) (1)
@@ -2758,7 +2878,8 @@ stable.
(2)
T \p{Numeric_Value: 1000000000000} (= 1.0e+12) (Short: \p{Nv=
1000000000000}) (1)
- \p{Numeric_Value: NaN} (Short: \p{Nv=NaN}) (1_112_883)
+ \p{Numeric_Value: NaN} (Short: \p{Nv=NaN}) (1_112_883 plus all
+ above-Unicode code points)
\p{Nv: *} \p{Numeric_Value: *}
X \p{OCR} \p{Optical_Character_Recognition} (=
\p{Block=Optical_Character_Recognition})
@@ -2778,7 +2899,7 @@ stable.
\p{Old_Turkic} \p{Script=Old_Turkic} (Short: \p{Orkh};
NOT \p{Block=Old_Turkic}) (73)
\p{Open_Punctuation} \p{General_Category=Open_Punctuation}
- (Short: \p{Ps}) (72)
+ (Short: \p{Ps}) (74)
X \p{Optical_Character_Recognition} \p{Block=
Optical_Character_Recognition} (Short:
\p{InOCR}) (32)
@@ -2793,7 +2914,8 @@ stable.
\p{Osmanya} \p{Script=Osmanya} (Short: \p{Osma}; NOT
\p{Block=Osmanya}) (40)
\p{Other} \p{General_Category=Other} (Short: \p{C})
- (1_004_134)
+ (1_004_135 plus all above-Unicode code
+ points)
\p{Other_Letter} \p{General_Category=Other_Letter} (Short:
\p{Lo}) (97_553)
\p{Other_Number} \p{General_Category=Other_Number} (Short:
@@ -2802,9 +2924,9 @@ stable.
(Short: \p{Po}) (434)
\p{Other_Symbol} \p{General_Category=Other_Symbol} (Short:
\p{So}) (4404)
- \p{P} \p{Punct} (= \p{General_Category=
+ \p{P} \pP \p{Punct} (= \p{General_Category=
Punctuation}) (NOT
- \p{General_Punctuation}) (632)
+ \p{General_Punctuation}) (636)
\p{Paragraph_Separator} \p{General_Category=Paragraph_Separator}
(Short: \p{Zp}) (1)
\p{Pat_Syn} \p{Pattern_Syntax} (= \p{Pattern_Syntax=
@@ -2816,12 +2938,14 @@ stable.
\p{Pattern_Syntax} \p{Pattern_Syntax=Y} (Short: \p{PatSyn})
(2760)
\p{Pattern_Syntax: N*} (Short: \p{PatSyn=N}, \P{PatSyn})
- (1_111_352)
+ (1_111_352 plus all above-Unicode code
+ points)
\p{Pattern_Syntax: Y*} (Short: \p{PatSyn=Y}, \p{PatSyn}) (2760)
\p{Pattern_White_Space} \p{Pattern_White_Space=Y} (Short:
\p{PatWS}) (11)
\p{Pattern_White_Space: N*} (Short: \p{PatWS=N}, \P{PatWS})
- (1_114_101)
+ (1_114_101 plus all above-Unicode code
+ points)
\p{Pattern_White_Space: Y*} (Short: \p{PatWS=Y}, \p{PatWS}) (11)
\p{Pc} \p{Connector_Punctuation} (=
\p{General_Category=
@@ -2831,7 +2955,7 @@ stable.
(23)
\p{Pe} \p{Close_Punctuation} (=
\p{General_Category=Close_Punctuation})
- (71)
+ (73)
\p{PerlSpace} \s, restricted to ASCII = [ \f\n\r\t] plus
vertical tab (6)
\p{PerlWord} \w, restricted to ASCII = [A-Za-z0-9_] (63)
@@ -2881,12 +3005,12 @@ stable.
NAK, SYN, ETB, CAN, EOM, SUB, ESC, FS,
GS, RS, US, and DEL (33)
\p{PosixDigit} [0-9] (10)
- \p{PosixGraph} [-!"#$%&'()*+,./:;<>?@[\\]^_`{|}~0-9A-Za-
+ \p{PosixGraph} [-!"#$%&'()*+,./:;<=>?@[\\]^_`{|}~0-9A-Za-
z] (94)
\p{PosixLower} [a-z] (/i= PosixAlpha) (26)
- \p{PosixPrint} [- 0-9A-Za-
- z!"#$%&'()*+,./:;<>?@[\\]^_`{|}~] (95)
- \p{PosixPunct} [-!"#$%&'()*+,./:;<>?@[\\]^_`{|}~] (32)
+ \p{PosixPrint} [- 0-9A-Za-z!"#$%&'()*+,./:;<=
+ >?@[\\]^_`{|}~] (95)
+ \p{PosixPunct} [-!"#$%&'()*+,./:;<=>?@[\\]^_`{|}~] (32)
\p{PosixSpace} \t, \n, \cK, \f, \r, and ' '. (\cK is
vertical tab) (6)
\p{PosixUpper} [A-Z] (/i= PosixAlpha) (26)
@@ -2934,10 +3058,14 @@ stable.
T \p{Present_In: 6.2} Code point's usage introduced in version
6.2 or earlier (Short: \p{In=6.2}) (Perl
extension) (249_764)
+ T \p{Present_In: 6.3} Code point's usage introduced in version
+ 6.3 or earlier (Short: \p{In=6.3}) (Perl
+ extension) (249_769)
\p{Present_In: Unassigned} \p{Age=Unassigned} (Short: \p{In=
- Unassigned}) (Perl extension) (864_348)
+ Unassigned}) (Perl extension) (864_343
+ plus all above-Unicode code points)
\p{Print} Characters that are graphical plus space
- characters (but no controls) (247_583)
+ characters (but no controls) (247_588)
\p{Private_Use} \p{General_Category=Private_Use} (Short:
\p{Co}; NOT \p{Private_Use_Area})
(137_468)
@@ -2948,14 +3076,14 @@ stable.
Inscriptional_Parthian}) (30)
\p{Ps} \p{Open_Punctuation} (=
\p{General_Category=Open_Punctuation})
- (72)
+ (74)
X \p{PUA} \p{Private_Use_Area} (= \p{Block=
Private_Use_Area}) (6400)
\p{Punct} \p{General_Category=Punctuation} (Short:
- \p{P}; NOT \p{General_Punctuation}) (632)
+ \p{P}; NOT \p{General_Punctuation}) (636)
\p{Punctuation} \p{Punct} (= \p{General_Category=
Punctuation}) (NOT
- \p{General_Punctuation}) (632)
+ \p{General_Punctuation}) (636)
\p{Qaac} \p{Coptic} (= \p{Script=Coptic}) (NOT
\p{Block=Coptic}) (137)
\p{Qaai} \p{Inherited} (= \p{Script=Inherited})
@@ -2965,10 +3093,12 @@ stable.
\p{QMark: *} \p{Quotation_Mark: *}
\p{Quotation_Mark} \p{Quotation_Mark=Y} (Short: \p{QMark})
(29)
- \p{Quotation_Mark: N*} (Short: \p{QMark=N}, \P{QMark}) (1_114_083)
+ \p{Quotation_Mark: N*} (Short: \p{QMark=N}, \P{QMark}) (1_114_083
+ plus all above-Unicode code points)
\p{Quotation_Mark: Y*} (Short: \p{QMark=Y}, \p{QMark}) (29)
\p{Radical} \p{Radical=Y} (329)
- \p{Radical: N*} (Single: \P{Radical}) (1_113_783)
+ \p{Radical: N*} (Single: \P{Radical}) (1_113_783 plus all
+ above-Unicode code points)
\p{Radical: Y*} (Single: \p{Radical}) (329)
\p{Rejang} \p{Script=Rejang} (Short: \p{Rjng}; NOT
\p{Block=Rejang}) (37)
@@ -2982,8 +3112,8 @@ stable.
\p{Block=Runic}) (78)
\p{Runr} \p{Runic} (= \p{Script=Runic}) (NOT
\p{Block=Runic}) (78)
- \p{S} \p{Symbol} (= \p{General_Category=Symbol})
- (5520)
+ \p{S} \pS \p{Symbol} (= \p{General_Category=Symbol})
+ (5516)
\p{Samaritan} \p{Script=Samaritan} (Short: \p{Samr}; NOT
\p{Block=Samaritan}) (61)
\p{Samr} \p{Samaritan} (= \p{Script=Samaritan})
@@ -2999,8 +3129,8 @@ stable.
\p{General_Category=Currency_Symbol})
(49)
\p{Sc: *} \p{Script: *}
- \p{Script: Arab} \p{Script=Arabic} (1235)
- \p{Script: Arabic} (Short: \p{Sc=Arab}, \p{Arab}) (1235)
+ \p{Script: Arab} \p{Script=Arabic} (1236)
+ \p{Script: Arabic} (Short: \p{Sc=Arab}, \p{Arab}) (1236)
\p{Script: Armenian} (Short: \p{Sc=Armn}, \p{Armn}) (91)
\p{Script: Armi} \p{Script=Imperial_Aramaic} (31)
\p{Script: Armn} \p{Script=Armenian} (91)
@@ -3034,7 +3164,7 @@ stable.
\p{Script: Cham} (Short: \p{Sc=Cham}, \p{Cham}) (83)
\p{Script: Cher} \p{Script=Cherokee} (85)
\p{Script: Cherokee} (Short: \p{Sc=Cher}, \p{Cher}) (85)
- \p{Script: Common} (Short: \p{Sc=Zyyy}, \p{Zyyy}) (6413)
+ \p{Script: Common} (Short: \p{Sc=Zyyy}, \p{Zyyy}) (6418)
\p{Script: Copt} \p{Script=Coptic} (137)
\p{Script: Coptic} (Short: \p{Sc=Copt}, \p{Copt}) (137)
\p{Script: Cprt} \p{Script=Cypriot} (55)
@@ -3080,8 +3210,8 @@ stable.
\p{Script: Inscriptional_Parthian} (Short: \p{Sc=Prti}, \p{Prti})
(30)
\p{Script: Ital} \p{Script=Old_Italic} (35)
- \p{Script: Java} \p{Script=Javanese} (91)
- \p{Script: Javanese} (Short: \p{Sc=Java}, \p{Java}) (91)
+ \p{Script: Java} \p{Script=Javanese} (90)
+ \p{Script: Javanese} (Short: \p{Sc=Java}, \p{Java}) (90)
\p{Script: Kaithi} (Short: \p{Sc=Kthi}, \p{Kthi}) (66)
\p{Script: Kali} \p{Script=Kayah_Li} (48)
\p{Script: Kana} \p{Script=Katakana} (300)
@@ -3199,7 +3329,8 @@ stable.
\p{Script: Tifinagh} (Short: \p{Sc=Tfng}, \p{Tfng}) (59)
\p{Script: Ugar} \p{Script=Ugaritic} (31)
\p{Script: Ugaritic} (Short: \p{Sc=Ugar}, \p{Ugar}) (31)
- \p{Script: Unknown} (Short: \p{Sc=Zzzz}, \p{Zzzz}) (1_003_930)
+ \p{Script: Unknown} (Short: \p{Sc=Zzzz}, \p{Zzzz}) (1_003_925
+ plus all above-Unicode code points)
\p{Script: Vai} (Short: \p{Sc=Vai}, \p{Vai}) (300)
\p{Script: Vaii} \p{Script=Vai} (300)
\p{Script: Xpeo} \p{Script=Old_Persian} (50)
@@ -3207,10 +3338,11 @@ stable.
\p{Script: Yi} (Short: \p{Sc=Yi}, \p{Yi}) (1220)
\p{Script: Yiii} \p{Script=Yi} (1220)
\p{Script: Zinh} \p{Script=Inherited} (523)
- \p{Script: Zyyy} \p{Script=Common} (6413)
- \p{Script: Zzzz} \p{Script=Unknown} (1_003_930)
- \p{Script_Extensions: Arab} \p{Script_Extensions=Arabic} (1262)
- \p{Script_Extensions: Arabic} (Short: \p{Scx=Arab}) (1262)
+ \p{Script: Zyyy} \p{Script=Common} (6418)
+ \p{Script: Zzzz} \p{Script=Unknown} (1_003_925 plus all
+ above-Unicode code points)
+ \p{Script_Extensions: Arab} \p{Script_Extensions=Arabic} (1263)
+ \p{Script_Extensions: Arabic} (Short: \p{Scx=Arab}) (1263)
\p{Script_Extensions: Armenian} (Short: \p{Scx=Armn}) (92)
\p{Script_Extensions: Armi} \p{Script_Extensions=Imperial_Aramaic}
(31)
@@ -3231,22 +3363,22 @@ stable.
\p{Script_Extensions: Brahmi} (Short: \p{Scx=Brah}) (108)
\p{Script_Extensions: Brai} \p{Script_Extensions=Braille} (256)
\p{Script_Extensions: Braille} (Short: \p{Scx=Brai}) (256)
- \p{Script_Extensions: Bugi} \p{Script_Extensions=Buginese} (30)
- \p{Script_Extensions: Buginese} (Short: \p{Scx=Bugi}) (30)
+ \p{Script_Extensions: Bugi} \p{Script_Extensions=Buginese} (31)
+ \p{Script_Extensions: Buginese} (Short: \p{Scx=Bugi}) (31)
\p{Script_Extensions: Buhd} \p{Script_Extensions=Buhid} (22)
\p{Script_Extensions: Buhid} (Short: \p{Scx=Buhd}) (22)
- \p{Script_Extensions: Cakm} \p{Script_Extensions=Chakma} (67)
+ \p{Script_Extensions: Cakm} \p{Script_Extensions=Chakma} (87)
\p{Script_Extensions: Canadian_Aboriginal} (Short: \p{Scx=Cans})
(710)
\p{Script_Extensions: Cans} \p{Script_Extensions=
Canadian_Aboriginal} (710)
\p{Script_Extensions: Cari} \p{Script_Extensions=Carian} (49)
\p{Script_Extensions: Carian} (Short: \p{Scx=Cari}) (49)
- \p{Script_Extensions: Chakma} (Short: \p{Scx=Cakm}) (67)
+ \p{Script_Extensions: Chakma} (Short: \p{Scx=Cakm}) (87)
\p{Script_Extensions: Cham} (Short: \p{Scx=Cham}) (83)
\p{Script_Extensions: Cher} \p{Script_Extensions=Cherokee} (85)
\p{Script_Extensions: Cherokee} (Short: \p{Scx=Cher}) (85)
- \p{Script_Extensions: Common} (Short: \p{Scx=Zyyy}) (6057)
+ \p{Script_Extensions: Common} (Short: \p{Scx=Zyyy}) (6061)
\p{Script_Extensions: Copt} \p{Script_Extensions=Coptic} (137)
\p{Script_Extensions: Coptic} (Short: \p{Scx=Copt}) (137)
\p{Script_Extensions: Cprt} \p{Script_Extensions=Cypriot} (112)
@@ -3295,7 +3427,7 @@ stable.
\p{Script_Extensions: Ital} \p{Script_Extensions=Old_Italic} (35)
\p{Script_Extensions: Java} \p{Script_Extensions=Javanese} (91)
\p{Script_Extensions: Javanese} (Short: \p{Scx=Java}) (91)
- \p{Script_Extensions: Kaithi} (Short: \p{Scx=Kthi}) (76)
+ \p{Script_Extensions: Kaithi} (Short: \p{Scx=Kthi}) (86)
\p{Script_Extensions: Kali} \p{Script_Extensions=Kayah_Li} (48)
\p{Script_Extensions: Kana} \p{Script_Extensions=Katakana} (565)
\p{Script_Extensions: Kannada} (Short: \p{Scx=Knda}) (86)
@@ -3306,7 +3438,7 @@ stable.
\p{Script_Extensions: Khmer} (Short: \p{Scx=Khmr}) (146)
\p{Script_Extensions: Khmr} \p{Script_Extensions=Khmer} (146)
\p{Script_Extensions: Knda} \p{Script_Extensions=Kannada} (86)
- \p{Script_Extensions: Kthi} \p{Script_Extensions=Kaithi} (76)
+ \p{Script_Extensions: Kthi} \p{Script_Extensions=Kaithi} (86)
\p{Script_Extensions: Lana} \p{Script_Extensions=Tai_Tham} (127)
\p{Script_Extensions: Lao} (Short: \p{Scx=Lao}) (67)
\p{Script_Extensions: Laoo} \p{Script_Extensions=Lao} (67)
@@ -3388,19 +3520,19 @@ stable.
\p{Script_Extensions: Sora_Sompeng} (Short: \p{Scx=Sora}) (35)
\p{Script_Extensions: Sund} \p{Script_Extensions=Sundanese} (72)
\p{Script_Extensions: Sundanese} (Short: \p{Scx=Sund}) (72)
- \p{Script_Extensions: Sylo} \p{Script_Extensions=Syloti_Nagri} (44)
- \p{Script_Extensions: Syloti_Nagri} (Short: \p{Scx=Sylo}) (44)
- \p{Script_Extensions: Syrc} \p{Script_Extensions=Syriac} (93)
- \p{Script_Extensions: Syriac} (Short: \p{Scx=Syrc}) (93)
+ \p{Script_Extensions: Sylo} \p{Script_Extensions=Syloti_Nagri} (54)
+ \p{Script_Extensions: Syloti_Nagri} (Short: \p{Scx=Sylo}) (54)
+ \p{Script_Extensions: Syrc} \p{Script_Extensions=Syriac} (94)
+ \p{Script_Extensions: Syriac} (Short: \p{Scx=Syrc}) (94)
\p{Script_Extensions: Tagalog} (Short: \p{Scx=Tglg}) (22)
\p{Script_Extensions: Tagb} \p{Script_Extensions=Tagbanwa} (20)
\p{Script_Extensions: Tagbanwa} (Short: \p{Scx=Tagb}) (20)
- \p{Script_Extensions: Tai_Le} (Short: \p{Scx=Tale}) (35)
+ \p{Script_Extensions: Tai_Le} (Short: \p{Scx=Tale}) (45)
\p{Script_Extensions: Tai_Tham} (Short: \p{Scx=Lana}) (127)
\p{Script_Extensions: Tai_Viet} (Short: \p{Scx=Tavt}) (72)
\p{Script_Extensions: Takr} \p{Script_Extensions=Takri} (78)
\p{Script_Extensions: Takri} (Short: \p{Scx=Takr}) (78)
- \p{Script_Extensions: Tale} \p{Script_Extensions=Tai_Le} (35)
+ \p{Script_Extensions: Tale} \p{Script_Extensions=Tai_Le} (45)
\p{Script_Extensions: Talu} \p{Script_Extensions=New_Tai_Lue} (83)
\p{Script_Extensions: Tamil} (Short: \p{Scx=Taml}) (72)
\p{Script_Extensions: Taml} \p{Script_Extensions=Tamil} (72)
@@ -3409,15 +3541,16 @@ stable.
\p{Script_Extensions: Telugu} (Short: \p{Scx=Telu}) (93)
\p{Script_Extensions: Tfng} \p{Script_Extensions=Tifinagh} (59)
\p{Script_Extensions: Tglg} \p{Script_Extensions=Tagalog} (22)
- \p{Script_Extensions: Thaa} \p{Script_Extensions=Thaana} (65)
- \p{Script_Extensions: Thaana} (Short: \p{Scx=Thaa}) (65)
+ \p{Script_Extensions: Thaa} \p{Script_Extensions=Thaana} (66)
+ \p{Script_Extensions: Thaana} (Short: \p{Scx=Thaa}) (66)
\p{Script_Extensions: Thai} (Short: \p{Scx=Thai}) (86)
\p{Script_Extensions: Tibetan} (Short: \p{Scx=Tibt}) (207)
\p{Script_Extensions: Tibt} \p{Script_Extensions=Tibetan} (207)
\p{Script_Extensions: Tifinagh} (Short: \p{Scx=Tfng}) (59)
\p{Script_Extensions: Ugar} \p{Script_Extensions=Ugaritic} (31)
\p{Script_Extensions: Ugaritic} (Short: \p{Scx=Ugar}) (31)
- \p{Script_Extensions: Unknown} (Short: \p{Scx=Zzzz}) (1_003_930)
+ \p{Script_Extensions: Unknown} (Short: \p{Scx=Zzzz}) (1_003_925
+ plus all above-Unicode code points)
\p{Script_Extensions: Vai} (Short: \p{Scx=Vai}) (300)
\p{Script_Extensions: Vaii} \p{Script_Extensions=Vai} (300)
\p{Script_Extensions: Xpeo} \p{Script_Extensions=Old_Persian} (50)
@@ -3425,21 +3558,22 @@ stable.
\p{Script_Extensions: Yi} (Short: \p{Scx=Yi}) (1246)
\p{Script_Extensions: Yiii} \p{Script_Extensions=Yi} (1246)
\p{Script_Extensions: Zinh} \p{Script_Extensions=Inherited} (459)
- \p{Script_Extensions: Zyyy} \p{Script_Extensions=Common} (6057)
+ \p{Script_Extensions: Zyyy} \p{Script_Extensions=Common} (6061)
\p{Script_Extensions: Zzzz} \p{Script_Extensions=Unknown}
- (1_003_930)
+ (1_003_925 plus all above-Unicode code
+ points)
\p{Scx: *} \p{Script_Extensions: *}
\p{SD} \p{Soft_Dotted} (= \p{Soft_Dotted=Y}) (46)
\p{SD: *} \p{Soft_Dotted: *}
\p{Sentence_Break: AT} \p{Sentence_Break=ATerm} (4)
\p{Sentence_Break: ATerm} (Short: \p{SB=AT}) (4)
- \p{Sentence_Break: CL} \p{Sentence_Break=Close} (177)
- \p{Sentence_Break: Close} (Short: \p{SB=CL}) (177)
+ \p{Sentence_Break: CL} \p{Sentence_Break=Close} (181)
+ \p{Sentence_Break: Close} (Short: \p{SB=CL}) (181)
\p{Sentence_Break: CR} (Short: \p{SB=CR}) (1)
\p{Sentence_Break: EX} \p{Sentence_Break=Extend} (1649)
\p{Sentence_Break: Extend} (Short: \p{SB=EX}) (1649)
- \p{Sentence_Break: FO} \p{Sentence_Break=Format} (137)
- \p{Sentence_Break: Format} (Short: \p{SB=FO}) (137)
+ \p{Sentence_Break: FO} \p{Sentence_Break=Format} (143)
+ \p{Sentence_Break: Format} (Short: \p{SB=FO}) (143)
\p{Sentence_Break: LE} \p{Sentence_Break=OLetter} (97_841)
\p{Sentence_Break: LF} (Short: \p{SB=LF}) (1)
\p{Sentence_Break: LO} \p{Sentence_Break=Lower} (1933)
@@ -3447,19 +3581,21 @@ stable.
\p{Sentence_Break: NU} \p{Sentence_Break=Numeric} (452)
\p{Sentence_Break: Numeric} (Short: \p{SB=NU}) (452)
\p{Sentence_Break: OLetter} (Short: \p{SB=LE}) (97_841)
- \p{Sentence_Break: Other} (Short: \p{SB=XX}) (1_010_273)
+ \p{Sentence_Break: Other} (Short: \p{SB=XX}) (1_010_264 plus all
+ above-Unicode code points)
\p{Sentence_Break: SC} \p{Sentence_Break=SContinue} (26)
\p{Sentence_Break: SContinue} (Short: \p{SB=SC}) (26)
\p{Sentence_Break: SE} \p{Sentence_Break=Sep} (3)
\p{Sentence_Break: Sep} (Short: \p{SB=SE}) (3)
- \p{Sentence_Break: Sp} (Short: \p{SB=Sp}) (21)
+ \p{Sentence_Break: Sp} (Short: \p{SB=Sp}) (20)
\p{Sentence_Break: ST} \p{Sentence_Break=STerm} (80)
\p{Sentence_Break: STerm} (Short: \p{SB=ST}) (80)
\p{Sentence_Break: UP} \p{Sentence_Break=Upper} (1514)
\p{Sentence_Break: Upper} (Short: \p{SB=UP}) (1514)
- \p{Sentence_Break: XX} \p{Sentence_Break=Other} (1_010_273)
+ \p{Sentence_Break: XX} \p{Sentence_Break=Other} (1_010_264 plus
+ all above-Unicode code points)
\p{Separator} \p{General_Category=Separator} (Short:
- \p{Z}) (20)
+ \p{Z}) (19)
\p{Sharada} \p{Script=Sharada} (Short: \p{Shrd}; NOT
\p{Block=Sharada}) (83)
\p{Shavian} \p{Script=Shavian} (Short: \p{Shaw}) (48)
@@ -3474,7 +3610,7 @@ stable.
\p{General_Category=Modifier_Symbol})
(115)
\p{Sm} \p{Math_Symbol} (= \p{General_Category=
- Math_Symbol}) (952)
+ Math_Symbol}) (948)
X \p{Small_Form_Variants} \p{Block=Small_Form_Variants} (Short:
\p{InSmallForms}) (32)
X \p{Small_Forms} \p{Small_Form_Variants} (= \p{Block=
@@ -3482,7 +3618,8 @@ stable.
\p{So} \p{Other_Symbol} (= \p{General_Category=
Other_Symbol}) (4404)
\p{Soft_Dotted} \p{Soft_Dotted=Y} (Short: \p{SD}) (46)
- \p{Soft_Dotted: N*} (Short: \p{SD=N}, \P{SD}) (1_114_066)
+ \p{Soft_Dotted: N*} (Short: \p{SD=N}, \P{SD}) (1_114_066 plus
+ all above-Unicode code points)
\p{Soft_Dotted: Y*} (Short: \p{SD=Y}, \p{SD}) (46)
\p{Sora} \p{Sora_Sompeng} (= \p{Script=
Sora_Sompeng}) (NOT \p{Block=
@@ -3490,18 +3627,19 @@ stable.
\p{Sora_Sompeng} \p{Script=Sora_Sompeng} (Short: \p{Sora};
NOT \p{Block=Sora_Sompeng}) (35)
\p{Space} \p{White_Space=Y} \s including beyond
- ASCII and vertical tab (26)
+ ASCII and vertical tab (25)
\p{Space: *} \p{White_Space: *}
\p{Space_Separator} \p{General_Category=Space_Separator}
- (Short: \p{Zs}) (18)
- \p{SpacePerl} \p{XPerlSpace} (26)
+ (Short: \p{Zs}) (17)
+ \p{SpacePerl} \p{XPerlSpace} (25)
\p{Spacing_Mark} \p{General_Category=Spacing_Mark} (Short:
- \p{Mc}) (353)
+ \p{Mc}) (352)
X \p{Spacing_Modifier_Letters} \p{Block=Spacing_Modifier_Letters}
(Short: \p{InModifierLetters}) (80)
X \p{Specials} \p{Block=Specials} (16)
\p{STerm} \p{STerm=Y} (83)
- \p{STerm: N*} (Single: \P{STerm}) (1_114_029)
+ \p{STerm: N*} (Single: \P{STerm}) (1_114_029 plus all
+ above-Unicode code points)
\p{STerm: Y*} (Single: \p{STerm}) (83)
\p{Sund} \p{Sundanese} (= \p{Script=Sundanese})
(NOT \p{Block=Sundanese}) (72)
@@ -3558,7 +3696,7 @@ stable.
\p{Syloti_Nagri} \p{Script=Syloti_Nagri} (Short: \p{Sylo};
NOT \p{Block=Syloti_Nagri}) (44)
\p{Symbol} \p{General_Category=Symbol} (Short: \p{S})
- (5520)
+ (5516)
\p{Syrc} \p{Syriac} (= \p{Script=Syriac}) (NOT
\p{Block=Syriac}) (77)
\p{Syriac} \p{Script=Syriac} (Short: \p{Syrc}; NOT
@@ -3604,7 +3742,8 @@ stable.
\p{Terminal_Punctuation} \p{Terminal_Punctuation=Y} (Short:
\p{Term}) (176)
\p{Terminal_Punctuation: N*} (Short: \p{Term=N}, \P{Term})
- (1_113_936)
+ (1_113_936 plus all above-Unicode code
+ points)
\p{Terminal_Punctuation: Y*} (Short: \p{Term=Y}, \p{Term}) (176)
\p{Tfng} \p{Tifinagh} (= \p{Script=Tifinagh}) (NOT
\p{Block=Tifinagh}) (59)
@@ -3647,7 +3786,9 @@ stable.
\p{Unified_Ideograph=Y}) (74_617)
\p{UIdeo: *} \p{Unified_Ideograph: *}
\p{Unassigned} \p{General_Category=Unassigned} (Short:
- \p{Cn}) (864_414)
+ \p{Cn}) (864_409 plus all above-Unicode
+ code points)
+ \p{Unicode} \p{Any} (1_114_112)
X \p{Unified_Canadian_Aboriginal_Syllabics} \p{Block=
Unified_Canadian_Aboriginal_Syllabics}
(Short: \p{InUCAS}) (640)
@@ -3657,16 +3798,19 @@ stable.
\p{Unified_Ideograph} \p{Unified_Ideograph=Y} (Short: \p{UIdeo})
(74_617)
\p{Unified_Ideograph: N*} (Short: \p{UIdeo=N}, \P{UIdeo})
- (1_039_495)
+ (1_039_495 plus all above-Unicode code
+ points)
\p{Unified_Ideograph: Y*} (Short: \p{UIdeo=Y}, \p{UIdeo}) (74_617)
\p{Unknown} \p{Script=Unknown} (Short: \p{Zzzz})
- (1_003_930)
+ (1_003_925 plus all above-Unicode code
+ points)
\p{Upper} \p{Uppercase=Y} (/i= Cased=Yes) (1483)
\p{Upper: *} \p{Uppercase: *}
\p{Uppercase} \p{Upper} (= \p{Uppercase=Y}) (/i= Cased=
Yes) (1483)
\p{Uppercase: N*} (Short: \p{Upper=N}, \P{Upper}; /i= Cased=
- No) (1_112_629)
+ No) (1_112_629 plus all above-Unicode
+ code points)
\p{Uppercase: Y*} (Short: \p{Upper=Y}, \p{Upper}; /i= Cased=
Yes) (1483)
\p{Uppercase_Letter} \p{General_Category=Uppercase_Letter}
@@ -3677,7 +3821,8 @@ stable.
Vai}) (300)
\p{Variation_Selector} \p{Variation_Selector=Y} (Short: \p{VS};
NOT \p{Variation_Selectors}) (259)
- \p{Variation_Selector: N*} (Short: \p{VS=N}, \P{VS}) (1_113_853)
+ \p{Variation_Selector: N*} (Short: \p{VS=N}, \P{VS}) (1_113_853
+ plus all above-Unicode code points)
\p{Variation_Selector: Y*} (Short: \p{VS=Y}, \p{VS}) (259)
X \p{Variation_Selectors} \p{Block=Variation_Selectors} (Short:
\p{InVS}) (16)
@@ -3698,46 +3843,57 @@ stable.
\p{Block=
Variation_Selectors_Supplement}) (240)
\p{WB: *} \p{Word_Break: *}
- \p{White_Space} \p{White_Space=Y} (Short: \p{WSpace}) (26)
+ \p{White_Space} \p{White_Space=Y} (Short: \p{WSpace}) (25)
\p{White_Space: N*} (Short: \p{Space=N}, \P{WSpace})
- (1_114_086)
- \p{White_Space: Y*} (Short: \p{Space=Y}, \p{WSpace}) (26)
+ (1_114_087 plus all above-Unicode code
+ points)
+ \p{White_Space: Y*} (Short: \p{Space=Y}, \p{WSpace}) (25)
\p{Word} \w, including beyond ASCII; = \p{Alnum} +
\pM + \p{Pc} (103_406)
- \p{Word_Break: ALetter} (Short: \p{WB=LE}) (24_941)
+ \p{Word_Break: ALetter} (Short: \p{WB=LE}) (24_867)
\p{Word_Break: CR} (Short: \p{WB=CR}) (1)
+ \p{Word_Break: Double_Quote} (Short: \p{WB=DQ}) (1)
+ \p{Word_Break: DQ} \p{Word_Break=Double_Quote} (1)
\p{Word_Break: EX} \p{Word_Break=ExtendNumLet} (10)
\p{Word_Break: Extend} (Short: \p{WB=Extend}) (1649)
\p{Word_Break: ExtendNumLet} (Short: \p{WB=EX}) (10)
- \p{Word_Break: FO} \p{Word_Break=Format} (136)
- \p{Word_Break: Format} (Short: \p{WB=FO}) (136)
+ \p{Word_Break: FO} \p{Word_Break=Format} (142)
+ \p{Word_Break: Format} (Short: \p{WB=FO}) (142)
+ \p{Word_Break: Hebrew_Letter} (Short: \p{WB=HL}) (74)
+ \p{Word_Break: HL} \p{Word_Break=Hebrew_Letter} (74)
\p{Word_Break: KA} \p{Word_Break=Katakana} (310)
\p{Word_Break: Katakana} (Short: \p{WB=KA}) (310)
- \p{Word_Break: LE} \p{Word_Break=ALetter} (24_941)
+ \p{Word_Break: LE} \p{Word_Break=ALetter} (24_867)
\p{Word_Break: LF} (Short: \p{WB=LF}) (1)
- \p{Word_Break: MB} \p{Word_Break=MidNumLet} (8)
- \p{Word_Break: MidLetter} (Short: \p{WB=ML}) (8)
+ \p{Word_Break: MB} \p{Word_Break=MidNumLet} (7)
+ \p{Word_Break: MidLetter} (Short: \p{WB=ML}) (9)
\p{Word_Break: MidNum} (Short: \p{WB=MN}) (15)
- \p{Word_Break: MidNumLet} (Short: \p{WB=MB}) (8)
- \p{Word_Break: ML} \p{Word_Break=MidLetter} (8)
+ \p{Word_Break: MidNumLet} (Short: \p{WB=MB}) (7)
+ \p{Word_Break: ML} \p{Word_Break=MidLetter} (9)
\p{Word_Break: MN} \p{Word_Break=MidNum} (15)
\p{Word_Break: Newline} (Short: \p{WB=NL}) (5)
\p{Word_Break: NL} \p{Word_Break=Newline} (5)
\p{Word_Break: NU} \p{Word_Break=Numeric} (451)
\p{Word_Break: Numeric} (Short: \p{WB=NU}) (451)
- \p{Word_Break: Other} (Short: \p{WB=XX}) (1_086_551)
+ \p{Word_Break: Other} (Short: \p{WB=XX}) (1_086_543 plus all
+ above-Unicode code points)
\p{Word_Break: Regional_Indicator} (Short: \p{WB=RI}) (26)
\p{Word_Break: RI} \p{Word_Break=Regional_Indicator} (26)
- \p{Word_Break: XX} \p{Word_Break=Other} (1_086_551)
- \p{WSpace} \p{White_Space} (= \p{White_Space=Y}) (26)
+ \p{Word_Break: Single_Quote} (Short: \p{WB=SQ}) (1)
+ \p{Word_Break: SQ} \p{Word_Break=Single_Quote} (1)
+ \p{Word_Break: XX} \p{Word_Break=Other} (1_086_543 plus all
+ above-Unicode code points)
+ \p{WSpace} \p{White_Space} (= \p{White_Space=Y}) (25)
\p{WSpace: *} \p{White_Space: *}
\p{XDigit} \p{Hex_Digit=Y} (Short: \p{Hex}) (44)
\p{XID_Continue} \p{XID_Continue=Y} (Short: \p{XIDC})
(103_336)
- \p{XID_Continue: N*} (Short: \p{XIDC=N}, \P{XIDC}) (1_010_776)
+ \p{XID_Continue: N*} (Short: \p{XIDC=N}, \P{XIDC}) (1_010_776
+ plus all above-Unicode code points)
\p{XID_Continue: Y*} (Short: \p{XIDC=Y}, \p{XIDC}) (103_336)
\p{XID_Start} \p{XID_Start=Y} (Short: \p{XIDS}) (101_217)
- \p{XID_Start: N*} (Short: \p{XIDS=N}, \P{XIDS}) (1_012_895)
+ \p{XID_Start: N*} (Short: \p{XIDS=N}, \P{XIDS}) (1_012_895
+ plus all above-Unicode code points)
\p{XID_Start: Y*} (Short: \p{XIDS=Y}, \p{XIDS}) (101_217)
\p{XIDC} \p{XID_Continue} (= \p{XID_Continue=Y})
(103_336)
@@ -3747,20 +3903,20 @@ stable.
\p{Xpeo} \p{Old_Persian} (= \p{Script=Old_Persian})
(NOT \p{Block=Old_Persian}) (50)
\p{XPerlSpace} \s, including beyond ASCII (Short:
- \p{SpacePerl}) (26)
+ \p{SpacePerl}) (25)
\p{XPosixAlnum} \p{Alnum} (102_619)
\p{XPosixAlpha} \p{Alpha} (= \p{Alphabetic=Y}) (102_159)
- \p{XPosixBlank} \p{Blank} (19)
+ \p{XPosixBlank} \p{Blank} (18)
\p{XPosixCntrl} \p{Cntrl} (= \p{General_Category=Control})
(65)
\p{XPosixDigit} \p{Digit} (= \p{General_Category=
Decimal_Number}) (460)
- \p{XPosixGraph} \p{Graph} (247_565)
+ \p{XPosixGraph} \p{Graph} (247_571)
\p{XPosixLower} \p{Lower} (= \p{Lowercase=Y}) (/i= Cased=
Yes) (1934)
- \p{XPosixPrint} \p{Print} (247_583)
- \p{XPosixPunct} \p{Punct} + ASCII-range \p{Symbol} (641)
- \p{XPosixSpace} \p{Space} (= \p{White_Space=Y}) (26)
+ \p{XPosixPrint} \p{Print} (247_588)
+ \p{XPosixPunct} \p{Punct} + ASCII-range \p{Symbol} (645)
+ \p{XPosixSpace} \p{Space} (= \p{White_Space=Y}) (25)
\p{XPosixUpper} \p{Upper} (= \p{Uppercase=Y}) (/i= Cased=
Yes) (1483)
\p{XPosixWord} \p{Word} (103_406)
@@ -3775,8 +3931,8 @@ stable.
Yijing_Hexagram_Symbols}) (64)
X \p{Yijing_Hexagram_Symbols} \p{Block=Yijing_Hexagram_Symbols}
(Short: \p{InYijing}) (64)
- \p{Z} \p{Separator} (= \p{General_Category=
- Separator}) (20)
+ \p{Z} \pZ \p{Separator} (= \p{General_Category=
+ Separator}) (19)
\p{Zinh} \p{Inherited} (= \p{Script=Inherited})
(523)
\p{Zl} \p{Line_Separator} (= \p{General_Category=
@@ -3786,14 +3942,15 @@ stable.
Paragraph_Separator}) (1)
\p{Zs} \p{Space_Separator} (=
\p{General_Category=Space_Separator})
- (18)
- \p{Zyyy} \p{Common} (= \p{Script=Common}) (6413)
+ (17)
+ \p{Zyyy} \p{Common} (= \p{Script=Common}) (6418)
\p{Zzzz} \p{Unknown} (= \p{Script=Unknown})
- (1_003_930)
+ (1_003_925 plus all above-Unicode code
+ points)
TX\p{_CanonDCIJ} (For internal use by Perl, not necessarily
stable) (= \p{Soft_Dotted=Y}) (46)
TX\p{_Case_Ignorable} (For internal use by Perl, not necessarily
- stable) (= \p{Case_Ignorable=Y}) (1799)
+ stable) (= \p{Case_Ignorable=Y}) (1806)
TX\p{_CombAbove} (For internal use by Perl, not necessarily
stable) (= \p{Canonical_Combining_Class=
Above}) (349)
@@ -3815,8 +3972,6 @@ them. In this version of Unicode, the following match zero code points:
=item \p{Grapheme_Cluster_Break=Prepend}
-=item \p{Joining_Type=Left_Joining}
-
=back
@@ -3864,12 +4019,15 @@ through \p{} and \P{}>, like B<D> or B<S>.
Age
AHex ASCII_Hex_Digit
- All Any. (Perl extension)
+ All (Perl extension). All code points,
+ including those above Unicode. Same as
+ qr/./s
Alnum (Perl extension). Alphabetic and
(decimal) Numeric
Alpha Alphabetic
Alphabetic (Short: Alpha)
- Any (Perl extension). [\x{0000}-\x{10FFFF}]
+ Any (Perl extension). All Unicode code
+ points: [\x{0000}-\x{10FFFF}]
ASCII Block=ASCII. (Perl extension).
[[:ASCII:]]
ASCII_Hex_Digit (Short: AHex)
@@ -3881,11 +4039,15 @@ through \p{} and \P{}>, like B<D> or B<S>.
Bidi_M Bidi_Mirrored
Bidi_Mirrored (Short: Bidi_M)
Bidi_Mirroring_Glyph (Short: bmg)
+ Bidi_Paired_Bracket (Short: bpb)
+ Bidi_Paired_Bracket_Type (Short: bpt)
Blank (Perl extension). \h, Horizontal white
space
Blk Block
Block (Short: blk)
Bmg Bidi_Mirroring_Glyph
+ Bpb Bidi_Paired_Bracket
+ Bpt Bidi_Paired_Bracket_Type
Canonical_Combining_Class (Short: ccc)
Case_Folding (Short: cf)
Case_Ignorable (Short: CI)
@@ -4015,13 +4177,13 @@ through \p{} and \P{}>, like B<D> or B<S>.
DLE, DC1, DC2, DC3, DC4, NAK, SYN, ETB,
CAN, EOM, SUB, ESC, FS, GS, RS, US, and DEL
PosixDigit (Perl extension). [0-9]
- PosixGraph (Perl extension). [-
- !"#$%&'()*+,./:;<>?@[\\]^_`{|}~0-9A-Za-z]
+ PosixGraph (Perl extension). [-!"#$%&'()*+,./:;<=
+ >?@[\\]^_`{|}~0-9A-Za-z]
PosixLower (Perl extension). [a-z]
PosixPrint (Perl extension). [- 0-9A-Za-
- z!"#$%&'()*+,./:;<>?@[\\]^_`{|}~]
- PosixPunct (Perl extension). [-
- !"#$%&'()*+,./:;<>?@[\\]^_`{|}~]
+ z!"#$%&'()*+,./:;<=>?@[\\]^_`{|}~]
+ PosixPunct (Perl extension). [-!"#$%&'()*+,./:;<=
+ >?@[\\]^_`{|}~]
PosixSpace (Perl extension). \t, \n, \cK, \f, \r,
and ' '. (\cK is vertical tab)
PosixUpper (Perl extension). [A-Z]
@@ -4065,6 +4227,7 @@ through \p{} and \P{}>, like B<D> or B<S>.
Titlecase_Mapping (Short: tc)
Uc Uppercase_Mapping
UIdeo Unified_Ideograph
+ Unicode Any. (Perl extension)
Unicode_1_Name (Short: na1)
Unified_Ideograph (Short: UIdeo)
Upper Uppercase
@@ -4249,6 +4412,8 @@ Documentation of validation tests
=item F<auxiliary/WBTest.txt>
+=item F<BidiCharacterTest.txt>
+
=item F<BidiTest.txt>
=item F<NormTest.txt>
@@ -4289,6 +4454,12 @@ Named sequences proposed for inclusion in a later version of the Unicode Standar
+=item F<NamesList.html>
+
+Describes the format and contents of F<NamesList.txt>
+
+
+
=item F<NamesList.txt>
Annotated list of characters
@@ -4313,19 +4484,29 @@ Documentation
+=item F<StandardizedVariants.html>
+
+Provides a visual display of the standard variant sequences derived from F<StandardizedVariants.txt>.
+
+
+
=item F<StandardizedVariants.txt>
Certain glyph variations for character display are standardized. This lists the non-Unihan ones; the Unihan ones are also not used by Perl, and are in a separate Unicode data base L<http://www.unicode.org/ivd>
-=item F<USourceData.pdf>
-
=item F<USourceData.txt>
Documentation of status and cross reference of proposals for encoding by Unicode of Unihan characters
+
+=item F<USourceGlyphs.pdf>
+
+Pictures of the characters in F<USourceData.txt>
+
+
=back
=head1 SEE ALSO