AGE
public static final int AGE
String property Age.
Corresponds to UCharacter.getAge(int).
ALPHABETIC
public static final int ALPHABETIC
Binary property Alphabetic.
Property for UCharacter.isUAlphabetic(), different from the property
in UCharacter.isalpha().
Lu + Ll + Lt + Lm + Lo + Nl + Other_Alphabetic.
ASCII_HEX_DIGIT
public static final int ASCII_HEX_DIGIT
Binary property ASCII_Hex_Digit (0-9 A-F a-f).
BIDI_CLASS
public static final int BIDI_CLASS
Enumerated property Bidi_Class.
Same as UCharacter.getDirection(int), returns UCharacterDirection values.
BIDI_CONTROL
public static final int BIDI_CONTROL
Binary property Bidi_Control.
Format controls which have specific functions in the Bidi Algorithm.
BIDI_MIRRORED
public static final int BIDI_MIRRORED
Binary property Bidi_Mirrored.
Characters that may change display in RTL text.
Property for UCharacter.isMirrored().
See Bidi Algorithm; UTR 9.
BIDI_MIRRORING_GLYPH
public static final int BIDI_MIRRORING_GLYPH
String property Bidi_Mirroring_Glyph.
Corresponds to UCharacter.getMirror(int).
BINARY_LIMIT
public static final int BINARY_LIMIT
One more than the last constant for binary Unicode properties.
BINARY_START
public static final int BINARY_START
First constant for binary Unicode properties.
BLOCK
public static final int BLOCK
Enumerated property Block.
Same as UCharacter.UnicodeBlock.of(int), returns UCharacter.UnicodeBlock
values.
CANONICAL_COMBINING_CLASS
public static final int CANONICAL_COMBINING_CLASS
Enumerated property Canonical_Combining_Class.
Same as UCharacter.getCombiningClass(int), returns 8-bit numeric values.
CASE_FOLDING
public static final int CASE_FOLDING
String property Case_Folding.
Corresponds to UCharacter.foldCase(String, boolean).
CASE_SENSITIVE
public static final int CASE_SENSITIVE
Binary property Case_Sensitive.
Either the source of a case
mapping or _in_ the target of a case mapping. Not the same as
the general category Cased_Letter.
DASH
public static final int DASH
Binary property Dash.
Variations of dashes.
DECOMPOSITION_TYPE
public static final int DECOMPOSITION_TYPE
Enumerated property Decomposition_Type.
Returns UCharacter.DecompositionType values.
DEFAULT_IGNORABLE_CODE_POINT
public static final int DEFAULT_IGNORABLE_CODE_POINT
Binary property Default_Ignorable_Code_Point (new).
Property that indicates codepoint is ignorable in most processing.
Codepoints (2060..206F, FFF0..FFFB, E0000..E0FFF) +
Other_Default_Ignorable_Code_Point + (Cf + Cc + Cs - White_Space)
DEPRECATED
public static final int DEPRECATED
Binary property Deprecated (new).
The usage of deprecated characters is strongly discouraged.
DIACRITIC
public static final int DIACRITIC
Binary property Diacritic.
Characters that linguistically modify the meaning of another
character to which they apply.
DOUBLE_LIMIT
public static final int DOUBLE_LIMIT
One more than the last constant for double Unicode properties.
DOUBLE_START
public static final int DOUBLE_START
First constant for double Unicode properties.
EAST_ASIAN_WIDTH
public static final int EAST_ASIAN_WIDTH
Enumerated property East_Asian_Width.
See http://www.unicode.org/reports/tr11/
Returns UCharacter.EastAsianWidth values.
EXTENDER
public static final int EXTENDER
Binary property Extender.
Extend the value or shape of a preceding alphabetic character, e.g.
length and iteration marks.
FULL_COMPOSITION_EXCLUSION
public static final int FULL_COMPOSITION_EXCLUSION
Binary property Full_Composition_Exclusion.
CompositionExclusions.txt + Singleton Decompositions +
Non-Starter Decompositions.
GENERAL_CATEGORY
public static final int GENERAL_CATEGORY
Enumerated property General_Category.
Same as UCharacter.getType(int), returns UCharacterCategory values.
GENERAL_CATEGORY_MASK
public static final int GENERAL_CATEGORY_MASK
Bitmask property General_Category_Mask.
This is the General_Category property returned as a bit mask.
When used in UCharacter.getIntPropertyValue(c),
returns bit masks for UCharacterCategory values where exactly one bit is set.
When used with UCharacter.getPropertyValueName() and UCharacter.getPropertyValueEnum(),
a multi-bit mask is used for sets of categories like "Letters".
GRAPHEME_BASE
public static final int GRAPHEME_BASE
Binary property Grapheme_Base (new).
For programmatic determination of grapheme cluster boundaries.
[0..10FFFF]-Cc-Cf-Cs-Co-Cn-Zl-Zp-Grapheme_Link-Grapheme_Extend-CGJ
GRAPHEME_CLUSTER_BREAK
public static final int GRAPHEME_CLUSTER_BREAK
Enumerated property Grapheme_Cluster_Break (new in Unicode 4.1).
Used in UAX #29: Text Boundaries
(http://www.unicode.org/reports/tr29/)
Returns UGraphemeClusterBreak values.
GRAPHEME_EXTEND
public static final int GRAPHEME_EXTEND
Binary property Grapheme_Extend (new).
For programmatic determination of grapheme cluster boundaries.
Me+Mn+Mc+Other_Grapheme_Extend-Grapheme_Link-CGJ
GRAPHEME_LINK
public static final int GRAPHEME_LINK
Binary property Grapheme_Link (new).
For programmatic determination of grapheme cluster boundaries.
HANGUL_SYLLABLE_TYPE
public static final int HANGUL_SYLLABLE_TYPE
Enumerated property Hangul_Syllable_Type, new in Unicode 4.
Returns HangulSyllableType values.
HEX_DIGIT
public static final int HEX_DIGIT
Binary property Hex_Digit.
Characters commonly used for hexadecimal numbers.
HYPHEN
public static final int HYPHEN
Binary property Hyphen.
Dashes used to mark connections between pieces of words, plus the
Katakana middle dot.
IDEOGRAPHIC
public static final int IDEOGRAPHIC
Binary property Ideographic.
CJKV ideographs.
IDS_BINARY_OPERATOR
public static final int IDS_BINARY_OPERATOR
Binary property IDS_Binary_Operator (new).
For programmatic determination of Ideographic Description Sequences.
IDS_TRINARY_OPERATOR
public static final int IDS_TRINARY_OPERATOR
Binary property IDS_Trinary_Operator (new).
For programmatic determination of Ideographic Description
Sequences.
ID_CONTINUE
public static final int ID_CONTINUE
Binary property ID_Continue.
Characters that can continue an identifier.
ID_Start+Mn+Mc+Nd+Pc
ID_START
public static final int ID_START
Binary property ID_Start.
Characters that can start an identifier.
Lu+Ll+Lt+Lm+Lo+Nl
INT_LIMIT
public static final int INT_LIMIT
One more than the last constant for enumerated/integer Unicode
properties.
INT_START
public static final int INT_START
First constant for enumerated/integer Unicode properties.
ISO_COMMENT
public static final int ISO_COMMENT
String property ISO_Comment.
Corresponds to UCharacter.getISOComment(int).
JOINING_GROUP
public static final int JOINING_GROUP
Enumerated property Joining_Group.
Returns UCharacter.JoiningGroup values.
JOINING_TYPE
public static final int JOINING_TYPE
Enumerated property Joining_Type.
Returns UCharacter.JoiningType values.
JOIN_CONTROL
public static final int JOIN_CONTROL
Binary property Join_Control.
Format controls for cursive joining and ligation.
LEAD_CANONICAL_COMBINING_CLASS
public static final int LEAD_CANONICAL_COMBINING_CLASS
Enumerated property Lead_Canonical_Combining_Class.
ICU-specific property for the ccc of the first code point
of the decomposition, or lccc(c)=ccc(NFD(c)[0]).
Useful for checking for canonically ordered text;
see Normalizer.FCD and http://www.unicode.org/notes/tn5/#FCD .
Returns 8-bit numeric values like CANONICAL_COMBINING_CLASS.
LINE_BREAK
public static final int LINE_BREAK
Enumerated property Line_Break.
Returns UCharacter.LineBreak values.
LOGICAL_ORDER_EXCEPTION
public static final int LOGICAL_ORDER_EXCEPTION
Binary property Logical_Order_Exception (new).
Characters that do not use logical order and require special
handling in most processing.
LOWERCASE
public static final int LOWERCASE
Binary property Lowercase.
Same as UCharacter.isULowercase(), different from
UCharacter.islower().
Ll+Other_Lowercase
LOWERCASE_MAPPING
public static final int LOWERCASE_MAPPING
String property Lowercase_Mapping.
Corresponds to UCharacter.toLowerCase(String).
MASK_LIMIT
public static final int MASK_LIMIT
One more than the last constant for bit-mask Unicode properties.
MASK_START
public static final int MASK_START
First constant for bit-mask Unicode properties.
MATH
public static final int MATH
Binary property Math.
Sm+Other_Math
NAME
public static final int NAME
String property Name.
Corresponds to UCharacter.getName(int).
NFC_INERT
public static final int NFC_INERT
Binary property NFC_Inert.
ICU-specific property for characters that are inert under NFC,
i.e., they do not interact with adjacent characters.
Used for example in normalizing transforms in incremental mode
to find the boundary of safely normalizable text despite possible
text additions.
NFC_QUICK_CHECK
public static final int NFC_QUICK_CHECK
Enumerated property NFC_Quick_Check.
Returns numeric values compatible with Normalizer.QuickCheckResult.
NFD_INERT
public static final int NFD_INERT
Binary property NFD_Inert.
ICU-specific property for characters that are inert under NFD,
i.e., they do not interact with adjacent characters.
Used for example in normalizing transforms in incremental mode
to find the boundary of safely normalizable text despite possible
text additions.
There is one such property per normalization form.
These properties are computed as follows - an inert character is:
a) unassigned, or ALL of the following:
b) of combining class 0.
c) not decomposed by this normalization form.
AND if NFC or NFKC,
d) can never compose with a previous character.
e) can never compose with a following character.
f) can never change if another character is added.
Example: a-breve might satisfy all but f, but if you
add an ogonek it changes to a-ogonek + breve
See also com.ibm.text.UCD.NFSkippable in the ICU4J repository,
and icu/source/common/unormimp.h .
NFD_QUICK_CHECK
public static final int NFD_QUICK_CHECK
Enumerated property NFD_Quick_Check.
Returns numeric values compatible with Normalizer.QuickCheckResult.
NFKC_INERT
public static final int NFKC_INERT
Binary property NFKC_Inert.
ICU-specific property for characters that are inert under NFKC,
i.e., they do not interact with adjacent characters.
Used for example in normalizing transforms in incremental mode
to find the boundary of safely normalizable text despite possible
text additions.
NFKC_QUICK_CHECK
public static final int NFKC_QUICK_CHECK
Enumerated property NFKC_Quick_Check.
Returns numeric values compatible with Normalizer.QuickCheckResult.
NFKD_INERT
public static final int NFKD_INERT
Binary property NFKD_Inert.
ICU-specific property for characters that are inert under NFKD,
i.e., they do not interact with adjacent characters.
Used for example in normalizing transforms in incremental mode
to find the boundary of safely normalizable text despite possible
text additions.
NFKD_QUICK_CHECK
public static final int NFKD_QUICK_CHECK
Enumerated property NFKD_Quick_Check.
Returns numeric values compatible with Normalizer.QuickCheckResult.
NONCHARACTER_CODE_POINT
public static final int NONCHARACTER_CODE_POINT
Binary property Noncharacter_Code_Point.
Code points that are explicitly defined as illegal for the encoding
of characters.
NUMERIC_TYPE
public static final int NUMERIC_TYPE
Enumerated property Numeric_Type.
Returns UCharacter.NumericType values.
NUMERIC_VALUE
public static final int NUMERIC_VALUE
Double property Numeric_Value.
Corresponds to UCharacter.getUnicodeNumericValue(int).
PATTERN_SYNTAX
public static final int PATTERN_SYNTAX
Binary property Pattern_Syntax (new in Unicode 4.1).
See UAX #31 Identifier and Pattern Syntax
(http://www.unicode.org/reports/tr31/)
PATTERN_WHITE_SPACE
public static final int PATTERN_WHITE_SPACE
Binary property Pattern_White_Space (new in Unicode 4.1).
See UAX #31 Identifier and Pattern Syntax
(http://www.unicode.org/reports/tr31/)
POSIX_ALNUM
public static final int POSIX_ALNUM
Binary property alnum (a C/POSIX character class).
Implemented according to the UTS #18 Annex C Standard Recommendation.
See the UCharacter class documentation.
POSIX_BLANK
public static final int POSIX_BLANK
Binary property blank (a C/POSIX character class).
Implemented according to the UTS #18 Annex C Standard Recommendation.
See the UCharacter class documentation.
POSIX_GRAPH
public static final int POSIX_GRAPH
Binary property graph (a C/POSIX character class).
Implemented according to the UTS #18 Annex C Standard Recommendation.
See the UCharacter class documentation.
POSIX_PRINT
public static final int POSIX_PRINT
Binary property print (a C/POSIX character class).
Implemented according to the UTS #18 Annex C Standard Recommendation.
See the UCharacter class documentation.
POSIX_XDIGIT
public static final int POSIX_XDIGIT
Binary property xdigit (a C/POSIX character class).
Implemented according to the UTS #18 Annex C Standard Recommendation.
See the UCharacter class documentation.
QUOTATION_MARK
public static final int QUOTATION_MARK
Binary property Quotation_Mark.
RADICAL
public static final int RADICAL
Binary property Radical (new).
For programmatic determination of Ideographic Description
Sequences.
SCRIPT
public static final int SCRIPT
Enumerated property Script.
Same as UScript.getScript(int), returns UScript values.
SEGMENT_STARTER
public static final int SEGMENT_STARTER
Binary Property Segment_Starter.
ICU-specific property for characters that are starters in terms of
Unicode normalization and combining character sequences.
They have ccc=0 and do not occur in non-initial position of the
canonical decomposition of any character
(like " in NFD(a-umlaut) and a Jamo T in an NFD(Hangul LVT)).
ICU uses this property for segmenting a string for generating a set of
canonically equivalent strings, e.g. for canonical closure while
processing collation tailoring rules.
SENTENCE_BREAK
public static final int SENTENCE_BREAK
Enumerated property Sentence_Break (new in Unicode 4.1).
Used in UAX #29: Text Boundaries
(http://www.unicode.org/reports/tr29/)
Returns USentenceBreak values.
SIMPLE_CASE_FOLDING
public static final int SIMPLE_CASE_FOLDING
String property Simple_Case_Folding.
Corresponds to UCharacter.foldCase(int, boolean).
SIMPLE_LOWERCASE_MAPPING
public static final int SIMPLE_LOWERCASE_MAPPING
String property Simple_Lowercase_Mapping.
Corresponds to UCharacter.toLowerCase(int).
SIMPLE_TITLECASE_MAPPING
public static final int SIMPLE_TITLECASE_MAPPING
String property Simple_Titlecase_Mapping.
Corresponds to UCharacter.toTitleCase(int).
SIMPLE_UPPERCASE_MAPPING
public static final int SIMPLE_UPPERCASE_MAPPING
String property Simple_Uppercase_Mapping.
Corresponds to UCharacter.toUpperCase(int).
SOFT_DOTTED
public static final int SOFT_DOTTED
Binary property Soft_Dotted (new).
Characters with a "soft dot", like i or j.
An accent placed on these characters causes the dot to disappear.
STRING_LIMIT
public static final int STRING_LIMIT
One more than the last constant for string Unicode properties.
STRING_START
public static final int STRING_START
First constant for string Unicode properties.
S_TERM
public static final int S_TERM
Binary property STerm (new in Unicode 4.0.1).
Sentence Terminal. Used in UAX #29: Text Boundaries
(http://www.unicode.org/reports/tr29/)
TERMINAL_PUNCTUATION
public static final int TERMINAL_PUNCTUATION
Binary property Terminal_Punctuation.
Punctuation characters that generally mark the end of textual
units.
TITLECASE_MAPPING
public static final int TITLECASE_MAPPING
String property Titlecase_Mapping.
Corresponds to UCharacter.toTitleCase(String).
TRAIL_CANONICAL_COMBINING_CLASS
public static final int TRAIL_CANONICAL_COMBINING_CLASS
Enumerated property Trail_Canonical_Combining_Class.
ICU-specific property for the ccc of the last code point
of the decomposition, or lccc(c)=ccc(NFD(c)[last]).
Useful for checking for canonically ordered text;
see Normalizer.FCD and http://www.unicode.org/notes/tn5/#FCD .
Returns 8-bit numeric values like CANONICAL_COMBINING_CLASS.
UNICODE_1_NAME
public static final int UNICODE_1_NAME
String property Unicode_1_Name.
Corresponds to UCharacter.getName1_0(int).
UNIFIED_IDEOGRAPH
public static final int UNIFIED_IDEOGRAPH
Binary property Unified_Ideograph (new).
For programmatic determination of Ideographic Description
Sequences.
UPPERCASE
public static final int UPPERCASE
Binary property Uppercase.
Same as UCharacter.isUUppercase(), different from
UCharacter.isUpperCase().
Lu+Other_Uppercase
UPPERCASE_MAPPING
public static final int UPPERCASE_MAPPING
String property Uppercase_Mapping.
Corresponds to UCharacter.toUpperCase(String).
VARIATION_SELECTOR
public static final int VARIATION_SELECTOR
Binary property Variation_Selector (new in Unicode 4.0.1).
Indicates all those characters that qualify as Variation Selectors.
For details on the behavior of these characters,
see StandardizedVariants.html and 15.6 Variation Selectors.
WHITE_SPACE
public static final int WHITE_SPACE
Binary property White_Space.
Same as UCharacter.isUWhiteSpace(), different from
UCharacter.isSpace() and UCharacter.isWhitespace().
Space characters+TAB+CR+LF-ZWSP-ZWNBSP
WORD_BREAK
public static final int WORD_BREAK
Enumerated property Word_Break (new in Unicode 4.1).
Used in UAX #29: Text Boundaries
(http://www.unicode.org/reports/tr29/)
Returns UWordBreakValues values.
XID_CONTINUE
public static final int XID_CONTINUE
Binary property XID_Continue.
ID_Continue modified to allow closure under normalization forms
NFKC and NFKD.
XID_START
public static final int XID_START
Binary property XID_Start.
ID_Start modified to allow closure under normalization forms NFKC
and NFKD.