net.sf.saxon.codenorm
Class Normalizer
public
class
Normalizer
extends Object
Implements Unicode Normalization Forms C, D, KC, KD.
Copyright (c) 1991-2005 Unicode, Inc.
For terms of use, see http://www.unicode.org/terms_of_use.html
For documentation, see UAX#15.
The Unicode Consortium makes no expressed or implied warranty of any
kind, and assumes no liability for errors or omissions.
No liability is assumed for incidental and consequential damages
in connection with or arising out of the use of the information here.
Author: Mark Davis
Updates for supplementary code points: Vladimir Weinstein & Markus Scherer
Modified to remove dependency on ICU code: Michael Kay
Field Summary |
static byte | C
Normalization Form Selector |
static byte | COMPATIBILITY_MASK
Masks for the form selector |
static byte | COMPOSITION_MASK
Masks for the form selector |
static byte | D
Normalization Form Selector |
static byte | KC
Normalization Form Selector |
static byte | KD
Normalization Form Selector |
Method Summary |
boolean | getExcluded(char ch)
Just accessible for testing. |
String | getRawDecompositionMapping(char ch)
Just accessible for testing. |
StringBuffer | normalize(CharSequence source, StringBuffer target)
Normalizes text according to the chosen form,
replacing contents of the target buffer. |
CharSequence | normalize(CharSequence source)
Normalizes text according to the chosen form |
public static final byte C
Normalization Form Selector
static final byte COMPATIBILITY_MASK
Masks for the form selector
static final byte COMPOSITION_MASK
Masks for the form selector
public static final byte D
Normalization Form Selector
public static final byte KC
Normalization Form Selector
public static final byte KD
Normalization Form Selector
public Normalizer(byte form)
Create a normalizer for a given form.
boolean getExcluded(char ch)
Just accessible for testing.
String getRawDecompositionMapping(char ch)
Just accessible for testing.
public StringBuffer normalize(CharSequence source, StringBuffer target)
Normalizes text according to the chosen form,
replacing contents of the target buffer.
Parameters: source the original text, unnormalized target the resulting normalized text
public CharSequence normalize(CharSequence source)
Normalizes text according to the chosen form
Parameters: source the original text, unnormalized
Returns: target the resulting normalized text