net.sf.saxon.codenorm

Class Normalizer

public class Normalizer extends Object

Implements Unicode Normalization Forms C, D, KC, KD. Copyright (c) 1991-2005 Unicode, Inc. For terms of use, see http://www.unicode.org/terms_of_use.html For documentation, see UAX#15.
The Unicode Consortium makes no expressed or implied warranty of any kind, and assumes no liability for errors or omissions. No liability is assumed for incidental and consequential damages in connection with or arising out of the use of the information here.

Author: Mark Davis Updates for supplementary code points: Vladimir Weinstein & Markus Scherer Modified to remove dependency on ICU code: Michael Kay

Field Summary
static byteC
Normalization Form Selector
static byteCOMPATIBILITY_MASK
Masks for the form selector
static byteCOMPOSITION_MASK
Masks for the form selector
static byteD
Normalization Form Selector
static byteKC
Normalization Form Selector
static byteKD
Normalization Form Selector
Constructor Summary
Normalizer(byte form)
Create a normalizer for a given form.
Method Summary
booleangetExcluded(char ch)
Just accessible for testing.
StringgetRawDecompositionMapping(char ch)
Just accessible for testing.
StringBuffernormalize(CharSequence source, StringBuffer target)
Normalizes text according to the chosen form, replacing contents of the target buffer.
CharSequencenormalize(CharSequence source)
Normalizes text according to the chosen form

Field Detail

C

public static final byte C
Normalization Form Selector

COMPATIBILITY_MASK

static final byte COMPATIBILITY_MASK
Masks for the form selector

COMPOSITION_MASK

static final byte COMPOSITION_MASK
Masks for the form selector

D

public static final byte D
Normalization Form Selector

KC

public static final byte KC
Normalization Form Selector

KD

public static final byte KD
Normalization Form Selector

Constructor Detail

Normalizer

public Normalizer(byte form)
Create a normalizer for a given form.

Method Detail

getExcluded

boolean getExcluded(char ch)
Just accessible for testing.

getRawDecompositionMapping

String getRawDecompositionMapping(char ch)
Just accessible for testing.

normalize

public StringBuffer normalize(CharSequence source, StringBuffer target)
Normalizes text according to the chosen form, replacing contents of the target buffer.

Parameters: source the original text, unnormalized target the resulting normalized text

normalize

public CharSequence normalize(CharSequence source)
Normalizes text according to the chosen form

Parameters: source the original text, unnormalized

Returns: target the resulting normalized text