net.sf.saxon.codenorm

Class NormalizerData

public class NormalizerData extends Object

Accesses the Normalization Data used for Forms C and D.
Copyright ) 1998-1999 Unicode, Inc. All Rights Reserved.
The Unicode Consortium makes no expressed or implied warranty of any kind, and assumes no liability for errors or omissions. No liability is assumed for incidental and consequential damages in connection with or arising out of the use of the information here.

Author: Mark Davis

Field Summary
static Stringcopyright
static intNOT_COMPOSITE
Constant for use in getPairwiseComposition
Constructor Summary
NormalizerData(IntToIntHashMap canonicalClass, IntHashMap decompose, IntToIntHashMap compose, BitSet isCompatibility, BitSet isExcluded)
Only accessed by NormalizerBuilder.
Method Summary
intgetCanonicalClass(int ch)
Gets the combining class of a character from the Unicode Character Database.
booleangetExcluded(char ch)
Just accessible for testing.
chargetPairwiseComposition(int first, int second)
Returns the composite of the two characters.
StringgetRawDecompositionMapping(char ch)
Just accessible for testing.
voidgetRecursiveDecomposition(boolean canonical, int ch, StringBuffer buffer)
Gets recursive decomposition of a character from the Unicode Character Database.

Field Detail

copyright

static final String copyright

NOT_COMPOSITE

public static final int NOT_COMPOSITE
Constant for use in getPairwiseComposition

Constructor Detail

NormalizerData

NormalizerData(IntToIntHashMap canonicalClass, IntHashMap decompose, IntToIntHashMap compose, BitSet isCompatibility, BitSet isExcluded)
Only accessed by NormalizerBuilder.

Method Detail

getCanonicalClass

public int getCanonicalClass(int ch)
Gets the combining class of a character from the Unicode Character Database.

Parameters: ch the source character

Returns: value from 0 to 255

getExcluded

boolean getExcluded(char ch)
Just accessible for testing.

getPairwiseComposition

public char getPairwiseComposition(int first, int second)
Returns the composite of the two characters. If the two characters don't combine, returns NOT_COMPOSITE. Only has to worry about BMP characters, since those are the only ones that can ever compose.

Parameters: first first character (e.g. 'c') second second character (e.g. '8' cedilla)

Returns: composite (e.g. 'g')

getRawDecompositionMapping

String getRawDecompositionMapping(char ch)
Just accessible for testing.

getRecursiveDecomposition

public void getRecursiveDecomposition(boolean canonical, int ch, StringBuffer buffer)
Gets recursive decomposition of a character from the Unicode Character Database.

Parameters: canonical If true bit is on in this byte, then selects the recursive canonical decomposition, otherwise selects the recursive compatibility and canonical decomposition. ch the source character buffer buffer to be filled with the decomposition