Package com.ibm.icu.text
Class CharsetRecog_2022
- java.lang.Object
-
- com.ibm.icu.text.CharsetRecognizer
-
- com.ibm.icu.text.CharsetRecog_2022
-
- Direct Known Subclasses:
CharsetRecog_2022.CharsetRecog_2022CN
,CharsetRecog_2022.CharsetRecog_2022JP
,CharsetRecog_2022.CharsetRecog_2022KR
abstract class CharsetRecog_2022 extends CharsetRecognizer
class CharsetRecog_2022 part of the ICU charset detection implementation. This is a superclass for the individual detectors for each of the detectable members of the ISO 2022 family of encodings. The separate classes are nested within this class.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static class
CharsetRecog_2022.CharsetRecog_2022CN
(package private) static class
CharsetRecog_2022.CharsetRecog_2022JP
(package private) static class
CharsetRecog_2022.CharsetRecog_2022KR
-
Constructor Summary
Constructors Constructor Description CharsetRecog_2022()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description (package private) int
match(byte[] text, int textLen, byte[][] escapeSequences)
Matching function shared among the 2022 detectors JP, CN and KR Counts up the number of legal an unrecognized escape sequences in the sample of text, and computes a score based on the total number & the proportion that fit the encoding.-
Methods inherited from class com.ibm.icu.text.CharsetRecognizer
getLanguage, getName, match
-
-
-
-
Method Detail
-
match
int match(byte[] text, int textLen, byte[][] escapeSequences)
Matching function shared among the 2022 detectors JP, CN and KR Counts up the number of legal an unrecognized escape sequences in the sample of text, and computes a score based on the total number & the proportion that fit the encoding.- Parameters:
text
- the byte buffer containing text to analysetextLen
- the size of the text in the byte.escapeSequences
- the byte escape sequences to test for.- Returns:
- match quality, in the range of 0-100.
-
-