Package org.languagetool.tokenizers.eo
Class EsperantoWordTokenizer
java.lang.Object
org.languagetool.tokenizers.WordTokenizer
org.languagetool.tokenizers.eo.EsperantoWordTokenizer
- All Implemented Interfaces:
org.languagetool.tokenizers.Tokenizer
public class EsperantoWordTokenizer
extends org.languagetool.tokenizers.WordTokenizer
-
Constructor Summary
Constructors -
Method Summary
Methods inherited from class org.languagetool.tokenizers.WordTokenizer
getProtocols, getTokenizingCharacters, isEMail, isUrl, joinEMails, joinEMailsAndUrls, joinUrls
-
Constructor Details
-
EsperantoWordTokenizer
public EsperantoWordTokenizer()
-
-
Method Details
-
tokenize
Tokenizes just like WordTokenizer with the exception that words such as "dank'" contain an apostrophe within it.- Specified by:
tokenize
in interfaceorg.languagetool.tokenizers.Tokenizer
- Overrides:
tokenize
in classorg.languagetool.tokenizers.WordTokenizer
- Parameters:
text
- - Text to tokenize- Returns:
- List of tokens. Note: a special string EO@APOS is used to replace apostrophe during tokenizing.
-