Class EsperantoWordTokenizer

java.lang.Object
org.languagetool.tokenizers.WordTokenizer
org.languagetool.tokenizers.eo.EsperantoWordTokenizer
All Implemented Interfaces:
org.languagetool.tokenizers.Tokenizer

public class EsperantoWordTokenizer extends org.languagetool.tokenizers.WordTokenizer
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    Tokenizes just like WordTokenizer with the exception that words such as "dank'" contain an apostrophe within it.

    Methods inherited from class org.languagetool.tokenizers.WordTokenizer

    getProtocols, getTokenizingCharacters, isEMail, isUrl, joinEMails, joinEMailsAndUrls, joinUrls

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • EsperantoWordTokenizer

      public EsperantoWordTokenizer()
  • Method Details

    • tokenize

      public List<String> tokenize(String text)
      Tokenizes just like WordTokenizer with the exception that words such as "dank'" contain an apostrophe within it.
      Specified by:
      tokenize in interface org.languagetool.tokenizers.Tokenizer
      Overrides:
      tokenize in class org.languagetool.tokenizers.WordTokenizer
      Parameters:
      text - - Text to tokenize
      Returns:
      List of tokens. Note: a special string EO@APOS is used to replace apostrophe during tokenizing.