Package org.languagetool.tokenizers.nl
Class DutchWordTokenizer
java.lang.Object
org.languagetool.tokenizers.WordTokenizer
org.languagetool.tokenizers.nl.DutchWordTokenizer
- All Implemented Interfaces:
org.languagetool.tokenizers.Tokenizer
public class DutchWordTokenizer
extends org.languagetool.tokenizers.WordTokenizer
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate boolean
endsWithQuote
(String token) private boolean
startsWithQuote
(String token) Tokenizes just like WordTokenizer with the exception for words such as "oma's" that contain an apostrophe in their middle.Methods inherited from class org.languagetool.tokenizers.WordTokenizer
getProtocols, isEMail, isUrl, joinEMails, joinEMailsAndUrls, joinUrls
-
Field Details
-
QUOTES
-
nlTokenizingChars
-
-
Constructor Details
-
DutchWordTokenizer
public DutchWordTokenizer()
-
-
Method Details
-
tokenize
Tokenizes just like WordTokenizer with the exception for words such as "oma's" that contain an apostrophe in their middle.- Specified by:
tokenize
in interfaceorg.languagetool.tokenizers.Tokenizer
- Overrides:
tokenize
in classorg.languagetool.tokenizers.WordTokenizer
- Parameters:
text
- Text to tokenize- Returns:
- List of tokens
-
startsWithQuote
-
endsWithQuote
-
getTokenizingCharacters
- Overrides:
getTokenizingCharacters
in classorg.languagetool.tokenizers.WordTokenizer
-