org.apache.lucene.analysis.ngram
Class NGramTokenizer
public class NGramTokenizer
Tokenizes the input into n-grams of the given size(s).
NGramTokenizer(Reader input) - Creates NGramTokenizer with default min and max n-grams.
|
NGramTokenizer(Reader input, int minGram, int maxGram) - Creates NGramTokenizer with given min and max n-grams.
|
Token | next() - Returns the next token in the stream, or null at EOS.
|
DEFAULT_MAX_NGRAM_SIZE
public static final int DEFAULT_MAX_NGRAM_SIZE
DEFAULT_MIN_NGRAM_SIZE
public static final int DEFAULT_MIN_NGRAM_SIZE
NGramTokenizer
public NGramTokenizer(Reader input)
Creates NGramTokenizer with default min and max n-grams.
input
- Reader holding the input to be tokenized
NGramTokenizer
public NGramTokenizer(Reader input,
int minGram,
int maxGram)
Creates NGramTokenizer with given min and max n-grams.
input
- Reader holding the input to be tokenizedminGram
- the smallest n-gram to generatemaxGram
- the largest n-gram to generate
next
public final Token next()
throws IOException
Returns the next token in the stream, or null at EOS.
- next in interface TokenStream
Copyright © 2000-2007 Apache Software Foundation. All Rights Reserved.