org.apache.lucene.analysis
Class StopFilter
public final class StopFilter
Removes stop words from a token stream.
StopFilter(TokenStream in, Set stopWords) - Constructs a filter which removes words from the input
TokenStream that are named in the Set.
|
StopFilter(TokenStream input, Set stopWords, boolean ignoreCase) - Construct a token stream filtering the given input.
|
StopFilter(TokenStream input, String[] stopWords) - Construct a token stream filtering the given input.
|
StopFilter(TokenStream in, String[] stopWords, boolean ignoreCase) - Constructs a filter which removes words from the input
TokenStream that are named in the array of words.
|
static Set | makeStopSet(String[] stopWords) - Builds a Set from an array of stop words,
appropriate for passing into the StopFilter constructor.
|
static Set | makeStopSet(String[] stopWords, boolean ignoreCase)
|
Token | next() - Returns the next input Token whose termText() is not a stop word.
|
StopFilter
public StopFilter(TokenStream in,
Set stopWords)
Constructs a filter which removes words from the input
TokenStream that are named in the Set.
It is crucial that an efficient Set implementation is used
for maximum performance.
makeStopSet(java.lang.String[])
StopFilter
public StopFilter(TokenStream input,
Set stopWords,
boolean ignoreCase)
Construct a token stream filtering the given input.
input
- stopWords
- The set of Stop Words, as Strings. If ignoreCase is true, all strings should be lower casedignoreCase
- -Ignore case when stopping. The stopWords set must be setup to contain only lower case words
StopFilter
public StopFilter(TokenStream input,
String[] stopWords)
Construct a token stream filtering the given input.
StopFilter
public StopFilter(TokenStream in,
String[] stopWords,
boolean ignoreCase)
Constructs a filter which removes words from the input
TokenStream that are named in the array of words.
makeStopSet
public static final Set makeStopSet(String[] stopWords)
Builds a Set from an array of stop words,
appropriate for passing into the StopFilter constructor.
This permits this stopWords construction to be cached once when
an Analyzer is constructed.
passing false to ignoreCase
makeStopSet
public static final Set makeStopSet(String[] stopWords,
boolean ignoreCase)
stopWords
- ignoreCase
- If true, all words are lower cased first.
- a Set containing the words
next
public final Token next()
throws IOException
Returns the next input Token whose termText() is not a stop word.
- next in interface TokenStream
Copyright © 2000-2007 Apache Software Foundation. All Rights Reserved.