org.apache.lucene.analysis.nl
public class DutchAnalyzer extends Analyzer
Field Summary | |
---|---|
static String[] | DUTCH_STOP_WORDS
List of typical Dutch stopwords. |
Constructor Summary | |
---|---|
DutchAnalyzer()
Builds an analyzer with the default stop words (DUTCH_STOP_WORDS)
and a few default entries for the stem exclusion table.
| |
DutchAnalyzer(String[] stopwords)
Builds an analyzer with the given stop words.
| |
DutchAnalyzer(HashSet stopwords)
Builds an analyzer with the given stop words.
| |
DutchAnalyzer(File stopwords)
Builds an analyzer with the given stop words.
|
Method Summary | |
---|---|
void | setStemDictionary(File stemdictFile)
Reads a stemdictionary file , that overrules the stemming algorithm
This is a textfile that contains per line
word\tstem, i.e: two tab seperated words |
void | setStemExclusionTable(String[] exclusionlist)
Builds an exclusionlist from an array of Strings.
|
void | setStemExclusionTable(HashSet exclusionlist)
Builds an exclusionlist from a Hashtable. |
void | setStemExclusionTable(File exclusionlist)
Builds an exclusionlist from the words contained in the given file. |
TokenStream | tokenStream(String fieldName, Reader reader)
Creates a TokenStream which tokenizes all the text in the provided TextReader.
|
Parameters: stopwords
Parameters: stopwords
Parameters: stopwords
Parameters: exclusionlist
Returns: A TokenStream build from a StandardTokenizer filtered with StandardFilter, StopFilter, DutchStemFilter