Frames | No Frames |
Classes derived from org.apache.lucene.analysis.TokenStream | |
class | Based on GermanStemFilter
|
Constructors with parameter type org.apache.lucene.analysis.TokenStream | |
Construct a token stream filtering the given input. | |
Methods with return type org.apache.lucene.analysis.TokenStream | |
TokenStream | BrazilianAnalyzer.tokenStream(String fieldName, Reader reader) Creates a TokenStream which tokenizes all the text in the provided Reader. |
Classes derived from org.apache.lucene.analysis.TokenStream | |
class | An abstract base class for simple, character-oriented tokenizers. |
class | A filter that replaces accented characters in the ISO Latin 1 character set
(ISO-8859-1) by their unaccented equivalent. |
class | Emits the entire input as a single token. |
class | Removes words that are too long and too short from the stream. |
class | A LetterTokenizer is a tokenizer that divides text at non-letters. |
class | Normalizes token text to lower case. |
class | LowerCaseTokenizer performs the function of LetterTokenizer
and LowerCaseFilter together. |
class | Transforms the token stream as per the Porter stemming algorithm. |
class | Removes stop words from a token stream. |
class | A TokenFilter is a TokenStream whose input is another token stream. |
class | A Tokenizer is a TokenStream whose input is a Reader. |
class | A WhitespaceTokenizer is a tokenizer that divides text at whitespace. |
Constructors with parameter type org.apache.lucene.analysis.TokenStream | |
Build a filter that removes words that are too long or too
short from the text. | |
Constructs a filter which removes words from the input
TokenStream that are named in the Set. | |
Construct a token stream filtering the given input. | |
Construct a token stream filtering the given input. | |
Constructs a filter which removes words from the input
TokenStream that are named in the array of words. | |
Construct a token stream filtering the given input. |
Fields of type org.apache.lucene.analysis.TokenStream | |
TokenStream | The source of tokens for this filter. |
Methods with return type org.apache.lucene.analysis.TokenStream | |
TokenStream | Analyzer.tokenStream(String fieldName, Reader reader) Creates a TokenStream which tokenizes all the text in the provided
Reader. |
TokenStream | KeywordAnalyzer.tokenStream(String fieldName, Reader reader) Creates a TokenStream which tokenizes all the text in the provided
Reader. |
TokenStream | PerFieldAnalyzerWrapper.tokenStream(String fieldName, Reader reader) |
TokenStream | SimpleAnalyzer.tokenStream(String fieldName, Reader reader) |
TokenStream | StopAnalyzer.tokenStream(String fieldName, Reader reader) Filters LowerCaseTokenizer with StopFilter. |
TokenStream | WhitespaceAnalyzer.tokenStream(String fieldName, Reader reader) |
Classes derived from org.apache.lucene.analysis.TokenStream | |
class | TokenFilter that use java.text.BreakIterator to break each
Token that is Thai into separate Token(s) for each Thai word. |
Constructors with parameter type org.apache.lucene.analysis.TokenStream | |
Methods with return type org.apache.lucene.analysis.TokenStream | |
TokenStream | ThaiAnalyzer.tokenStream(String fieldName, Reader reader) Creates a TokenStream which tokenizes all the text in the provided
Reader. |
Classes derived from org.apache.lucene.analysis.TokenStream | |
class | A RussianLetterTokenizer is a tokenizer that extends LetterTokenizer by additionally looking up letters
in a given "russian charset". |
class | Normalizes token text to lower case, analyzing given ("russian") charset. |
class | A filter that stems Russian words. |
Constructors with parameter type org.apache.lucene.analysis.TokenStream | |
Methods with return type org.apache.lucene.analysis.TokenStream | |
TokenStream | RussianAnalyzer.tokenStream(String fieldName, Reader reader) Creates a TokenStream which tokenizes all the text in the provided Reader. |
Classes derived from org.apache.lucene.analysis.TokenStream | |
class | A filter that stems Dutch words. |
Constructors with parameter type org.apache.lucene.analysis.TokenStream | |
Builds a DutchStemFilter that uses an exclusiontable. | |
Methods with return type org.apache.lucene.analysis.TokenStream | |
TokenStream | DutchAnalyzer.tokenStream(String fieldName, Reader reader) Creates a TokenStream which tokenizes all the text in the provided TextReader. |
Classes derived from org.apache.lucene.analysis.TokenStream | |
class | Tokenizes the input into n-grams of the given size. |
class | Tokenizes the input into n-grams of the given size(s). |
Classes derived from org.apache.lucene.analysis.TokenStream | |
class | A filter that stemms french words. |
Constructors with parameter type org.apache.lucene.analysis.TokenStream | |
Methods with return type org.apache.lucene.analysis.TokenStream | |
TokenStream | FrenchAnalyzer.tokenStream(String fieldName, Reader reader) Creates a TokenStream which tokenizes all the text in the provided Reader. |
Methods with parameter type org.apache.lucene.analysis.TokenStream | |
String | Highlights chosen terms in a text, extracting the most relevant section. |
String[] | Highlights chosen terms in a text, extracting the most relevant sections. |
String | Highlighter.getBestFragments(TokenStream tokenStream, String text, int maxNumFragments, String separator) Highlights terms in the text , extracting the most relevant sections
and concatenating the chosen fragments with a separator (typically "..."). |
TextFragment[] | Highlighter.getBestTextFragments(TokenStream tokenStream, String text, boolean mergeContiguousFragments, int maxNumFragments) Low level api to get the most relevant (formatted) sections of the document. |
Methods with return type org.apache.lucene.analysis.TokenStream | |
TokenStream | A convenience method that tries a number of approaches to getting a token stream. |
TokenStream | |
TokenStream | |
TokenStream | |
TokenStream | Low level api. |
Classes derived from org.apache.lucene.analysis.TokenStream | |
class | Normalizes token text to lower case, analyzing given ("greek") charset. |
Constructors with parameter type org.apache.lucene.analysis.TokenStream | |
Methods with return type org.apache.lucene.analysis.TokenStream | |
TokenStream | GreekAnalyzer.tokenStream(String fieldName, Reader reader) Creates a TokenStream which tokenizes all the text in the provided Reader. |
Classes derived from org.apache.lucene.analysis.TokenStream | |
class | Normalizes tokens extracted with StandardTokenizer . |
class | A grammar-based tokenizer constructed with JavaCC. |
Constructors with parameter type org.apache.lucene.analysis.TokenStream | |
Construct filtering in. |
Methods with return type org.apache.lucene.analysis.TokenStream | |
TokenStream | StandardAnalyzer.tokenStream(String fieldName, Reader reader) |
Classes derived from org.apache.lucene.analysis.TokenStream | |
class | A filter that stems words using a Snowball-generated stemmer. |
Constructors with parameter type org.apache.lucene.analysis.TokenStream | |
Construct the named stemming filter. |
Methods with return type org.apache.lucene.analysis.TokenStream | |
TokenStream | SnowballAnalyzer.tokenStream(String fieldName, Reader reader) |
Classes derived from org.apache.lucene.analysis.TokenStream | |
class | Injects additional tokens for synonyms of token terms fetched from the
underlying child stream; the child stream must deliver lowercase tokens
for synonyms to be found. |
Constructors with parameter type org.apache.lucene.analysis.TokenStream | |
Creates an instance for the given underlying stream and synonym table. |
Methods with parameter type org.apache.lucene.analysis.TokenStream | |
void | Equivalent to addField(fieldName, stream, 1.0f) . |
void | Iterates over the given token stream and adds the resulting terms to the index;
Equivalent to adding a tokenized, indexed, termVectorStored, unstored,
Lucene Field . |
Methods with return type org.apache.lucene.analysis.TokenStream | |
TokenStream | MemoryIndex.keywordTokenStream(Collection keywords) Convenience method; Creates and returns a token stream that generates a
token for each keyword in the given collection, "as is", without any
transforming text analysis. |
TokenStream | PatternAnalyzer.tokenStream(String fieldName, Reader reader) Creates a token stream that tokenizes all the text in the given Reader;
This implementation forwards to tokenStream(String, String) and is
less efficient than tokenStream(String, String) . |
TokenStream | PatternAnalyzer.tokenStream(String fieldName, String text) Creates a token stream that tokenizes the given string into token terms
(aka words). |
Classes derived from org.apache.lucene.analysis.TokenStream | |
class | A filter that stems German words. |
Constructors with parameter type org.apache.lucene.analysis.TokenStream | |
Builds a GermanStemFilter that uses an exclusiontable. |
Methods with return type org.apache.lucene.analysis.TokenStream | |
TokenStream | GermanAnalyzer.tokenStream(String fieldName, Reader reader) Creates a TokenStream which tokenizes all the text in the provided Reader. |
Methods with return type org.apache.lucene.analysis.TokenStream | |
TokenStream | CzechAnalyzer.tokenStream(String fieldName, Reader reader) Creates a TokenStream which tokenizes all the text in the provided Reader. |
Classes derived from org.apache.lucene.analysis.TokenStream | |
class | Title: ChineseFilter
Description: Filter with a stop word table
Rule: No digital is allowed. |
class | Title: ChineseTokenizer
Description: Extract tokens from the Stream using Character.getType()
Rule: A Chinese character as a single token
Copyright: Copyright (c) 2001
Company:
The difference between thr ChineseTokenizer and the
CJKTokenizer (id=23545) is that they have different
token parsing logic. |
Constructors with parameter type org.apache.lucene.analysis.TokenStream | |
Methods with return type org.apache.lucene.analysis.TokenStream | |
TokenStream | ChineseAnalyzer.tokenStream(String fieldName, Reader reader) Creates a TokenStream which tokenizes all the text in the provided Reader. |
Classes derived from org.apache.lucene.analysis.TokenStream | |
class | CJKTokenizer was modified from StopTokenizer which does a decent job for
most European languages. |
Methods with return type org.apache.lucene.analysis.TokenStream | |
TokenStream | CJKAnalyzer.tokenStream(String fieldName, Reader reader) get token stream from input
|