org.apache.lucene.misc
Class SweetSpotSimilarity
- Serializable
public class SweetSpotSimilarity
A similarity with a lengthNorm that provides for a "platuea" of
equally good lengths, and tf helper functions.
For lengthNorm, A global min/max can be specified to define the
platuea of lengths that should all have a norm of 1.0.
Below the min, and above the max the lengthNorm drops off in a
sqrt function.
A per field min/max can be specified if different fields have
different sweet spots.
For tf, baselineTf and hyperbolicTf functions are provided, which
subclasses can choose between.
float | baselineTf(float freq) - Implimented as:
(x <= min) ? base : sqrt(x+(base**2)-min)
...but with a special case check for 0.
|
float | hyperbolicTf(float freq) - Uses a hyperbolic tangent function that allows for a hard max...
|
float | lengthNorm(String fieldName, int numTerms) - Implimented as:
1/sqrt( steepness * (abs(x-min) + abs(x-max) - (max-min)) + 1 )
This degrades to 1/sqrt(x) when min and max are both 1 and
steepness is 0.5
:TODO: potential optimiation is to just flat out return 1.0f if numTerms
is between min and max.
|
void | setBaselineTfFactors(float base, float min) - Sets the baseline and minimum function variables for baselineTf
|
void | setHyperbolicTfFactors(float min, float max, double base, float xoffset) - Sets the function variables for the hyperbolicTf functions
|
void | setLengthNormFactors(String field, int min, int max, float steepness) - Sets the function variables used by lengthNorm for a specific named field
|
void | setLengthNormFactors(int min, int max, float steepness) - Sets the default function variables used by lengthNorm when no field
specifc variables have been set.
|
float | tf(int freq) - Delegates to baselineTf
|
coord , decodeNorm , encodeNorm , getDefault , getNormDecoder , idf , idf , idf , lengthNorm , queryNorm , setDefault , sloppyFreq , tf , tf |
SweetSpotSimilarity
public SweetSpotSimilarity()
baselineTf
public float baselineTf(float freq)
Implimented as:
(x <= min) ? base : sqrt(x+(base**2)-min)
...but with a special case check for 0.
This degrates to
sqrt(x)
when min and base are both 0
hyperbolicTf
public float hyperbolicTf(float freq)
Uses a hyperbolic tangent function that allows for a hard max...
tf(x)=min+(max-min)/2*(((base**(x-xoffset)-base**-(x-xoffset))/(base**(x-xoffset)+base**-(x-xoffset)))+1)
This code is provided as a convincience for subclasses that want
to use a hyperbolic tf function.
lengthNorm
public float lengthNorm(String fieldName,
int numTerms)
Implimented as:
1/sqrt( steepness * (abs(x-min) + abs(x-max) - (max-min)) + 1 )
This degrades to
1/sqrt(x)
when min and max are both 1 and
steepness is 0.5
:TODO: potential optimiation is to just flat out return 1.0f if numTerms
is between min and max.
- lengthNorm in interface DefaultSimilarity
setBaselineTfFactors
public void setBaselineTfFactors(float base,
float min)
Sets the baseline and minimum function variables for baselineTf
setHyperbolicTfFactors
public void setHyperbolicTfFactors(float min,
float max,
double base,
float xoffset)
Sets the function variables for the hyperbolicTf functions
min
- the minimum tf value to ever be returned (default: 0.0)max
- the maximum tf value to ever be returned (default: 2.0)base
- the base value to be used in the exponential for the hyperbolic function (default: e)xoffset
- the midpoint of the hyperbolic function (default: 10.0)
setLengthNormFactors
public void setLengthNormFactors(String field,
int min,
int max,
float steepness)
Sets the function variables used by lengthNorm for a specific named field
setLengthNormFactors
public void setLengthNormFactors(int min,
int max,
float steepness)
Sets the default function variables used by lengthNorm when no field
specifc variables have been set.
tf
public float tf(int freq)
Delegates to baselineTf
- tf in interface Similarity
Copyright © 2000-2007 Apache Software Foundation. All Rights Reserved.