org.apache.lucene.search

Class DisjunctionMaxQuery

Implemented Interfaces:
Cloneable, Serializable

public class DisjunctionMaxQuery
extends Query

A query that generates the union of documents produced by its subqueries, and that scores each document with the maximum score for that document as produced by any subquery, plus a tie breaking increment for any additional matching subqueries. This is useful when searching for a word in multiple fields with different boost factors (so that the fields cannot be combined equivalently into a single search field). We want the primary score to be the one associated with the highest boost, not the sum of the field scores (as BooleanQuery would give). If the query is "albino elephant" this ensures that "albino" matching one field and "elephant" matching another gets a higher score than "albino" matching both fields. To get this result, use both BooleanQuery and DisjunctionMaxQuery: for each term a DisjunctionMaxQuery searches for it in each field, while the set of these DisjunctionMaxQuery's is combined into a BooleanQuery. The tie breaker capability allows results that include the same term in multiple fields to be judged better than results that include this term in only the best of those multiple fields, without confusing this with the better case of two different terms in the multiple fields.
Author:
Chuck Williams
See Also:
Serialized Form

Constructor Summary

DisjunctionMaxQuery(Collection disjuncts, float tieBreakerMultiplier)
Creates a new DisjunctionMaxQuery
DisjunctionMaxQuery(float tieBreakerMultiplier)
Creates a new empty DisjunctionMaxQuery.

Method Summary

void
add(Collection disjuncts)
Add a collection of disjuncts to this disjunction via Iterable
void
add(Query query)
Add a subquery to this disjunction
Object
clone()
Create a shallow copy of us -- used in rewriting if necessary
protected Weight
createWeight(Searcher searcher)
Expert: Constructs an appropriate Weight implementation for this query.
boolean
equals(Object o)
Return true iff we represent the same query as o
void
extractTerms(Set terms)
Expert: adds all terms occuring in this query to the terms set.
int
hashCode()
Compute a hash code for hashing us
Iterator
iterator()
An Iterator over the disjuncts
Query
rewrite(IndexReader reader)
Optimize our representation and our subqueries representations
String
toString(String field)
Prettyprint us.

Methods inherited from class org.apache.lucene.search.Query

clone, combine, createWeight, extractTerms, getBoost, getSimilarity, mergeBooleanQueries, rewrite, setBoost, toString, toString, weight

Constructor Details

DisjunctionMaxQuery

public DisjunctionMaxQuery(Collection disjuncts,
                           float tieBreakerMultiplier)
Creates a new DisjunctionMaxQuery
Parameters:
disjuncts - a Collection of all the disjuncts to add
tieBreakerMultiplier - the weight to give to each matching non-maximum disjunct

DisjunctionMaxQuery

public DisjunctionMaxQuery(float tieBreakerMultiplier)
Creates a new empty DisjunctionMaxQuery. Use add() to add the subqueries.
Parameters:
tieBreakerMultiplier - this score of each non-maximum disjunct for a document is multiplied by this weight and added into the final score. If non-zero, the value should be small, on the order of 0.1, which says that 10 occurrences of word in a lower-scored field that is also in a higher scored field is just as good as a unique word in the lower scored field (i.e., one that is not in any higher scored field.

Method Details

add

public void add(Collection disjuncts)
Add a collection of disjuncts to this disjunction via Iterable

add

public void add(Query query)
Add a subquery to this disjunction
Parameters:
query - the disjunct added

clone

public Object clone()
Create a shallow copy of us -- used in rewriting if necessary
Overrides:
clone in interface Query
Returns:
a copy of us (but reuse, don't copy, our subqueries)

createWeight

protected Weight createWeight(Searcher searcher)
            throws IOException
Expert: Constructs an appropriate Weight implementation for this query.

Only implemented by primitive queries, which re-write to themselves.

Overrides:
createWeight in interface Query

equals

public boolean equals(Object o)
Return true iff we represent the same query as o
Parameters:
o - another object
Returns:
true iff o is a DisjunctionMaxQuery with the same boost and the same subqueries, in the same order, as us

extractTerms

public void extractTerms(Set terms)
Overrides:
extractTerms in interface Query

hashCode

public int hashCode()
Compute a hash code for hashing us
Returns:
the hash code

iterator

public Iterator iterator()
An Iterator over the disjuncts

rewrite

public Query rewrite(IndexReader reader)
            throws IOException
Optimize our representation and our subqueries representations
Overrides:
rewrite in interface Query
Parameters:
reader - the IndexReader we query
Returns:
an optimized copy of us (which may not be a copy if there is nothing to optimize)

toString

public String toString(String field)
Prettyprint us.
Overrides:
toString in interface Query
Parameters:
field - the field to which we are applied
Returns:
a string that shows what we do, of the form "(disjunct1 | disjunct2 | ... | disjunctn)^boost"

Copyright © 2000-2007 Apache Software Foundation. All Rights Reserved.