org.apache.lucene.benchmark.standard

Class StandardBenchmarker

Implemented Interfaces:
Benchmarker

public class StandardBenchmarker
extends AbstractBenchmarker
implements Benchmarker

Reads in the Reuters Collection, downloaded from http://www.daviddlewis.com/resources/testcollections/reuters21578/reuters21578.tar.gz in the workingDir/reuters and indexes them using the StandardAnalyzer

Runs a standard set of documents through an Indexer and then runs a standard set of queries against the index.

See Also:
org.apache.lucene.benchmark.standard.StandardBenchmarker.benchmark(java.io.File, org.apache.lucene.benchmark.BenchmarkOptions)

Field Summary

static String
INDEX_DIR
static String
SOURCE_DIR

Constructor Summary

StandardBenchmarker()

Method Summary

TestData[]
benchmark(File workingDir, BenchmarkOptions opts)
Benchmark according to the implementation, using the workingDir as the place to store things.
static Query[]
createQueries(List qs, Analyzer a)
Parse the strings containing Lucene queries.
static void
getAllFiles(File srcDir, FileFilter filter, List allFiles)
protected File
getSourceDirectory(File workingDir)
protected Document
makeDocument(File in, String[] tags, boolean stored, boolean tokenized, boolean tfv)
Parse the Reuters SGML and index: Date, Title, Dateline, Body
protected void
makeIndex(TestRunData trd, File srcDir, IndexWriter iw, boolean stored, boolean tokenized, boolean tfv, StandardOptions options)
Make index, and collect time data.
protected void
reset(File indexDir)
Remove existing index.
protected void
runBenchmark(TestData params, StandardOptions options)
Run benchmark using supplied parameters.
protected void
saveStream(InputStream is, File out, boolean closeInput)
Save a stream to a file.

Methods inherited from class org.apache.lucene.benchmark.AbstractBenchmarker

fullyDelete

Field Details

INDEX_DIR

public static final String INDEX_DIR

SOURCE_DIR

public static final String SOURCE_DIR

Constructor Details

StandardBenchmarker

public StandardBenchmarker()

Method Details

benchmark

public TestData[] benchmark(File workingDir,
                            BenchmarkOptions opts)
            throws Exception
Benchmark according to the implementation, using the workingDir as the place to store things.
Specified by:
benchmark in interface Benchmarker
Parameters:
workingDir - The File directory to store temporary data in for running the benchmark
Returns:
The TestData used to run the benchmark.

createQueries

public static Query[] createQueries(List qs,
                                    Analyzer a)
Parse the strings containing Lucene queries.
Parameters:
qs - array of strings containing query expressions
a - analyzer to use when parsing queries
Returns:
array of Lucene queries

getAllFiles

public static void getAllFiles(File srcDir,
                               FileFilter filter,
                               List allFiles)

getSourceDirectory

protected File getSourceDirectory(File workingDir)

makeDocument

protected Document makeDocument(File in,
                                String[] tags,
                                boolean stored,
                                boolean tokenized,
                                boolean tfv)
            throws Exception
Parse the Reuters SGML and index: Date, Title, Dateline, Body
Parameters:
in - input file
Returns:
Lucene document

makeIndex

protected void makeIndex(TestRunData trd,
                         File srcDir,
                         IndexWriter iw,
                         boolean stored,
                         boolean tokenized,
                         boolean tfv,
                         StandardOptions options)
            throws Exception
Make index, and collect time data.
Parameters:
trd - run data to populate
srcDir - directory with source files
iw - index writer, already open
stored - store values of fields
tokenized - tokenize fields
tfv - store term vectors

reset

protected void reset(File indexDir)
            throws Exception
Remove existing index.

runBenchmark

protected void runBenchmark(TestData params,
                            StandardOptions options)
            throws Exception
Run benchmark using supplied parameters.
Parameters:
params - benchmark parameters

saveStream

protected void saveStream(InputStream is,
                          File out,
                          boolean closeInput)
            throws Exception
Save a stream to a file.
Parameters:
is - input stream
out - output file
closeInput - if true, close the input stream when done.

Copyright © 2000-2007 Apache Software Foundation. All Rights Reserved.