org.apache.lucene.benchmark.utils

Class ExtractReuters


public class ExtractReuters
extends Object

Split the Reuters SGML documents into Simple Text files containing: Title, Date, Dateline, Body

Constructor Summary

ExtractReuters(File reutersDir, File outputDir)

Method Summary

void
extract()
protected void
extractFile(File sgmFile)
Override if you wish to change what is extracted
static void
main(String[] args)

Constructor Details

ExtractReuters

public ExtractReuters(File reutersDir,
                      File outputDir)

Method Details

extract

public void extract()

extractFile

protected void extractFile(File sgmFile)
Override if you wish to change what is extracted
Parameters:
sgmFile -

main

public static void main(String[] args)

Copyright © 2000-2007 Apache Software Foundation. All Rights Reserved.