org.galagosearch.core.parse
Class DocumentIndexWriter

java.lang.Object
  extended by org.galagosearch.core.parse.DocumentIndexWriter
All Implemented Interfaces:
org.galagosearch.tupleflow.Processor<Document>, org.galagosearch.tupleflow.Step

@InputClass(className="org.galagosearch.core.parse.Document")
public class DocumentIndexWriter
extends java.lang.Object
implements org.galagosearch.tupleflow.Processor<Document>

Writes document text and metadata to an index file. The output files are in '.corpus' format, which can be fed to UniversalParser as an input to indexing. The '.corpus' format is also convenient for quickly finding individual documents.

Author:
trevor

Constructor Summary
DocumentIndexWriter(org.galagosearch.tupleflow.TupleFlowParameters parameters)
           
 
Method Summary
 void close()
           
 void process(Document document)
           
static void verify(org.galagosearch.tupleflow.TupleFlowParameters parameters, org.galagosearch.tupleflow.execution.ErrorHandler handler)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DocumentIndexWriter

public DocumentIndexWriter(org.galagosearch.tupleflow.TupleFlowParameters parameters)
                    throws java.io.FileNotFoundException,
                           java.io.IOException
Throws:
java.io.FileNotFoundException
java.io.IOException
Method Detail

close

public void close()
           throws java.io.IOException
Specified by:
close in interface org.galagosearch.tupleflow.Processor<Document>
Throws:
java.io.IOException

process

public void process(Document document)
             throws java.io.IOException
Specified by:
process in interface org.galagosearch.tupleflow.Processor<Document>
Throws:
java.io.IOException

verify

public static void verify(org.galagosearch.tupleflow.TupleFlowParameters parameters,
                          org.galagosearch.tupleflow.execution.ErrorHandler handler)


Copyright © 2009. All Rights Reserved.