org.galagosearch.core.parse
Class DocumentSource

java.lang.Object
  extended by org.galagosearch.core.parse.DocumentSource
All Implemented Interfaces:
org.galagosearch.tupleflow.ExNihiloSource<org.galagosearch.core.types.DocumentSplit>, org.galagosearch.tupleflow.Source<org.galagosearch.core.types.DocumentSplit>, org.galagosearch.tupleflow.Step

@OutputClass(className="org.galagosearch.core.types.DocumentSplit")
public class DocumentSource
extends java.lang.Object
implements org.galagosearch.tupleflow.ExNihiloSource<org.galagosearch.core.types.DocumentSplit>

From a set of inputs, splits the input into many DocumentSplit records. This will usually be in a stage by itself at the beginning of a Galago pipeline. This is somewhat similar to FileSource, except that it can autodetect file formats. This splitter can detect ARC, TREC, TRECWEB and corpus files.

Author:
trevor

Field Summary
 org.galagosearch.tupleflow.Processor processor
           
 
Constructor Summary
DocumentSource(org.galagosearch.tupleflow.TupleFlowParameters parameters)
           
 
Method Summary
 void run()
           
 void setProcessor(org.galagosearch.tupleflow.Step processor)
           
static void verify(org.galagosearch.tupleflow.TupleFlowParameters parameters, org.galagosearch.tupleflow.execution.ErrorHandler handler)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

processor

public org.galagosearch.tupleflow.Processor processor
Constructor Detail

DocumentSource

public DocumentSource(org.galagosearch.tupleflow.TupleFlowParameters parameters)
Method Detail

run

public void run()
         throws java.io.IOException
Specified by:
run in interface org.galagosearch.tupleflow.ExNihiloSource<org.galagosearch.core.types.DocumentSplit>
Throws:
java.io.IOException

setProcessor

public void setProcessor(org.galagosearch.tupleflow.Step processor)
                  throws org.galagosearch.tupleflow.IncompatibleProcessorException
Specified by:
setProcessor in interface org.galagosearch.tupleflow.Source<org.galagosearch.core.types.DocumentSplit>
Throws:
org.galagosearch.tupleflow.IncompatibleProcessorException

verify

public static void verify(org.galagosearch.tupleflow.TupleFlowParameters parameters,
                          org.galagosearch.tupleflow.execution.ErrorHandler handler)


Copyright © 2009. All Rights Reserved.