org.galagosearch.core.parse
Class DocumentSource
java.lang.Object
org.galagosearch.core.parse.DocumentSource
- All Implemented Interfaces:
- org.galagosearch.tupleflow.ExNihiloSource<org.galagosearch.core.types.DocumentSplit>, org.galagosearch.tupleflow.Source<org.galagosearch.core.types.DocumentSplit>, org.galagosearch.tupleflow.Step
@OutputClass(className="org.galagosearch.core.types.DocumentSplit")
public class DocumentSource
- extends java.lang.Object
- implements org.galagosearch.tupleflow.ExNihiloSource<org.galagosearch.core.types.DocumentSplit>
From a set of inputs, splits the input into many DocumentSplit records.
This will usually be in a stage by itself at the beginning of a Galago pipeline.
This is somewhat similar to FileSource, except that it can autodetect file formats.
This splitter can detect ARC, TREC, TRECWEB and corpus files.
- Author:
- trevor
|
Field Summary |
org.galagosearch.tupleflow.Processor |
processor
|
|
Constructor Summary |
DocumentSource(org.galagosearch.tupleflow.TupleFlowParameters parameters)
|
|
Method Summary |
void |
run()
|
void |
setProcessor(org.galagosearch.tupleflow.Step processor)
|
static void |
verify(org.galagosearch.tupleflow.TupleFlowParameters parameters,
org.galagosearch.tupleflow.execution.ErrorHandler handler)
|
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
processor
public org.galagosearch.tupleflow.Processor processor
DocumentSource
public DocumentSource(org.galagosearch.tupleflow.TupleFlowParameters parameters)
run
public void run()
throws java.io.IOException
- Specified by:
run in interface org.galagosearch.tupleflow.ExNihiloSource<org.galagosearch.core.types.DocumentSplit>
- Throws:
java.io.IOException
setProcessor
public void setProcessor(org.galagosearch.tupleflow.Step processor)
throws org.galagosearch.tupleflow.IncompatibleProcessorException
- Specified by:
setProcessor in interface org.galagosearch.tupleflow.Source<org.galagosearch.core.types.DocumentSplit>
- Throws:
org.galagosearch.tupleflow.IncompatibleProcessorException
verify
public static void verify(org.galagosearch.tupleflow.TupleFlowParameters parameters,
org.galagosearch.tupleflow.execution.ErrorHandler handler)
Copyright © 2009. All Rights Reserved.