org.galagosearch.core.parse
Class DocumentDataNumberer

java.lang.Object
  extended by org.galagosearch.tupleflow.StandardStep<org.galagosearch.core.types.DocumentData,org.galagosearch.core.types.NumberedDocumentData>
      extended by org.galagosearch.core.parse.DocumentDataNumberer
All Implemented Interfaces:
org.galagosearch.tupleflow.Processor<org.galagosearch.core.types.DocumentData>, org.galagosearch.tupleflow.Source<org.galagosearch.core.types.NumberedDocumentData>, org.galagosearch.tupleflow.Step

@InputClass(className="org.galagosearch.core.types.DocumentData")
@OutputClass(className="org.galagosearch.core.types.NumberedDocumentData")
public class DocumentDataNumberer
extends org.galagosearch.tupleflow.StandardStep<org.galagosearch.core.types.DocumentData,org.galagosearch.core.types.NumberedDocumentData>

Sequentially numbers document data objects.

The point of this class is to assign small numbers to each document. This would be simple if only one process was parsing documents, but in fact there are many of them doing the job at once. So, we extract DocumentData records from each document, put them into a single list, and assign numbers to them. These NumberedDocumentData records are then used to assign numbers to index positings.

Author:
trevor

Field Summary
 
Fields inherited from class org.galagosearch.tupleflow.StandardStep
processor
 
Constructor Summary
DocumentDataNumberer()
           
 
Method Summary
 void process(org.galagosearch.core.types.DocumentData data)
           
 
Methods inherited from class org.galagosearch.tupleflow.StandardStep
close, setProcessor
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DocumentDataNumberer

public DocumentDataNumberer()
Method Detail

process

public void process(org.galagosearch.core.types.DocumentData data)
             throws java.io.IOException
Specified by:
process in interface org.galagosearch.tupleflow.Processor<org.galagosearch.core.types.DocumentData>
Specified by:
process in class org.galagosearch.tupleflow.StandardStep<org.galagosearch.core.types.DocumentData,org.galagosearch.core.types.NumberedDocumentData>
Throws:
java.io.IOException


Copyright © 2009. All Rights Reserved.