org.galagosearch.core.parse
Class WordFilter

java.lang.Object
  extended by org.galagosearch.core.parse.WordFilter
All Implemented Interfaces:
org.galagosearch.tupleflow.Processor<Document>, org.galagosearch.tupleflow.Source<Document>, org.galagosearch.tupleflow.Step

@InputClass(className="org.galagosearch.core.parse.Document")
@OutputClass(className="org.galagosearch.core.parse.Document")
public class WordFilter
extends java.lang.Object
implements org.galagosearch.tupleflow.Processor<Document>, org.galagosearch.tupleflow.Source<Document>

WordFilter filters out unnecessary words from documents. Typically this object takes a stopword list as parameters and removes all the listed words. However, this can also be used to keep only the specified list of words in the index, which can be used to create an index that is tailored for only a small set of experimental queries.

Author:
trevor

Field Summary
 org.galagosearch.tupleflow.Processor<Document> processor
           
 
Constructor Summary
WordFilter(java.util.HashSet<java.lang.String> words)
           
WordFilter(org.galagosearch.tupleflow.TupleFlowParameters params)
           
 
Method Summary
 void close()
           
 void process(Document document)
           
 void setProcessor(org.galagosearch.tupleflow.Step processor)
           
static void verify(org.galagosearch.tupleflow.TupleFlowParameters parameters, org.galagosearch.tupleflow.execution.ErrorHandler handler)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

processor

public org.galagosearch.tupleflow.Processor<Document> processor
Constructor Detail

WordFilter

public WordFilter(java.util.HashSet<java.lang.String> words)

WordFilter

public WordFilter(org.galagosearch.tupleflow.TupleFlowParameters params)
           throws java.io.IOException
Throws:
java.io.IOException
Method Detail

process

public void process(Document document)
             throws java.io.IOException
Specified by:
process in interface org.galagosearch.tupleflow.Processor<Document>
Throws:
java.io.IOException

close

public void close()
           throws java.io.IOException
Specified by:
close in interface org.galagosearch.tupleflow.Processor<Document>
Throws:
java.io.IOException

verify

public static void verify(org.galagosearch.tupleflow.TupleFlowParameters parameters,
                          org.galagosearch.tupleflow.execution.ErrorHandler handler)

setProcessor

public void setProcessor(org.galagosearch.tupleflow.Step processor)
                  throws org.galagosearch.tupleflow.IncompatibleProcessorException
Specified by:
setProcessor in interface org.galagosearch.tupleflow.Source<Document>
Throws:
org.galagosearch.tupleflow.IncompatibleProcessorException


Copyright © 2009. All Rights Reserved.