org.galagosearch.core.parse
Class WordFilter
java.lang.Object
org.galagosearch.core.parse.WordFilter
- All Implemented Interfaces:
- org.galagosearch.tupleflow.Processor<Document>, org.galagosearch.tupleflow.Source<Document>, org.galagosearch.tupleflow.Step
@InputClass(className="org.galagosearch.core.parse.Document")
@OutputClass(className="org.galagosearch.core.parse.Document")
public class WordFilter
- extends java.lang.Object
- implements org.galagosearch.tupleflow.Processor<Document>, org.galagosearch.tupleflow.Source<Document>
WordFilter filters out unnecessary words from documents. Typically this object
takes a stopword list as parameters and removes all the listed words. However,
this can also be used to keep only the specified list of words in the index, which
can be used to create an index that is tailored for only a small set
of experimental queries.
- Author:
- trevor
|
Constructor Summary |
WordFilter(java.util.HashSet<java.lang.String> words)
|
WordFilter(org.galagosearch.tupleflow.TupleFlowParameters params)
|
|
Method Summary |
void |
close()
|
void |
process(Document document)
|
void |
setProcessor(org.galagosearch.tupleflow.Step processor)
|
static void |
verify(org.galagosearch.tupleflow.TupleFlowParameters parameters,
org.galagosearch.tupleflow.execution.ErrorHandler handler)
|
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
processor
public org.galagosearch.tupleflow.Processor<Document> processor
WordFilter
public WordFilter(java.util.HashSet<java.lang.String> words)
WordFilter
public WordFilter(org.galagosearch.tupleflow.TupleFlowParameters params)
throws java.io.IOException
- Throws:
java.io.IOException
process
public void process(Document document)
throws java.io.IOException
- Specified by:
process in interface org.galagosearch.tupleflow.Processor<Document>
- Throws:
java.io.IOException
close
public void close()
throws java.io.IOException
- Specified by:
close in interface org.galagosearch.tupleflow.Processor<Document>
- Throws:
java.io.IOException
verify
public static void verify(org.galagosearch.tupleflow.TupleFlowParameters parameters,
org.galagosearch.tupleflow.execution.ErrorHandler handler)
setProcessor
public void setProcessor(org.galagosearch.tupleflow.Step processor)
throws org.galagosearch.tupleflow.IncompatibleProcessorException
- Specified by:
setProcessor in interface org.galagosearch.tupleflow.Source<Document>
- Throws:
org.galagosearch.tupleflow.IncompatibleProcessorException
Copyright © 2009. All Rights Reserved.