org.galagosearch.core.parse
Class DateExtractor
java.lang.Object
org.galagosearch.tupleflow.StandardStep<Document,Document>
org.galagosearch.core.parse.DateExtractor
- All Implemented Interfaces:
- org.galagosearch.tupleflow.Processor<Document>, org.galagosearch.tupleflow.Source<Document>, org.galagosearch.tupleflow.Step
@InputClass(className="org.galagosearch.core.parse.Document")
@OutputClass(className="org.galagosearch.core.types.DateExtent")
public class DateExtractor
- extends org.galagosearch.tupleflow.StandardStep<Document,Document>
A very crude extractor of dates from text.
This class searches for anything that looks like a year (1000-2999), then
searches around that year for a month name. A year is sufficient to emit
a date. Day of the month is currently not supported.
- Author:
- trevor
| Fields inherited from class org.galagosearch.tupleflow.StandardStep |
processor |
|
Method Summary |
void |
addMonth(java.lang.String longMonth,
java.lang.String shortMonth,
int value)
|
int |
getMonth(java.util.List<java.lang.String> terms,
int i)
|
boolean |
isMonth(java.lang.String month)
|
boolean |
isYear(java.lang.String year)
|
void |
process(Document object)
|
| Methods inherited from class org.galagosearch.tupleflow.StandardStep |
close, setProcessor |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
DateExtractor
public DateExtractor()
addMonth
public void addMonth(java.lang.String longMonth,
java.lang.String shortMonth,
int value)
isMonth
public boolean isMonth(java.lang.String month)
isYear
public boolean isYear(java.lang.String year)
getMonth
public int getMonth(java.util.List<java.lang.String> terms,
int i)
process
public void process(Document object)
throws java.io.IOException
- Specified by:
process in interface org.galagosearch.tupleflow.Processor<Document>- Specified by:
process in class org.galagosearch.tupleflow.StandardStep<Document,Document>
- Throws:
java.io.IOException
Copyright © 2009. All Rights Reserved.