org.galagosearch.core.parse
Class DateExtractor

java.lang.Object
  extended by org.galagosearch.tupleflow.StandardStep<Document,Document>
      extended by org.galagosearch.core.parse.DateExtractor
All Implemented Interfaces:
org.galagosearch.tupleflow.Processor<Document>, org.galagosearch.tupleflow.Source<Document>, org.galagosearch.tupleflow.Step

@InputClass(className="org.galagosearch.core.parse.Document")
@OutputClass(className="org.galagosearch.core.types.DateExtent")
public class DateExtractor
extends org.galagosearch.tupleflow.StandardStep<Document,Document>

A very crude extractor of dates from text. This class searches for anything that looks like a year (1000-2999), then searches around that year for a month name. A year is sufficient to emit a date. Day of the month is currently not supported.

Author:
trevor

Field Summary
 
Fields inherited from class org.galagosearch.tupleflow.StandardStep
processor
 
Constructor Summary
DateExtractor()
           
 
Method Summary
 void addMonth(java.lang.String longMonth, java.lang.String shortMonth, int value)
           
 int getMonth(java.util.List<java.lang.String> terms, int i)
           
 boolean isMonth(java.lang.String month)
           
 boolean isYear(java.lang.String year)
           
 void process(Document object)
           
 
Methods inherited from class org.galagosearch.tupleflow.StandardStep
close, setProcessor
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DateExtractor

public DateExtractor()
Method Detail

addMonth

public void addMonth(java.lang.String longMonth,
                     java.lang.String shortMonth,
                     int value)

isMonth

public boolean isMonth(java.lang.String month)

isYear

public boolean isYear(java.lang.String year)

getMonth

public int getMonth(java.util.List<java.lang.String> terms,
                    int i)

process

public void process(Document object)
             throws java.io.IOException
Specified by:
process in interface org.galagosearch.tupleflow.Processor<Document>
Specified by:
process in class org.galagosearch.tupleflow.StandardStep<Document,Document>
Throws:
java.io.IOException


Copyright © 2009. All Rights Reserved.