org.galagosearch.core.parse
Class TrecWebParser

java.lang.Object
  extended by org.galagosearch.core.parse.TrecWebParser
All Implemented Interfaces:
DocumentStreamParser

public class TrecWebParser
extends java.lang.Object
implements DocumentStreamParser

Author:
trevor

Constructor Summary
TrecWebParser(java.io.BufferedReader reader)
          Creates a new instance of TrecWebParser
 
Method Summary
 void close()
           
 Document nextDocument()
           
 java.lang.String readUrl()
           
 java.lang.String scrubUrl(java.lang.String url)
           
 java.lang.String waitFor(java.lang.String tag)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TrecWebParser

public TrecWebParser(java.io.BufferedReader reader)
              throws java.io.FileNotFoundException,
                     java.io.IOException
Creates a new instance of TrecWebParser

Throws:
java.io.FileNotFoundException
java.io.IOException
Method Detail

waitFor

public java.lang.String waitFor(java.lang.String tag)
                         throws java.io.IOException
Throws:
java.io.IOException

close

public void close()
           throws java.io.IOException
Throws:
java.io.IOException

scrubUrl

public java.lang.String scrubUrl(java.lang.String url)

readUrl

public java.lang.String readUrl()
                         throws java.io.IOException
Throws:
java.io.IOException

nextDocument

public Document nextDocument()
                      throws java.io.IOException
Specified by:
nextDocument in interface DocumentStreamParser
Throws:
java.io.IOException


Copyright © 2009. All Rights Reserved.