org.galagosearch.tupleflow
Class Sorter<T>

java.lang.Object
  extended by org.galagosearch.tupleflow.StandardStep<T,T>
      extended by org.galagosearch.tupleflow.Sorter<T>
All Implemented Interfaces:
java.util.EventListener, javax.management.NotificationListener, Processor<T>, Source<T>, Step

public class Sorter<T>
extends StandardStep<T,T>
implements javax.management.NotificationListener

This class sorts an incoming stream of objects in some specified order. When this object is closed (by calling the close method), the sorted objects are then sent in sorted order to the next stage of the Processor chain.

Since there may be many objects submitted to the Sorter (more than will fit in main memory), the object may create temporary files to store partially sorted results. The path used for these temporary files is specified in the TempPath Java preferences variable.

In many instances, Sorters are used to generate streams of data that are then used to create aggregate statistics. For instance, suppose we want to compute the monthly sales of a particular corporation, separated by region. We can feed a set of transactions to the Sorter, each containing a dollar amount and the region it came from, e.g.:

If we sort this list by region name: It's now very easy to add up totals for each region (since all data for each region is adjacent in the list).

In these kinds of aggregate applications, it may be more efficient to provide the Sorter with a Reducer object. A Reducer is an object that transforms n sorted objects of type T into some (hopefully) smaller number of objects, also of type T. In the example above, we could write a reducer that turned those three transactions into:

which would be equivalent for this application. Using a Reducer allows the application to buffer fewer items and hopefully reduce the reliance on the disk during sorting.

Author:
Trevor Strohman

Field Summary
 
Fields inherited from class org.galagosearch.tupleflow.StandardStep
processor
 
Constructor Summary
Sorter(int limit, Order<T> order)
           
Sorter(int limit, Order<T> order, Reducer<T> reducer)
           
Sorter(int limit, Order<T> order, Reducer<T> reducer, Processor<T> processor)
           
Sorter(Order<T> order)
           
Sorter(Order<T> order, Reducer<T> reducer)
           
Sorter(TupleFlowParameters parameters)
           
 
Method Summary
 void close()
           
 void flush()
           
 void flushIfNecessary()
           
static java.lang.String getInputClass(TupleFlowParameters parameters)
           
static java.lang.String getOutputClass(TupleFlowParameters parameters)
           
static java.lang.String[] getOutputOrder(TupleFlowParameters parameters)
           
 void handleNotification(javax.management.Notification notification, java.lang.Object handback)
           
 boolean needsFlush()
           
 void process(T object)
           
 void removeMemoryWarnings()
           
 void requestMemoryWarnings()
           
 java.lang.String toString()
           
static void verify(TupleFlowParameters fullParameters, ErrorHandler handler)
           
 
Methods inherited from class org.galagosearch.tupleflow.StandardStep
setProcessor
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

Sorter

public Sorter(Order<T> order)

Sorter

public Sorter(Order<T> order,
              Reducer<T> reducer)

Sorter

public Sorter(int limit,
              Order<T> order)

Sorter

public Sorter(int limit,
              Order<T> order,
              Reducer<T> reducer)

Sorter

public Sorter(int limit,
              Order<T> order,
              Reducer<T> reducer,
              Processor<T> processor)

Sorter

public Sorter(TupleFlowParameters parameters)
       throws java.lang.ClassNotFoundException,
              java.lang.InstantiationException,
              java.lang.IllegalAccessException,
              java.io.IOException
Throws:
java.lang.ClassNotFoundException
java.lang.InstantiationException
java.lang.IllegalAccessException
java.io.IOException
Method Detail

requestMemoryWarnings

public void requestMemoryWarnings()

removeMemoryWarnings

public void removeMemoryWarnings()

handleNotification

public void handleNotification(javax.management.Notification notification,
                               java.lang.Object handback)
Specified by:
handleNotification in interface javax.management.NotificationListener

verify

public static void verify(TupleFlowParameters fullParameters,
                          ErrorHandler handler)

getInputClass

public static java.lang.String getInputClass(TupleFlowParameters parameters)

getOutputClass

public static java.lang.String getOutputClass(TupleFlowParameters parameters)

getOutputOrder

public static java.lang.String[] getOutputOrder(TupleFlowParameters parameters)

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object

needsFlush

public boolean needsFlush()

process

public void process(T object)
             throws java.io.IOException
Specified by:
process in interface Processor<T>
Specified by:
process in class StandardStep<T,T>
Throws:
java.io.IOException

flushIfNecessary

public void flushIfNecessary()
                      throws java.io.IOException
Throws:
java.io.IOException

close

public void close()
           throws java.io.IOException
Specified by:
close in interface Processor<T>
Overrides:
close in class StandardStep<T,T>
Throws:
java.io.IOException

flush

public void flush()
           throws java.io.IOException
Throws:
java.io.IOException


Copyright © 2009. All Rights Reserved.