org.apache.mahout.clustering.topdown.postprocessor
Class ClusterOutputPostProcessor

java.lang.Object
  extended by org.apache.mahout.clustering.topdown.postprocessor.ClusterOutputPostProcessor

public final class ClusterOutputPostProcessor
extends Object

This class reads the output of any clustering algorithm, and, creates separate directories for different clusters. Each cluster directory's name is its clusterId. Each and every point is written in the cluster directory associated with that point.

This class incorporates a sequential algorithm and is appropriate for use for data which has been clustered sequentially.

The sequential and non sequential version, both are being used from ClusterOutputPostProcessorDriver.


Constructor Summary
ClusterOutputPostProcessor(org.apache.hadoop.fs.Path clusterOutputToBeProcessed, org.apache.hadoop.fs.Path output, org.apache.hadoop.conf.Configuration hadoopConfiguration)
           
 
Method Summary
 Map<String,org.apache.hadoop.fs.Path> getPostProcessedClusterDirectories()
           
 void process()
          This method takes the clustered points output by the clustering algorithms as input and writes them into their respective clusters.
 void setClusteredPoints(org.apache.hadoop.fs.Path clusteredPoints)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ClusterOutputPostProcessor

public ClusterOutputPostProcessor(org.apache.hadoop.fs.Path clusterOutputToBeProcessed,
                                  org.apache.hadoop.fs.Path output,
                                  org.apache.hadoop.conf.Configuration hadoopConfiguration)
                           throws IOException
Throws:
IOException
Method Detail

process

public void process()
             throws IOException
This method takes the clustered points output by the clustering algorithms as input and writes them into their respective clusters.

Throws:
IOException

getPostProcessedClusterDirectories

public Map<String,org.apache.hadoop.fs.Path> getPostProcessedClusterDirectories()
Returns:
the set of all post processed cluster paths.

setClusteredPoints

public void setClusteredPoints(org.apache.hadoop.fs.Path clusteredPoints)


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.