org.apache.mahout.clustering.topdown.postprocessor
Class ClusterOutputPostProcessor
java.lang.Object
org.apache.mahout.clustering.topdown.postprocessor.ClusterOutputPostProcessor
public final class ClusterOutputPostProcessor
- extends Object
This class reads the output of any clustering algorithm, and, creates separate directories for different
clusters. Each cluster directory's name is its clusterId. Each and every point is written in the cluster
directory associated with that point.
This class incorporates a sequential algorithm and is appropriate for use for data which has been clustered
sequentially.
The sequential and non sequential version, both are being used from ClusterOutputPostProcessorDriver
.
Constructor Summary |
ClusterOutputPostProcessor(org.apache.hadoop.fs.Path clusterOutputToBeProcessed,
org.apache.hadoop.fs.Path output,
org.apache.hadoop.conf.Configuration hadoopConfiguration)
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ClusterOutputPostProcessor
public ClusterOutputPostProcessor(org.apache.hadoop.fs.Path clusterOutputToBeProcessed,
org.apache.hadoop.fs.Path output,
org.apache.hadoop.conf.Configuration hadoopConfiguration)
throws IOException
- Throws:
IOException
process
public void process()
throws IOException
- This method takes the clustered points output by the clustering algorithms as input and writes them into
their respective clusters.
- Throws:
IOException
getPostProcessedClusterDirectories
public Map<String,org.apache.hadoop.fs.Path> getPostProcessedClusterDirectories()
- Returns:
- the set of all post processed cluster paths.
setClusteredPoints
public void setClusteredPoints(org.apache.hadoop.fs.Path clusteredPoints)
Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.