org.apache.mahout.clustering.topdown
Class PathDirectory

java.lang.Object
  extended by org.apache.mahout.clustering.topdown.PathDirectory

public final class PathDirectory
extends Object

Contains list of all internal paths used in top down clustering.


Field Summary
static String BOTTOM_LEVEL_CLUSTER_DIRECTORY
           
static String CLUSTERED_POINTS_DIRECTORY
           
static String POST_PROCESS_DIRECTORY
           
static String TOP_LEVEL_CLUSTER_DIRECTORY
           
 
Method Summary
static org.apache.hadoop.fs.Path getBottomLevelClusterPath(org.apache.hadoop.fs.Path output, String clusterId)
          Each cluster produced by top level clustering is processed in output/"bottomLevelCluster"/clusterId.
static org.apache.hadoop.fs.Path getClusterOutputClusteredPoints(org.apache.hadoop.fs.Path output)
          The top level clustered points before post processing is generated here.
static org.apache.hadoop.fs.Path getClusterPathForClusterId(org.apache.hadoop.fs.Path clusterPostProcessorOutput, String clusterId)
          Each clusters path name is its clusterId.
static org.apache.hadoop.fs.Path getClusterPostProcessorOutputDirectory(org.apache.hadoop.fs.Path outputPathProvidedByUser)
          The output of top level clusters is post processed and kept in this path.
static org.apache.hadoop.fs.Path getTopLevelClusterPath(org.apache.hadoop.fs.Path output)
          All output of top level clustering is stored in output directory/topLevelCluster.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

TOP_LEVEL_CLUSTER_DIRECTORY

public static final String TOP_LEVEL_CLUSTER_DIRECTORY
See Also:
Constant Field Values

POST_PROCESS_DIRECTORY

public static final String POST_PROCESS_DIRECTORY
See Also:
Constant Field Values

CLUSTERED_POINTS_DIRECTORY

public static final String CLUSTERED_POINTS_DIRECTORY
See Also:
Constant Field Values

BOTTOM_LEVEL_CLUSTER_DIRECTORY

public static final String BOTTOM_LEVEL_CLUSTER_DIRECTORY
See Also:
Constant Field Values
Method Detail

getTopLevelClusterPath

public static org.apache.hadoop.fs.Path getTopLevelClusterPath(org.apache.hadoop.fs.Path output)
All output of top level clustering is stored in output directory/topLevelCluster.

Parameters:
output - the output path of clustering.
Returns:
The top level Cluster Directory.

getClusterPostProcessorOutputDirectory

public static org.apache.hadoop.fs.Path getClusterPostProcessorOutputDirectory(org.apache.hadoop.fs.Path outputPathProvidedByUser)
The output of top level clusters is post processed and kept in this path.

Parameters:
outputPathProvidedByUser - the output path of clustering.
Returns:
the path where the output of top level cluster post processor is kept.

getClusterOutputClusteredPoints

public static org.apache.hadoop.fs.Path getClusterOutputClusteredPoints(org.apache.hadoop.fs.Path output)
The top level clustered points before post processing is generated here.

Parameters:
output - the output path of clustering.
Returns:
the clustered points directory

getBottomLevelClusterPath

public static org.apache.hadoop.fs.Path getBottomLevelClusterPath(org.apache.hadoop.fs.Path output,
                                                                  String clusterId)
Each cluster produced by top level clustering is processed in output/"bottomLevelCluster"/clusterId.

Parameters:
output -
clusterId -
Returns:
the bottom level clustering path.

getClusterPathForClusterId

public static org.apache.hadoop.fs.Path getClusterPathForClusterId(org.apache.hadoop.fs.Path clusterPostProcessorOutput,
                                                                   String clusterId)
Each clusters path name is its clusterId. The vectors reside in separate files inside it.

Parameters:
clusterPostProcessorOutput - the path of cluster post processor output.
clusterId - the id of the cluster.
Returns:
the cluster path for cluster id.


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.