org.apache.mahout.clustering.kmeans
Class RandomSeedGenerator

java.lang.Object
  extended by org.apache.mahout.clustering.kmeans.RandomSeedGenerator

public final class RandomSeedGenerator
extends Object

Given an Input Path containing a SequenceFile, randomly select k vectors and write them to the output file as a Kluster representing the initial centroid to use. This implementation uses reservoir sampling as described in http://en.wikipedia.org/wiki/Reservoir_sampling


Field Summary
static String K
           
 
Method Summary
static org.apache.hadoop.fs.Path buildRandom(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path input, org.apache.hadoop.fs.Path output, int k, DistanceMeasure measure)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

K

public static final String K
See Also:
Constant Field Values
Method Detail

buildRandom

public static org.apache.hadoop.fs.Path buildRandom(org.apache.hadoop.conf.Configuration conf,
                                                    org.apache.hadoop.fs.Path input,
                                                    org.apache.hadoop.fs.Path output,
                                                    int k,
                                                    DistanceMeasure measure)
                                             throws IOException
Throws:
IOException


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.