org.apache.mahout.classifier.df.mapreduce.partial
Class Step1Mapper

java.lang.Object
  extended by org.apache.hadoop.mapreduce.Mapper<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
      extended by org.apache.mahout.classifier.df.mapreduce.MapredMapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>
          extended by org.apache.mahout.classifier.df.mapreduce.partial.Step1Mapper

public class Step1Mapper
extends MapredMapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>

First step of the Partial Data Builder. Builds the trees using the data available in the InputSplit. Predict the oob classes for each tree in its growing partition (input split).


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Mapper
org.apache.hadoop.mapreduce.Mapper.Context
 
Constructor Summary
Step1Mapper()
           
 
Method Summary
protected  void cleanup(org.apache.hadoop.mapreduce.Mapper.Context context)
           
protected  void configure(Long seed, int partition, int numMapTasks, int numTrees)
          Useful when testing
 int getFirstTreeId()
           
protected  void map(org.apache.hadoop.io.LongWritable key, org.apache.hadoop.io.Text value, org.apache.hadoop.mapreduce.Mapper.Context context)
           
static int nbTrees(int numMaps, int numTrees, int partition)
          Compute the number of trees for a given partition.
protected  void setup(org.apache.hadoop.mapreduce.Mapper.Context context)
           
 
Methods inherited from class org.apache.mahout.classifier.df.mapreduce.MapredMapper
configure, getDataset, getTreeBuilder, isOutput
 
Methods inherited from class org.apache.hadoop.mapreduce.Mapper
run
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Step1Mapper

public Step1Mapper()
Method Detail

getFirstTreeId

public int getFirstTreeId()

setup

protected void setup(org.apache.hadoop.mapreduce.Mapper.Context context)
              throws IOException,
                     InterruptedException
Overrides:
setup in class MapredMapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>
Throws:
IOException
InterruptedException

configure

protected void configure(Long seed,
                         int partition,
                         int numMapTasks,
                         int numTrees)
Useful when testing

Parameters:
partition - current mapper inputSplit partition
numMapTasks - number of running map tasks
numTrees - total number of trees in the forest

nbTrees

public static int nbTrees(int numMaps,
                          int numTrees,
                          int partition)
Compute the number of trees for a given partition. The first partition (0) may be longer than the rest of partition because of the remainder.

Parameters:
numMaps - total number of maps (partitions)
numTrees - total number of trees to build
partition - partition to compute the number of trees for

map

protected void map(org.apache.hadoop.io.LongWritable key,
                   org.apache.hadoop.io.Text value,
                   org.apache.hadoop.mapreduce.Mapper.Context context)
            throws IOException,
                   InterruptedException
Overrides:
map in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>
Throws:
IOException
InterruptedException

cleanup

protected void cleanup(org.apache.hadoop.mapreduce.Mapper.Context context)
                throws IOException,
                       InterruptedException
Overrides:
cleanup in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>
Throws:
IOException
InterruptedException


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.