org.apache.mahout.vectorizer.document
Class SequenceFileTokenizerMapper
java.lang.Object
org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,StringTuple>
org.apache.mahout.vectorizer.document.SequenceFileTokenizerMapper
public class SequenceFileTokenizerMapper
- extends org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,StringTuple>
Tokenizes a text document and outputs tokens in a StringTuple
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Mapper |
org.apache.hadoop.mapreduce.Mapper.Context |
Method Summary |
protected void |
map(org.apache.hadoop.io.Text key,
org.apache.hadoop.io.Text value,
org.apache.hadoop.mapreduce.Mapper.Context context)
|
protected void |
setup(org.apache.hadoop.mapreduce.Mapper.Context context)
|
Methods inherited from class org.apache.hadoop.mapreduce.Mapper |
cleanup, run |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
SequenceFileTokenizerMapper
public SequenceFileTokenizerMapper()
map
protected void map(org.apache.hadoop.io.Text key,
org.apache.hadoop.io.Text value,
org.apache.hadoop.mapreduce.Mapper.Context context)
throws IOException,
InterruptedException
- Overrides:
map
in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,StringTuple>
- Throws:
IOException
InterruptedException
setup
protected void setup(org.apache.hadoop.mapreduce.Mapper.Context context)
throws IOException,
InterruptedException
- Overrides:
setup
in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,StringTuple>
- Throws:
IOException
InterruptedException
Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.