org.apache.mahout.text
Class SequenceFilesFromLuceneStorageDriver

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.mahout.common.AbstractJob
          extended by org.apache.mahout.text.SequenceFilesFromLuceneStorageDriver
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool

public class SequenceFilesFromLuceneStorageDriver
extends AbstractJob

Driver class for the lucene2seq program. Converts text contents of stored fields of a lucene index into a Hadoop SequenceFile. The key of the sequence file is the document ID and the value is the concatenated text of the specified stored field(s).


Field Summary
 
Fields inherited from class org.apache.mahout.common.AbstractJob
argMap, inputFile, inputPath, outputFile, outputPath, tempPath
 
Constructor Summary
SequenceFilesFromLuceneStorageDriver()
           
 
Method Summary
static void main(String[] args)
           
 LuceneStorageConfiguration newLucene2SeqConfiguration(org.apache.hadoop.conf.Configuration configuration, List<org.apache.hadoop.fs.Path> indexPaths, org.apache.hadoop.fs.Path sequenceFilesOutputPath, String idField, List<String> fields)
           
 int run(String[] args)
           
 
Methods inherited from class org.apache.mahout.common.AbstractJob
addFlag, addInputOption, addOption, addOption, addOption, addOption, addOutputOption, buildOption, buildOption, getAnalyzerClassFromOption, getCLIOption, getConf, getDimensions, getFloat, getFloat, getGroup, getInputFile, getInputPath, getInt, getInt, getOption, getOption, getOption, getOptions, getOutputFile, getOutputPath, getOutputPath, getTempPath, getTempPath, hasOption, keyFor, maybePut, parseArguments, parseArguments, parseDirectories, prepareJob, prepareJob, prepareJob, prepareJob, setConf, setS3SafeCombinedInputPath, shouldRunNextPhase
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SequenceFilesFromLuceneStorageDriver

public SequenceFilesFromLuceneStorageDriver()
Method Detail

main

public static void main(String[] args)
                 throws Exception
Throws:
Exception

run

public int run(String[] args)
        throws Exception
Throws:
Exception

newLucene2SeqConfiguration

public LuceneStorageConfiguration newLucene2SeqConfiguration(org.apache.hadoop.conf.Configuration configuration,
                                                             List<org.apache.hadoop.fs.Path> indexPaths,
                                                             org.apache.hadoop.fs.Path sequenceFilesOutputPath,
                                                             String idField,
                                                             List<String> fields)


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.