org.apache.mahout.text
Class LuceneStorageConfiguration

java.lang.Object
  extended by org.apache.mahout.text.LuceneStorageConfiguration
All Implemented Interfaces:
org.apache.hadoop.io.Writable

public class LuceneStorageConfiguration
extends Object
implements org.apache.hadoop.io.Writable

Holds all the configuration for SequenceFilesFromLuceneStorage, which generates a sequence file with id as the key and a content field as value.


Constructor Summary
LuceneStorageConfiguration()
           
LuceneStorageConfiguration(org.apache.hadoop.conf.Configuration conf)
          Deserializes a LuceneStorageConfiguration from a Configuration.
LuceneStorageConfiguration(org.apache.hadoop.conf.Configuration configuration, List<org.apache.hadoop.fs.Path> indexPaths, org.apache.hadoop.fs.Path sequenceFilesOutputPath, String idField, List<String> fields)
          Create a configuration bean with all mandatory parameters.
 
Method Summary
 boolean equals(Object o)
           
 org.apache.hadoop.conf.Configuration getConfiguration()
           
 List<String> getFields()
           
 String getIdField()
           
 List<org.apache.hadoop.fs.Path> getIndexPaths()
           
 int getMaxHits()
           
 org.apache.lucene.search.Query getQuery()
           
 Iterator<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> getSequenceFileIterator()
          Returns an Iterator which returns (Text, Text) Pairs of the produced sequence files.
 org.apache.hadoop.fs.Path getSequenceFilesOutputPath()
           
 org.apache.lucene.document.DocumentStoredFieldVisitor getStoredFieldVisitor()
           
 int hashCode()
           
 void readFields(DataInput in)
           
 org.apache.hadoop.conf.Configuration serialize()
          Serializes this object in a Hadoop Configuration
 void setMaxHits(int maxHits)
           
 void setQuery(org.apache.lucene.search.Query query)
           
 void write(DataOutput out)
           
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

LuceneStorageConfiguration

public LuceneStorageConfiguration(org.apache.hadoop.conf.Configuration configuration,
                                  List<org.apache.hadoop.fs.Path> indexPaths,
                                  org.apache.hadoop.fs.Path sequenceFilesOutputPath,
                                  String idField,
                                  List<String> fields)
Create a configuration bean with all mandatory parameters.

Parameters:
configuration - Hadoop configuration for writing sequencefiles
indexPaths - paths to the index
sequenceFilesOutputPath - path to output the sequence file
idField - field used for the key of the sequence file
fields - field(s) used for the value of the sequence file

LuceneStorageConfiguration

public LuceneStorageConfiguration()

LuceneStorageConfiguration

public LuceneStorageConfiguration(org.apache.hadoop.conf.Configuration conf)
                           throws IOException
Deserializes a LuceneStorageConfiguration from a Configuration.

Parameters:
conf - the Configuration object with a serialized LuceneStorageConfiguration
Throws:
IOException - if deserialization fails
Method Detail

serialize

public org.apache.hadoop.conf.Configuration serialize()
                                               throws IOException
Serializes this object in a Hadoop Configuration

Returns:
a Configuration object with a String serialization
Throws:
IOException - if serialization fails

getSequenceFileIterator

public Iterator<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> getSequenceFileIterator()
Returns an Iterator which returns (Text, Text) Pairs of the produced sequence files.

Returns:
iterator

getConfiguration

public org.apache.hadoop.conf.Configuration getConfiguration()

getSequenceFilesOutputPath

public org.apache.hadoop.fs.Path getSequenceFilesOutputPath()

getIndexPaths

public List<org.apache.hadoop.fs.Path> getIndexPaths()

getIdField

public String getIdField()

getFields

public List<String> getFields()

setQuery

public void setQuery(org.apache.lucene.search.Query query)

getQuery

public org.apache.lucene.search.Query getQuery()

setMaxHits

public void setMaxHits(int maxHits)

getMaxHits

public int getMaxHits()

getStoredFieldVisitor

public org.apache.lucene.document.DocumentStoredFieldVisitor getStoredFieldVisitor()

write

public void write(DataOutput out)
           throws IOException
Specified by:
write in interface org.apache.hadoop.io.Writable
Throws:
IOException

readFields

public void readFields(DataInput in)
                throws IOException
Specified by:
readFields in interface org.apache.hadoop.io.Writable
Throws:
IOException

equals

public boolean equals(Object o)
Overrides:
equals in class Object

hashCode

public int hashCode()
Overrides:
hashCode in class Object


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.