org.apache.mahout.vectorizer.encoders
Class CachingValueEncoder

java.lang.Object
  extended by org.apache.mahout.vectorizer.encoders.FeatureVectorEncoder
      extended by org.apache.mahout.vectorizer.encoders.CachingValueEncoder
Direct Known Subclasses:
ConstantValueEncoder, ContinuousValueEncoder

public abstract class CachingValueEncoder
extends FeatureVectorEncoder

Provides basic hashing semantics for encoders where the probe locations depend only on the name of the variable.


Field Summary
 
Fields inherited from class org.apache.mahout.vectorizer.encoders.FeatureVectorEncoder
CONTINUOUS_VALUE_HASH_SEED, WORD_LIKE_VALUE_HASH_SEED
 
Constructor Summary
protected CachingValueEncoder(String name, int seed)
           
 
Method Summary
protected abstract  int getSeed()
           
protected  int hashForProbe(byte[] originalForm, int dataSize, String name, int probe)
          Provides the unique hash for a particular probe.
 void setProbes(int probes)
          Sets the number of locations in the feature vector that a value should be in.
 
Methods inherited from class org.apache.mahout.vectorizer.encoders.FeatureVectorEncoder
addToVector, addToVector, addToVector, addToVector, asString, bytesForString, getName, getProbes, getWeight, hash, hash, hash, hash, hash, hashesForProbe, isTraceEnabled, setTraceDictionary, trace, trace
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CachingValueEncoder

protected CachingValueEncoder(String name,
                              int seed)
Method Detail

setProbes

public void setProbes(int probes)
Sets the number of locations in the feature vector that a value should be in. This causes the cached probe locations to be recomputed.

Overrides:
setProbes in class FeatureVectorEncoder
Parameters:
probes - Number of locations to increment.

getSeed

protected abstract int getSeed()

hashForProbe

protected int hashForProbe(byte[] originalForm,
                           int dataSize,
                           String name,
                           int probe)
Description copied from class: FeatureVectorEncoder
Provides the unique hash for a particular probe. For all encoders except text, this is all that is needed and the default implementation of hashesForProbe will do the right thing. For text and similar values, hashesForProbe should be over-ridden and this method should not be used.

Specified by:
hashForProbe in class FeatureVectorEncoder
Parameters:
originalForm - The original byte array value
dataSize - The length of the vector being encoded
name - The name of the variable being encoded
probe - The probe number
Returns:
The hash of the current probe


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.