org.apache.mahout.cf.taste.impl.model.cassandra
Class CassandraDataModel

java.lang.Object
  extended by org.apache.mahout.cf.taste.impl.model.cassandra.CassandraDataModel
All Implemented Interfaces:
Closeable, Serializable, Refreshable, DataModel

public final class CassandraDataModel
extends Object
implements DataModel, Closeable

A DataModel based on a Cassandra keyspace. By default it uses keyspace "recommender" but this can be configured. Create the keyspace before using this class; this can be done on the Cassandra command line with a command linke create keyspace recommender;.

Within the keyspace, this model uses four column families:

First, it uses a column family called "users". This is keyed by the user ID as an 8-byte long. It contains a column for every preference the user expresses. The column name is item ID, again as an 8-byte long, and value is a floating point value represnted as an IEEE 32-bit floating poitn value.

It uses an analogous column family called "items" for the same data, but keyed by item ID rather than user ID. In this column family, column names are user IDs instead.

It uses a column family called "userIDs" as well, with an identical schema. It has one row under key 0. IT contains a column for every user ID in th emodel. It has no values.

Finally it also uses an analogous column family "itemIDs" containing item IDs.

Each of these four column families needs to be created ahead of time. Again the Cassandra CLI can be used to do so, with commands like create column family users;.

Note that this thread uses a long-lived Cassandra client which will run until terminated. You must close() this implementation when done or the JVM will not terminate.

This implementation still relies heavily on reading data into memory and caching, as it remains too data-intensive to be effective even against Cassandra. It will take some time to "warm up" as the first few requests will block loading user and item data into caches. This is still going to send a great deal of query traffic to Cassandra. It would be advisable to employ caching wrapper classes in your implementation, like CachingRecommender or CachingItemSimilarity.

See Also:
Serialized Form

Constructor Summary
CassandraDataModel()
          Uses the standard Cassandra host and port (localhost:9160), and keyspace name ("recommender").
CassandraDataModel(String host, int port, String keyspaceName)
           
 
Method Summary
 void close()
           
 LongPrimitiveIterator getItemIDs()
           
 FastIDSet getItemIDsFromUser(long userID)
           
 float getMaxPreference()
           
 float getMinPreference()
           
 int getNumItems()
           
 int getNumUsers()
           
 int getNumUsersWithPreferenceFor(long itemID)
           
 int getNumUsersWithPreferenceFor(long itemID1, long itemID2)
           
 PreferenceArray getPreferencesForItem(long itemID)
           
 PreferenceArray getPreferencesFromUser(long userID)
           
 Long getPreferenceTime(long userID, long itemID)
           
 Float getPreferenceValue(long userID, long itemID)
           
 LongPrimitiveIterator getUserIDs()
           
 boolean hasPreferenceValues()
           
 void refresh(Collection<Refreshable> alreadyRefreshed)
           
 void removePreference(long userID, long itemID)
           
 void setPreference(long userID, long itemID, float value)
           
 String toString()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

CassandraDataModel

public CassandraDataModel()
Uses the standard Cassandra host and port (localhost:9160), and keyspace name ("recommender").


CassandraDataModel

public CassandraDataModel(String host,
                          int port,
                          String keyspaceName)
Parameters:
host - Cassandra server host name
port - Cassandra server port
keyspaceName - name of Cassandra keyspace to use
Method Detail

getUserIDs

public LongPrimitiveIterator getUserIDs()
Specified by:
getUserIDs in interface DataModel

getPreferencesFromUser

public PreferenceArray getPreferencesFromUser(long userID)
                                       throws TasteException
Specified by:
getPreferencesFromUser in interface DataModel
Throws:
TasteException

getItemIDsFromUser

public FastIDSet getItemIDsFromUser(long userID)
                             throws TasteException
Specified by:
getItemIDsFromUser in interface DataModel
Throws:
TasteException

getItemIDs

public LongPrimitiveIterator getItemIDs()
Specified by:
getItemIDs in interface DataModel

getPreferencesForItem

public PreferenceArray getPreferencesForItem(long itemID)
                                      throws TasteException
Specified by:
getPreferencesForItem in interface DataModel
Throws:
TasteException

getPreferenceValue

public Float getPreferenceValue(long userID,
                                long itemID)
Specified by:
getPreferenceValue in interface DataModel

getPreferenceTime

public Long getPreferenceTime(long userID,
                              long itemID)
Specified by:
getPreferenceTime in interface DataModel

getNumItems

public int getNumItems()
Specified by:
getNumItems in interface DataModel

getNumUsers

public int getNumUsers()
Specified by:
getNumUsers in interface DataModel

getNumUsersWithPreferenceFor

public int getNumUsersWithPreferenceFor(long itemID)
                                 throws TasteException
Specified by:
getNumUsersWithPreferenceFor in interface DataModel
Throws:
TasteException

getNumUsersWithPreferenceFor

public int getNumUsersWithPreferenceFor(long itemID1,
                                        long itemID2)
                                 throws TasteException
Specified by:
getNumUsersWithPreferenceFor in interface DataModel
Throws:
TasteException

setPreference

public void setPreference(long userID,
                          long itemID,
                          float value)
Specified by:
setPreference in interface DataModel

removePreference

public void removePreference(long userID,
                             long itemID)
Specified by:
removePreference in interface DataModel

hasPreferenceValues

public boolean hasPreferenceValues()
Specified by:
hasPreferenceValues in interface DataModel
Returns:
true

getMaxPreference

public float getMaxPreference()
Specified by:
getMaxPreference in interface DataModel
Returns:
Float#NaN

getMinPreference

public float getMinPreference()
Specified by:
getMinPreference in interface DataModel
Returns:
Float#NaN

refresh

public void refresh(Collection<Refreshable> alreadyRefreshed)
Specified by:
refresh in interface Refreshable

toString

public String toString()
Overrides:
toString in class Object

close

public void close()
Specified by:
close in interface Closeable


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.