org.apache.mahout.cf.taste.impl.similarity
Class GenericItemSimilarity

java.lang.Object
  extended by org.apache.mahout.cf.taste.impl.similarity.GenericItemSimilarity
All Implemented Interfaces:
Refreshable, ItemSimilarity

public final class GenericItemSimilarity
extends Object
implements ItemSimilarity

A "generic" GenericItemSimilarity.ItemItemSimilarity which takes a static list of precomputed item similarities and bases its responses on that alone. The values may have been precomputed offline by another process, stored in a file, and then read and fed into an instance of this class.

This is perhaps the best GenericItemSimilarity.ItemItemSimilarity to use with GenericItemBasedRecommender, for now, since the point of item-based recommenders is that they can take advantage of the fact that item similarity is relatively static, can be precomputed, and then used in computation to gain a significant performance advantage.


Nested Class Summary
static class GenericItemSimilarity.ItemItemSimilarity
          Encapsulates a similarity between two items.
 
Constructor Summary
GenericItemSimilarity(ItemSimilarity otherSimilarity, DataModel dataModel)
           Builds a list of item-item similarities given an GenericItemSimilarity.ItemItemSimilarity implementation and a DataModel, rather than a list of GenericItemSimilarity.ItemItemSimilaritys.
GenericItemSimilarity(ItemSimilarity otherSimilarity, DataModel dataModel, int maxToKeep)
           Like GenericItemSimilarity(ItemSimilarity, DataModel) )}, but will only keep the specified number of similarities from the given DataModel.
GenericItemSimilarity(Iterable<GenericItemSimilarity.ItemItemSimilarity> similarities)
           Creates a GenericItemSimilarity from a precomputed list of GenericItemSimilarity.ItemItemSimilaritys.
GenericItemSimilarity(Iterable<GenericItemSimilarity.ItemItemSimilarity> similarities, int maxToKeep)
           Like GenericItemSimilarity(Iterable), but will only keep the specified number of similarities from the given Iterable of similarities.
 
Method Summary
 long[] allSimilarItemIDs(long itemID)
           
 double[] itemSimilarities(long itemID1, long[] itemID2s)
          A bulk-get version of ItemSimilarity.itemSimilarity(long, long).
 double itemSimilarity(long itemID1, long itemID2)
           Returns the similarity between two items.
 void refresh(Collection<Refreshable> alreadyRefreshed)
           Triggers "refresh" -- whatever that means -- of the implementation.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

GenericItemSimilarity

public GenericItemSimilarity(Iterable<GenericItemSimilarity.ItemItemSimilarity> similarities)

Creates a GenericItemSimilarity from a precomputed list of GenericItemSimilarity.ItemItemSimilaritys. Each represents the similarity between two distinct items. Since similarity is assumed to be symmetric, it is not necessary to specify similarity between item1 and item2, and item2 and item1. Both are the same. It is also not necessary to specify a similarity between any item and itself; these are assumed to be 1.0.

Note that specifying a similarity between two items twice is not an error, but, the later value will win.

Parameters:
similarities - set of GenericItemSimilarity.ItemItemSimilaritys on which to base this instance

GenericItemSimilarity

public GenericItemSimilarity(Iterable<GenericItemSimilarity.ItemItemSimilarity> similarities,
                             int maxToKeep)

Like GenericItemSimilarity(Iterable), but will only keep the specified number of similarities from the given Iterable of similarities. It will keep those with the highest similarity -- those that are therefore most important.

Thanks to tsmorton for suggesting this and providing part of the implementation.

Parameters:
similarities - set of GenericItemSimilarity.ItemItemSimilaritys on which to base this instance
maxToKeep - maximum number of similarities to keep

GenericItemSimilarity

public GenericItemSimilarity(ItemSimilarity otherSimilarity,
                             DataModel dataModel)
                      throws TasteException

Builds a list of item-item similarities given an GenericItemSimilarity.ItemItemSimilarity implementation and a DataModel, rather than a list of GenericItemSimilarity.ItemItemSimilaritys.

It's valid to build a GenericItemSimilarity this way, but perhaps missing some of the point of an item-based recommender. Item-based recommenders use the assumption that item-item similarities are relatively fixed, and might be known already independent of user preferences. Hence it is useful to inject that information, using GenericItemSimilarity(Iterable).

Parameters:
otherSimilarity - other GenericItemSimilarity.ItemItemSimilarity to get similarities from
dataModel - data model to get items from
Throws:
TasteException - if an error occurs while accessing the DataModel items

GenericItemSimilarity

public GenericItemSimilarity(ItemSimilarity otherSimilarity,
                             DataModel dataModel,
                             int maxToKeep)
                      throws TasteException

Like GenericItemSimilarity(ItemSimilarity, DataModel) )}, but will only keep the specified number of similarities from the given DataModel. It will keep those with the highest similarity -- those that are therefore most important.

Thanks to tsmorton for suggesting this and providing part of the implementation.

Parameters:
otherSimilarity - other GenericItemSimilarity.ItemItemSimilarity to get similarities from
dataModel - data model to get items from
maxToKeep - maximum number of similarities to keep
Throws:
TasteException - if an error occurs while accessing the DataModel items
Method Detail

itemSimilarity

public double itemSimilarity(long itemID1,
                             long itemID2)

Returns the similarity between two items. Note that similarity is assumed to be symmetric, that itemSimilarity(item1, item2) == itemSimilarity(item2, item1), and that itemSimilarity(item1,item1) == 1.0 for all items.

Specified by:
itemSimilarity in interface ItemSimilarity
Parameters:
itemID1 - first item
itemID2 - second item
Returns:
similarity between the two

itemSimilarities

public double[] itemSimilarities(long itemID1,
                                 long[] itemID2s)
Description copied from interface: ItemSimilarity

A bulk-get version of ItemSimilarity.itemSimilarity(long, long).

Specified by:
itemSimilarities in interface ItemSimilarity
Parameters:
itemID1 - first item ID
itemID2s - second item IDs to compute similarity with
Returns:
similarity between itemID1 and other items

allSimilarItemIDs

public long[] allSimilarItemIDs(long itemID)
Specified by:
allSimilarItemIDs in interface ItemSimilarity
Returns:
all IDs of similar items, in no particular order

refresh

public void refresh(Collection<Refreshable> alreadyRefreshed)
Description copied from interface: Refreshable

Triggers "refresh" -- whatever that means -- of the implementation. The general contract is that any Refreshable should always leave itself in a consistent, operational state, and that the refresh atomically updates internal state from old to new.

Specified by:
refresh in interface Refreshable
Parameters:
alreadyRefreshed - Refreshables that are known to have already been refreshed as a result of an initial call to a Refreshable.refresh(Collection) method on some object. This ensure that objects in a refresh dependency graph aren't refreshed twice needlessly.


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.