org.apache.mahout.text
Class MailArchivesClusteringAnalyzer
java.lang.Object
org.apache.lucene.analysis.Analyzer
org.apache.lucene.analysis.util.StopwordAnalyzerBase
org.apache.mahout.text.MailArchivesClusteringAnalyzer
- All Implemented Interfaces:
- Closeable
public final class MailArchivesClusteringAnalyzer
- extends org.apache.lucene.analysis.util.StopwordAnalyzerBase
Custom Lucene Analyzer designed for aggressive feature reduction
for clustering the ASF Mail Archives using an extended set of
stop words, excluding non-alpha-numeric tokens, and porter stemming.
Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer |
org.apache.lucene.analysis.Analyzer.GlobalReuseStrategy, org.apache.lucene.analysis.Analyzer.PerFieldReuseStrategy, org.apache.lucene.analysis.Analyzer.ReuseStrategy, org.apache.lucene.analysis.Analyzer.TokenStreamComponents |
Fields inherited from class org.apache.lucene.analysis.util.StopwordAnalyzerBase |
matchVersion, stopwords |
Fields inherited from class org.apache.lucene.analysis.Analyzer |
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY |
Methods inherited from class org.apache.lucene.analysis.util.StopwordAnalyzerBase |
getStopwordSet, loadStopwordSet, loadStopwordSet, loadStopwordSet |
Methods inherited from class org.apache.lucene.analysis.Analyzer |
close, getOffsetGap, getPositionIncrementGap, getReuseStrategy, initReader, tokenStream, tokenStream |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
MailArchivesClusteringAnalyzer
public MailArchivesClusteringAnalyzer()
MailArchivesClusteringAnalyzer
public MailArchivesClusteringAnalyzer(org.apache.lucene.analysis.util.CharArraySet stopSet)
createComponents
protected org.apache.lucene.analysis.Analyzer.TokenStreamComponents createComponents(String fieldName,
Reader reader)
- Specified by:
createComponents
in class org.apache.lucene.analysis.Analyzer
Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.