cc.mallet.topics
Class NPTopicModel
java.lang.Object
cc.mallet.topics.NPTopicModel
- All Implemented Interfaces:
- java.io.Serializable
public class NPTopicModel
- extends java.lang.Object
- implements java.io.Serializable
A non-parametric topic model that uses the "minimal path" assumption
to reduce bookkeeping.
- Author:
- David Mimno
- See Also:
- Serialized Form
Constructor Summary |
NPTopicModel(double alpha,
double gamma,
double beta)
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
data
protected java.util.ArrayList<TopicAssignment> data
alphabet
protected Alphabet alphabet
topicAlphabet
protected LabelAlphabet topicAlphabet
maxTopic
protected int maxTopic
numTopics
protected int numTopics
numTypes
protected int numTypes
alpha
protected double alpha
gamma
protected double gamma
beta
protected double beta
betaSum
protected double betaSum
DEFAULT_BETA
public static final double DEFAULT_BETA
- See Also:
- Constant Field Values
typeTopicCounts
protected gnu.trove.TIntIntHashMap[] typeTopicCounts
tokensPerTopic
protected gnu.trove.TIntIntHashMap tokensPerTopic
docsPerTopic
protected gnu.trove.TIntIntHashMap docsPerTopic
totalDocTopics
protected int totalDocTopics
showTopicsInterval
public int showTopicsInterval
wordsPerTopic
public int wordsPerTopic
random
protected Randoms random
formatter
protected java.text.NumberFormat formatter
printLogLikelihood
protected boolean printLogLikelihood
NPTopicModel
public NPTopicModel(double alpha,
double gamma,
double beta)
- Parameters:
alpha
- this parameter balances the local document topic counts with
the global distribution over topics.gamma
- this parameter is the weight on a completely new, never-before-seen topic
in the global distribution.beta
- this parameter controls the variability of the topic-word distributions
setTopicDisplay
public void setTopicDisplay(int interval,
int n)
setRandomSeed
public void setRandomSeed(int seed)
addInstances
public void addInstances(InstanceList training,
int initialTopics)
sample
public void sample(int iterations)
throws java.io.IOException
- Throws:
java.io.IOException
sampleTopicsForOneDoc
protected void sampleTopicsForOneDoc(FeatureSequence tokenSequence,
FeatureSequence topicSequence)
topWords
public java.lang.String topWords(int numWords)
printState
public void printState(java.io.File f)
throws java.io.IOException
- Throws:
java.io.IOException
printState
public void printState(java.io.PrintStream out)
main
public static void main(java.lang.String[] args)
throws java.io.IOException
- Throws:
java.io.IOException