cc.mallet.classify
Class NaiveBayes

java.lang.Object
  extended by cc.mallet.classify.Classifier
      extended by cc.mallet.classify.NaiveBayes
All Implemented Interfaces:
AlphabetCarrying, java.io.Serializable

public class NaiveBayes
extends Classifier
implements java.io.Serializable

A classifier that classifies instances according to the NaiveBayes method. In an Bayes classifier, the p(Classification|Data) = p(Data|Classification)p(Classification)/p(Data)

To compute the likelihood:
p(Data|Classification) = p(d1,d2,..dn | Classification)
Naive Bayes makes the assumption that all of the data are conditionally independent given the Classification:
p(d1,d2,...dn | Classification) = p(d1|Classification)p(d2|Classification)..

As with other classifiers in Mallet, NaiveBayes is implemented as two classes: a trainer and a classifier. The NaiveBayesTrainer produces estimates of the various p(dn|Classifier) and contructs this class with those estimates.

Instances are assumed to be FeatureVectors

As with other Mallet classifiers, classification may only be performed on instances processed with the pipe associated with this classifer, ie naiveBayes.getPipeInstance(); The NaiveBayesTrainer sets this pipe to the pipe used to process the training instances.

A NaiveBayes classifier can be persisted and reused using serialization.

Author:
Andrew McCallum mccallum@cs.umass.edu
See Also:
NaiveBayesTrainer, FeatureVector, Serialized Form

Field Summary
 
Fields inherited from class cc.mallet.classify.Classifier
instancePipe
 
Constructor Summary
NaiveBayes(Pipe instancePipe, Multinomial.Logged prior, Multinomial.Logged[] classIndex2FeatureProb)
          Construct a NaiveBayes classifier from a pipe, prior estimates for each Classification, and feature estimates of each Classification.
NaiveBayes(Pipe dataPipe, Multinomial prior, Multinomial[] classIndex2FeatureProb)
          Construct a NaiveBayes classifier from a pipe, prior estimates for each Classification, and feature estimates of each Classification.
 
Method Summary
 Classification classify(Instance instance)
          Classify an instance using NaiveBayes according to the trained data.
 double dataLogLikelihood(InstanceList ilist)
           
 Multinomial.Logged[] getMultinomials()
           
 Multinomial.Logged getPriors()
           
 double labelLogLikelihood(InstanceList ilist)
           
 void printWords(int numToPrint)
           
 
Methods inherited from class cc.mallet.classify.Classifier
alphabetsMatch, classify, classify, classify, getAccuracy, getAlphabet, getAlphabets, getAverageRank, getF1, getF1, getF1, getFeatureSelection, getInstancePipe, getLabelAlphabet, getPerClassFeatureSelection, getPrecision, getPrecision, getPrecision, getRecall, getRecall, getRecall, print, print
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

NaiveBayes

public NaiveBayes(Pipe instancePipe,
                  Multinomial.Logged prior,
                  Multinomial.Logged[] classIndex2FeatureProb)
Construct a NaiveBayes classifier from a pipe, prior estimates for each Classification, and feature estimates of each Classification. A NaiveBayes classifier is generally generated from a NaiveBayesTrainer, not constructed directly by users. Proability estimates are converted and saved as logarithms internally.

Parameters:
instancePipe - Used to check that feature vector dictionary for each instance is the same as that associated with the pipe. Null suppresses check
prior - Mulinomial that gives an estimate of the prior probability for each Classification
classIndex2FeatureProb - An array of multinomials giving an estimate of the probability of a classification for each feature of each featurevector.

NaiveBayes

public NaiveBayes(Pipe dataPipe,
                  Multinomial prior,
                  Multinomial[] classIndex2FeatureProb)
Construct a NaiveBayes classifier from a pipe, prior estimates for each Classification, and feature estimates of each Classification. A NaiveBayes classifier is generally generated from a NaiveBayesTrainer, not constructed directly by users.

Parameters:
dataPipe - Used to check that feature vector dictionary for each instance is the same as that associated with the pipe. Null suppresses check
prior - Mulinomial that gives an estimate of the prior probability for each Classification
classIndex2FeatureProb - An array of multinomials giving an estimate of the probability of a classification for each feature of each featurevector.
Method Detail

getMultinomials

public Multinomial.Logged[] getMultinomials()

getPriors

public Multinomial.Logged getPriors()

printWords

public void printWords(int numToPrint)

classify

public Classification classify(Instance instance)
Classify an instance using NaiveBayes according to the trained data. The alphabet of the featureVector of the instance must match the alphabe of the pipe used to train the classifier.

Specified by:
classify in class Classifier
Parameters:
instance - to be classified. Data field must be a FeatureVector
Returns:
Classification containing the labeling of the instance

dataLogLikelihood

public double dataLogLikelihood(InstanceList ilist)

labelLogLikelihood

public double labelLogLikelihood(InstanceList ilist)