cc.mallet.classify
Class MaxEntTrainer

java.lang.Object
  extended by cc.mallet.classify.ClassifierTrainer<MaxEnt>
      extended by cc.mallet.classify.MaxEntTrainer
All Implemented Interfaces:
Boostable, ClassifierTrainer.ByOptimization<MaxEnt>, java.io.Serializable
Direct Known Subclasses:
RankMaxEntTrainer

public class MaxEntTrainer
extends ClassifierTrainer<MaxEnt>
implements ClassifierTrainer.ByOptimization<MaxEnt>, Boostable, java.io.Serializable

The trainer for a Maximum Entropy classifier.

Author:
Andrew McCallum mccallum@cs.umass.edu
See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class cc.mallet.classify.ClassifierTrainer
ClassifierTrainer.ByActiveLearning<C extends Classifier>, ClassifierTrainer.ByIncrements<C extends Classifier>, ClassifierTrainer.ByInstanceIncrements<C extends Classifier>, ClassifierTrainer.ByOptimization<C extends Classifier>, ClassifierTrainer.Factory<CT extends ClassifierTrainer<? extends Classifier>>
 
Field Summary
static java.lang.String EXP_GAIN
           
static java.lang.String GRADIENT_GAIN
           
static java.lang.String INFORMATION_GAIN
           
 
Fields inherited from class cc.mallet.classify.ClassifierTrainer
finishedTraining, validationSet
 
Constructor Summary
MaxEntTrainer()
           
MaxEntTrainer(double gaussianPriorVariance)
          Constructs a trainer with a parameter to avoid overtraining.
MaxEntTrainer(MaxEnt theClassifierToTrain)
           
 
Method Summary
 MaxEnt getClassifier()
           
 int getIteration()
           
 Optimizable getOptimizable()
           
 MaxEntOptimizableByLabelLikelihood getOptimizable(InstanceList trainingSet)
           
 MaxEntOptimizableByLabelLikelihood getOptimizable(InstanceList trainingSet, MaxEnt initialClassifier)
           
 Optimizer getOptimizer()
           
 Optimizer getOptimizer(InstanceList trainingSet)
           
 void setClassifier(MaxEnt theClassifierToTrain)
           
 MaxEntTrainer setGaussianPriorVariance(double gaussianPriorVariance)
          Sets a parameter to prevent overtraining.
 MaxEntTrainer setNumIterations(int i)
          Specifies the maximum number of iterations to run during a single call to train or trainWithFeatureInduction.
 java.lang.String toString()
          Like the other version of trainWithFeatureInduction, but allows some default options to be changed.
 MaxEnt train(InstanceList trainingSet)
           
 MaxEnt train(InstanceList trainingSet, int numIterations)
           
 
Methods inherited from class cc.mallet.classify.ClassifierTrainer
getValidationInstances, isFinishedTraining, setValidationInstances
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

EXP_GAIN

public static final java.lang.String EXP_GAIN
See Also:
Constant Field Values

GRADIENT_GAIN

public static final java.lang.String GRADIENT_GAIN
See Also:
Constant Field Values

INFORMATION_GAIN

public static final java.lang.String INFORMATION_GAIN
See Also:
Constant Field Values
Constructor Detail

MaxEntTrainer

public MaxEntTrainer()

MaxEntTrainer

public MaxEntTrainer(MaxEnt theClassifierToTrain)

MaxEntTrainer

public MaxEntTrainer(double gaussianPriorVariance)
Constructs a trainer with a parameter to avoid overtraining. 1.0 is usually a reasonable default value.

Method Detail

getClassifier

public MaxEnt getClassifier()
Specified by:
getClassifier in class ClassifierTrainer<MaxEnt>

setClassifier

public void setClassifier(MaxEnt theClassifierToTrain)

getOptimizable

public Optimizable getOptimizable()

getOptimizable

public MaxEntOptimizableByLabelLikelihood getOptimizable(InstanceList trainingSet)

getOptimizable

public MaxEntOptimizableByLabelLikelihood getOptimizable(InstanceList trainingSet,
                                                         MaxEnt initialClassifier)

getOptimizer

public Optimizer getOptimizer()
Specified by:
getOptimizer in interface ClassifierTrainer.ByOptimization<MaxEnt>

getOptimizer

public Optimizer getOptimizer(InstanceList trainingSet)

setNumIterations

public MaxEntTrainer setNumIterations(int i)
Specifies the maximum number of iterations to run during a single call to train or trainWithFeatureInduction. Not currently functional.

Returns:
This trainer

getIteration

public int getIteration()
Specified by:
getIteration in interface ClassifierTrainer.ByOptimization<MaxEnt>

setGaussianPriorVariance

public MaxEntTrainer setGaussianPriorVariance(double gaussianPriorVariance)
Sets a parameter to prevent overtraining. A smaller variance for the prior means that feature weights are expected to hover closer to 0, so extra evidence is required to set a higher weight.

Returns:
This trainer

train

public MaxEnt train(InstanceList trainingSet)
Specified by:
train in class ClassifierTrainer<MaxEnt>

train

public MaxEnt train(InstanceList trainingSet,
                    int numIterations)
Specified by:
train in interface ClassifierTrainer.ByOptimization<MaxEnt>

toString

public java.lang.String toString()

Like the other version of trainWithFeatureInduction, but allows some default options to be changed.

Overrides:
toString in class java.lang.Object
Parameters:
maxent - An initial partially-trained classifier (default null). This classifier may be modified during training.
gainName - The estimate of gain (log-likelihood increase) we want our chosen features to maximize. Should be one of MaxEntTrainer.EXP_GAIN, MaxEntTrainer.GRADIENT_GAIN, or MaxEntTrainer.INFORMATION_GAIN (default EXP_GAIN).
Returns:
The trained MaxEnt classifier