cc.mallet.classify
Class MaxEntTrainer

java.lang.Object
  extended by cc.mallet.classify.ClassifierTrainer<MaxEnt>
      extended by cc.mallet.classify.MaxEntTrainer
All Implemented Interfaces:
Boostable, ClassifierTrainer.ByOptimization<MaxEnt>, java.io.Serializable
Direct Known Subclasses:
MaxEntL1Trainer, RankMaxEntTrainer

public class MaxEntTrainer
extends ClassifierTrainer<MaxEnt>
implements ClassifierTrainer.ByOptimization<MaxEnt>, Boostable, java.io.Serializable

The trainer for a Maximum Entropy classifier.

Author:
Andrew McCallum mccallum@cs.umass.edu
See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class cc.mallet.classify.ClassifierTrainer
ClassifierTrainer.ByActiveLearning<C extends Classifier>, ClassifierTrainer.ByIncrements<C extends Classifier>, ClassifierTrainer.ByInstanceIncrements<C extends Classifier>, ClassifierTrainer.ByOptimization<C extends Classifier>, ClassifierTrainer.Factory<CT extends ClassifierTrainer<? extends Classifier>>
 
Field Summary
 
Fields inherited from class cc.mallet.classify.ClassifierTrainer
finishedTraining, validationSet
 
Constructor Summary
MaxEntTrainer()
           
MaxEntTrainer(double gaussianPriorVariance)
          Constructs a trainer with a parameter to avoid overtraining.
MaxEntTrainer(MaxEnt theClassifierToTrain)
          Construct a MaxEnt trainer using a trained classifier as initial values.
 
Method Summary
 MaxEnt getClassifier()
           
 int getIteration()
           
 Optimizable getOptimizable()
           
 MaxEntOptimizableByLabelLikelihood getOptimizable(InstanceList trainingSet)
           
 MaxEntOptimizableByLabelLikelihood getOptimizable(InstanceList trainingSet, MaxEnt initialClassifier)
           
 Optimizer getOptimizer()
           
 Optimizer getOptimizer(InstanceList trainingSet)
          This method is called by the train method.
 void setClassifier(MaxEnt theClassifierToTrain)
          Initialize parameters using the provided classifier.
 MaxEntTrainer setGaussianPriorVariance(double gaussianPriorVariance)
          Sets a parameter to prevent overtraining.
 MaxEntTrainer setL1Weight(double l1Weight)
          Use an L1 prior.
 MaxEntTrainer setNumIterations(int i)
          Specifies the maximum number of iterations to run during a single call to train or trainWithFeatureInduction.
 java.lang.String toString()
          Like the other version of trainWithFeatureInduction, but allows some default options to be changed.
 MaxEnt train(InstanceList trainingSet)
           
 MaxEnt train(InstanceList trainingSet, int numIterations)
           
 
Methods inherited from class cc.mallet.classify.ClassifierTrainer
getValidationInstances, isFinishedTraining, setValidationInstances
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

MaxEntTrainer

public MaxEntTrainer()

MaxEntTrainer

public MaxEntTrainer(MaxEnt theClassifierToTrain)
Construct a MaxEnt trainer using a trained classifier as initial values.


MaxEntTrainer

public MaxEntTrainer(double gaussianPriorVariance)
Constructs a trainer with a parameter to avoid overtraining. 1.0 is the default value.

Method Detail

getClassifier

public MaxEnt getClassifier()
Specified by:
getClassifier in class ClassifierTrainer<MaxEnt>

setClassifier

public void setClassifier(MaxEnt theClassifierToTrain)
Initialize parameters using the provided classifier.


getOptimizable

public Optimizable getOptimizable()

getOptimizable

public MaxEntOptimizableByLabelLikelihood getOptimizable(InstanceList trainingSet)

getOptimizable

public MaxEntOptimizableByLabelLikelihood getOptimizable(InstanceList trainingSet,
                                                         MaxEnt initialClassifier)

getOptimizer

public Optimizer getOptimizer()
Specified by:
getOptimizer in interface ClassifierTrainer.ByOptimization<MaxEnt>

getOptimizer

public Optimizer getOptimizer(InstanceList trainingSet)
This method is called by the train method. This is the main entry point for the optimizable and optimizer compontents.


setNumIterations

public MaxEntTrainer setNumIterations(int i)
Specifies the maximum number of iterations to run during a single call to train or trainWithFeatureInduction. Not currently functional.

Returns:
This trainer

getIteration

public int getIteration()
Specified by:
getIteration in interface ClassifierTrainer.ByOptimization<MaxEnt>

setGaussianPriorVariance

public MaxEntTrainer setGaussianPriorVariance(double gaussianPriorVariance)
Sets a parameter to prevent overtraining. A smaller variance for the prior means that feature weights are expected to hover closer to 0, so extra evidence is required to set a higher weight.

Returns:
This trainer

setL1Weight

public MaxEntTrainer setL1Weight(double l1Weight)
Use an L1 prior. Larger values mean parameters will be closer to 0. Note that this setting overrides any Gaussian prior.


train

public MaxEnt train(InstanceList trainingSet)
Specified by:
train in class ClassifierTrainer<MaxEnt>

train

public MaxEnt train(InstanceList trainingSet,
                    int numIterations)
Specified by:
train in interface ClassifierTrainer.ByOptimization<MaxEnt>

toString

public java.lang.String toString()

Like the other version of trainWithFeatureInduction, but allows some default options to be changed.

Overrides:
toString in class java.lang.Object
Parameters:
maxent - An initial partially-trained classifier (default null). This classifier may be modified during training.
gainName - The estimate of gain (log-likelihood increase) we want our chosen features to maximize. Should be one of MaxEntTrainer.EXP_GAIN, MaxEntTrainer.GRADIENT_GAIN, or MaxEntTrainer.INFORMATION_GAIN (default EXP_GAIN).
Returns:
The trained MaxEnt classifier