|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object cc.mallet.fst.TransducerTrainer cc.mallet.fst.CRFTrainerByLabelLikelihood
public class CRFTrainerByLabelLikelihood
Unlike ClassifierTrainer, TransducerTrainer is not "stateless" between calls to train. A TransducerTrainer is constructed paired with a specific Transducer, and can only train that Transducer. CRF stores and has methods for FeatureSelection and weight freezing. CRFTrainer stores and has methods for determining the contents/dimensions/sparsity/FeatureInduction of the CRF's weights as determined by training data.
Note: In the future this class may go away in favor of some default version of CRFTrainerByValueGradients.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class cc.mallet.fst.TransducerTrainer |
---|
TransducerTrainer.ByIncrements, TransducerTrainer.ByInstanceIncrements, TransducerTrainer.ByOptimization |
Field Summary | |
---|---|
boolean |
printGradient
|
Constructor Summary | |
---|---|
CRFTrainerByLabelLikelihood(CRF crf)
|
Method Summary | |
---|---|
CRF |
getCRF()
|
double |
getGaussianPriorVariance()
|
int |
getIteration()
|
CRFOptimizableByLabelLikelihood |
getOptimizableCRF(InstanceList trainingSet)
|
Optimizer |
getOptimizer()
|
Optimizer |
getOptimizer(InstanceList trainingSet)
|
Transducer |
getTransducer()
|
double |
getUseHyperbolicPriorSharpness()
|
double |
getUseHyperbolicPriorSlope()
|
boolean |
getUseSparseWeights()
|
boolean |
isConverged()
|
boolean |
isFinishedTraining()
|
void |
setAddNoFactors(boolean flag)
Use this method to specify whether or not factors are added to the CRF by this trainer. |
void |
setGaussianPriorVariance(double p)
|
void |
setHyperbolicPriorSharpness(double p)
|
void |
setHyperbolicPriorSlope(double p)
|
void |
setUseHyperbolicPrior(boolean f)
|
void |
setUseSomeUnsupportedTrick(boolean b)
Sets whether to use the 'some unsupported trick.' This trick is, if training a CRF where some training has been done and sparse weights are used, to add a few weights for feaures that do not occur in the tainig data. |
void |
setUseSparseWeights(boolean b)
|
boolean |
train(InstanceList trainingSet,
int numIterations)
Train the transducer associated with this TransducerTrainer. |
boolean |
train(InstanceList training,
int numIterationsPerProportion,
double[] trainingProportions)
Train a CRF on various-sized subsets of the data. |
boolean |
trainIncremental(InstanceList training)
|
boolean |
trainWithFeatureInduction(InstanceList trainingData,
InstanceList validationData,
InstanceList testingData,
TransducerEvaluator eval,
int numIterations,
int numIterationsBetweenFeatureInductions,
int numFeatureInductions,
int numFeaturesPerFeatureInduction,
double trueLabelProbThreshold,
boolean clusteredFeatureInduction,
double[] trainingProportions)
|
boolean |
trainWithFeatureInduction(InstanceList trainingData,
InstanceList validationData,
InstanceList testingData,
TransducerEvaluator eval,
int numIterations,
int numIterationsBetweenFeatureInductions,
int numFeatureInductions,
int numFeaturesPerFeatureInduction,
double trueLabelProbThreshold,
boolean clusteredFeatureInduction,
double[] trainingProportions,
java.lang.String gainName)
Train a CRF using feature induction to generate conjunctions of features. |
Methods inherited from class cc.mallet.fst.TransducerTrainer |
---|
addEvaluator, addEvaluators, removeEvaluator, runEvaluators, train |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public boolean printGradient
Constructor Detail |
---|
public CRFTrainerByLabelLikelihood(CRF crf)
Method Detail |
---|
public Transducer getTransducer()
getTransducer
in class TransducerTrainer
public CRF getCRF()
public Optimizer getOptimizer()
getOptimizer
in interface TransducerTrainer.ByOptimization
public boolean isConverged()
public boolean isFinishedTraining()
isFinishedTraining
in class TransducerTrainer
public int getIteration()
getIteration
in class TransducerTrainer
public void setAddNoFactors(boolean flag)
flag
- If true, this trainer adds no factors to the CRF.public CRFOptimizableByLabelLikelihood getOptimizableCRF(InstanceList trainingSet)
public Optimizer getOptimizer(InstanceList trainingSet)
public boolean trainIncremental(InstanceList training)
public boolean train(InstanceList trainingSet, int numIterations)
TransducerTrainer
train
in class TransducerTrainer
public boolean train(InstanceList training, int numIterationsPerProportion, double[] trainingProportions)
training
- The training Instances.numIterationsPerProportion
- Maximum number of Maximizer iterations per training proportion.trainingProportions
- If non-null, train on increasingly
larger portions of the data, e.g. new double[] {0.2, 0.5, 1.0}. This can sometimes speedup convergence.
Be sure to end in 1.0 if you want to train on all the data in the end.
public boolean trainWithFeatureInduction(InstanceList trainingData, InstanceList validationData, InstanceList testingData, TransducerEvaluator eval, int numIterations, int numIterationsBetweenFeatureInductions, int numFeatureInductions, int numFeaturesPerFeatureInduction, double trueLabelProbThreshold, boolean clusteredFeatureInduction, double[] trainingProportions)
public boolean trainWithFeatureInduction(InstanceList trainingData, InstanceList validationData, InstanceList testingData, TransducerEvaluator eval, int numIterations, int numIterationsBetweenFeatureInductions, int numFeatureInductions, int numFeaturesPerFeatureInduction, double trueLabelProbThreshold, boolean clusteredFeatureInduction, double[] trainingProportions, java.lang.String gainName)
FeatureInducer
specified by gainName
training
- The training Instances.validation
- The validation Instances.testing
- The testing instances.eval
- For evaluation during training.numIterations
- Maximum number of Maximizer iterations.numIterationsBetweenFeatureInductions
- Number of maximizer
iterations between each call to the Feature Inducer.numFeatureInductions
- Maximum number of rounds of feature
induction.numFeaturesPerFeatureInduction
- Maximum number of features
to induce at each round of induction.trueLabelProbThreshold
- If the model's probability of the
true Label of an Instance is less than this value, it is added as
an error instance to the FeatureInducer
.clusteredFeatureInduction
- If true, a separate FeatureInducer
is constructed for each label pair. This can
avoid inducing a disproportionate number of features for a single
label.trainingProportions
- If non-null, train on increasingly
larger portions of the data (e.g. [0.2, 0.5, 1.0]. This can
sometimes speedup convergence.gainName
- The type of FeatureInducer
to use. One of
"exp", "grad", or "info" for ExpGain
, GradientGain
, or InfoGain
.
public void setUseHyperbolicPrior(boolean f)
public void setHyperbolicPriorSlope(double p)
public void setHyperbolicPriorSharpness(double p)
public double getUseHyperbolicPriorSlope()
public double getUseHyperbolicPriorSharpness()
public void setGaussianPriorVariance(double p)
public double getGaussianPriorVariance()
public void setUseSparseWeights(boolean b)
public boolean getUseSparseWeights()
public void setUseSomeUnsupportedTrick(boolean b)
This generally leads to better accuracy at only a small memory cost.
b
- Whether to use the trick
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |