cc.mallet.fst.semi_supervised
Class CRFTrainerByEntropyRegularization
java.lang.Object
cc.mallet.fst.TransducerTrainer
cc.mallet.fst.semi_supervised.CRFTrainerByEntropyRegularization
- All Implemented Interfaces:
- TransducerTrainer.ByOptimization
public class CRFTrainerByEntropyRegularization
- extends TransducerTrainer
- implements TransducerTrainer.ByOptimization
A CRF trainer that maximizes the log-likelihood plus
a weighted entropy regularization term on unlabeled
data. Intuitively, it aims to make the CRF's predictions
on unlabeled data more confident.
References:
Feng Jiao, Shaojun Wang, Chi-Hoon Lee, Russell Greiner, Dale Schuurmans
"Semi-supervised conditional random fields for improved sequence segmentation and labeling"
ACL 2006
Gideon Mann, Andrew McCallum
"Efficient Computation of Entropy Gradient for Semi-Supervised Conditional Random Fields"
HLT/NAACL 2007
- Author:
- Gregory Druck
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
CRFTrainerByEntropyRegularization
public CRFTrainerByEntropyRegularization(CRF crf)
setGaussianPriorVariance
public void setGaussianPriorVariance(double variance)
setEntropyWeight
public void setEntropyWeight(double gamma)
- Sets the scaling factor for the entropy regularization term.
In [Jiao et al. 06], this is gamma.
- Parameters:
gamma
-
getIteration
public int getIteration()
- Specified by:
getIteration
in class TransducerTrainer
getTransducer
public Transducer getTransducer()
- Specified by:
getTransducer
in class TransducerTrainer
isFinishedTraining
public boolean isFinishedTraining()
- Specified by:
isFinishedTraining
in class TransducerTrainer
train
public boolean train(InstanceList trainingSet,
int numIterations)
- Description copied from class:
TransducerTrainer
- Train the transducer associated with this TransducerTrainer.
You should be able to call this method with different trainingSet objects.
Whether this causes the TransducerTrainer to combine both trainingSets or
to view the second as a new alternative is at the discretion of the particular
TransducerTrainer subclass involved.
- Specified by:
train
in class TransducerTrainer
train
public boolean train(InstanceList labeled,
InstanceList unlabeled,
int numIterations)
- Performs CRF training with label likelihood and entropy regularization.
The CRF is first trained with label likelihood only. This parameter
setting is used as a starting point for the combined optimization.
- Parameters:
labeled
- Labeled data, only used for label likelihood term.unlabeled
- Unlabeled data, only used for entropy regularization term.numIterations
- Number of iterations.
- Returns:
- True if training has converged.
getOptimizer
public Optimizer getOptimizer()
- Specified by:
getOptimizer
in interface TransducerTrainer.ByOptimization