cc.mallet.fst.semi_supervised
Class GECriteria

java.lang.Object
  extended by cc.mallet.fst.semi_supervised.GECriteria
Direct Known Subclasses:
GEKLCriteria, GEL2Criteria

public abstract class GECriteria
extends java.lang.Object

Represents GE criteria specified in the form of feature-label associations.

Author:
Gaurav Chandalia, Gregory Druck

Field Summary
protected  java.util.BitSet constraintBits
           
protected  java.util.Map<java.lang.Integer,GECriterion> constraints
           
protected  cc.mallet.fst.semi_supervised.GECriteria.FeatureLabelExpExecutor labelExpExecutor
           
protected  int numStates
           
protected  StateLabelMap stateLabelMap
           
 
Constructor Summary
GECriteria(int numStates, StateLabelMap stateLabelMap, java.util.Map<java.lang.Integer,GECriterion> constraints)
          Initializes the feature-label association constraints.
 
Method Summary
protected  void assertLabelExpNonNull()
           
 void calculateExpectations(InstanceList ilist, Transducer transducer, java.util.Map<java.lang.Integer,SumLattice> lattices)
          Calculates the model expectation of all feature constraints.
 GECriterion getConstraint(int featureIndex)
          Returns the FeatureInfo object mapped to the feature index.
 java.util.BitSet getConstraintBits()
          Returns bits for all instances, each set if instance has at least one feature constraint.
 java.util.BitSet getConstraintBitsForInstance(FeatureVectorSequence fvs)
          Returns bits for an instance, each bit corresponds to a feature index and is set if the feature is present in the instance.
 double[] getExpectationForInstance(int featureIndex, FeatureVectorSequence fvs, double[][] gammas)
          Returns the expectation of a feature in one instance.
protected  int getExpectationForInstance(int featureIndex, FeatureVectorSequence fvs, double[][] gammas, double[] expectation)
          Returns the number of times the feature occurred in the sequence (an instance).
 java.util.Iterator<java.lang.Integer> getFeatureIndexIterator()
          Returns an iterator to the indices of the feature constraints.
abstract  double getGEValue()
          Computes sum of GE constraint values.
 StateLabelMap getStateLabelMap()
          Returns the state-label mapping.
 void print(Alphabet targetAlphabet)
          Prints the constraints.
 void setConstraintBits(InstanceList ilist, int start, int end)
          Sets a bit for each instance if it has at least one feature constraint (anywhere in the sequence).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

numStates

protected int numStates

stateLabelMap

protected StateLabelMap stateLabelMap

constraints

protected java.util.Map<java.lang.Integer,GECriterion> constraints

constraintBits

protected java.util.BitSet constraintBits

labelExpExecutor

protected transient cc.mallet.fst.semi_supervised.GECriteria.FeatureLabelExpExecutor labelExpExecutor
Constructor Detail

GECriteria

public GECriteria(int numStates,
                  StateLabelMap stateLabelMap,
                  java.util.Map<java.lang.Integer,GECriterion> constraints)
Initializes the feature-label association constraints.

Parameters:
numStates - Number of states in the lattice.
stateLabelMap - Mapping of states to labels (used when a custom FST is used to train a CRF).
constraints - Map, key: feature index, value: FeatureInfo object.
Method Detail

getStateLabelMap

public StateLabelMap getStateLabelMap()
Returns the state-label mapping.


getConstraint

public GECriterion getConstraint(int featureIndex)
Returns the FeatureInfo object mapped to the feature index. Note: No check is performed to make sure feature index is valid. Method can return null.


getFeatureIndexIterator

public java.util.Iterator<java.lang.Integer> getFeatureIndexIterator()
Returns an iterator to the indices of the feature constraints.


getConstraintBits

public java.util.BitSet getConstraintBits()
Returns bits for all instances, each set if instance has at least one feature constraint.


setConstraintBits

public void setConstraintBits(InstanceList ilist,
                              int start,
                              int end)
Sets a bit for each instance if it has at least one feature constraint (anywhere in the sequence). start, end indicate range of indices that will be used for semisup computations.


getConstraintBitsForInstance

public final java.util.BitSet getConstraintBitsForInstance(FeatureVectorSequence fvs)
Returns bits for an instance, each bit corresponds to a feature index and is set if the feature is present in the instance.

Returns:
Constraint bits, size == number of feature constraints

getExpectationForInstance

protected final int getExpectationForInstance(int featureIndex,
                                              FeatureVectorSequence fvs,
                                              double[][] gammas,
                                              double[] expectation)
Returns the number of times the feature occurred in the sequence (an instance).

Also updates the expectation of a feature in one instance.

Parameters:
featureIndex - Feature to look for.
fvs - Observation sequence.
gammas - Log probability of being in state 'i' at input position 'j'.
expectation - Model expectation (filled by this method).
Returns:
Number of times the feature occurred in the input sequence.
Throws:
java.lang.IndexOutOfBoundsException - If an invalid feature index is specified.

getExpectationForInstance

public final double[] getExpectationForInstance(int featureIndex,
                                                FeatureVectorSequence fvs,
                                                double[][] gammas)
Returns the expectation of a feature in one instance.

*Note*: These expectations are not normalized.


calculateExpectations

public void calculateExpectations(InstanceList ilist,
                                  Transducer transducer,
                                  java.util.Map<java.lang.Integer,SumLattice> lattices)
Calculates the model expectation of all feature constraints.

lattices contains the SumLattice objects of instances to be used for semisup computations.


getGEValue

public abstract double getGEValue()
Computes sum of GE constraint values.

Note: Label expectations are not re-computed here. If desired, then make a call to calculateLabelExp.


assertLabelExpNonNull

protected void assertLabelExpNonNull()

print

public void print(Alphabet targetAlphabet)
Prints the constraints.