cc.mallet.fst.semi_supervised
Class StateLabelMap

java.lang.Object
  extended by cc.mallet.fst.semi_supervised.StateLabelMap

public class StateLabelMap
extends java.lang.Object

Maps states in the lattice to labels.

When a custom state machine is constructed while training a CRF, it is possible that several states map to the same label. In this case, there will be a discrepancy between the number of states used in the lattice and the number of output labels (targets). Use this mapping if such an FST is used in training a CRF model.

If the number of states in the lattice is expected to be equal to the number of output labels, then set isOneToOneMap to true in the constructor.

This map associates the state with the appropriate label (indexing is zero onwards).

Note: Add the states to the map in the same order in which they are added to the CRF while constructing the FST. This is necessary to keep a correct mapping of the state indices in this map to the state indices used within the CRF.

Author:
Gaurav Chandalia

Field Summary
static int START_LABEL
           
 
Constructor Summary
StateLabelMap(Alphabet labelAlphabet, boolean isOneToOneMap)
           
StateLabelMap(Alphabet labelAlphabet, boolean isOneToOneMap, int startStateIndex)
          Initializes the state and label maps.
 
Method Summary
 void addStartState(int index)
          If there is a special start state in the CRF that is not included in the label set, then we need to add it here.
 int addState(java.lang.String stateName, java.lang.String labelName)
          Adds a state to the map.
 Alphabet getLabelAlphabet()
          Returns the label (target) alphabet.
 int getLabelIndex(int stateIndex)
          Returns the label index mapped to the state index.
 int getNumLabels()
          Returns the number of labels in the map.
 int getNumStates()
          Returns the number of states in the map.
 Alphabet getStateAlphabet()
          Returns the state alphabet.
 java.util.LinkedHashSet<java.lang.Integer> getStateIndices(int labelIndex)
          Returns the state indices that map to the label index.
 boolean isOneToOneMapping()
          Returns true if there is a one-to-one mapping between the states and labels and false otherwise.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

START_LABEL

public static final int START_LABEL
See Also:
Constant Field Values
Constructor Detail

StateLabelMap

public StateLabelMap(Alphabet labelAlphabet,
                     boolean isOneToOneMap)

StateLabelMap

public StateLabelMap(Alphabet labelAlphabet,
                     boolean isOneToOneMap,
                     int startStateIndex)
Initializes the state and label maps. Note: If a standard FST is used (using one of the methods provided in CRF class), the state and label alphabets are the same. In this case, there will be a one-to-one mapping between the states and labels. Also, the addStates method can no longer be used. This is done when isOneToOneMap is true.

Parameters:
labelAlphabet - Target alphabet that maps label names to integers.
isOneToOneMap - True if a one to one mapping of states and labels is to be created (ignoring the start label)
startStateIndex - Index of special START state or -1
Method Detail

addStartState

public void addStartState(int index)
If there is a special start state in the CRF that is not included in the label set, then we need to add it here. Constraints can then check if a state maps to the special START_LABEL, and handle this appropriately.

Parameters:
index - Index of the special start state in the CRF.

isOneToOneMapping

public boolean isOneToOneMapping()
Returns true if there is a one-to-one mapping between the states and labels and false otherwise.


getNumLabels

public int getNumLabels()
Returns the number of labels in the map.


getNumStates

public int getNumStates()
Returns the number of states in the map.


getLabelAlphabet

public Alphabet getLabelAlphabet()
Returns the label (target) alphabet.


getStateAlphabet

public Alphabet getStateAlphabet()
Returns the state alphabet.


getLabelIndex

public int getLabelIndex(int stateIndex)
Returns the label index mapped to the state index.

Parameters:
stateIndex - State index.
Returns:
Index of the label that is mapped to the state. Returns -1 if there is no label (index) that maps to the specified state.

getStateIndices

public java.util.LinkedHashSet<java.lang.Integer> getStateIndices(int labelIndex)
Returns the state indices that map to the label index.

Parameters:
labelIndex - Label (target) index.
Returns:
Indices of the states that map to the label. Returns null if there are no states that map to the label.

addState

public int addState(java.lang.String stateName,
                    java.lang.String labelName)
Adds a state to the map.

Parameters:
stateName - Name of the state.
labelName - Label (target) name with which the state is associated.
Returns:
The index associated with the state that was added.
Throws:
java.lang.IllegalArgumentException - If an invalid label name or a duplicate state name is provided.
IllegalStateError - If this method is called when there is a one-to-one mapping between the states and labels.