cc.mallet.fst
Class HMM
java.lang.Object
cc.mallet.fst.Transducer
cc.mallet.fst.HMM
- All Implemented Interfaces:
- java.io.Serializable
public class HMM
- extends Transducer
- implements java.io.Serializable
Hidden Markov Model
- See Also:
- Serialized Form
|
Method Summary |
void |
addFullyConnectedStates(java.lang.String[] stateNames)
|
void |
addFullyConnectedStatesForBiLabels()
|
void |
addFullyConnectedStatesForLabels()
|
void |
addFullyConnectedStatesForThreeQuarterLabels(InstanceList trainingSet)
|
void |
addFullyConnectedStatesForTriLabels()
|
java.lang.String |
addOrderNStates(InstanceList trainingSet,
int[] orders,
boolean[] defaults,
java.lang.String start,
java.util.regex.Pattern forbidden,
java.util.regex.Pattern allowed,
boolean fullyConnected)
Assumes that the HMM's output alphabet contains
Strings. |
void |
addSelfTransitioningStateForAllLabels(java.lang.String name)
|
void |
addState(java.lang.String name,
double initialWeight,
double finalWeight,
java.lang.String[] destinationNames,
java.lang.String[] labelNames)
|
void |
addState(java.lang.String name,
java.lang.String[] destinationNames)
|
void |
addStatesForBiLabelsConnectedAsIn(InstanceList trainingSet)
Add states to create a second-order Markov model on labels,
adding only those transitions the occur in the given
trainingSet. |
void |
addStatesForHalfLabelsConnectedAsIn(InstanceList trainingSet)
Add as many states as there are labels, but don't create separate weights
for each source-destination pair of states. |
void |
addStatesForLabelsConnectedAsIn(InstanceList trainingSet)
Add states to create a first-order Markov model on labels,
adding only those transitions the occur in the given
trainingSet. |
void |
addStatesForThreeQuarterLabelsConnectedAsIn(InstanceList trainingSet)
Add as many states as there are labels, but don't create
separate observational-test-weights for each source-destination
pair of states---instead have all the incoming transitions to a
state share the same observational-feature-test weights. |
void |
estimate()
|
Alphabet |
getInputAlphabet()
|
Alphabet |
getOutputAlphabet()
|
Transducer.State |
getState(int index)
|
HMM.State |
getState(java.lang.String name)
|
java.util.Iterator |
initialStateIterator()
|
boolean |
isTrainable()
|
int |
numStates()
|
void |
print()
|
void |
reset()
|
boolean |
train(InstanceList ilist)
|
boolean |
train(InstanceList ilist,
InstanceList validation,
InstanceList testing)
|
boolean |
train(InstanceList ilist,
InstanceList validation,
InstanceList testing,
TransducerEvaluator eval)
|
void |
write(java.io.File f)
|
| Methods inherited from class cc.mallet.fst.Transducer |
averageTokenAccuracy, canIterateAllTransitions, generatePath, getInputPipe, getMaxLatticeFactory, getOutputPipe, getSumLatticeFactory, isGenerative, label, less_efficient_sumLogProb, no_longer_needed_sumNegLogProb, setMaxLatticeFactory, setSumLatticeFactory, stateIndexOfString, sumLogProb, transduce, transduce |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
HMM
public HMM(Pipe inputPipe,
Pipe outputPipe)
HMM
public HMM(Alphabet inputAlphabet,
Alphabet outputAlphabet)
getInputAlphabet
public Alphabet getInputAlphabet()
getOutputAlphabet
public Alphabet getOutputAlphabet()
print
public void print()
- Overrides:
print in class Transducer
addState
public void addState(java.lang.String name,
double initialWeight,
double finalWeight,
java.lang.String[] destinationNames,
java.lang.String[] labelNames)
addState
public void addState(java.lang.String name,
java.lang.String[] destinationNames)
addFullyConnectedStates
public void addFullyConnectedStates(java.lang.String[] stateNames)
addFullyConnectedStatesForLabels
public void addFullyConnectedStatesForLabels()
addStatesForLabelsConnectedAsIn
public void addStatesForLabelsConnectedAsIn(InstanceList trainingSet)
- Add states to create a first-order Markov model on labels,
adding only those transitions the occur in the given
trainingSet.
addStatesForHalfLabelsConnectedAsIn
public void addStatesForHalfLabelsConnectedAsIn(InstanceList trainingSet)
- Add as many states as there are labels, but don't create separate weights
for each source-destination pair of states. Instead have all the incoming
transitions to a state share the same weights.
addStatesForThreeQuarterLabelsConnectedAsIn
public void addStatesForThreeQuarterLabelsConnectedAsIn(InstanceList trainingSet)
- Add as many states as there are labels, but don't create
separate observational-test-weights for each source-destination
pair of states---instead have all the incoming transitions to a
state share the same observational-feature-test weights.
However, do create separate default feature for each transition,
(which acts as an HMM-style transition probability).
addFullyConnectedStatesForThreeQuarterLabels
public void addFullyConnectedStatesForThreeQuarterLabels(InstanceList trainingSet)
addFullyConnectedStatesForBiLabels
public void addFullyConnectedStatesForBiLabels()
addStatesForBiLabelsConnectedAsIn
public void addStatesForBiLabelsConnectedAsIn(InstanceList trainingSet)
- Add states to create a second-order Markov model on labels,
adding only those transitions the occur in the given
trainingSet.
addFullyConnectedStatesForTriLabels
public void addFullyConnectedStatesForTriLabels()
addSelfTransitioningStateForAllLabels
public void addSelfTransitioningStateForAllLabels(java.lang.String name)
addOrderNStates
public java.lang.String addOrderNStates(InstanceList trainingSet,
int[] orders,
boolean[] defaults,
java.lang.String start,
java.util.regex.Pattern forbidden,
java.util.regex.Pattern allowed,
boolean fullyConnected)
- Assumes that the HMM's output alphabet contains
Strings. Creates an order-n HMM with input
predicates and output labels given by trainingSet
and order, connectivity, and weights given by the remaining
arguments.
- Parameters:
trainingSet - the training instancesorders - an array of increasing non-negative numbers giving
the orders of the features for this HMM. The largest number
n is the Markov order of the HMM. States are
n-tuples of output labels. Each of the other numbers
k in orders represents a weight set shared
by all destination states whose last (most recent) k
labels agree. If orders is null, an
order-0 HMM is built.defaults - If non-null, it must be the same length as
orders, with true positions indicating
that the weight set for the corresponding order contains only the
weight for a default feature; otherwise, the weight set has
weights for all features built from input predicates.start - The label that represents the context of the start of
a sequence. It may be also used for sequence labels.forbidden - If non-null, specifies what pairs of successive
labels are not allowed, both for constructing norder
states or for transitions. A label pair (u,v)
is not allowed if u + "," + v matches
forbidden.allowed - If non-null, specifies what pairs of successive
labels are allowed, both for constructing norder
states or for transitions. A label pair (u,v)
is allowed only if u + "," + v matches
allowed.fullyConnected - Whether to include all allowed transitions,
even those not occurring in trainingSet,
getState
public HMM.State getState(java.lang.String name)
numStates
public int numStates()
- Specified by:
numStates in class Transducer
getState
public Transducer.State getState(int index)
- Specified by:
getState in class Transducer
initialStateIterator
public java.util.Iterator initialStateIterator()
- Specified by:
initialStateIterator in class Transducer
isTrainable
public boolean isTrainable()
reset
public void reset()
estimate
public void estimate()
train
public boolean train(InstanceList ilist)
train
public boolean train(InstanceList ilist,
InstanceList validation,
InstanceList testing)
train
public boolean train(InstanceList ilist,
InstanceList validation,
InstanceList testing,
TransducerEvaluator eval)
write
public void write(java.io.File f)