cc.mallet.grmm.learning
Class GenericAcrfData2TokenSequence

java.lang.Object
  extended by cc.mallet.pipe.Pipe
      extended by cc.mallet.grmm.learning.GenericAcrfData2TokenSequence
All Implemented Interfaces:
AlphabetCarrying, java.io.Serializable

public class GenericAcrfData2TokenSequence
extends Pipe

Generic pipe that takes a linegroup of the form:

  LABEL1 LABEL2 ... LABELk word feature1 feature2 ... featuren
 
and converts it into an input FeatureVectorSequence and target LabelsSequence.

If the number of labels at each sequence position could vary, then use this format instead:

  LABEL1 LABEL2 ... LABELk ---- word feature1 feature2 ... featuren
  
The four dashes ---- must be there to separate the features from the labels. Whitespace is ignored. The difference between this pipe and edu.umass.cs.iesl.casutton.experiments.dcrf.GenericDcrfPipe is that this pipe allows for a different number of labels at each sequence position.

Explicitly specifying which word is the token allows the use of the HTML output from the extract package. Created: Aug 22, 2005

Version:
$Id: GenericAcrfData2TokenSequence.java,v 1.1 2007/10/22 21:37:43 mccallum Exp $
Author:
Serialized Form

Constructor Summary
GenericAcrfData2TokenSequence()
           
GenericAcrfData2TokenSequence(int numLabels)
           
 
Method Summary
 boolean getFeaturesIncludeToken()
           
 LabelAlphabet getLabelAlphabet(int lvl)
           
 boolean isLabelsAtEnd()
           
 int numLevels()
           
 Instance pipe(Instance carrier)
          Really this should be 'protected', but isn't for historical reasons.
 void setFeaturesIncludeToken(boolean featuresIncludeToken)
          If true, then the first feature in the list is considered to be the token's text.
 void setIncludeTokenText(boolean includeTokenText)
           
 void setLabelsAtEnd(boolean labelsAtEnd)
           
 void setTextFeaturePrefix(java.lang.String textFeaturePrefix)
           
 
Methods inherited from class cc.mallet.pipe.Pipe
alphabetsMatch, getAlphabet, getAlphabets, getDataAlphabet, getInstanceId, getTargetAlphabet, instanceFrom, instancesFrom, instancesFrom, isDataAlphabetSet, isTargetProcessing, newIteratorFrom, preceedingPipeDataAlphabetNotification, preceedingPipeTargetAlphabetNotification, precondition, readResolve, setDataAlphabet, setOrCheckDataAlphabet, setOrCheckTargetAlphabet, setTargetAlphabet, setTargetProcessing
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

GenericAcrfData2TokenSequence

public GenericAcrfData2TokenSequence()

GenericAcrfData2TokenSequence

public GenericAcrfData2TokenSequence(int numLabels)
Method Detail

setIncludeTokenText

public void setIncludeTokenText(boolean includeTokenText)

setFeaturesIncludeToken

public void setFeaturesIncludeToken(boolean featuresIncludeToken)
If true, then the first feature in the list is considered to be the token's text. If false, then no feature is designated as the token text.

Parameters:
featuresIncludeToken -

getFeaturesIncludeToken

public boolean getFeaturesIncludeToken()

setTextFeaturePrefix

public void setTextFeaturePrefix(java.lang.String textFeaturePrefix)

getLabelAlphabet

public LabelAlphabet getLabelAlphabet(int lvl)

numLevels

public int numLevels()

pipe

public Instance pipe(Instance carrier)
Description copied from class: Pipe
Really this should be 'protected', but isn't for historical reasons.

Overrides:
pipe in class Pipe

isLabelsAtEnd

public boolean isLabelsAtEnd()

setLabelsAtEnd

public void setLabelsAtEnd(boolean labelsAtEnd)