cc.mallet.pipe
Class SimpleTaggerSentence2StringTokenization

java.lang.Object
  extended by cc.mallet.pipe.Pipe
      extended by cc.mallet.pipe.SimpleTaggerSentence2TokenSequence
          extended by cc.mallet.pipe.SimpleTaggerSentence2StringTokenization
All Implemented Interfaces:
AlphabetCarrying, java.io.Serializable

public class SimpleTaggerSentence2StringTokenization
extends SimpleTaggerSentence2TokenSequence

This extends SimpleTaggerSentence2TokenSequence to use {Slink StringTokenizations} for use with the extract package.

See Also:
Serialized Form

Field Summary
 
Fields inherited from class cc.mallet.pipe.SimpleTaggerSentence2TokenSequence
setTokensAsFeatures
 
Constructor Summary
SimpleTaggerSentence2StringTokenization()
          Creates a new SimpleTaggerSentence2StringTokenization instance.
SimpleTaggerSentence2StringTokenization(boolean inc)
          creates a new SimpleTaggerSentence2StringTokenization instance which includes tokens as features iff the supplied argument is true.
 
Method Summary
 Instance pipe(Instance carrier)
          Takes an instance with data of type String or String[][] and creates an Instance of type StringTokenization.
 
Methods inherited from class cc.mallet.pipe.SimpleTaggerSentence2TokenSequence
makeText, parseSentence
 
Methods inherited from class cc.mallet.pipe.Pipe
alphabetsMatch, getAlphabet, getAlphabets, getDataAlphabet, getInstanceId, getTargetAlphabet, instanceFrom, instancesFrom, instancesFrom, isDataAlphabetSet, isTargetProcessing, newIteratorFrom, preceedingPipeDataAlphabetNotification, preceedingPipeTargetAlphabetNotification, precondition, readResolve, setDataAlphabet, setOrCheckDataAlphabet, setOrCheckTargetAlphabet, setTargetAlphabet, setTargetProcessing
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SimpleTaggerSentence2StringTokenization

public SimpleTaggerSentence2StringTokenization()
Creates a new SimpleTaggerSentence2StringTokenization instance. By default we include tokens as features.


SimpleTaggerSentence2StringTokenization

public SimpleTaggerSentence2StringTokenization(boolean inc)
creates a new SimpleTaggerSentence2StringTokenization instance which includes tokens as features iff the supplied argument is true.

Method Detail

pipe

public Instance pipe(Instance carrier)
Takes an instance with data of type String or String[][] and creates an Instance of type StringTokenization. Each Token in the sequence is gets the test of the line preceding it and once feature of value 1 for each "Feature" in the line. For example, if the String[][] is {{a,b},{c,d,e}} (and target processing is off) then the text would be "a b" for the first token and "c d e" for the second. Also, the features "a" and "b" would be set for the first token and "c", "d" and "e" for the second. The last element in the String[] for the current token is taken as the target (label), so in the previous example "b" would have been the label of the first sequence.

Overrides:
pipe in class SimpleTaggerSentence2TokenSequence