cc.mallet.pipe.tsf
Class Target2BIOFormat

java.lang.Object
  extended by cc.mallet.pipe.Pipe
      extended by cc.mallet.pipe.tsf.Target2BIOFormat
All Implemented Interfaces:
AlphabetCarrying, java.io.Serializable

public class Target2BIOFormat
extends Pipe
implements java.io.Serializable

Creates a LabelSequence out of a TokenSequence that is the target of an Instance. Labels are constructed out of each Token in the TokenSequence to conform with BIO format (Begin, Inside, Outside of Segment). Prepends a "B-" to Tokens that leave a background state and an "I-" to tags that have the same label as the previous Token. NOTE: This class assumes that subsequent identical tags belong to the same Segment. This means that you cannot have B B I, only B I I.

See Also:
Serialized Form

Constructor Summary
Target2BIOFormat()
           
Target2BIOFormat(java.lang.String background)
           
 
Method Summary
 Instance pipe(Instance carrier)
          Really this should be 'protected', but isn't for historical reasons.
 
Methods inherited from class cc.mallet.pipe.Pipe
alphabetsMatch, getAlphabet, getAlphabets, getDataAlphabet, getInstanceId, getTargetAlphabet, instanceFrom, instancesFrom, instancesFrom, isDataAlphabetSet, isTargetProcessing, newIteratorFrom, preceedingPipeDataAlphabetNotification, preceedingPipeTargetAlphabetNotification, precondition, readResolve, setDataAlphabet, setOrCheckDataAlphabet, setOrCheckTargetAlphabet, setTargetAlphabet, setTargetProcessing
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Target2BIOFormat

public Target2BIOFormat()

Target2BIOFormat

public Target2BIOFormat(java.lang.String background)
Parameters:
background - represents Tokens that are not part of a target Segment.
Method Detail

pipe

public Instance pipe(Instance carrier)
Description copied from class: Pipe
Really this should be 'protected', but isn't for historical reasons.

Overrides:
pipe in class Pipe