cc.mallet.pipe
Class SelectiveSGML2TokenSequence

java.lang.Object
  extended by cc.mallet.pipe.Pipe
      extended by cc.mallet.pipe.SelectiveSGML2TokenSequence
All Implemented Interfaces:
AlphabetCarrying, java.io.Serializable

public class SelectiveSGML2TokenSequence
extends Pipe
implements java.io.Serializable

Similar to SGML2TokenSequence, except that only the tags listed in allowedTags are converted to Labels.

Author:
Aron Culotta culotta@cs.umass.edu
See Also:
Serialized Form

Constructor Summary
SelectiveSGML2TokenSequence(CharSequenceLexer lex, java.util.Set allowed)
           
SelectiveSGML2TokenSequence(CharSequenceLexer lexer, java.lang.String backgroundTag, java.util.Set allowed)
           
SelectiveSGML2TokenSequence(java.util.Set allowed)
           
SelectiveSGML2TokenSequence(java.lang.String regex, java.lang.String backgroundTag, java.util.Set allowed)
           
 
Method Summary
 Instance pipe(Instance carrier)
          Really this should be 'protected', but isn't for historical reasons.
 java.lang.String toString()
           
 
Methods inherited from class cc.mallet.pipe.Pipe
alphabetsMatch, getAlphabet, getAlphabets, getDataAlphabet, getInstanceId, getTargetAlphabet, instanceFrom, instancesFrom, instancesFrom, isDataAlphabetSet, isTargetProcessing, newIteratorFrom, preceedingPipeDataAlphabetNotification, preceedingPipeTargetAlphabetNotification, precondition, readResolve, setDataAlphabet, setOrCheckDataAlphabet, setOrCheckTargetAlphabet, setTargetAlphabet, setTargetProcessing
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

SelectiveSGML2TokenSequence

public SelectiveSGML2TokenSequence(CharSequenceLexer lexer,
                                   java.lang.String backgroundTag,
                                   java.util.Set allowed)
Parameters:
lexer - to tokenize input
backgroundTag - default tag when not in any other tag
allowed - set of tags (Strings) that will be converted to labels

SelectiveSGML2TokenSequence

public SelectiveSGML2TokenSequence(java.lang.String regex,
                                   java.lang.String backgroundTag,
                                   java.util.Set allowed)

SelectiveSGML2TokenSequence

public SelectiveSGML2TokenSequence(java.util.Set allowed)

SelectiveSGML2TokenSequence

public SelectiveSGML2TokenSequence(CharSequenceLexer lex,
                                   java.util.Set allowed)
Method Detail

pipe

public Instance pipe(Instance carrier)
Description copied from class: Pipe
Really this should be 'protected', but isn't for historical reasons.

Overrides:
pipe in class Pipe

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object