cc.mallet.pipe
Class TokenSequenceRemoveNonAlpha

java.lang.Object
  extended by cc.mallet.pipe.Pipe
      extended by cc.mallet.pipe.TokenSequenceRemoveNonAlpha
All Implemented Interfaces:
AlphabetCarrying, java.io.Serializable

public class TokenSequenceRemoveNonAlpha
extends Pipe

Remove tokens that contain non-alphabetic characters. This class is used in conjunction wtih CharSequenceLexer.LEX_NON_WHITESPACE_CLASSES and FeatureSequenceWithBigrams, which in turn is used by TopicalNGrams.

Author:
Andrew McCallum
See Also:
Serialized Form

Constructor Summary
TokenSequenceRemoveNonAlpha()
           
TokenSequenceRemoveNonAlpha(boolean markDeletions)
           
 
Method Summary
 Instance pipe(Instance carrier)
          Really this should be 'protected', but isn't for historical reasons.
 
Methods inherited from class cc.mallet.pipe.Pipe
alphabetsMatch, getAlphabet, getAlphabets, getDataAlphabet, getInstanceId, getTargetAlphabet, instanceFrom, instancesFrom, instancesFrom, isDataAlphabetSet, isTargetProcessing, newIteratorFrom, preceedingPipeDataAlphabetNotification, preceedingPipeTargetAlphabetNotification, precondition, readResolve, setDataAlphabet, setOrCheckDataAlphabet, setOrCheckTargetAlphabet, setTargetAlphabet, setTargetProcessing
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TokenSequenceRemoveNonAlpha

public TokenSequenceRemoveNonAlpha(boolean markDeletions)

TokenSequenceRemoveNonAlpha

public TokenSequenceRemoveNonAlpha()
Method Detail

pipe

public Instance pipe(Instance carrier)
Description copied from class: Pipe
Really this should be 'protected', but isn't for historical reasons.

Overrides:
pipe in class Pipe