cc.mallet.pipe
Class CharSequence2TokenSequence
java.lang.Object
cc.mallet.pipe.Pipe
cc.mallet.pipe.CharSequence2TokenSequence
- All Implemented Interfaces:
- AlphabetCarrying, java.io.Serializable
public class CharSequence2TokenSequence
- extends Pipe
- implements java.io.Serializable
Pipe that tokenizes a character sequence. Expects a CharSequence
in the Instance data, and converts the sequence into a token
sequence using the given regex or CharSequenceLexer.
(The regex / lexer should specify what counts as a token.)
- See Also:
- Serialized Form
Method Summary |
static void |
main(java.lang.String[] args)
|
Instance |
pipe(Instance carrier)
Really this should be 'protected', but isn't for historical reasons. |
Methods inherited from class cc.mallet.pipe.Pipe |
alphabetsMatch, getAlphabet, getAlphabets, getDataAlphabet, getInstanceId, getTargetAlphabet, instanceFrom, instancesFrom, instancesFrom, isDataAlphabetSet, isTargetProcessing, newIteratorFrom, preceedingPipeDataAlphabetNotification, preceedingPipeTargetAlphabetNotification, precondition, readResolve, setDataAlphabet, setOrCheckDataAlphabet, setOrCheckTargetAlphabet, setTargetAlphabet, setTargetProcessing |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
CharSequence2TokenSequence
public CharSequence2TokenSequence(CharSequenceLexer lexer)
CharSequence2TokenSequence
public CharSequence2TokenSequence(java.lang.String regex)
CharSequence2TokenSequence
public CharSequence2TokenSequence(java.util.regex.Pattern regex)
CharSequence2TokenSequence
public CharSequence2TokenSequence()
pipe
public Instance pipe(Instance carrier)
- Description copied from class:
Pipe
- Really this should be 'protected', but isn't for historical reasons.
- Overrides:
pipe
in class Pipe
main
public static void main(java.lang.String[] args)