| 
 | ||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||
See:
          Description
| Class Summary | |
|---|---|
| AddClassifierTokenPredictions | This pipe uses a Classifier to label each token (i.e., using 0-th order Markov assumption), then adds the predictions as features to each token. | 
| AddClassifierTokenPredictions.TokenClassifiers | This inner class represents the trained token classifiers. | 
| Array2FeatureVector | Converts a Java array of numerical types to a FeatureVector, where the Alphabet is the data array index wrapped in an Integer object. | 
| AugmentableFeatureVectorAddConjunctions | Add specified conjunctions to each instance. | 
| AugmentableFeatureVectorLogScale | Given an AugmentableFeatureVector, set those values greater than or equal to 1 to log(value)+1. | 
| BranchingPipe | Deprecated. | 
| CharSequence2CharNGrams | Transform a character sequence into a token sequence of character N grams. | 
| CharSequence2TokenSequence | Pipe that tokenizes a character sequence. | 
| CharSequenceArray2TokenSequence | Transform an array of character Sequences into a token sequence. | 
| CharSequenceLowercase | Replace the data string with a lowercased version. | 
| CharSequenceRemoveHTML | This pipe removes HTML from a CharSequence. | 
| CharSequenceRemoveUUEncodedBlocks | |
| CharSequenceReplace | Given a string, repeatedly look for matches of the regex, and replace the entire match with the given replacement string. | 
| CharSubsequence | Given a string, return only the portion of the string inside a regex parenthesized group. | 
| Classification2ConfidencePredictingFeatureVector | Pipe features from underlying classifier to the confidence prediction instance list | 
| Csv2Array | Converts a string of comma separated values to an array. | 
| Csv2FeatureVector | Converts a string of the form feature_1:val_1 feature_2:val_2 ... | 
| Directory2FileIterator | Convert a File object representing a directory into a FileIterator which iterates over files in the directory matching a pattern and which extracts a label from each file path to become the target field of the instance. | 
| FeatureCountPipe | Pruning low-count features can be a good way to save memory and computation. | 
| FeatureDocFreqPipe | Pruning low-count features can be a good way to save memory and computation. | 
| FeatureSequence2AugmentableFeatureVector | Convert the data field from a feature sequence to an augmentable feature vector. | 
| FeatureSequence2FeatureVector | Convert the data field from a feature sequence to a feature vector. | 
| FeatureSequenceConvolution | |
| FeatureValueString2FeatureVector | |
| FeatureVectorConjunctions | Include in the FeatureVector conjunctions of all its features. | 
| FeatureVectorSequence2FeatureVectors | Given instances with a FeatureVectorSequence in the data field, break up the sequence into the individual FeatureVectors, producing one FeatureVector per Instance. | 
| Filename2CharSequence | Given a filename contained in a string, read in contents of file into a CharSequence. | 
| FilterEmptyFeatureVectors | |
| Input2CharSequence | Pipe that can read from various kinds of text sources (either URI, File, or Reader) into a CharSequence | 
| InstanceListTrimFeaturesByCount | Unimplemented. | 
| LineGroupString2TokenSequence | |
| MakeAmpersandXMLFriendly | convert & to & in tokens of a token sequence | 
| Noop | A pipe that does nothing to the instance fields but which has side effects on the dictionary. | 
| Pipe | The abstract superclass of all Pipes, which transform one data type to another. | 
| PipeUtils | Created: Aug 28, 2005 | 
| PrintInput | Print the data field of each instance. | 
| PrintInputAndTarget | Print the data and target fields of each instance. | 
| PrintTokenSequenceFeatures | Print properties of the token sequence in the data field and the corresponding value of any token in a token sequence or feature in a featur sequence in the target field. | 
| SaveDataInSource | Set the source field of each instance to its data field. | 
| SelectiveSGML2TokenSequence | Similar to SGML2TokenSequence, except that only the tags
         listed inallowedTagsare converted toLabels. | 
| SerialPipes | Convert an instance through a sequence of pipes. | 
| SGML2TokenSequence | Converts a string containing simple SGML tags into a dta TokenSequence of words, paired with a target TokenSequence containing the SGML tags in effect for each word. | 
| SimpleTaggerSentence2StringTokenization | This extends SimpleTaggerSentence2TokenSequenceto use
 {Slink StringTokenizations} for use with the extract package. | 
| SimpleTaggerSentence2TokenSequence | Converts an external encoding of a sequence of elements with binary
 features to a TokenSequence. | 
| SimpleTokenizer | A simple unicode tokenizer that accepts sequences of letters as tokens. | 
| SourceLocation2TokenSequence | Read from File or BufferedRead in the data field and produce a TokenSequence. | 
| StringAddNewLineDelimiter | Pipe that can adds special text between lines to explicitly represent line breaks. | 
| StringList2FeatureSequence | Convert a list of strings into a feature sequence | 
| SvmLight2FeatureVectorAndLabel | This Pipe converts a line in SVMLight format to a Mallet instance with FeatureVector data and Label target. | 
| Target2FeatureSequence | Convert a token sequence in the target field into a feature sequence in the target field. | 
| Target2Label | Convert object in the target field into a label in the target field. | 
| Target2LabelSequence | convert a token sequence in the target field into a label sequence in the target field. | 
| TargetRememberLastLabel | For each position in the target, remember the last non-background label. | 
| TargetStringToFeatures | |
| Token2FeatureVector | convert the property list on a token into a feature vector | 
| TokenSequence2FeatureSequence | Convert the token sequence in the data field each instance to a feature sequence. | 
| TokenSequence2FeatureSequenceWithBigrams | Convert the token sequence in the data field of each instance to a feature sequence that preserves bigram information. | 
| TokenSequence2FeatureVectorSequence | Convert the token sequence in the data field of each instance to a feature vector sequence. | 
| TokenSequence2TokenInstances | |
| TokenSequenceLowercase | Convert the text in each token in the token sequence in the data field to lower case. | 
| TokenSequenceMatchDataAndTarget | Run a regular expression over the text of each token; replace the text with the substring matching one regex group; create a target TokenSequence from the text matching another regex group. | 
| TokenSequenceNGrams | Convert the token sequence in the data field to a token sequence of ngrams. | 
| TokenSequenceParseFeatureString | Convert the string in each field Token.textto a list
         of Strings (space delimited). | 
| TokenSequenceRemoveNonAlpha | Remove tokens that contain non-alphabetic characters. | 
| TokenSequenceRemoveStopwords | Remove tokens from the token sequence in the data field whose text is in the stopword list. | 
| Exception Summary | |
|---|---|
| PipeException | |
Classes for processing arbitrary data into instances. Every class in this Directory should be a subclass of Pipe. Other classes should go in base.pipe.util.
| 
 | ||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||