|
||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |
See:
Description
Class Summary | |
---|---|
AddClassifierTokenPredictions | This pipe uses a Classifier to label each token (i.e., using 0-th order Markov assumption), then adds the predictions as features to each token. |
AddClassifierTokenPredictions.TokenClassifiers | This inner class represents the trained token classifiers. |
Array2FeatureVector | Converts a Java array of numerical types to a FeatureVector, where the Alphabet is the data array index wrapped in an Integer object. |
AugmentableFeatureVectorAddConjunctions | Add specified conjunctions to each instance. |
AugmentableFeatureVectorLogScale | Given an AugmentableFeatureVector, set those values greater than or equal to 1 to log(value)+1. |
BranchingPipe | Deprecated. |
CharSequence2CharNGrams | Transform a character sequence into a token sequence of character N grams. |
CharSequence2TokenSequence | Pipe that tokenizes a character sequence. |
CharSequenceArray2TokenSequence | Transform an array of character Sequences into a token sequence. |
CharSequenceLowercase | Replace the data string with a lowercased version. |
CharSequenceRemoveHTML | This pipe removes HTML from a CharSequence. |
CharSequenceRemoveUUEncodedBlocks | |
CharSequenceReplace | Given a string, repeatedly look for matches of the regex, and replace the entire match with the given replacement string. |
CharSubsequence | Given a string, return only the portion of the string inside a regex parenthesized group. |
Classification2ConfidencePredictingFeatureVector | Pipe features from underlying classifier to the confidence prediction instance list |
Csv2Array | Converts a string of comma separated values to an array. |
Csv2FeatureVector | Converts a string of the form feature_1:val_1 feature_2:val_2 ... |
Directory2FileIterator | Convert a File object representing a directory into a FileIterator which iterates over files in the directory matching a pattern and which extracts a label from each file path to become the target field of the instance. |
FeatureCountPipe | Pruning low-count features can be a good way to save memory and computation. |
FeatureDocFreqPipe | Pruning low-count features can be a good way to save memory and computation. |
FeatureSequence2AugmentableFeatureVector | Convert the data field from a feature sequence to an augmentable feature vector. |
FeatureSequence2FeatureVector | Convert the data field from a feature sequence to a feature vector. |
FeatureSequenceConvolution | |
FeatureValueString2FeatureVector | |
FeatureVectorConjunctions | Include in the FeatureVector conjunctions of all its features. |
FeatureVectorSequence2FeatureVectors | Given instances with a FeatureVectorSequence in the data field, break up the sequence into the individual FeatureVectors, producing one FeatureVector per Instance. |
Filename2CharSequence | Given a filename contained in a string, read in contents of file into a CharSequence. |
FilterEmptyFeatureVectors | |
Input2CharSequence | Pipe that can read from various kinds of text sources (either URI, File, or Reader) into a CharSequence |
InstanceListTrimFeaturesByCount | Unimplemented. |
LineGroupString2TokenSequence | |
MakeAmpersandXMLFriendly | convert & to & in tokens of a token sequence |
Noop | A pipe that does nothing to the instance fields but which has side effects on the dictionary. |
Pipe | The abstract superclass of all Pipes, which transform one data type to another. |
PipeUtils | Created: Aug 28, 2005 |
PrintInput | Print the data field of each instance. |
PrintInputAndTarget | Print the data and target fields of each instance. |
PrintTokenSequenceFeatures | Print properties of the token sequence in the data field and the corresponding value of any token in a token sequence or feature in a featur sequence in the target field. |
SaveDataInSource | Set the source field of each instance to its data field. |
SelectiveSGML2TokenSequence | Similar to SGML2TokenSequence , except that only the tags
listed in allowedTags are converted to Label s. |
SerialPipes | Convert an instance through a sequence of pipes. |
SGML2TokenSequence | Converts a string containing simple SGML tags into a dta TokenSequence of words, paired with a target TokenSequence containing the SGML tags in effect for each word. |
SimpleTaggerSentence2StringTokenization | This extends SimpleTaggerSentence2TokenSequence to use
{Slink StringTokenizations} for use with the extract package. |
SimpleTaggerSentence2TokenSequence | Converts an external encoding of a sequence of elements with binary
features to a TokenSequence . |
SimpleTokenizer | A simple unicode tokenizer that accepts sequences of letters as tokens. |
SourceLocation2TokenSequence | Read from File or BufferedRead in the data field and produce a TokenSequence. |
StringAddNewLineDelimiter | Pipe that can adds special text between lines to explicitly represent line breaks. |
StringList2FeatureSequence | Convert a list of strings into a feature sequence |
SvmLight2FeatureVectorAndLabel | This Pipe converts a line in SVMLight format to a Mallet instance with FeatureVector data and Label target. |
Target2FeatureSequence | Convert a token sequence in the target field into a feature sequence in the target field. |
Target2Label | Convert object in the target field into a label in the target field. |
Target2LabelSequence | convert a token sequence in the target field into a label sequence in the target field. |
TargetRememberLastLabel | For each position in the target, remember the last non-background label. |
TargetStringToFeatures | |
Token2FeatureVector | convert the property list on a token into a feature vector |
TokenSequence2FeatureSequence | Convert the token sequence in the data field each instance to a feature sequence. |
TokenSequence2FeatureSequenceWithBigrams | Convert the token sequence in the data field of each instance to a feature sequence that preserves bigram information. |
TokenSequence2FeatureVectorSequence | Convert the token sequence in the data field of each instance to a feature vector sequence. |
TokenSequence2TokenInstances | |
TokenSequenceLowercase | Convert the text in each token in the token sequence in the data field to lower case. |
TokenSequenceMatchDataAndTarget | Run a regular expression over the text of each token; replace the text with the substring matching one regex group; create a target TokenSequence from the text matching another regex group. |
TokenSequenceNGrams | Convert the token sequence in the data field to a token sequence of ngrams. |
TokenSequenceParseFeatureString | Convert the string in each field Token.text to a list
of Strings (space delimited). |
TokenSequenceRemoveNonAlpha | Remove tokens that contain non-alphabetic characters. |
TokenSequenceRemoveStopwords | Remove tokens from the token sequence in the data field whose text is in the stopword list. |
Exception Summary | |
---|---|
PipeException |
Classes for processing arbitrary data into instances. Every class in this Directory should be a subclass of Pipe. Other classes should go in base.pipe.util.
|
||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |