Package cc.mallet.extract



Interface Summary
ExtractionEvaluator Created: Oct 8, 2004
Extractor Generic interface for objects that do information extraction.
FieldCleaner Interface for functions that are used to clean up field values after extraction has been performed.
FieldComparator Interface for functions that compares extracted values of a field to see if they match.
Span A sub-section of a document, either linear or two-dimensional.
TokenizationFilter Created: Nov 12, 2004

Class Summary
AccuracyCoverageEvaluator Constructs Accuracy-coverage graph using confidence values to sort Fields.
BIOTokenizationFilter Created: Nov 12, 2004
ConfidenceTokenizationFilter Created: Oct 26, 2005
CRFExtractor Created: Oct 12, 2004
DefaultTokenizationFilter Created: Nov 12, 2004
DocumentExtraction Created: Oct 12, 2004
DocumentViewer Diagnosis class that outputs HTML pages that allows you to view errors on a more global per-instance basis.
ExactMatchComparator Created: Nov 23, 2004
Extraction The results of doing information extraction.
ExtractionConfidenceEstimator Estimates the confidence in the labeling of a LabeledSpan.
Field Created: Oct 12, 2004
HierarchicalTokenizationFilter Tokenization filter that will create nested spans based on a hierarchical labeling of the data.
LabeledSpan Created: Oct 12, 2004
LabeledSpans Created: Oct 31, 2004
LatticeViewer Created: Oct 31, 2004
PerDocumentF1Evaluator Created: Oct 8, 2004
PerFieldF1Evaluator Created: Oct 8, 2004
PunctuationIgnoringComparator Created: Nov 23, 2004
Record Created: Oct 12, 2004
RegexFieldCleaner A field cleaner that removes all occurrences of a given regex.
StringSpan A sub-section of a linear string.
TransducerExtractionConfidenceEstimator Estimates the confidence in the labeling of a LabeledSpan using a TransducerConfidenceEstimator.

Package cc.mallet.extract Description

Unimplemented. This code is mostly notes for how things might be implemented in the future.