|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
public interface Extractor
Generic interface for objects that do information extraction. Typically, this will mean extraction of database records (see @link{Record}) from Strings, but this interface is not specific to this case.
Method Summary | |
---|---|
Extraction |
extract(java.util.Iterator<Instance> source)
Performs extraction on a a set of raw documents. |
Extraction |
extract(java.lang.Object o)
Performs extraction given a raw object. |
Extraction |
extract(Tokenization toks)
Performs extraction from an object that has been already been tokenized. |
Pipe |
getFeaturePipe()
Returns the pipe used by this extractor for. |
Alphabet |
getInputAlphabet()
Returns an alphabet of the features used by the extractor. |
LabelAlphabet |
getTargetAlphabet()
Returns an alphabet of the labels used by the extractor. |
Pipe |
getTokenizationPipe()
Returns the pipe used by this extractor to tokenize the input. |
void |
setTokenizationPipe(Pipe pipe)
Sets the pipe used by this extractor for tokenization. |
Method Detail |
---|
Extraction extract(java.lang.Object o)
o
- The document to extract from (often a String).
Extraction extract(Tokenization toks)
toks
- A tokenized document
Extraction extract(java.util.Iterator<Instance> source)
source
- A source of raw documents
Pipe getFeaturePipe()
Pipe getTokenizationPipe()
void setTokenizationPipe(Pipe pipe)
The pipe @link{edu.umass.cs.mallet.base.pipe.CharSequence2TokenSequence} is an example of a pipe that could be used here.
Alphabet getInputAlphabet()
LabelAlphabet getTargetAlphabet()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |