cc.mallet.extract
Class Extraction

java.lang.Object
  extended by cc.mallet.extract.Extraction

public class Extraction
extends java.lang.Object

The results of doing information extraction. This is designed to handle field extraction from a single document, or relation extraction and coreference from multiple documents;


Constructor Summary
Extraction(Extractor extractor, LabelAlphabet dict)
          Creates an empty Extraction option.
Extraction(Extractor extractor, LabelAlphabet dict, java.lang.String name, Tokenization input, Sequence output, java.lang.String background)
          Creates an extration given a sequence output by some kind of per-sequece labeler, like an HMM or a CRF.
 
Method Summary
 void addDocumentExtraction(DocumentExtraction docseq)
           
 void cleanFields(FieldCleaner cleaner)
           
 DocumentExtraction getDocumentExtraction(int idx)
           
 Extractor getExtractor()
           
 LabelAlphabet getLabelAlphabet()
           
 int getNumDocuments()
           
 int getNumRecords()
           
 Record getRecord(int idx)
           
 Record getTargetRecord(int docnum)
           
 void print(java.io.PrintWriter writer)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Extraction

public Extraction(Extractor extractor,
                  LabelAlphabet dict)
Creates an empty Extraction option. DocumentExtractions can be added later by the addDocumentExtraction method.


Extraction

public Extraction(Extractor extractor,
                  LabelAlphabet dict,
                  java.lang.String name,
                  Tokenization input,
                  Sequence output,
                  java.lang.String background)
Creates an extration given a sequence output by some kind of per-sequece labeler, like an HMM or a CRF. The extraction will contain a single document.

Method Detail

addDocumentExtraction

public void addDocumentExtraction(DocumentExtraction docseq)

getRecord

public Record getRecord(int idx)

getNumRecords

public int getNumRecords()

getDocumentExtraction

public DocumentExtraction getDocumentExtraction(int idx)

getNumDocuments

public int getNumDocuments()

getExtractor

public Extractor getExtractor()

getTargetRecord

public Record getTargetRecord(int docnum)

getLabelAlphabet

public LabelAlphabet getLabelAlphabet()

cleanFields

public void cleanFields(FieldCleaner cleaner)

print

public void print(java.io.PrintWriter writer)