cc.mallet.extract
Interface Tokenization

All Superinterfaces:
Sequence
All Known Implementing Classes:
StringTokenization

public interface Tokenization
extends Sequence


Method Summary
 java.lang.Object getDocument()
          Returns the document of which this is a tokenization.
 Span getSpan(int i)
           
 Span subspan(int start, int end)
          Returns a span formed by concatenating the spans from start to end.
 
Methods inherited from interface cc.mallet.types.Sequence
get, size
 

Method Detail

getDocument

java.lang.Object getDocument()
Returns the document of which this is a tokenization.


getSpan

Span getSpan(int i)

subspan

Span subspan(int start,
             int end)
Returns a span formed by concatenating the spans from start to end. In more detail:

Parameters:
start - The index of the first token in the new span (inclusive). This is an index of a token, *not* an index into the document.
end - The index of the first token in the new span (exclusive). This is an index of a token, *not* an index into the document.
Returns:
A span into this tokenization's document