|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectjava.util.AbstractCollection<E>
java.util.AbstractList<E>
java.util.ArrayList<Token>
cc.mallet.types.TokenSequence
cc.mallet.extract.StringTokenization
public class StringTokenization
| Field Summary |
|---|
| Fields inherited from class java.util.AbstractList |
|---|
modCount |
| Constructor Summary | |
|---|---|
StringTokenization(java.lang.CharSequence seq)
Create an empty StringTokenization |
|
StringTokenization(java.lang.CharSequence string,
CharSequenceLexer lexer)
Creates a tokenization of the given string. |
|
| Method Summary | |
|---|---|
java.lang.Object |
getDocument()
Returns the document of which this is a tokenization. |
Span |
getSpan(int i)
|
Span |
subspan(int firstToken,
int lastToken)
Returns a span formed by concatenating the spans from start to end. |
| Methods inherited from class cc.mallet.types.TokenSequence |
|---|
add, addAll, getNumericProperty, getProperties, getProperty, hasProperty, removeLast, setNumericProperty, setProperty, toFeatureSequence, toFeatureVector, toString, toStringShort |
| Methods inherited from class java.util.ArrayList |
|---|
add, add, addAll, addAll, clear, clone, contains, ensureCapacity, get, indexOf, isEmpty, lastIndexOf, remove, remove, removeRange, set, size, toArray, toArray, trimToSize |
| Methods inherited from class java.util.AbstractList |
|---|
equals, hashCode, iterator, listIterator, listIterator, subList |
| Methods inherited from class java.util.AbstractCollection |
|---|
containsAll, removeAll, retainAll |
| Methods inherited from class java.lang.Object |
|---|
finalize, getClass, notify, notifyAll, wait, wait, wait |
| Methods inherited from interface cc.mallet.types.Sequence |
|---|
get, size |
| Methods inherited from interface java.util.List |
|---|
containsAll, equals, hashCode, iterator, listIterator, listIterator, removeAll, retainAll, subList |
| Constructor Detail |
|---|
public StringTokenization(java.lang.CharSequence seq)
public StringTokenization(java.lang.CharSequence string,
CharSequenceLexer lexer)
| Method Detail |
|---|
public Span subspan(int firstToken,
int lastToken)
Tokenization
subspan in interface TokenizationfirstToken - The index of the first token in the new span (inclusive).
This is an index of a token, *not* an index into the document.lastToken - The index of the first token in the new span (exclusive).
This is an index of a token, *not* an index into the document.
public Span getSpan(int i)
getSpan in interface Tokenizationpublic java.lang.Object getDocument()
Tokenization
getDocument in interface Tokenization
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||