|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcc.mallet.pipe.Pipe
cc.mallet.pipe.TokenSequenceRemoveStopwords
public class TokenSequenceRemoveStopwords
Remove tokens from the token sequence in the data field whose text is in the stopword list.
| Constructor Summary | |
|---|---|
TokenSequenceRemoveStopwords()
|
|
TokenSequenceRemoveStopwords(boolean caseSensitive)
|
|
TokenSequenceRemoveStopwords(boolean caseSensitive,
boolean markDeletions)
|
|
TokenSequenceRemoveStopwords(java.io.File stoplistFile,
java.lang.String encoding,
boolean includeDefault,
boolean caseSensitive,
boolean markDeletions)
Load a stoplist from a file. |
|
| Method Summary | |
|---|---|
TokenSequenceRemoveStopwords |
addStopWords(java.io.File wordlist)
Add whitespace-separated tokens in file "wordlist" to the stoplist. |
TokenSequenceRemoveStopwords |
addStopWords(java.lang.String[] words)
|
Instance |
pipe(Instance carrier)
Really this should be 'protected', but isn't for historical reasons. |
TokenSequenceRemoveStopwords |
removeStopWords(java.io.File wordlist)
Remove whitespace-separated tokens in file "wordlist" to the stoplist. |
TokenSequenceRemoveStopwords |
removeStopWords(java.lang.String[] words)
|
TokenSequenceRemoveStopwords |
setCaseSensitive(boolean flag)
|
TokenSequenceRemoveStopwords |
setMarkDeletions(boolean flag)
|
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public TokenSequenceRemoveStopwords(boolean caseSensitive,
boolean markDeletions)
public TokenSequenceRemoveStopwords(boolean caseSensitive)
public TokenSequenceRemoveStopwords()
public TokenSequenceRemoveStopwords(java.io.File stoplistFile,
java.lang.String encoding,
boolean includeDefault,
boolean caseSensitive,
boolean markDeletions)
stoplistFile - The file to loadencoding - The encoding of the stoplist file (eg UTF-8)includeDefault - Whether to include the standard mallet English stoplist| Method Detail |
|---|
public TokenSequenceRemoveStopwords setCaseSensitive(boolean flag)
public TokenSequenceRemoveStopwords setMarkDeletions(boolean flag)
public TokenSequenceRemoveStopwords addStopWords(java.lang.String[] words)
public TokenSequenceRemoveStopwords removeStopWords(java.lang.String[] words)
public TokenSequenceRemoveStopwords removeStopWords(java.io.File wordlist)
public TokenSequenceRemoveStopwords addStopWords(java.io.File wordlist)
public Instance pipe(Instance carrier)
Pipe
pipe in class Pipe
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||