cc.mallet.util
Class BulkLoader

java.lang.Object
  extended by cc.mallet.util.BulkLoader

public class BulkLoader
extends java.lang.Object

This class reads through a single file, breaking each line into data and (optional) name and label fields.


Constructor Summary
BulkLoader()
           
 
Method Summary
static void generateStoplist(SimpleTokenizer prunedTokenizer)
          Read the data from inputFile, then write all the words that do not occur pruneCount.value times or more to the pruned word file.
static void main(java.lang.String[] args)
           
static void writeInstanceList(SimpleTokenizer prunedTokenizer)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BulkLoader

public BulkLoader()
Method Detail

generateStoplist

public static void generateStoplist(SimpleTokenizer prunedTokenizer)
                             throws java.io.IOException
Read the data from inputFile, then write all the words that do not occur pruneCount.value times or more to the pruned word file.

Parameters:
prunedTokenizer - the tokenizer that will be used to write instances
Throws:
java.io.IOException

writeInstanceList

public static void writeInstanceList(SimpleTokenizer prunedTokenizer)
                              throws java.io.IOException
Throws:
java.io.IOException

main

public static void main(java.lang.String[] args)
                 throws java.io.IOException
Throws:
java.io.IOException