cc.mallet.share.upenn
Class MaxEntShell

java.lang.Object
  extended by cc.mallet.share.upenn.MaxEntShell

public class MaxEntShell
extends java.lang.Object

Simple wrapper for training a MALLET maxent classifier.

Version:
1.0
Author:
Fernando Pereira

Method Summary
static Classification[] classify(Classifier classifier, java.util.Iterator<Instance> data)
          Compute the maxent classifications for unlabeled instances given by an iterator.
static Classification classify(Classifier classifier, java.lang.String[] features)
          Compute the maxent classification of an instance.
static Classification[] classify(Classifier classifier, java.lang.String[][] features)
          Compute the maxent classifications of an array of instances
static Classifier load(java.io.File modelFile)
          Load a classifier from a file.
static void main(java.lang.String[] args)
          Command-line wrapper to train, test, or run a maxent classifier.
static double test(Classifier classifier, java.util.Iterator<Instance> data)
          Test a maxent classifier.
static double test(Classifier classifier, java.lang.String[][] features, java.lang.String[] labels)
          Test a maxent classifier.
static Classifier train(java.util.Iterator<Instance> data, double var, java.io.File save)
          Train a maxent classifier.
static Classifier train(java.lang.String[][] features, java.lang.String[] labels, double var, java.io.File save)
          Train a maxent classifier.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

train

public static Classifier train(java.lang.String[][] features,
                               java.lang.String[] labels,
                               double var,
                               java.io.File save)
                        throws java.io.IOException
Train a maxent classifier. Each row of features represents the features of a training instance. The label for that instance is in the corresponding position of labels.

Parameters:
features - Each row gives the on features of an instance
labels - Each position gives the label of an instance
var - Gaussian prior variance for training
save - if non-null, save the trained model to this file
Returns:
the maxent classifier
Throws:
java.io.IOException - if the trained model cannot be saved

train

public static Classifier train(java.util.Iterator<Instance> data,
                               double var,
                               java.io.File save)
                        throws java.io.IOException
Train a maxent classifier. The iterator data returns training instances with a TokenSequence as data and a target object. The tokens in the instance data will be converted to features.

Parameters:
data - the iterator over training instances
var - Gaussian prior variance for training.
save - if non-null, save the trained model to this file
Returns:
the maxent classifier
Throws:
java.io.IOException - if the trained model cannot be saved

test

public static double test(Classifier classifier,
                          java.lang.String[][] features,
                          java.lang.String[] labels)
Test a maxent classifier. The data representation is the same as for training.

Parameters:
classifier - the classifier to test
features - an array of instances represented as arrays of features
labels - corresponding labels
Returns:
accuracy on the data

test

public static double test(Classifier classifier,
                          java.util.Iterator<Instance> data)
Test a maxent classifier. The data representation is the same as for training.

Parameters:
classifier - the classifier to test
data - an iterator over labeled instances
Returns:
accuracy on the data

classify

public static Classification classify(Classifier classifier,
                                      java.lang.String[] features)
Compute the maxent classification of an instance.

Parameters:
classifier - the classifier
features - the features that are on for this instance
Returns:
the classification

classify

public static Classification[] classify(Classifier classifier,
                                        java.lang.String[][] features)
Compute the maxent classifications of an array of instances

Parameters:
classifier - the classifier
features - each row represents the on features for an instance
Returns:
the array of classifications for the given instances

classify

public static Classification[] classify(Classifier classifier,
                                        java.util.Iterator<Instance> data)
Compute the maxent classifications for unlabeled instances given by an iterator.

Parameters:
classifier - the classifier
data - the iterator over unlabeled instances
Returns:
the array of classifications for the given instances

load

public static Classifier load(java.io.File modelFile)
                       throws java.io.IOException,
                              java.lang.ClassNotFoundException
Load a classifier from a file.

Parameters:
modelFile - the file
Returns:
the classifier serialized in the file
Throws:
java.io.IOException - if the file cannot be opened or read
java.lang.ClassNotFoundException - if the file does not deserialize

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Command-line wrapper to train, test, or run a maxent classifier. Instances are represented as follows:
Labeled:
label feature-1 ... feature-n
Unlabeled:
feature-1 ... feature-n

Parameters:
args - the command line arguments. Options (shell and Java quoting should be added as needed):
--help boolean
Print this command line option usage information. Give true for longer documentation. Default is false.
--prefix-code Java-code
Java code you want run before any other interpreted code. Note that the text is interpreted without modification, so unlike some other Java code options, you need to include any necessary 'new's. Default is null.
--gaussian-variance positive-number
The Gaussian prior variance used for training. Default is 1.0.
--train filenane
Train on labeled instances stored in filename. Default is no training.
--test filename
Test on the labeled instances stored in filename. Default is no testing.
--classify filename
Classify the unlabeled instances stored in filename. Default is no classification.
--model filename
The filename for reading (test/classify) or saving (train) the model. Default is no model file.
Throws:
java.lang.Exception - if an error occurs