MALLET 2.0.5 Release Notes November 6, 2009 Major updates: Better Windows support. In addition to the linux/mac "bin/mallet" script, there is now a functionally identical "bin/mallet.bat" script. Windows support is still limited, but with this batch file you will no longer need to install cygwin to run Mallet from the command line. Configuration files. All "bin/mallet" commands now take an optional configuration file, which allows you to specify command line parameters. For example, you could replace this command > bin/mallet import-file --input input.txt --output output.mallet --remove-stopwords with this command > bin/mallet import-file --config import.config where import.config contains input = input.txt output = output.mallet remove-stopwords = true Note that configuration files do not support more than one instance of the same parameter (for example specifying multiple classifier trainers) -- to do this you will need to use true command line parameters. A new class cc.mallet.util.MVNormal implements several utilities for working with multivariate normal distributions and symmetric positive definite matrices, represented as one-dimensional arrays. Several topic model package enhancements: Support for aligned corpora in multiple languages (the Polylingual Topic Model). Use the option "--language-inputs en.sequences de.sequences fr.sequences ..." to invoke this option. All languages must be imported in separate serialized instance lists, with empty instances inserted so that each list is the same length and aligned instances are at the same position in each list. An initial version of topic held-out likelihood evaluation has been added. Use the option "--evaluator-filename" when training topics and then the "bin/mallet evaluate-topics" command to estimate the probability of new documents. In addition to optimizing the document-topic hyperparameters, you can also optimize a topic-word hyperparameter. This is triggered automatically by "--optimize-interval". Bug fixes to topic training and topic inference.