MalletMain Page | About | Help | FAQ | Special pages | Log in
Advanced Machine Learning for Language
Printable version | Disclaimers

TestMaximizable

From Mallet

Table of contents

Background

We have a very convenient framework for unconstrained maximization in MALLET. These are problems of the form maximize f(x) with respect to a real vector x. To use these, you just need to implement one of the interfaces contained in the class Maximizable. Most commonly, you'll want to implement Maximizable.ByGradient, which requires you to write two functions which compute the value and gradient of your function.

The problem is, it's easy to mess up those two functions. What you'd like to do as a test is to calculate the finite-difference gradient from the value function, then calculate the analytic gradient from your gradient function, and make sure they match. This would be a very good test, don't you think? That's why we have this implemented in TestMaximizable.testValueAndGradient().

Whenever you write an instance of Maximizable.ByGradient, the first thing you should do is to write a unit test that calls TestMaximizable.

Basic Usage

Here's a skeleton of Java code you might write to use TestMaximizable.

  Maximizable.ByGradient maxable = createMyMaximizable(); // Create an instance of your Maximizable
  TestMaximizable.testValueAndGradient (maxable);

It's most convenient to make this a unit test.

Troubleshooting

If TestMaximizable fails, here's a rather vague step-by-step guide for troubleshooting.

  1. Try to reproduce the failure on the smallest data set you can. Ideally, the data set should be small enough that you can compute the true value and gradient by hand (or in an Emacs scratch buffer).
  2. If you have a default feature (i.e., always 1, corresponding to a bias weight), check that. I always screw up the default feature.
  3. Check that the signs are correct on the terms in your value and gradient. Also, if the angle between the empirical and analytic gradient is 3.14, you can guess what that means: you left out a minus sign in the analytic gradient smile
  4. NaN appearing in your code is usually caused when you try to do infinity - infinity.
  5. Once you know that your code is broken, debug one coordinate at a time. See the lines in testValueAndGradientCurrentParamters that say:
     for (int i = 0; i < parameters.length; i++) {
 //      { int i = 0;   // Uncomment this line to debug one parameter at a time -cas
        if ((parameters.length >= sampleParameterInterval) &&
              (i % sampleParameterInterval != 0))
           continue;

Recomment them to say:

 //      for (int i = 0; i < parameters.length; i++) {
     { int i = 1729;   // Uncomment this line to debug one parameter at a time -cas
 //         if ((parameters.length >= sampleParameterInterval) &&
 //               (i % sampleParameterInterval != 0))
 //            continue;

where 1729 is the coordinate of your Maximizable that has the largest slope difference. After you make this change, TM will check only coordinate 1729, so that you can add copious debugging print statements with less chance of scroll blindness.

Other tips

Here are some tips that are less troubleshooting-focused:

  1. There's also a function TestMaximizable.testGetSetParameters that makes sure your getParameters and getParameters functions are consistent. Be aware that running this test will leave the parameters of the maximizale object in a strange state.
  2. Sometime TestMaximizable takes a really long time to run. In that case, you might want to tell it "Just check 500 coordinates". To do this, call TestMaximiable.setNumComponents (500); before you call testValueAndGradient.
  3. The function testValueAndGradient checks three different parameter settings: all zero, zero + one gradient step, and random. If there's a specific parameter setting you'd like to test (maybe one that caused an error during training), you can directly call testValueAndGradientCurrentParameters.
  4. It is possible for TestMaximizable to fail spuriously. Most commonly, this could happen because the step size is the same in every direction, based on the two-norm of the gradient. If your function has large curvature, or if you coordinates have very different scale, then TM is likely to fail spuriously. Feel free to modify the test to make it more accurate.

Retrieved from "http://mallet.cs.umass.edu/index.php/TestMaximizable"

This page has been accessed 3528 times. This page was last modified 17:05, 25 Feb 2005.


Find
Navigation
Main Page
Community portal
Recent changes
Random page
Help
Donations
Edit
Edit this page
Editing help
This page
Discuss this page
Post a comment
Printable version
Context
Page history
What links here
Related changes
My pages
Create an account or log in
Special pages
New pages
Image list
Statistics
Bug reports
More...